On a Minimal Spanning, Tree Approach in the Cluster Validation Problem

2009-01-01
Barzily, Zeev
Volkovich, Zeev
Öztürk, Başak
Weber, Gerhard Wilhelm
In this paper, a method for the study of cluster stability is purposed. We draw pairs of samples from the data, according to two sampling distributions. The first distribution corresponds to the high density zones of data-elements distribution. Thus it is associated with the clusters cores. The second one, associated with file cluster margins, is related to the low density zones. The samples are clustered and the two obtained partitions are compared. The partitions are considered to be consistent if the obtained clusters are similar. The resemblance is measured by the total number of edges, in the clusters minimal spanning trees, connecting points from different samples. We use the Friedman and Rafsky two sample test Statistic. Under the homogeneity hypothesis, this statistic is normally distributed. Thus, it can he expected that the true number of clusters corresponds to the statistic empirical distribution which is closest to normal. Numerical experiments demonstrate the ability of the approach to detect the true number of clusters.
INFORMATICA

Suggestions

Cluster stability using minimal spanning trees
Barzily, Zeev; Volkovich, Zeev; Akteke-Oeztuerk, Basak; Weber, Gerhard Wilhelm (2008-05-23)
In this paper, a method for the study of cluster stability is purposed. We draw pairs of samples from the data, according to two sampling distributions. The first distribution corresponds to the high density zones of data-elements distribution. It is associated with the clusters cores. The second one, associated with the cluster margins, is related to the low density zones. The samples are clustered and the two obtained partitions are compared. The partitions are considered to be consistent if the obtained ...
CLUSTER STABILITY ESTIMATION BASED ON A MINIMAL SPANNING TREES APPROACH
Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard Wilhelm; Toledano-Kitai, Dvora (2009-06-03)
Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stabil...
A Comparative Study on Two Different Direct Parallel Solution Strategies for Large-Scale Problems
Bahcecioglu, T.; Ozmen, S.; Kurç, Özgür (2009-04-08)
This paper presents a comparative study on two different direct parallel solution strategies for the linear solution of large scale actual finite element models: global and domain-by-domain. The global solution strategy was examined by utilizing the parallel multi-frontal equation solver, MUMPS [1], together with a finite element program. In a similar manner a substructure based parallel solution framework [2] was utilized for investigating the domain-by-domain strategy. Various large-scale structural model...
On the smoothness of solutions of impulsive autonomous systems
Akhmet, Marat (Elsevier BV, 2005-01-01)
The aim of this paper is to investigate dependence of solutions on parameters for nonlinear autonomous impulsive differential equations. We will specify what continuous, differentiable and analytic dependence of solutions on parameters is, define higher order derivatives of solutions with respect to parameters and determine conditions for existence of such derivatives. The theorem of analytic dependence of solutions on parameters is proved.
Assessment of Transient Stability of Nonlinear Dynamic Systems by the Method of Tangent Hyperplanes and the Method of Tangent Hypersurfaces
Eskicioglu, Ahmet M. (ASME International, 1989-9-1)
Two direct methods, the method of tangent hyperplanes and the method of tangent hypersurfaces, are applied to an elementary nonlinear dynamic system for transient stability assessment. The former method is based on the approximation of the asymptotic stability boundary by hyperplanes at a certain class of unstable singular points in the state-space, and the latter replaces hyperplanes by hypersurfaces. The applicability and accuracy of both methods are evaluated through a comparison of results.
Citation Formats
Z. Barzily, Z. Volkovich, B. Öztürk, and G. W. Weber, “On a Minimal Spanning, Tree Approach in the Cluster Validation Problem,” INFORMATICA, pp. 187–202, 2009, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/54010.