Probabilistic Distance Clustering: Algorithm and Applications

2009-02-01
The probabilistic distance clustering method of the authors [2, 8], assumes the cluster membership probabilities given in terms of the distances of the data points from the cluster centers, and the cluster sizes. A resulting extremal principle is then used to update the cluster centers (as convex combinations of the data points), and the cluster sizes (if not given.) Progress is monitored by the joint distance function (JDF), a weighted harmonic mean of the above distances, that approximates the data by capturing the data points in its lowest contours. The method is described, and applied to clustering, location problems, and mixtures of distributions, where it is a viable alternative to the Expectation–Maximization (EM) method. The JDF also helps to determine the “right” number of clusters for a given data set.

Suggestions

Angular analysis and branching fraction measurement of the decay B-0 -> K*(0)mu(+)mu(-)
Chatrchyan, S.; et. al. (2013-11-01)
The angular distributions and the differential branching fraction of the decay B-0 -> K*(892)(0)mu(+)mu(-) are studied using a data sample corresponding to an integrated luminosity of 5.2 fb(-1) collected with the CMS detector at the LHC in pp collisions at root s = 7 TeV. From more than 400 signal decays, the forward-backward asymmetry of the muons, the K*(892)(0) longitudinal polarization fraction, and the differential branching fraction are determined as a function of the square of the dimuon invariant m...
Artificial-neural-network prediction of hexagonal lattice parameters for non-stoichiometric apatites
Kockan, Umit; Ozturk, Fahrettin; Evis, Zafer (2014-01-01)
In this study, hexagonal lattice parameters (a and c) and unit-cell volumes of non-stoichiometric apatites of M-10(TO4)(6)X-2 are predicted from their ionic radii with artificial neural networks. A multilayer-perceptron network is used for training. The results indicate that the Bayesian regularization method with four neurons in the hidden layer with a tansig activation function and one neuron in the output layer with a purelin function gives the best results. It is found that the errors for the predicted ...
TRACEMIN Fiedler A Parallel Algorithm for Computing the Fiedler Vector
Manguoğlu, Murat; Saied, Faisal; Sameh, Ahmed (null; 2010-06-25)
The eigenvector corresponding to the second smallest eigenvalue of the Laplacian of a graph, known as the Fiedler vector, has a number of applications in areas that include matrix reordering, graph partitioning, protein analysis, data mining, machine learning, and web search. The computation of the Fiedler vector has been regarded as an expensive process as it involves solving a large eigenvalue problem. We present a novel and efficient parallel algorithm for computing the Fiedler vector of large graphs bas...
Mixed integer programming and heuristics approaches for clustering with cluster-based feature selection
İyigün, Cem (null; 2019-10-20)
In this study, we work on a clustering problem where it is assumed that the features identifying the clusters may differ for each cluster. Number of clusters and number of relevant features in each cluster are given in advance. A centerbased clustering approach is proposed. Finding the cluster centers, assigning the data points and selecting relevant features for each cluster are performed simultaneously. A non-linear mixed integer mathematical model is proposed which minimizes the total distance between da...
Semi-Bayesian Inference of Time Series Chain Graphical Models in Biological Networks
Farnoudkia, Hajar; Purutçuoğlu Gazi, Vilda (null; 2018-09-20)
The construction of biological networks via time-course datasets can be performed both deterministic models such as ordinary differential equations and stochastic models such as diffusion approximation. Between these two branches, the former has wider application since more data can be available. In this study, we particularly deal with the probabilistic approaches for the steady-state or deterministic description of the biological systems when the systems are observed though time. Hence, we consider time s...
Citation Formats
C. İyigün, Probabilistic Distance Clustering: Algorithm and Applications. 2009, p. 52.