Wednesday, November 20, 2019
Cluster Analysis Essay Example | Topics and Well Written Essays - 3750 words
Cluster Analysis - Essay Example There are various statistics associated with cluster analysis which are used for analyzing the data. Clustering can be hierarchical or non hierarchal and these are further classified into various methods. Hierarchal clustering is developed as a tree like structure. This method can be either agglomerative or divisive. In agglomerative clustering each object is formed as a separate cluster which is formed by grouping into bigger clusters and the process is continued till all the cases form as members of a single cluster. In agglomerative method, the various methods such as linkage methods, error sum of squares or variance and central methods are used. Linkage method includes single linkage, complete linkage and average linkage. The single linkage method is based on the minimum distance. The complete linkage is based on the maximum distance. And the average linkage is based on the average distance between all pairs of objects, so that one member of the pair is from each of the clusters. Variance method is used to minimize the within -cluster variance. Ward's procedure is a variance method where the squared euclidean distance to the cluster means is minimized. In the centroid method the distance between the two clusters is computed as the distance between their centroids. Generally the average linkage and Ward's method are supposed to perform better than other procedures. Now we shall discuss the various statistics associated with cluster analysis. Agglomerative schedule gives information on the cases being combined at each stage of a hierarchical clustering. The mean value of the variable associated with all cases in a cluster is known as cluster centroid. Dendogram is a tree like graph which displays the result of cluster analysis. The clusters which are joined together are represented by vertical lines. The position of line indicates the distance where the clusters are joined. This graph is a generally read from left to right. The distance between cluster centers indicates how the pairs of clusters are separated. If the clusters are widely separated and distinct then they are desirable. Icicle diagram is a graph, which displays the clustering results. It is called as icicles which hang from the eaves of a house. The columns represent the cases being clustered and the rows correspond to the number of clusters. This diagram is read from bottom to top. In this case chestnut ridge club clustering is considered on the attitude of the respondents in terms of joining a club. And the respondents expressed on a scale of 1-5, the objective here is group similar cases and to measures how similar or different the case are. The approach is to measure similarity in terms of distance between pairs of objects. There are different methods to measure the distance. These methods can be used to measure and the results can be compared. In hierarchical clustering agglomerative clustering is selected and Wards procedure is used to measure the distance. Generally the choice of clustering method and choice of a distance measure are related. Here the variables are measured on a five-point scale. The Wards linkage method is used to find the average distance between all pairs of objects. In this variance method the squared Euclidean distance to the cluster means is minimized. The important outputs obtained here are agglomeration schedule which shows the number of clusters combined at each
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.