Clustering Analysis: Difference between revisions

Revision as of 06:08, 21 March 2019

The Clustering Analysis view groups cases in the model in a way that the cases inside a group are similar to each other (e.g. cases have the same case attribute values are in the same group). See this Wikipedia article for more about the idea behind clustering.

You can use the Clustering Analysis View, for example, to check data integrity. That is, the Clustering Analysis might reveal that the model actually contains data from two different processes.

Left Panel

You can use the left panel to filter cases. Note that you are not bound to using just the Flowchart analysis, as you can change the analysis by right-clicking the analysis and selecting a different type of analysis shown on the panel.

Right Panel

The right panel contains the clustering analysis. The table shows the clusters, how many cases are in each cluster, and the following details for each cluster:

Feature and Value: These two columns list the case attribute and other values that are common to the cases in the cluster.
Total Density %: share of cases having this feature value in the whole data set (i.e. the total number of cases having the value shown on the row divided by the total number of cases * 100).
Cluster Density %: share of cases having this feature value within the cluster (i.e. the number of cases having the value shown on the row in this particular cluster divided by the number of cases in the cluster * 100).
Contribution %: the amount of cases that can be explained to belong to this cluster because of this feature value. The scale is such that 0% means that the feature value isn't specific to this cluster and 100% means that all cases belonging to this cluster can be explained by this feature value.

@@ Line 9: / Line 9: @@
 The right panel contains the clustering analysis. The table shows the clusters, how many cases are in each cluster, and the following details for each cluster:
 * '''Feature''' and '''Value''': These two columns list the case attribute and other values that are common to the cases in the cluster.
-* '''Cluster Density''': share of cases having this feature value within the cluster (i.e. the number of cases having the value shown on the row in this particular cluster divided by the number of cases in the cluster * 100).
+* '''Total Density %''': share of cases having this feature value in the whole data set (i.e. the total number of cases having the value shown on the row divided by the total number of cases * 100).
-* '''Total Density''': share of cases having this feature value in the whole data set (i.e. the total number of cases having the value shown on the row divided by the total number of cases * 100).
+* '''Cluster Density %''': share of cases having this feature value within the cluster (i.e. the number of cases having the value shown on the row in this particular cluster divided by the number of cases in the cluster * 100).
 * '''Contribution %''': the amount of cases that can be explained to belong to this cluster because of this feature value. The scale is such that 0% means that the feature value isn't specific to this cluster and 100% means that all cases belonging to this cluster can be explained by this feature value.

Clustering Analysis: Difference between revisions

Revision as of 06:08, 21 March 2019

Left Panel

Right Panel

Navigation menu

Search