| Title: | Universal Clustering Analysis Platform |
|---|---|
| Description: | An interactive platform for clustering analysis and teaching based on the 'shiny' web application framework. Supports multiple popular clustering algorithms including k-means, hierarchical clustering, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), PAM (Partitioning Around Medoids), GMM (Gaussian Mixture Model), and spectral clustering. Users can upload datasets or use built-in ones, visualize clustering results using dimensionality reduction methods such as Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE), evaluate clustering quality via silhouette plots, and explore method-specific visualizations and guides. For details on implemented methods, see: Reynolds (2009, ISBN:9781598296975) for GMM; Luxburg (2007) <doi:10.1007/s11222-007-9033-z> for spectral clustering. |
| Authors: | Yijin Zhou [aut, cre] |
| Maintainer: | Yijin Zhou <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.3 |
| Built: | 2026-05-18 07:38:28 UTC |
| Source: | https://github.com/cran/clusterWebApp |
Calculates the average silhouette coefficient from a silhouette object.
compute_silhouette(sil)compute_silhouette(sil)
sil |
A silhouette object as returned by |
A numeric value indicating the average silhouette width, or NA if input is NULL.
data <- scale(iris[, 1:4]) cl <- kmeans(data, 3)$cluster sil <- cluster::silhouette(cl, dist(data)) if (interactive()) { compute_silhouette(sil) }data <- scale(iris[, 1:4]) cl <- kmeans(data, 3)$cluster sil <- cluster::silhouette(cl, dist(data)) if (interactive()) { compute_silhouette(sil) }
Uses within-cluster sum of squares (WSS) to help determine the optimal number of clusters.
plot_elbow(data)plot_elbow(data)
data |
A numeric matrix or data frame for clustering. |
A ggplot object showing the elbow plot.
data <- scale(iris[, 1:4]) if (interactive()) { plot_elbow(data) }data <- scale(iris[, 1:4]) if (interactive()) { plot_elbow(data) }
Displays the medoids of each PAM cluster using a polar radar chart.
plot_radar(data, clusters)plot_radar(data, clusters)
data |
A numeric matrix or data frame for clustering. |
clusters |
An integer indicating the number of clusters. |
A ggplot object showing the radar chart of cluster medoids.
data <- scale(iris[, 1:4]) if (interactive()) { plot_radar(data, clusters = 3) }data <- scale(iris[, 1:4]) if (interactive()) { plot_radar(data, clusters = 3) }
Plots the silhouette diagram for a given clustering result.
plot_silhouette(sil)plot_silhouette(sil)
sil |
A silhouette object as returned by |
A silhouette plot if input is not NULL, otherwise a placeholder text.
data <- scale(iris[, 1:4]) cl <- kmeans(data, 3)$cluster sil <- cluster::silhouette(cl, dist(data)) if (interactive()) { plot_silhouette(sil) }data <- scale(iris[, 1:4]) cl <- kmeans(data, 3)$cluster sil <- cluster::silhouette(cl, dist(data)) if (interactive()) { plot_silhouette(sil) }
Loads and preprocesses a built-in dataset for clustering analysis. Depending on the dataset name provided, different cleaning steps are applied.
prepare_data(dataset)prepare_data(dataset)
dataset |
A string specifying the dataset name. Options are: "iris", "USArrests", "mtcars", "CO2", "swiss", "Moons". |
The classic iris dataset, excluding the species column.
State-wise arrest data. Missing values are removed.
Motor trend car data set. No transformation applied.
CO2 uptake in grass plants. Only numeric columns are selected and rows with missing values are removed.
Swiss fertility and socio-economic indicators. Used as-is.
Synthetic non-linear dataset generated by mlbench::mlbench.smiley().
A cleaned data.frame containing only numeric variables and no missing values.
data <- prepare_data("iris") head(data)data <- prepare_data("iris") head(data)
This function launches the Shiny web application located in the inst/app directory
of the installed package. The application provides an interactive interface for clustering analysis.
run_app()run_app()
No return value. This function is called for its side effect (launching the app).
if (interactive()) { run_app() }if (interactive()) { run_app() }
This function performs clustering on a numeric matrix using one of six common clustering methods: KMeans, Hierarchical, DBSCAN, PAM, Gaussian Mixture Model (GMM), or Spectral Clustering.
run_clustering(data, method, k = 3, eps = 0.5, minPts = 5)run_clustering(data, method, k = 3, eps = 0.5, minPts = 5)
data |
A numeric matrix or data frame, typically standardized, to be clustered. |
method |
A string indicating the clustering method to use. Options are: "KMeans", "Hierarchical", "DBSCAN", "PAM", "GMM", "Spectral". |
k |
An integer specifying the number of clusters. Required for KMeans, Hierarchical, PAM, GMM, and Spectral. |
eps |
A numeric value specifying the epsilon parameter for DBSCAN. Default is 0.5. |
minPts |
An integer specifying the minimum number of points for DBSCAN. Default is 5. |
A list containing two elements:
A vector of cluster labels assigned to each observation.
An object of class silhouette representing silhouette widths.
data(iris) result <- run_clustering(scale(iris[, 1:4]), method = "KMeans", k = 3) print(result$cluster) if (interactive()) { plot(result$silhouette) }data(iris) result <- run_clustering(scale(iris[, 1:4]), method = "KMeans", k = 3) print(result$cluster) if (interactive()) { plot(result$silhouette) }