Type: Package
Depends: R (≥ 3.5.0)
Title: Whale Optimization Algorithm for K-Medoids Clustering
Version: 0.2.2
Date: 2026-02-08
Encoding: UTF-8
Description: Implements the Whale Optimization Algorithm(WOA) for k-medoids clustering, providing tools for effective and efficient cluster analysis in various data sets. The methodology is based on "The Whale Optimization Algorithm" by Mirjalili and Lewis (2016) <doi:10.1016/j.advengsoft.2016.01.008>.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Imports: dtwclust, proxy, cluster, Rcpp (≥ 1.0.11), RcppParallel
LinkingTo: Rcpp, RcppParallel
SystemRequirements: GNU make
RoxygenNote: 7.3.3
LazyData: true
NeedsCompilation: yes
Packaged: 2026-02-18 13:56:36 UTC; huang
Author: Chenan Huang [aut, cre], Narumasa Tsutsumida ORCID iD [aut]
Maintainer: Chenan Huang <hualianchan@gmail.com>
Repository: CRAN
Date/Publication: 2026-02-18 14:20:09 UTC

Lightning7 Data for Testing

Description

A dataset containing example data for testing purposes from the UCR Time Series Classification Archive. This dataset is a time series dataset with correct classifications in the first column. There are 7 classes in this dataset. It contains 73 series, each with 319 time points, and the best DTW window length for this dataset is 5.

Usage

data(Lightning7)

Format

A data frame with 73 rows and 320 columns. The first column (V1) is a factor vector of correct classifications, and the remaining 319 columns (V2 to V320) are numeric vectors of time series data.

Source

UCR Time Series Classification Archive

References

Examples

data(Lightning7)
head(Lightning7)

Fitness function for WOA optimization (C++ implementation)

Description

Combines cluster assignment and total distance calculation. Returns Inf if any cluster has fewer than 2 members.

Usage

fitnessFunction_cpp(dist_vec, nrows, medoids)

Arguments

dist_vec

Distance vector (from dist object)

nrows

Number of rows in the original data

medoids

Integer vector of medoid indices (1-indexed)

Value

total_dist (double), Inf if invalid clustering


Deterministic serial fitness function for final evaluation

Description

Deterministic serial fitness function for final evaluation

Usage

fitnessFunction_serial_cpp(dist_vec, nrows, medoids)

Arguments

dist_vec

Distance vector (from dist object)

nrows

Number of rows in the original data

medoids

Integer vector of medoid indices (1-indexed)

Value

total_dist (double), Inf if invalid clustering


Check if two medoid solutions are identical

Description

Check if two medoid solutions are identical

Usage

medoidsEqual_cpp(medoids1, medoids2)

Arguments

medoids1

First medoid vector (1-indexed)

medoids2

Second medoid vector (1-indexed)

Value

TRUE if identical (regardless of order), FALSE otherwise


Project MDS coordinates to nearest unique sample indices

Description

Finds the nearest sample in the MDS embedded space for each medoid coordinate, while enforcing a one-to-one mapping between medoids and samples (no duplicates).

Usage

projectToIndex_cpp(medoid_coords, Z)

Arguments

medoid_coords

NumericMatrix of medoid coordinates in MDS space (ClusNum x mds_dim)

Z

NumericMatrix of sample coordinates in MDS space (nrows x mds_dim)

Value

IntegerVector of nearest unique sample indices (1-indexed, length = ClusNum)


Whale Optimization Algorithm for K-Medoids Clustering

Description

This function implements the Whale Optimization Algorithm (WOA) for K-Medoids clustering. Supported distance measures are Dynamic Time Warping (DTW) and Euclidean Distance (ED).

Usage

woa_kmedoids(
  data,
  ClusNum,
  distance_method = c("dtw", "ed"),
  learned_w = NULL,
  Max_iter = 200,
  n = 5,
  early_stopping = TRUE,
  patience = 5,
  verbose = FALSE
)

Arguments

data

Data matrix

ClusNum

Number of clusters

distance_method

Distance calculation method, either "dtw" or "ed"

learned_w

Window size for DTW (only used if distance_method is "dtw")

Max_iter

Maximum number of iterations (default is 200, it can be adjusted according to the size of the dataset)

n

Population size (number of whales, default is 5, it can be adjusted according to the size of the dataset)

early_stopping

Logical. If TRUE, stop early when the best solution converges (default is TRUE)

patience

Number of consecutive iterations without improvement before early stopping (default is 5)

verbose

Logical. If TRUE, print progress messages (default is FALSE)

Value

The 'woa_clustering' object containing the clustering result and medoids

Author(s)

Chenan Huang, Narumasa Tsutsumida

References

Chenan H. and Tsutsumida N. (2025) A scalable k-medoids clustering via whale optimization algorithm, Array, 28,100599. https://doi.org/10.1016/j.array.2025.100599.

Examples

# NOTE: This example only shows how to implement woa_kmedoids using sample data.
# Results do not suggest any meanings.
data(Lightning7)
Lightning7_data <- Lightning7[, -1]  # Remove the first column of classification data
  result <- woa_kmedoids(Lightning7_data, ClusNum = 7, distance_method = "dtw", learned_w = 5)
  print(result)