Posts

Introducing the somhca R package

Image
Complex datasets often contain patterns that are difficult to interpret using traditional statistical methods alone. One effective approach is to combine self-organizing maps (SOM) with hierarchical cluster analysis (HCA) . Together, these techniques provide a powerful framework for exploring, visualizing and grouping high-dimensional data. SOM is an unsupervised neural network method that projects high-dimensional data onto a two-dimensional grid while preserving local relationships in the original data. Similar observations are positioned close to one another on the map, allowing patterns, relationships, trends and structures in complex datasets to become visually apparent, making SOM an excellent tool for dimensionality reduction and exploratory data analysis. However, SOM is not a clustering method by itself; it is primarily a topology-preserving mapping technique. When a large number of SOM units is used, similar observati...

somhca Package – Part 1: Training and Visualizing Self-Organizing Maps in R

Image
This is part 1 of a two-part series on the somhca R package. Start here for an introduction. Overview The four functions from the somhca R package presented in this post provide a complete workflow for preparing data, selecting an appropriate self-organizing map (SOM) configuration, training the SOM, and visualizing the resulting patterns. Together, they simplify the process of applying SOM-based exploratory analysis to high-dimensional numeric datasets such as spectra, sensor measurements, or other multivariate observations. These functions are particularly useful when working with complex datasets where pattern exploration, clustering, dimensionality reduction, and visualization are important goals. By automating tasks such as data preprocessing, SOM grid optimization, model training, and graphical interpretation, this workflow helps users build robust and reproducible SOM analyses with minimal manual tuning. loadMatrix() The loadMatrix() funct...

somhca Package – Part 2: Performing Hierarchical Cluster Analysis in R

Image
This is part 2 of a two-part series on the somhca R package. Start here for an introduction. Overview The three functions from the somhca R package presented in this post provide a complete workflow for hierarchical cluster analysis (HCA), from grouping observations after SOM training or other types of dimensionality reduction and pattern extraction, to retrieving and exploring the results. Together, these functions help transform complex, high-dimensional datasets into interpretable groups that can be validated, visualized and used for further analysis. Typical problems these functions help solve include: Simplifying interpretation of SOM results; Identifying natural groupings in complex datasets; Comparing clustering strategies (e.g., SOM-based vs PCA-based clustering); Detecting outliers or unusual observations; Assigning cluster labels for visualization or statistical analysis; Preparing grouped datasets for downstream machine learning or reporting. c...

Introducing the spectrakit R package

Image
If you regularly work with spectral data, you’ve probably run into the same set of challenges: producing clear spectral plots, combining data from multiple files, applying consistent normalization, explore patterns in the data, and assembling publication-ready figures. The spectrakit R package is designed to streamline this entire workflow. At its core, spectrakit provides a small set of focused tools for handling, analyzing and visualizing spectral data, from raw files all the way to final figures. What spectrakit does The package covers four common tasks: Visualizing spectra with flexible plotting options Combining spectra from multiple files into a single dataset Supporting exploration of spectral data using principal component analysis (PCA) Creating composite figures for publication-ready output A simple workflow Once the package is installed and loaded via install.packages(...

spectrakit Package – Part 1: Plotting Spectra in R with plotSpectra()

Image
This is part 1 of a series on the spectrakit R package. Start here for an introduction. Overview The plotSpectra() function from the spectrakit R package reads spectral data from multiple files in a folder, applies optional normalization, and produces publication-ready plots with extensive customization options. It supports multiple plotting modes, color palettes, axis formatting, annotations and automatic export of figures. This function is especially useful when working with batches of raw spectra files that require consistent and reproducible visualization. Typical use cases include exploratory comparison of spectra and generating standardized figures for reports, presentations or scientific publications. Syntax plotSpectra( folder = ".", file_type = "csv", sep = ",", header = TRUE, normalization = c("none", "simple", "min-max", "z-score", "area", "vec...