Framework for analysing large-scale metabolomic data

May 30, 2025

Statisticians from the National University of Singapore (NUS) have developed a pioneering approach for analysing population-scale metabolomic data, marking a major advancement in the precision and depth of metabolic profiling. This new method promises to improve both personalised healthcare and preventive medicine by improving the accuracy and interpretability of metabolic analyses.

The pioneering framework, developed by a team of researchers led by Associate Professor Zhigang YAO from the Department of Statistics and Data Science at the NUS Faculty of Science, employs advanced mathematical techniques to fit low-dimensional manifolds into the high-dimensional space of Nuclear Magnetic Resonance (NMR)-based metabolic biomarkers. This effectively reduces noise and reveals meaningful patterns associated with metabolic change. It can be used to better stratify individuals based on their metabolic profile and associated risk of disease. The research was carried out in collaboration with Professor YAU Shing-Tung of Tsinghua University.

Their findings were published in the Proceedings of the National Academy of Sciences of the United States of America.

 

Exploiting manifold fitting techniques to decipher metabolic heterogeneity

Metabolomic profiling, particularly through NMR-based biomarkers, offers rich insights into human metabolism. However, the complexity and dimensionality of such data have long challenged conventional analytical techniques. Traditional methods often struggle to uncover the subtle and structured biological variations underpinning disease risks.

The new framework represents a significant advancement in overcoming these limitations. It begins by clustering 251 metabolic biomarkers—measured from over 210,000 participants in the UK Biobank—into seven biologically meaningful categories, reflecting the modular organisation of human metabolism. Manifold fitting is then applied to each category to reveal smooth, low-dimensional structures that capture the essential variations in metabolic states.

At the core of this framework is the manifold fitting module, which models how individuals are distributed in a low-dimensional space based on their metabolic profiles. This geometric representation not only reduces noise but also enhances interpretability by uncovering coherent metabolic patterns that correlate with health and disease outcomes.

The key innovation lies in the method’s ability to stratify the population. In three of the seven categories, the fitted manifolds clearly divide individuals into two major subgroups, each associated with distinct risks for conditions such as metabolic disorders, cardiovascular disease, and autoimmune conditions.

During a plenary lecture at the 2025 International Congress of Chinese Mathematicians (ICCM), Associate Professor Yao explained, “The new approach allows us to identify meaningful metabolic subgroups by fitting low-dimensional manifolds to high-dimensional biomarker data. This will significantly improve our ability to relate metabolic states to susceptibility to disease.”

Compared to traditional analyses, this manifold-based framework demonstrates superior performance in preserving biological signals, identifying disease-relevant subgroups, and aligning with demographic, clinical, and lifestyle factors. These strengths position it as a powerful tool for metabolic research and precision health applications.

 

Future directions: Advancing genetic and longitudinal insight into metabolic health

Building on the success of this framework, the research team is now exploring several promising directions to deepen their understanding of metabolic heterogeneity and its clinical implications.

One key avenue involves integrating genetic data with the identified metabolic subgroups. By conducting genome-wide association studies within each manifold-defined subgroup, the researchers aim to uncover genetic variants linked to specific metabolic patterns. This could provide critical insights into the hereditary basis of metabolic diversity and help elucidate the genetic architecture underlying complex metabolic traits and their associated disease risks.

Another focus is the longitudinal analysis of metabolic manifolds to assess their stability over time and evaluate their potential as predictive biomarkers. By analysing time-series metabolomic data, the team seeks to trace how individuals transition between metabolic states over time and determine whether these shifts are associated with disease onset or progression. Such findings could pave the way for early detection systems and more precisely timed preventive interventions.

“Our framework not only captures the current structure of metabolic variation but also lays the foundation for investigating its genetic origins and temporal dynamics. These future directions could significantly enhance personalised healthcare by enabling earlier and more targeted responses to metabolic risk,” added Associate Professor Yao.

This ongoing research continues to expand the frontiers of metabolic profiling, providing a robust and adaptable platform for population health studies and precision medicine.

Visual representation of a new framework that applies manifold fitting to large-scale Nuclear Magnetic Resonance (NMR) biomarker data, uncovering meaningful metabolic patterns for precise risk stratification. This breakthrough has the potential to transform metabolic health profiling and drive advances in personalised healthcare interventions.

 

Reference

Li B; Su J; Lin, R; Yau S-T; Yao Z*, “Manifold Fitting Reveals Metabolomic Heterogeneity and Disease Associations in UK Biobank Populations”, Proceedings of the National Academy of Sciences of the United states of America DOI: pending Published: 2025.