Clinical and lifestyle related factors influencing whole blood metabolite levels – A comparative analysis of three large cohorts

Carl Beuchel, Susen Becker, Julia Dittrich, Holger Kirsten, Anke Toenjes, Michael Stumvoll, Markus Loeffler, Holger Thiele, Frank Beutner, Joachim Thiery, Uta Ceglarek, Markus Scholz

Proper identification of factors affecting metabolite levels across multiple studies is highly relevant for standardized translation of metabolite biomarkers into clinical applications and to understand possible confounders of disease associations. However, only limited data exist regarding kind, number, and relevance of possible influencing factors. Beuchel and colleagues investigated the effects of 29 clinical and lifestyle related factors on metabolite levels in dried whole blood derived from mass spectrometry in three large human studies with different designs comprising a total of 16,222 subjects. They developed a generic and adaptable workflow and interpreted the discovered associations biologically by applying pathway-based methods.

Objective: Human blood metabolites are influenced by a number of lifestyle and environmental factors. Identification of these factors and the proper quantification of their relevance provides insights into human biological and metabolic disease processes, is key for standardized translation of metabolite biomarkers into clinical applications, and is a prerequisite for comparability of data between studies. However, so far only limited data exist from large and well-phenotyped human cohorts and current methods for analysis do not fully account for the characteristics of these data. The primary aim of this study was to identify, quantify and compare the impact of a comprehensive set of clinical and lifestyle related factors on metabolite levels in three large human cohorts. To achieve this goal, we improve current methodology by developing a principled analysis approach, which could be translated to other cohorts and metabolite panels.

Methods: 63 Metabolites (amino acids, acylcarnitines) were quantified by liquid chromatography tandem mass spectrometry in three cohorts (total N = 16,222). Supported by a simulation study evaluating various analytical approaches, we developed an analysis pipeline including preprocessing, identification, and quantification of factors affecting metabolite levels. We comprehensively identified uni- and multivariable metabolite associations considering 29 environmental and clinical factors and performed metabolic pathway enrichment and network analyses.

Results: Inverse normal transformation of batch corrected and outlier removed metabolite levels accompanied by linear regression analysis proved to be the best suited method to deal with the metabolite data. Association analyses revealed numerous uni- and multivariable significant associations. 15 of the analyzed 29 factors explained >1% of variance for at least one of the metabolites. Strongest factors are application of steroid hormones, reticulocytes, waist-to-hip ratio, sex, haematocrit, and age. Effect sizes of factors are comparable across studies.

Conclusions: We introduced a principled approach for the analysis of MS data allowing identification, and quantification of effects of clinical and lifestyle factors with metabolite levels. We detected a number of known and novel associations broadening our understanding of the regulation of the human metabolome. The large heterogeneity observed between cohorts could almost completely be explained by differences in the distribution of influencing factors emphasizing the necessity of a proper confounder analysis when interpreting metabolite associations.