Integrated genomics and metabolomics to identify cause-specific biomarkers for chronic kidney disease in a Korean population
Article information
Abstract
Background
The heterogeneity of chronic kidney disease (CKD) and fragmented analysis methods hinder the precise identification of novel biomarkers. We addressed this challenge using two independent cohorts to integrate genomics and metabolomics, aiming to identify cause-specific biomarkers for CKD in the Korean population.
Methods
A longitudinal genome-wide association study using the Cox proportional hazards model was conducted using the Ansan and Ansung cohort. To validate these genomic biomarkers and integrate them with plasma metabolomics biomarkers, we utilized a hospital-based biopsy cohort to identify cause-specific CKD biomarkers. Within the biopsy cohort, we analyzed four disease subsets, including type 2 diabetic kidney disease (DKD), hypertensive nephropathy (HN), immunoglobulin A nephropathy (IgAN), and membranous nephropathy (MN), and compared them with healthy individuals. Significant single nucleotide polymorphisms(SNPs) and metabolites for each CKD subset were identified through logistic regression and correlation-based network analyses. Subsequently, we analyzed the risk of disease progression associated with the identified pairs.
Results
A total of 448 variants associated with CKD occurrence were identified, with significant differences in several genetic variants and metabolites observed among patients with DKD, HN, IgAN, and MN compared to healthy individuals. Among 36 SNP-metabolite pairs, those involing FOXB1 and ZFP42 were associated with DKD, whereas pairs involving MMRN1 and SYNJ2 were linked to MN. Notably, the rs1025170 variant in FOXB1 and tyrosine pair was correlated with DKD progression.
Conclusion
Integrating genomics and metabolomics across independent cohorts enables the discovery of cause-specific biomarkers for the occurrence and progression of CKD in the Korean population.
Introduction
Chronic kidney disease (CKD) is a major cause of global mortality, affecting more than 10% of the general population worldwide [1,2]. In Korea, data from the Korean National Health and Nutrition Examination Survey show that the prevalence of CKD among adults aged over 18 years increased from 8.0% in 2012 to 8.4% in 2021 [3]. CKD is an independent risk factor for end-stage kidney disease requiring dialysis, as well as cardiovascular events and all-cause death [4,5]. Various factors, including hypertension, diabetes mellitus (DM), acute kidney injury, metabolic syndrome, and smoking, have been linked to CKD development [6–13]. Despite efforts to manage these risk factors, CKD remains a significant threat, which is partially due to unidentified risk factors associated with its onset and progression [14]. CKD is considered to be incurable, and available treatment options are limited, which is likely due to the heterogeneous nature of the underlying mechanisms that are involved [15]. Thus, there is a critical need to discover biomarkers for both the occurrence and progression of CKD to facilitate tailored treatment strategies and the development of novel therapeutic interventions.
Early prediction of CKD is crucial because it can significantly enhance quality of life and overall survival [16]. Genome-wide association studies (GWAS) have become a cornerstone in identifying genetic variants associated with disease. The primary advantage of GWAS is its ability to systematically scan the genome for single nucleotide polymorphisms (SNPs) that correlate with disease traits without prior hypotheses about their location, thereby identifying numerous genomic biomarkers for various diseases. In CKD, GWASs have significantly contributed to the identification of more than 250 associated genomic loci [17,18]. However, the identified lead SNPs may account for approximately 20% of the estimated genomic heritability of the estimated glomerular filtration rate (eGFR) [17]. Moreover, these biomarkers do not entirely elucidate CKD susceptibility. On the other hand, metabolomic analysis involves the comprehensive profiling of small molecules (metabolites) in biological samples. Metabolomics can capture the end-products of gene expression and provide a direct snapshot of the physiological state, revealing perturbations in metabolic pathways associated with CKD [19,20]. Nonetheless, metabolites are influenced by environmental factors such as diet, medications, and the microbiome, thus making it challenging to detect true associations due to individual variability [21]. When considering the heterogeneous causes of CKD and the complex interplay of numerous genotypes and metabolites in its pathophysiology, a multiomics approach may be essential for identifying definitive risk factors or treatment targets [22]. Combining GWAS with plasma metabolomics enables us to capture both the genetic predispositions and the biochemical consequences of genetic and environmental interactions, providing a comprehensive understanding of CKD pathophysiology. We selected plasma over urine for metabolomics analysis due to its broader systemic representation and greater stability, which provide better insights into metabolic disruptions contributing to CKD. In contrast, urine metabolites are more variable and can be influenced by factors such as hydration and diet [23]. Despite several multiomics studies on CKD, there is insufficient research to pinpoint the definite biomarkers for CKD and its progression, thus necessitating further investigation [24].
Given the aforementioned research constraints, our objective was to explore CKD biomarkers by integrating GWASs and metabolomics. Initially, we performed a GWAS by using a representative cohort of the Korean population, known as the Ansan and Ansung cohort of the Korean Genome and Epidemiology Study (KoGES). Subsequently, we validated these findings and integrated them with plasma metabolomics in a cohort with biopsy-proven CKD subsets to identify cause-specific biomarkers.
Methods
Study design and subjects
This study was approved by the Institutional Review Board of Seoul National University Hospital (No. H-2104-120-1214), and the requirement for informed consent was waived due to the retrospective study design. All clinical investigations were conducted in accordance with the guidelines of the Declaration of Helsinki.
A longitudinal cohort study was used to identify biomarkers associated with CKD. This study comprised two primary cohorts: the Ansan and Ansung study of the KoGES and a biopsy cohort of Seoul National University Hospital. The Ansan and Ansung cohort, which represents a component of KoGES, recruited the Korean population from 2001 to 2002 and conducted eight biennial follow-ups through 2017–2018. Genotyped and imputed data for a total of 5,493 participants aged 40 to 69 years were obtained using the Korean Biobank Array (KoreanChip, KCHIP) [25,26]. CKD was defined by using creatinine-based eGFR and proteinuria. The eGFR was calculated by using the CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration) equation [27]. Participants with a baseline eGFR <60 mL/min/1.73 m2, proteinuria ≥1+, or missing data for eGFR or proteinuria at baseline were excluded from the analysis. Additionally, participants with self-reported underlying CKD at baseline were excluded. The analysis included 5,194 non-CKD participants from the Ansan and Ansung cohort of KoGES (Supplementary Fig. 1, available online). CKD development was defined as an eGFR <60 mL/min/1.73 m2 or proteinuria (defined as ≥1+ on the dipstick test) at follow-up.
The biopsy cohort comprised patients who underwent kidney tissue biopsy between January 2009 and March 2021 at Seoul National University Hospital, Korea, thus enabling the identification of the cause of CKD. The biopsy cohort provides access to both genomic variant data and plasma biospecimens. Patients with active diseases such as ischemic heart disease, stroke, malignancy, tuberculosis, or autoimmune disorders were excluded. For comparison, healthy individuals who attempted to donate a kidney for transplantation were included in the analysis, and they did not have kidney disease or comorbidities. Only patients and healthy individuals with available blood biospecimens at the time of kidney biopsy and before transplantation, respectively, were included in the study. Subsequently, patients were categorized into four disease subsets by using 1:1 propensity score matching based on age and sex. The hypertensive nephropathy (HN) subset had a limited sample size, thus preventing propensity score matching. The disease subsets included type 2 diabetic kidney disease (DKD) (n = 64), HN (n = 24), immunoglobulin A nephropathy (IgAN) (n = 66), and membranous nephropathy (MN) (n = 66), along with a group of healthy individuals (n = 66) (Supplementary Fig. 2, available online). As a composite endpoint, kidney progression was defined as a decline in eGFR of less than 50%, a doubling of serum creatinine levels, or progression to end-stage kidney disease (e.g., dialysis and kidney transplantation).
GWAS using the Cox proportional hazard model using the Ansan and Ansung cohort
The Ansan and Ansung cohort of the KoGES served as the discovery set, whereby it utilized the KCHIP, which was designed with imputation-aware SNPs and optimized for the Korean population consisting of 833,535 genotyped SNPs. As the optimized genomic data for Korean genome structure, it was developed with high coverage of common and low-frequency variants. Approximately 95% of genotyped SNPs consist of tagging SNPs (600 K) of GWAS markers and function variants (200 K) [26,28]. In the quality control process, genotyped samples meeting the following criteria were excluded: minor allele frequency <0.01, p for Hardy-Weinberg equilibrium <0.001, missing genotype rate <0.01, and missing rate per person >0.05. After quality control, a total of 451,172 SNPs remained for analysis.
Imputed data was produced by the KCHIP using IMPUTE4 with 1000 Genome Project Phase 3 data and the Korean reference genome as reference panels. Variants with an imputation quality score <0.8 and minor allele frequency <1% were excluded, resulting in a total of 8,056,211 remaining variants [26,28]. The same quality control criteria for the genotyped data were applied to the imputed data.
Validation of significant single nucleotide polymorphisms in a biopsy cohort
To validate the significant SNPs that were identified in the Ansan and Ansung cohort and to identify biomarkers, a biopsy cohort was utilized. A total of 286 patients were genotyped by using Axiom_KORV1_1 (Macrogen Inc.), which is identical to KCHIP in KoGES. We applied the same quality control criteria that were used for the Ansan and Ansung cohort to the GWAS data from the biopsy cohort. Among the 827,783 SNPs, 514,187 SNPs remained after quality control.
Metabolomics in a biopsy cohort
Plasma metabolites were measured by using the Biocrates AbsoluteIDQ p180 platform (Biocrates Life Sciences AG). The combined method of flow-injection analysis and liquid chromatography enabled the quantification of amino acids, acylcarnitines, sphingomyelins, lysophosphatidylcholines, phosphatidylcholines, hexoses, and biogenic amines. A total of 188 metabolites were quantified in the plasma samples of CKD patients and healthy individuals. These assays were conducted by using an SCIEX 5500 QTrap mass spectrometer (SCIEX) equipped with a Waters ACQUITY ultra-performance liquid chromatography I-Class (Waters) with electrospray ionization. Calibration standards, internal standards, quality controls, and 10 µL of human plasma samples were applied to a 96-well extraction plate and dried under nitrogen gas. After derivatization, all of the metabolites were extracted for mass spectrometry analysis. Five batches were normalized by using MetIDQ software (Biocrates Life Sciences AG) with pooled quality control samples to adjust for batch effects, and metabolites with 80% missing values were excluded from further analysis.
Network analysis of GWAS and metabolomics data
To integrate GWAS and metabolomics data, we conducted a correlation-based network analysis, as it allows for the identification of novel relationships between different omics layers through a data-driven approach that may not be captured by knowledge-based network analysis, which is limited to pre-existing biological mechanisms [29,30]. The partial Spearman-rank correlation test was performed by using the pcor package in R software [31]. Significant correlations were selected based on a false discovery rate threshold of p < 0.01 and visualized as a network by using Cytoscape software (version 3.10.1). We defined cause-specific biomarkers as those for which both SNPs and metabolites were exclusively correlated with a specific CKD subset. Linkage disequilibrium (LD) SNPs with a r2 value greater than 0.2 for the cause-specific biomarkers were identified by examining the ±1,000 kb region in the biopsy cohort. Annotation plots were generated by using LocusZoom [32], and the 1000 Genomes Project ASN data (GRCh37/hg19) were utilized as the reference panel.
Significant single nucleotide polymorphism/metabolite pairs
By conducting network analysis via both GWAS and metabolomics, we identified pairs consisting of variants and metabolites that demonstrated significant relationships and associations with a particular, singular cause of the CKD subset. Subsequent analysis was performed to ascertain the risk of kidney progression associated with these selected pairs. To determine the risk of the presence of a minor allele, SNPs were encoded into the following two categorical variables based on the number of minor alleles that were present: 0 or 1–2 minor alleles. Furthermore, the metabolites were divided into the following two categories based on their concentration levels: above or below the optimal cutoff level of the receiver operating characteristic curve for the presence of the minor allele of paired SNPs according to the metabolite level. Consequently, the SNP and metabolite pairs were encoded into the following four categorical variables resulting from the combination of the two SNP categories and the two metabolite categories: minor allele count 0/lower metabolite level, minor allele count 0/upper metabolite level, minor allele count 1–2/lower metabolite level, and minor allele count 1–2/upper metabolite level.
Statistical analysis
In the Ansan and Ansung cohort, significant genetic variants for CKD development were identified by using a saddlepoint approximation implementation based on the Cox proportional hazard regression model (SPACox) for GWAS, adjusted for age, sex, education, eGFR, and histories of hypertension, DM, and cardiovascular diseases at baseline [33]. SNPs with p-values <10–3 were considered to be statistically significant for the association. GWAS using the Cox proportional hazard model was conducted by using PLINK software (version 2.0) and the SPACox package in R software (version 4.2.2; R Foundation for Statistical Computing).
In the biopsy cohort, a logistic regression model adjusted for age and sex was used to identify SNPs exhibiting significant correlations with cause-specific CKD subsets. SNPs with p-values <0.05 were considered to be valid. The GWAS analysis was conducted by using the same software as previously mentioned. Similarly, a logistic regression model adjusting for age and sex was employed to identify metabolites demonstrating significant associations with particular subsets in the biopsy cohort. Metabolites exhibiting significant correlations with specific subsets were selected when the false discovery rate-adjusted p-value was less than 0.05. Pathway enrichment analysis of the significant metabolites was performed by using MetaboAnalyst (version 5.0). Hit metabolites from this analysis were identified by using the Small Molecule Pathway Database (https://smpdb.ca).
For the cause-specific SNPs selected from the network analysis, the hazard ratio (HR) for kidney progression was calculated by using a Cox proportional hazards model, adjusting for age, sex, eGFR, and random urine protein-to-creatinine ratio (uPCR). For cause-specific SNP/metabolite pairs that were identified from the network analysis, the risk of kidney progression was assessed by using a Cox proportional hazards model after adjusting for age, sex, eGFR, and uPCR. Due to the extreme rarity of cases with two risk alleles, it was statistically challenging to analyze groups defined by both metabolites and the presence of two risk alleles. Therefore, we categorized risk alleles based on the presence of at least one allele. This approach allowed for consistent grouping by metabolite levels and avoided the need for an additive model.
The LD clumping based on the imputed results was performed by focusing on the CKD cause-specific SNPs selected from the network analysis. The threshold of 0.001 was applied for both index SNPs (p1) and linked SNPs (p2), with a ±1,000 kb window and r2 value of 0.2 for the clumped results.
Results
Baseline characteristics
A total of 5,194 participants from the Ansan and Ansung cohort were included in the study. Among them, 836 (16.1%) experienced CKD occurrence, whereas 4,358 (83.9%) did not (Table 1; Supplementary Fig. 1, available online). The mean age of the population was 51.4 ± 8.5 years, and the proportion of male subjects was 48.0%. The risk of CKD occurrence differed according to age, the presence of hypertension, DM, and cardiovascular disease. However, there was no difference in the risk of CKD occurrence based on sex.
The biopsy cohort comprised 286 patients, with 64 diagnosed with DKD, 24 with HN, 66 with IgAN, and 66 with MN (Supplementary Fig. 2, available online). The mean age of the patients was 53.1 ± 13.1 years (Table 2), which was consistent across the different CKD subsets. The proportion of males was 62.2%, and similar proportions were observed within each subset. The mean eGFR was 72.3 ± 40.2 mL/min/1.73 m2, with the MN group exhibiting the highest eGFR and the HN group showing the lowest. The mean uPCR was 3.1 ± 3.8 g/g, with the highest uPCR observed in the DKD subset.
GWAS of the two cohorts
Among the 451,172 SNPs that were analyzed after quality control, 448 SNPs were found to be associated with CKD occurrence in the Ansan and Ansung cohort. The Manhattan plot and quantile-quantile plot, which visualize the distribution and significance of these associations, are provided in Supplementary Fig. 3 (available online). Among these 448 SNPs associated with CKD occurrence, logistic regression analyses identified 11, 10, 13, and 21 SNPs that were validated in the DKD, HN, IgAN, and MN subsets, respectively (Table 3, Fig. 1). Among the identified variants, we further categorized those that were significantly associated with only one specific CKD subset. As a result, nine variants were found to be significantly associated exclusively with DKD, nine variants with HN, six variants with IgAN, and 15 variants with MN.
Flow chart of the analysis.
CKD, chronic kidney disease; DKD, diabetic kidney disease; FDR, false discovery rate; HN, hypertensive nephropathy; IgAN, immunoglobulin A nephropathy; KoGES, Korean Genome and Epidemiology Study; Met, metabolite; MN, membranous nephropathy; QC, quality control; SNP, single nucleotide polymorphism.
Metabolomics of the biopsy cohort
Using a targeted metabolomics platform, we analyzed the profiles of 188 metabolites in plasma samples from both CKD patients and healthy individuals. Principal component analysis demonstrated that a minority of patient samples showed distinct metabolite trends within certain CKD subsets, such as DKD and MN (Supplementary Fig. 4, available online). A total of 51, 28, 29, and 71 metabolites showed significant associations with DKD, HN, IgAN, and MN, respectively (Supplementary Table 1, available online).
Among the significant metabolites in each disease group, eight, five, eight, and nine metabolites were identified as hit metabolites from pathway enrichment analysis (Supplementary Table 2 and Supplementary Fig. 5; available online). The most noteworthy metabolic pathway associated with DKD was arginine and proline metabolism. Key metabolites within this pathway, such as glutamic acid, proline, glycine, and citrulline, were identified as being significant contributors. For HN, the most significant metabolic pathways were malate-aspartate shuttle and phenylalanine and tyrosine metabolism, with hit metabolites such as glutamic acid and phenylalanine being observed. Regarding IgAN, the most significant pathway was arginine and proline metabolism, with hit metabolites including glycine, glutamic acid, proline, and citrulline. Finally, for MN, glycine and serine metabolism exhibited the highest significance, with hit metabolites including glycine, glutamic acid, threonine, serine, and ornithine.
Combined analysis of single nucleotide polymorphisms and metabolites
Correlation analysis was conducted for all 46 SNPs and 103 metabolites that were significantly associated with CKD subsets in the biopsy cohort (Fig. 2). A total of 36 SNP/metabolite pairs exhibited significant correlations (Supplementary Table 3, available online). Among these 36 pairs, only five SNP/metabolite pairs were correlated with one specific CKD subset each, including rs1025170 (FOXB1)/tyrosine, rs1125327 (ZFP42)/PC aa C30:0, rs1125327 (ZEP42)/PC aa C38:3, rs17016322 (MMRN1)/ornithine, and rs2295894 (SYNJ2)/lysoPC C20:3 pairs. Among the five selected pairs, the rs1025170/tyrosine, rs1125327/PC aa C30:0, and rs1125327/PC aa C38:3 pairs were associated with DKD, whereas the rs17016322/ornithine and rs2295894/lysoPC a C20:3 pairs were associated with MN. Regarding LD SNPs, rs10516848 of the MMRN1 gene, which is significantly linked with rs17016322, exhibited differences between the MN and healthy individual groups (r2 > 0.2, p < 0.05); rs1109511 (r2 > 0.6, p < 0.05) and rs9365723 (r2 > 0.2, p < 0.05) of the SYNJ2 gene linked with rs2295894 differed between the MN and healthy individual groups (Supplementary Fig. 6, available online). No significant metabolites paired with SNPs were found in the pathway analysis.
Correlation-based network of single nucleotide polymorphisms and metabolites according to the cause of chronic kidney disease in the biopsy cohort.
Correlations of the single nucleotide polymorphisms and metabolites are represented as solid edges, with red and blue colors indicating positive and negative correlations, respectively. Specifically, 36 correlations (adjusted p after correction for false discovery rate <0.01) involve 47 nodes and 116 edges. Red and blue arrows denote correlations that were exclusively identified in the diabetic kidney disease and membranous nephropathy subsets, respectively.
CKD, chronic kidney disease; DKD, diabetic kidney disease; HN, hypertensive nephropathy; IgAN, immunoglobulin A nephropathy; MN, membranous nephropathy; SNP, single nucleotide polymorphism.
When kidney progression was analyzed according to the four SNPs included in the five selected SNP/metabolite pairs, only rs1025170 showed a risk for kidney progression (HR, 2.68; 95% confidence interval [CI], 1.38–5.20) (Table 4). Survival analysis was performed on the five selected SNP/metabolite pairs, and only the rs1025170/tyrosine pair showed a risk for kidney progression (p < 0.001) (Fig. 3). Patients with a lower tyrosine level and one or more rs1025170 minor alleles had a higher risk of kidney progression than did their counterpart patients with a higher tyrosine level and no minor allele (HR, 3.47; 95% CI, 1.42–8.48) (Table 5). Other SNP/metabolite pairs were not associated with a risk for kidney progression (p ≥ 0.05) (Supplementary Fig. 7, available online). The rs2970384 variant was in LD with rs1025170 but did not exhibit a significant correlation with DKD (HR, 1.60; 95% CI, 0.79–3.26; p = 0.20) (Fig. 4).
Adjusted cumulative rates of kidney progression according to the single nucleotide polymorphism variant for rs1025170 and plasma tyrosine levels.
MAC, minor allele count.
Risk for kidney progression based on the minor allele count of rs1025170 (FOXB1) and plasma tyrosine levels
Linkage disequilibrium of rs1025170.
The odds ratio and the relevant p-value are 1.60 (95% confidence interval, 0.79–3.26) and 0.16, respectively.
SNP, single nucleotide polymorphism.
rs17016322 and rs1125327 (chromosome 4) were identified as lead SNPs including two and one other SNPs based on the r2 values of 0.2, respectively. rs2295894 (chromosome 6) was one of the significant clumped SNPs of rs9365766, which is a lead SNP that includes 25 other SNPs in clump. Although rs1025170 (chromosome 15) was identified as a lead SNP, it has no other SNPs within its LD (Supplementary Table 4, available online).
Discussion
Numerous GWAS studies have been conducted to identify a plethora of genomic biomarkers associated with kidney function [34,35]. However, the heterogeneity of CKD features and the fragmented nature of analysis methods may impede the precise identification of novel biomarkers. We successfully identified biomarkers for CKD occurrence and progression by integrating GWAS and metabolomics data from both a representative cohort of the Korean population and a biopsy cohort that accurately deciphered cause-specific subsets of CKD. Notably, we confirmed that the pair of rs1025170 and plasma tyrosine was associated with the risk of DKD and its progression.
We used the partial correlation-based networks to integrate the genomic and metabolic markers related to each CKD subset. Although there is a limitation to elucidate the causal associations between multiomics levels, the data-driven network analysis that we used for the current study has the advantage of potentially identifying novel relationships by exploring connections between different omics layers that have not been previously recognized [30,36]. We integrated genomic and metabolomic data to build a network that reflects the relationships between genes and metabolites, which allowed us to propose complex interactions according to CKD subsets.
Initially, we identified SNPs that were significantly associated with CKD occurrence in the Ansan and Ansung cohort [25]. However, as this cohort does not provide insight into the causes of CKD, the specific disease subset associated with the identified SNPs is unknown. Therefore, by using the biopsy cohort diagnosed via tissue examinations, we confirmed the association of SNPs that were identified in the Ansan and Ansung cohort with the biopsy-proven CKD subset. This process lends significance to these SNPs as being specific biomarkers for CKD based on their etiology. Finally, the 36 SNPs associated with CKD subsets in the biopsy cohort, which are indicative of a high CKD progression risk, were newly discovered and not reported in previous CKD or cause-related studies [17].
The ZFP42 gene, associated with DKD, is involved in the regulation of pluripotency and differentiation of embryonic stem cells [37,38]. Its role in DKD may be linked to its regulatory functions in cell growth and development, potentially affecting renal cell turnover and repair mechanisms in diabetic nephropathy. The MMRN1 gene, associated with MN, encodes multimerin 1, a protein involved in platelet function and endothelial interactions, suggesting a possible role in the endothelial changes observed in MN [39,40]. SYNJ2, another gene linked to MN, encodes synaptojanin 2, a phosphoinositide phosphatase involved in cell proliferation and apoptosis, which may influence podocyte function and glomerular integrity in MN [41]. Understanding these associations can provide new insights into disease mechanisms and potential therapeutic targets.
The current study demonstrated that rs1025170 is associated with DKD and subsequent renal function deterioration, including the risk of future dialysis. Previous research has not identified any known association between rs1025170 and CKD or a specific disease subset. The FOXB1 gene, which is located near rs1025170, has been previously linked to central nervous system development [42,43]. Neurons expressing forkhead box protein B1 (FOXB1) are involved in defense against life-threatening conditions [44]. The rs335810 variant, which is located adjacent to the FOXB1 gene, is associated with type 2 DM in African Americans [45]. Due to the involvement of FOXB1 in regulating the autonomic nervous system, the FOXB1 gene variant potentially exerts an influence on DKD kidney function through altered sympathetic activity [46–48]. Despite this possibility, further preclinical and clinical studies investigating the mechanism and factors underlying this finding are warranted.
In patients with CKD, there is a deficiency in the conversion of phenylalanine to tyrosine, thus resulting in decreased plasma levels of tyrosine [49,50]. Additionally, there have been reports suggesting an association between plasma tyrosine and insulin resistance, as well as the occurrence of type 2 DM [51,52]. However, it is not yet established whether low plasma tyrosine levels have an impact on DKD occurrence and progression. The potential relationship between tyrosine levels and the rs1025170 variant of the FOXB1 gene regarding DKD progression may be inferred from the role of tyrosine-derived neurotransmitters, which serve as primary messengers within the sympathetic nervous system. Increased sympathetic activity may lead to elevated norepinephrine biosynthesis and subsequently heightened tyrosine hydroxylase activity, thus potentially resulting in low plasma tyrosine levels. This theory is bolstered by preclinical findings, wherein rats subjected to streptozotocin-induced DM exhibited elevated tyrosine hydroxylase activity within the sympathetic nervous system [53].
Compared to those in healthy individuals, plasma metabolites that significantly differed in DKD, HN, IgAN, and MN patients were more closely related to the clinical phenotype. For instance, PC aa C38:3, which was identified in this study to be associated with DKD, has been associated with type 2 DM [54,55]. However, there have been no reports regarding the association of PC aa C30:0 with CKD or DKD. Although ornithine was associated with MN in this study, whereas the ornithine/citrulline ratio is associated with CKD [56,57], no research has yet demonstrated a specific link between ornithine and MN. Further research is required to elucidate the relationships and underlying mechanisms between the identified novel metabolomic biomarkers and CKD.
Although this study provides insightful information, it possesses certain limitations. Metabolites were analyzed in bulk rather than being selected based on their association with SNPs, thus potentially limiting our ability to confidently assert shared metabolic pathways between SNPs and metabolites. Additionally, the outcome analysis of kidney progression was based on the biopsy cohort with a small sample size, warranting validation in an external large cohort. The underlying pathophysiology linking CKD with the biomarkers remains unexplored, although elucidating the association with cause-specific subsets has been conducted. The relatively high p-value cutoff applied to the discovery set may increase the risk of false-positive associations. Although we validated our results using the same genotyped array (KCHIP), further studies with a larger sample size are required to confirm our findings. Moreover, further studies are needed to validate causal variants considering the imputed genomic data. In addition to these limitations, the selection of target metabolites relied on pathway-based approaches and curated datasets rather than solely on statistical inference, which might have introduced inconsistencies in the methodology. While this approach aimed to prioritize biological interpretability, it may have overlooked metabolites with significant changes that could be functionally relevant.
In conclusion, we have identified genetic and metabolomic biomarkers of CKD associated with its cause, demonstrating their association with the risk of future disease deterioration. The minor allele of rs1025170 and low tyrosine levels, which showed associations with each other, were associated with DKD and indicated a risk factor for DKD progression. These findings will serve as a clue in understanding the mechanisms behind CKD occurrence and progression.
Supplementary Materials
Supplementary data are available at Kidney Research and Clinical Practice online (https://doi.org/10.23876/j.krcp.24.135).
Notes
Conflicts of interest
All authors have no conflicts of interest to declare.
Funding
This work was supported by the National Institute of Health Research Project (project no. 2021-0317C93-00).
Acknowledgments
The human biospecimens were provided by the Biobank of Seoul National University Hospital, a member of the Korea Biobank Network (KBN4_A03), which is supported by the Korea Centers for Disease Control and Prevention (#4845-303).
Data sharing statement
The KOGES data are accessible to researchers who have obtained approval through the appropriate proposal process from the National Biobank of Korea (https://biobank.nih.go.kr/eng). Genomic data from Seoul National University Hospital can be provided upon further request to the corresponding author. The metabolomic data were deposited in the Metabolomics Workbench and Korea BioData Station. This study does not report any original code for the analyses. Any additional information required to reanalyze the data is available from the lead contact author upon request.
Authors’ contributions
Funding acquisition: SSH, JY Cho, JY Choi
Conceptualization, Methodology: SSH, JY Cho, JY Choi
Data curation, Resources, Supervision: SK, SB
Formal analysis: MWK, JEK, JK, JYP
Investigation: MWK, JEK, JK, SSH, SK
Validation: SSH, JY Cho, JY Choi, YSK
Writing–original draft: MWK, JEK, JK, SSH
Writing–review & editing: All authors
All authors approved the final manuscript.
