The Joshi group uses machine learning and data science to uncover the contributions from regulation of hundreds of glycosylation related genes to the overall glycosylation process that takes place in the endoplasmic reticulum and Golgi.
In silico glycomics
Where there is life, there are sugars. Carbohydrates, and the biosynthetic machinery to build glycans (the glycosylation metabolic network) are found in every domain of life. In eukaryotes, even though the basic structure of the secretory glycosylation network is shared, the diversity of glycans between species is massive. Glycans are largely found in the extracellular space on proteins and lipids, where they serve many general functions in assembly of the glycocalyx and extracellular matrix, protection and interaction with the environment, and lubrication and clearance of microorganisms. Glycans also play highly specific roles in myriad fundamental protein functions such as co-regulation of proprotein convertase processing and ectodomain shedding, modulation of receptor activation and interactions, and modulation of peptide hormone stability and their ligand binding propensities. Inside the cell, in the endoplasmic reticulum and Golgi, glycans serve non-specific roles to ensure the correct folding of proteins and sorting of proteins e.g. to lysosomal compartments. Glycosylation also takes place in the nucleus and cytoplasm, and through cross-talk with phosphorylation, co-regulates most cell signalling, including playing roles in regulating the cell cycle. Thus, most cellular proteins undergo one or more types of glycosylation and there is great potential for the discovery of specific roles of glycosylation in defined cellular contexts.
The biosynthesis of glycans is a complex, non-template driven process that involves the orchestration of expression of over 700 genes, including glycosyltransferases, glycosyl hydrolases, nucleotide sugar transporters and other enzymes. Of the 700 genes, the builders - approximately 250 glycosyltransferases and sulfotransferases (glycogenes) - are arguably the most important, because they directly catalyse the synthesis and modification of glycans in a stepwise manner. A naïve reading of these glycogenes results in a prediction of millions of potential glycans.
The biosynthesis of glycans takes place in a single cell, and the results of the glycosylation process for that cell (the glycome) is tailored to suit the functional needs of that cell. Dysregulation of the glycosylation process results in aberrant glycosylation, and impaired cellular functions where these functions are dependent on glycosylation. In order to understand the myriad functions of glycans, we need to understand not only how they are regulated from cell to cell, but also how this process can be dysregulated. Direct analytics upon glycans is difficult due to both the heterogeneity of glycans, but also technological challenges.
In the Joshi group, we use computational and data science approaches to uncover patterns of regulation within the glycosylation process, taking advantage of large amounts of publicly available transcriptomic, proteomic and glycomic data. For example, by mining transcriptomic data we can bring order to this large family of genes so that we can predict what the activity of these genes are, what their patterns of regulation are, and predict the potential impact of their dysregulation upon health and disease.
Taking Glycomics to the single cell level
We have recently performed a first analysis of how the capacity to perform glycosylation varies between cell types derived from many organs in human. Our analysis revealed the overall patterns of regulation for glycosyltransferases, and enabled prediction of glycosylation capacity for individual cell types. The tools we have developed enable translation of any single cell data into a prediction of the overall glycosylation capacity of a cell.
Data science to uncover patterns of regulation of glycosylation
Based upon our success in analysis of glycosylation capacities at the single cell level, we are investigating how to reveal even more detail about how glycosylation is regulated, and what the hallmarks of glycosylation dysregulation are in different diseases.
Prediction of protein glycosylation
At the Copenhagen Center for Glycomics, we have the largest repository of glycoproteomic data covering O-linked glycosylation in-house, which gives us a unique opportunity to mine this information to build useful prediction models. We first used this kind of data to build the NetOGlyc4.0 tool, and recently we have been using language models to develop the next generation of predictors for protein glycosylation.
- Dworkin LA, Clausen H, Joshi HJ (2022) Applying transcriptomics to study glycosylation at the cell type level. 25: 104419
- Larsen I, Povolo L, Zhou L, Tian W, Mygind K, Hintze J, Jiang C, Hartill V, Prescott K, Johnson C, Mullegama S, McConki-Rosell A, McDonald M, Hansen L, Vakhrushev S, Schjoldager K, Clausen H, Joshi HJ, Halim A (2022) The SHDRA syndrome associated gene TMEM260 encodes a protein-specific O-mannosyltransferase. Research Square preprint.
- Schjoldager KT*, Narimatsu Y*, Joshi HJ*, Clausen H* (2020) Global view of human protein glycosylation pathways and functions. Rev. Mol. Cell Biol.
- Larsen ISB, Narimatsu Y, Clausen H, Joshi HJ, Halim A (2019) Multiple distinct O-Mannosylation pathways in eukaryotes. Opin Struct Biol 56: 171-178
- Narimatsu Y, Joshi HJ, Schjoldager KT, Hintze J, Halim A, Steentoft C, Nason R, Mandel U, Bennett EP, Clausen H, Vakhrushev SY (2019) Exploring Regulation of Protein O-Glycosylation in Isogenic Human HEK293 Cells by Differential O-Glycoproteomics. Mol Cell ProteomicsRA118.001121
- Narimatsu Y, Joshi HJ, Nason R, Coillie JV, Karlsson R, Sun L, et al. (2019) An Atlas of Human Glycosylation Pathways Enables Display of the Human Glycome by Gene Engineered Cells. Molecular Cell 75: 394-407.e5
- Joshi HJ, Narimatsu Y, Schjoldager KT, Tytgat HLP, Aebi M, Clausen H, et al. (2018) SnapShot: O-Glycosylation Pathways across Kingdoms. Cell 172: 632–632.e2.
- Joshi HJ, Jørgensen A, Schjoldager KT, Halim A, Dworkin LA, Steentoft C, et al. (2018) GlycoDomainViewer: A bioinformatics tool for contextual exploration of glycoproteomes. Glycobiology 28: 131–136.
- Joshi HJ, Hansen L, Narimatsu Y, Freeze HH, Henrissat B, Bennett E, et al. (2018) Glycosyltransferase genes that cause monogenic congenital disorders of glycosylation are distinct from glycosyltransferase genes associated with complex diseases. Glycobiology 28: 284–294.
- Steentoft C *, Vakhrushev SY *, Joshi HJ *, Kong Y, Vester-Christensen MB, Schjoldager KT-BG, et al. (2013) Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology. EMBO J 32: 1478–88.
(+45) 35 33 55 04
CV, publications, etc