O-GlcNAcylation Prediction: An Unattained Objective
Résumé
Background: O-GlcNAcylation is an essential post-translational modification (PTM) in mammalian cells. It consists in the addition of a N-acetylglucosamine (GlcNAc) residue onto serines or threonines by an O-GlcNAc transferase (OGT). Inhibition of OGT is lethal, and misregulation of this PTM can lead to diverse pathologies including diabetes, Alzheimer’s disease and cancers. Knowing the location of O-GlcNAcylation sites and the ability to accurately predict them is therefore of prime importance to a better understanding of this process and its related pathologies.Purpose: Here, we present an evaluation of the current predictors of O-GlcNAcylation sites based on a newly built dataset and an investigation to improve predictions.Methods: Several datasets of experimentally proven O-GlcNAcylated sites were combined, and the resulting meta-dataset was used to evaluate three prediction tools. We further defined a set of new features following the analysis of the primary to tertiary structures of experimentally proven O-GlcNAcylated sites in order to improve predictions by the use of different types of machine learning techniques.Results: Our results show the failure of currently available algorithms to predict O-GlcNAcylated sites with a precision exceeding 9%. Our efforts to improve the precision with new features using machine learning techniques do succeed for equal proportions of O-GlcNAcylated and non-O-GlcNAcylated sites but fail like the other tools for real-life proportions where ∼ 1.4% of S/T are O-GlcNAcylated.Conclusion: Present-day algorithms for O-GlcNAcylation prediction narrowly outperform random prediction. The inclusion of additional features, in combination with machine learning algorithms, does not enhance these predictions, emphasizing a pressing need for further development. We hypothesize that the improvement of prediction algorithms requires characterization of OGT’s partners.
Origine | Fichiers éditeurs autorisés sur une archive ouverte |
---|