DTU Health Tech
Department of Health Technology
This link is for the general contact of the DTU Health Tech institute.
If you need help with the bioinformatics programs, see the "Getting Help" section below the program.
The NetNglyc server predicts N-Glycosylation sites in human proteins using artificial neural networks that examine the sequence context of Asn-Xaa-Ser/Thr sequons.
For publication of results, please cite:
Gupta R, Brunak S.
Prediction of glycosylation across the human proteome and the correlation to protein function.
Pac Symp Biocomput. 2002;:310-22.
PMID: 11928486
The sequence must be written using the one letter
amino acid code:
`acdefghiklmnpqrstvwy' or
`ACDEFGHIKLMNPQRSTVWY'.
Other letters will be converted to `X' and treated as unknown
amino acids.
Other characters, such as whitespace and
numbers, will simply be ignored.
# Predictions for N-Glycosylation sites in 1 sequence Name: CBG_HUMAN Length: 405
(Sequence) Asn-Xaa-Ser/Thr sequons (including Asn-Pro-Ser/Thr) are shown in blue. Asparagines predicted to be N-glycosylated are shown in red. Note that not all sequons are predicted glycosylated.
MPLLLYTCLLWLPTSGLWTVQAMDPNAAYVNMSNHHRGLASANVDFAFSLYKHLVALSPKKNIFISPVSISMALAMLSLG 80 TCGHTRAQLLQGLGFNLTERSETEIHQGFQHLHQLFAKSDTSLEMTMGNALFLDGSLELLESFSADIKHYYESEVLAMNF 160 QDWATASRQINSYVKNKTQGKIVDLFSGLDSPAILVLVNYIFFKGTWTQPFDLASTREENFYVDETTVVKVPMMLQSSTI 240 SYLHDSELPCQLVQMNYVGNGTVFFILPDKGKMNTVIAALSRDTINRWSAGLTSSQVDLYIPKVTISGVYDLGDVLEEMG 320 IADLFTNQANFSRITQDAQLKSSKVVHKAVLQLNEEGVDTAGSTGVTLNLTSKPIILRFNQPFIIMIFDHFTWSSLFLAR 400 VMNPV (Annotation line) `N' represents a predicted N-glycosylation site. `n' represents an Asn with a positive score, but not occuring within an Asn-Xaa-Ser/Thr sequon
..............................N................................................. 80 ...............N................................................................ 160 ................................................................................ 240 ...................N............................................................ 320 ................................................N............................... 400 ..... (Threshold=0.5) -------------------------------------------------------------------------------- SeqName Position Potential Jury NGlyc agreement result -------------------------------------------------------------------------------- CBG_HUMAN 31 NMSN 0.7166 (9/9) ++ <-- Predicted as N-glycosylated (++) CBG_HUMAN 96 NLTE 0.6356 (8/9) + <-- Predicted as N-glycosylated (+) CBG_HUMAN 176 NKTQ 0.3941 (7/9) - <-- A negative site CBG_HUMAN 260 NGTV 0.7400 (9/9) ++ CBG_HUMAN 330 NFSR 0.4223 (7/9) - see below for CBG_HUMAN 369 NLTS 0.6684 (9/9) ++ more information --------------------------------------------------------------------------------
The graph illustrates predicted N-glyc sites across the protein chain (x-axis represents protein length from N- to C-terminal). A position with a potential (vertical lines) crossing the threshold (horizontal line at 0.5) is predicted glycosylated. Additional thresholds are shown at 0.32, 0.75 and 0.90 by horizontal dotted lines. Explained below. An Encapsulated postscript format of the graph is available for including in publications.
The Asn-Xaa-Ser/Thr sequon
Asn-Pro-Ser/Thr
Thresholds and confidence
+ Potential < 0.5 ++ Potential < 0.5 AND Jury agreement (9/9) OR Potential<0.75 +++ Potential < 0.75 AND Jury agreement ++++ Potential < 0.90 AND Jury agreementand non-glycosylated sites:
- Potential < 0.5 -- Potential < 0.5 AND Jury agreement (all nine > 0.5) --- Potential < 0.32 AND Jury agreement
Warnings and notes in the right margin
SEQUON ASN-XAA-SER/THR.If you request a prediction on all Asparagines (instead of the default to predict only on Asn-Xaa-Ser/Thr sequons), then this note will appear for Asparagine positions which do occur within the Asn-Xaa-Ser/Thr sequon.
WARNING: PRO-X1.Proline occurs just after the Asparagine residue. This makes it highly unlikely that the Asparagine is glycosylated, presumably due to conformational constraints.
WARNING: PRO-X2.Proline occurs at the 3rd position C-terminal to the Asparagine in question (2nd 'X' in NX[ST]X). This makes it somewhat unlikely that the Asparagine is glycosylated, but this condition is not as harsh as the PRO-X1 condition.
Contrary to widespread belief, acceptor sites for N-linked
glycosylation on protein sequences, are not well
characterised. The consensus sequence, Asn-Xaa-Ser/Thr
(where Xaa is not Pro), is known to be a prerequisite for
the modification. However, not all of these sequons are
modified and it is thus not discriminatory between
glycosylated and non-glycosylated asparagines. We train
artificial neural networks on the surrounding sequence
context, in an attempt to discriminate between acceptor and
non-acceptor sequons. In a cross-validated performance, the
networks could identify 86% of the glycosylated and 61% of
the non-glycosylated sequons, with an overall accuracy of
76%. The method can be optimised for high specificity
or high sensitivity. Apart from characterising individual
proteins, the prediction method can rapidly
scan complete proteomes.
Glycosylation is an important post-translational
modification, and is known to influence protein folding,
localisation and trafficking, protein solubility,
antigenicity, biological activity and half-life, as well as
cell-cell interactions. We investigate the spread of known
and predicted N-glycosylation sites across functional
categories of the human proteome.
The network will be updated and predictions can alter due to different versions. The network is balanced to give optimal predictions whether or not you submit sequences with homology to the known N-glycosylated proteins. If however the submitted sequence is very close to or identical to the sequences in our training dataset, the accuracy can be expected to be higher than reported above.
We would appreciate any confirmation or the opposite of our predictions. Since an expanded data set with additional N-glycosylated sequences would increase the performance of the network, we are very interested in receiving such material. User feedback is the only way we will learn to enhance the performance of the method. Any other comments regarding the predictions or the data may be sent to:
If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).
If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.
Correspondence:
Technical Support: