About

1 Prediction

To start using PON-IDR, provide UniProt IDs and amino acid substitution information. For protein sequences, there are two ways to input: directly by providing sequences in Fasta format or list of IDs (UniProt). These data can be written or pasted to the boxes in the input forms or uploaded as a file. Both types allow the submission of variants in multiple queries simultaneously. To look for an input example, please click the "Example" text on the input page.

1.1 Input sequence(s)

FASTA sequences have to contain a header line starting with greater than sign (>) followed by amino acids sequence. Amino acids sequence has to start from a new line.

Information for amino acid substitutions has to contain the same header line as the sequence. An amino acid substitution consists of three parts in HGVS format: original amino acid, position, and new amino acid. For example, "A61T" means 61st amino acid A (alanine) is substituted by T (threonine). Use single letter amino acid codes. Each protein sequence can contain multiple amino acid substitutions, each one indicated in a different line.

Example

>P06400
MPPKTPRKTAATAAAAAAEPPAPPPPPPPEEDPEQDSGPEDLPLVRLEFEETEEPDFTALCQKLKIPDHVRERAWLTWEKVSSVDGVLGGYIQKKKELWGICIFIAAVDLDEMSFTFTELQKNIEISVHKFFNLLKEIDTSTKVDNAMSRLLKKYDVLFALFSKLERTCELIYLTQPSSSISTEINSALVLKVSWITFLLAKGEVLQMEDDLVISFQLMLCVLDYFIKLSPPMLLKEPYKTAVIPINGSPRTPRRGQNRSARIAKQLENDTRIIEVLCKEHECNIDEVKNVYFKNFIPFMNSLGLVTSNGLPEVENLSKRYEEIYLKNKDLDARLFLDHDKTLQTDSIDSFETQRTPRKSNLDEEVNVIPPHTPVRTVMNTIQQLMMILNSASDQPSENLISYFNNCTVNPKESILKRVKDIGYIFKEKFAKAVGQGCVEIGSQRYKLGVRLYYRVMESMLKSEEERLSIQNFSKLLNDNIFHMSLLACALEVVMATYSRSTSQNLDSGTDLSFPWILNVLNLKAFDFYKVIESFIKAEGNLTREMIKHLERCEHRIMESLAWLSDSPLFDLIKQSKDREGPTDHLESACPLNLPLQNNHTAADMYLSPVRSPKKKGSTTRVNSTANAETQATSAFQTQKPLKSTSLSLFYKKVYRLAYLRLNTLCERLLSEHPELEHIIWTLFQHTLQNEYELMRDRHLDQIMMCSMYGICKVKNIDLKFKIIVTAYKDLPHAVQETFKRVLIKEEEYDSIIVFYNSVFMQRLKTNILQYASTRPPTLSPIPHIPRSPYKFPSSPLRIPGGNIYISPLKSPYKISEGLPTPTKMTPRSRILVSIGESFGTSEKFQKINQMVCNSDRVLKRSAEGSNPPKPLKKLRFDIEGSDEADGSKHLPGESKFQQKLAEMTSTRTRMQKQKMNDSMDTSNKEEK

>P06400
R358G
Y93C

1.2 Input Protein ID

PON-IDR accepts also sequence IDs.The IDs should be preceded by greater than sign (>). After that, provide amino acid substitutions starting from the next line. All the variants in a protein in a single list. After that, details for another sequence can be provided. Substitutions are provided in the HGVS format.

Example

>P06400
R358G
Y93C

-----------------------------------------------------------------------------------------------------------

2 Prediction result

-----------------------------------------------------------------------------------------------------------

3 Data sets

Dataset	pathogenic	benign	total
Training dataset	544	464	1008
Blind Test dataset	50	50	100
noIDR of PON-ALL test set	2080	2267	4347
noIDR of ClinVar test set	7167	2089	9976

The dataset can be download from here.

Training set

Blind test set

-----------------------------------------------------------------------------------------------------------

4 Contact

If you have any problems, please contact Wei Wang (20215227100@stu.suda.edu.cn).

HELP

-----------------------------------------------------------------------------------------------------------

1 Prediction

1.1 Input sequence(s)

Example

1.2 Input Protein ID

Example

-----------------------------------------------------------------------------------------------------------

2 Prediction result

-----------------------------------------------------------------------------------------------------------

3 Data sets

-----------------------------------------------------------------------------------------------------------

4 Contact