The outbreaks of infectious diseases caused by coronavirus are significant threats to human, for example, Severe acute respiratory syndromes (SARS) and Middle East respiratory syndrome (MERS) in recent years caused many deaths and led to great panics around the world. To identify their potential hosts or to predict the possibility of interspecies transmission is very important for us in order to prevent and control such emerging infectious diseases (EIDs). The main objective of this tool is to identify the likely hosts of coronaviruses within the six kinds of species: Human, Porcine, Bovine, Bat, Murine and Avian.
In our study, we used a dual-model approach, support vector machines (SVM) and a Mahalanobis distance discriminant (MD), to infer potential hosts of coronaviruses based on mononucleotide and dinucleotide biases of spike gene of coronaviruses, and this is the web tool freely available for this purpose. With nucleotide sequences of spike genes supplied, it will output predicted hosts. Roughly, if the predicted hosts by SVM and MD are different, the one predicted by MD is likely to be the natural host, and SVM tends to reveal new host for a virus.
Please choose a file (an example) for spike genes of coronaviruses: Or/And directly input sequences over here (in FASTA format or raw sequences with genes delimited by one or more blank rows):
Threshold p values in SVM for suspicious hosts: and highly suspicious hosts: Threshold values in MD for suspicious hosts: and highly suspicious hosts:
2. Sequences for training (Optional)
Users are allowed to use their own training sequences: Carry out a leave-one-out cross validation on the training data: Yes No Notes for the training sequence file:
If you have any questions, please contact us by sending feedback to Dr. Xiao-Qin Xia.