Help

User must provide sequences for both Host and Pathogen datasets. Sequences can be provided either by uploading a file or by directly pasting the sequence inside textarea. File upload will have priority over text area sequences. Therefore, if a user upload a file in host dataset and also paste a sequence in the host textarea, the service will use the file sequences for the prediction process.

User must also select which type of predictive model will like to use, as well as which kind of model (Human-Pathogen, Animal-pathogen & Plan-pathogen). By default Sensitive mode and plant-pathogen model will be used, please make sure that this work for your specific dataset, if not please change default values.

Additionally, a user can provide an email if desired in this way the user will receive an email after job completion.

After all parameters have been set, user can simply click on run prediction button, at that moment a loading screen will appear that will automatically redirect to result page when the job finish. A link to the result page will also be shown within this loading screen that the user can save to access later if needed.


FASTA is the format that is allowed as input in the DeepHPI service. User can either upload a multifasta file or paste sequence(s) in the boxes. Originally this service was intended to be run only for aminoacid sequences, however, if user provides nucleotide sequences DeepHPI will give the option to translate nucleotide sequences into protein and run the service. User will be asked to agree with this prior to the translation of the sequences.

The system ensure that sequences are either nucleotide or aminaocid by running a backend validation. This validation will also make sure that the sequences provided follow the FASTA format.

Currently the developer is working to establish a threshold that limit the amount of sequences that can be send in a job, so there is not limit in the number of sequences that this service hanldes yet. With that in mind we asked to our users to make use of our service responsibly. Please be aware that the service will run all combinations between Host & Pathogen proteins, e.g: if 100 host proteins and 100 pathogen proteins are given, then, input dataset will be 10,000 potential PPIs.


Depending of the number of sequences provided. PyTorch calculations are run in an HPC enviroment that gives priority to the users of this service. For a small list of proteins it will take less than 30 seconds.


When prediction is complete, user will be redirected to a result page. If at least one HPI is predicted, user will see a sortable and filterable table that can also be downloaded in Excel or CSV format to be further accessed in a Spreadsheet editor (e.g: LibreOffice, Excel) or even in R or SAS.

This table is straightforward, and contains just the sequence identifiers of both host and pathogen proteins that were predicted as interacting.


DeepHPI have the option to visualize the resulted HPI network. From the result page, user can click in the visualization button (green button) to be redirected to the visualization app.

Within this app, the user will be able to visualize how the network is composed. Also, DeepHPI provides the closest Swissprot hit associated with the protein that was predicted interacting. Gene ontology annotation and description of that swissprot homolog is provided

At the network visualization enviroment, there are links to several web sources such as Uniprot and NCBI, to obtain more information regarding the proteins in the network.

Users have the option to download the resulted network in JSON format to be opened in network analyzer software such as Gephi or Cytoscape. In adittion, networks can be downloaded in SVG format to produce publication-quality images (> 300 DPI).


When the visualization app is launched it will show the information of the protein with largest degree. Degree is just the number of interactions of that particular protein. If your network is to large and you would like to narrow down complexity a good idea to start is to check those protein with large degree. However, is recommeneded to download the network and perform topology analysis and functional enrichment of nodes within the HPI network.


If you can't get something to work or if you want to suggest something to make this webserver works better, the developer would be happy to address any question at crissloaiza@gmail.com or cdloaiza@aggiemail.usu.edu.


OS Version Chrome Firefox Safari Opera
Linux CentOS 7 72.0.3626.96 60.5.0 n/a 58.0.3135.53
MacOS Mojave 10.14.2 72.0.3626.121 65.0.1 12.0.2 not tested
Windows 7 72.0.3626.121 63.0.3 not tested 45.0.2552.888