is-pero

Predicting the localisation of peroxisomal proteins

Help

is-pero is a tool designed for the prediction and classification of peroxisomal proteins. It utilizes advanced machine learning models and sequence analysis techniques to provide accurate and interpretable results. The server supports high-throughput proteomics studies by allowing batch predictions. The tool is available as a web interface and for local running on GitHub and DockerHub

Server Input

is-pero webserver accepts as input protein sequences in FASTA format or providing UniProt IDs. Users can submit up to 100 protein sequences per job. These sequences can either be pasted directly into the input box or uploaded as a FASTA file. The web interface predicts whether the submitted proteins are peroxisomal or not by selecting "Yes" among the "Predict peroxisomal protein" options. When this option is selected the downstream predictions are preformed only in the subset of proteins predicted as peroxisomal. Alternatively when this prediction is not preformed the tool considers all the proteins as peroxisomal and the downstream analysis is performed for all the sequences. Two examples of input can be loaded through links included on the submission page.

Server Output

The is-pero webserver returns the aggregated results of three sequential prediction steps:

An optional step predicting if the input protein is peroxisomal or not.
Predicts if the protein is located in the matrix or the membrane of the peroxisome.
Detection of specific peptide signals PTS1, PTS2 and mPTS. For the selected membrane peptides (mPTS) the prediction of the binding probabilities are returned.

Upon submission, is-pero processes the input and returns both a tabular output displayed with DataTables and downloadable TSV file containing detailed results. The output includes sequence IDs, localisation probabilities, signal peptide matches, and predictions for MPTS regions if they are retrieved by regular expressions. If the initial optional step, predicting peroxisomal protein, is not performed the corresponding probability for each protein of being peroxisomal is not displayed. In detail the TSV output includes the following data:

Sequence_ID: sequence identifier provided in the fasta file.
Peroxisomal: yes or no (Optional)
Probability_Peroxisomal: Probability of the sequence to be peroxisomal (Optional)
Matrix: the subcellular localisation of the peroxisomal protein is in the matrix or membrane.
Probability_Matrix: Probability of the sequence to be in the matrix of the peroxisome.
Peptide: membrane peroxisomal targeting signal (MPTS) peptide detected by regular expression.
Start: position of starting point in the MPTS peptide.
Score: MPTS pattern matching score.
Signal: membrane or other PTSs.
Probability_MPTS: probability of the peptide of being MPTS.

This structured output allows easy interpretation and integration into further proteomics workflows. An example of output of the web interface is reported below.

JobID: a152f5d9-f8f0-11f0-8a0b-17fbffffffff Output File: output_finalprediction.csv

Usage Notes

Ensure that all input sequences are in the correct FASTA format. Jobs may take longer to process if the maximum number of sequences (100) is submitted. For best results, review the mPTS predictions alongside other localisation.

Contact

For additional support or inquiries, please contact Marco Anteghini or Emidio Capriotti .