Filters protein identification engine results by different criteria.
This tool is used to filter the identifications found by a peptide/protein identification tool like Mascot. Different filters can be applied:
To enable any of the filters, just change their default value. All active filters will be applied in order.
-
precursor:rt:
Precursor RT range for the peptide identification to be kept.
-
precursor:mz:
Precursor m/z range for the peptide identification to be kept.
-
score:pep:
The score a peptide hit should have to be kept.
-
score:prot:
The score a protein hit should have to be kept.
-
thresh:pep:
The fraction of the significance threshold that should be reached by a peptide hit to be kept. If for example a peptide has score 30 and the significance threshold is 40, the peptide will only be kept by the filter if the significance threshold fraction is set to 0.75 or lower.
-
thresh:prot:
This parameter behaves in the same way as the peptide significance threshold fraction parameter. The only difference is that it is used to filter protein hits.
-
whitelist:proteins:
If you know which proteins are in the measured sample you can specify a FASTA file which contains the protein sequences of those proteins. All peptides which are not a substring of a protein contained in the sequences file will be filtered out. The filtering is based on the protein identifiers attached to the peptide hits. Protein Hits not matching any FASTA protein are also removed.
If you want filtering using the sequence alone, then use the flag WhiteList:by_seq_only.
-
blacklist:peptides:
For this option you specify an idXML file. All peptides that are present in both files (in-file and exclusion peptides file) will be dropped. Protein Hits are not affected.
-
rt:
To filter identifications according to their predicted retention times you have to set 'rt:p_value' and/or 'rt:p_value_1st_dim' larger than 0, depending which RT dimension you want to filter. This filter can only be applied to idXML files produced by RTPredict.
-
best:n_peptide_hits:
Only the best n peptide hits of a spectrum are kept. If two hits have the same score, their order is random.
-
best:n_protein_hits:
Only the best n protein hits of a spectrum are kept. If two hits have the same score, their order is random.
-
best:strict:
Only the best hit of a spectrum is kept. If there is more than one hit for a spectrum with the maximum score, then none of the hits will be kept. This is similar to n_peptide_hits=1, but if there are two or more highest scoring hits, none are kept.
- Note
- Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.
The command line parameters of this tool are:
INI file documentation of this tool: