Installation

Dependencies

  • Proteome Discoverer version 2.0 or 2.1 is required

  • R For some parts an R installation is needed. It is included in the installer

Installation

Peakjuggler consists of two nodes, one for the processing workflow and one for the consensus workflow.

Usage

Parameters

Processing Node

0.5 Node Parameters

0.5 Node Parameters

  1. Search Parameters. These parameters limit the candidates for area calculation and therefore the runtime needed. The values depend on the certainty you have in your data. For discovery type experiments, these values can be lowered in order to quantify weaker peptides.

    • Match between runs: Calculate the area only for identified peptides or try to recover peptides in runs where they were not identified

    • Confidence Level: Only use PSMs with at least this confidence

    • Minimum Score: The minimum score a PSM needs to be used. This is dependent on the search engine you use

      • MsAmanda:  120-150

      • Mascot:  20

      • Sequest:  2

    • Minimum Sequence Length: The minimum number of AAs a peptide needs to have to be used. A value of 6 or 7 is recommended

    • Search Engine Rank: Up to which search engine rank PSMs are considered

    • Missing Peaks: The number of consecutive MS1 scans where the peptide mass is not found. This parameter is used in the detection of peak boundaries. If you encounter really noisy data, increasing this parameter may improve your results.

    • Mass Tolerance: The maximum mass difference a signal in the MS1 scan can have to be still considered from the same peptide. This is usually the same value as the MS1 tolerance for search engines.

    • Retention Time Tolerance: Used for the matching. It indicates the maximum time difference used for finding peaks in files without identification.

  2. PhosphoRS/ptmRS Settings

    • Probability Threshold: Everything above this threshold is counted as high

    • PhosphoRS Column: The name of the phosphoRS/ptmRS column. If this name is not found, PJ tries some standard names, if still no column is found, modifications are taken from the search engine

  3. RT Correction Parameters

    • RT Correction: Peakjuggler can correct retention time shifts. This parameter activates/deactivates this feature.

  4. Performance Parameters

    • Workpackage size: This regulates the number of spectra that are read into the RAM. If you experience that your RAM is filled up during analysis, it is a good idea to set this parameter lower. Unfortunately with lower package size the analysis will take longer. With 16GB of RAM, a workpackage size of 10 000 should pose no problem. If you see that the RAM is maxed out while PJ is running, decreasing this number is strongly advised!

  5. Area Calculation Parameters

    • LOESS Smoothing: IF set to yes, LOESS smoothing is used, if set to no, smoothing is performed with a grade 2 polynomial. We suggest using LOESS.

    • LOESS window size (in min) This regulates the number of data points used in the smoothing algorithm. The more MS2 scans allowed between two MS1 scans the higher this parameter should be. On an QEx HF with a top 12 method, 0.1 minutes is a fair value.

    • Minimum Width for Peaks: The minimum time for a peak to be considered.

    • Noise Level: All signals below this intensity are considered as zero

  6. Confidence Parameters

    • High Confidence The FDR cutoff for high confident intensity values

    • Low Confidence The FDR cutoff for medium confident intensity values

Consensus Node

  1. Protein Area

    • Usage of peptides: Which method should be used to combine peptide areas to the protein area. The possibilities are:

      • sum

      • average

      • median

    • Ignore peptides with no area Deprecated. Will be removed in the next version

    • Peptides to use for Protein Area:

      • Top x per sample: Takes the top x peptides per sample and performs the chosen method on their areas.

      • Top x over all samples: This method first sums all areas of each peptide in all samples and then takes the top x to calculate the protein area. This ensures that the same peptides are selected in all samples.

    • Use shared peptides for:

      • All proteins: Shared peptides contribute to all proteins they appear in

      • Nothing: Shared peptides are ignored

      • Protein Groups: If a peptide is only shared within the same protein group use it

    • # peps for protein area: The maximum number of peptides to combine for the protein area. If a protein has less peptides than this number, all available are used.

    • Confidence Level: Use only peptides with a Peakjuggler confidence value of the selected or higher for the protein area. Medium is recommended, as also other tools like MaxQuant use a 5% cutoff.

  2. Peptide Area

    • Confidence to use: This is the rule that defines the confidence on the PeptideGroup level. The confidences of the areas in the individual runs are combined to one confidence according to this setting.

      • lowest The lowest of all confidences is taken.

      • highest The highest of all confidences is taken

      • mode(lowest) The most common confidence is taken. In case of a tie, the lower wins

      • mode(highest) The most common confidence is taken. In case of a tie, the highest wins

    • Minimum Area: Peptides with a smaller area are counted as not quantified

Workflow

0.5 Sample Workflows

0.5 Sample Workflows

Processing Node

The processing node needs two connections, one to the Spectrum Files node, and another one to the PSM Validation node (e.g. Percolator or Target Decoy PSM Validator).

Consensus Node

The consensus node needs to be connected to the Protein Grouping node.

Results

New columns

Protein tab

The protein table receives two new columns: Peakjuggler Area and Identified by. The areas are colour coded to give a quick overview if the areas are similar in the samples. The Identified by column has one box per sample and indicates the origin of quantification. (MS/MS, MBR or none) See figure [fig:protein_table].

Protein Table

Protein Table

There is also one hidden column per raw file that lists the peptides that were used for the protein area named “identifying_peptides_raw_file_name*” (see fig [fig:hidden]).

0.5 Show what peptides were used for the protein area

0.5 Show what peptides were used for the protein area

Peptide Groups tab

Exactly the same as in the protein table. See figure [fig:peptide_table]

Peptide Groups table

Peptide Groups table

New tables

PjFeatures

This table is similar to the PSMs table, but not on PSM but on PCM (peptide charge modification) level (see fig [fig:features]).

Feature table

Feature table

It also provides the user with information regarding the quantification like peak start and end times and a button to show the extracted ion chromatogram (XIC) including the peak boundaries (see fig. [fig:feature_xic])

XIC of a single PjFeature

XIC of a single PjFeature

PjQuanResult

The QuanResult is the equivalent to the Peptide Groups table for Features. (see fig. [fig:quan_results])

QuanResult table

QuanResult table

It also features an XIC view over all samples (see fig. [fig:quan_xic])

XIC of a Feature in different samples

XIC of a Feature in different samples

Scoring

Peakjuggler also tries to give the quantification a score which indicates how well the integration went. Base for this score was the DeMix-Q paper by Zhang et al.1 Peakjuggler also performs a target-decoy quantification and assigns confidences to the Features. This feature is still under development!

Troubleshooting

Peakjuggler crashes

I get no areas


  1. Bo Zhang, Lukas Kall, and Roman A. Zubarev
    DeMix-Q: Quantification-centered Data Processing Workflow
    Mol Cell Proteomics mcp.O115.055475. First Published on January 4, 2016,
    doi:10.1074/mcp.O115.055475