Targeted Proteomics Data
Targeted proteomics analysis was conducted on cerebrospinal fluid and blood plasma of both Parkinson's Disease patients and healthy participants in the PPMI cohort. Analysis was conducted using Olink Explore which uses two DNA tagged antibodies per analyte and amplification in a Proximity Extension Assay (PEA) for measuring protein levels. The method requires data to be converted into Normalized protein expression (NPX) values to be used in downstream analysis. NPX data is then intensity normalized and undergoes extensive quality control using the different assay controls before proceeding with analysis .
AMP PD Release 2.5 contains data generated from 743 matched CSF and Plasma samples from 212 participants. The data includes a preview of Olink Explore 138 participants run on 521 CSF samples & 521 matched plasma samples PDBP and 74 participants run on 222 CSF samples & 222 matched plasma samples from PPMI. These data will be fully integrated with untargeted proteomics in the AMP PD 3.0 release with additional supporting materials.
What's on this page:
We have an extensive quality control procedure that we follow when running an assay. This allows for full visibility and control over the technical performance of the assay at each step and ensures that reliable data are generated with customers' samples.
Three internal controls are added to each sample to monitor the quality of assay performance, as well as the quality of individual samples:
- An Incubation Control, which is a PEA assay for spiked-in non-human proteins.
- An Extension Control, which is an antibody molecule with both DNA-oligos attached and therefore always in proximity.
- An Amplification Control, which is a synthetic double-stranded amplicon.
- The extension control is used to calculate the NPX, and the other two are used for quality control of the assay.
We also include external control samples on each plate to normalize data and also to monitor the assay’s performance:
- A Sample Control of pooled plasma to estimate intra- and inter-assay precision.
- A Negative Control to measure background levels, which is used to estimate LOD for all assays.
- A Plate Control of pooled plasma to calculate the NPX and normalize signal levels between different plates for each assay.
Converting Counts to NPX
The Explore system´s raw data output are NGS counts, where each combination of an assay and sample is given an integer value based on the number of DNA copies detected. These raw data counts are converted into NPX values for use in downstream statistical analysis.
NPX generation and Normalization
The NPX values are calculated in two main steps and then intensity normalized by performing between-plate-normalization. First, the assay counts of a sample are divided by those of the extension control for that sample and block, which then undergoes log2 transformation to normalize the data:
Steps in the NPX generation described in equation form, where “i” refers to a specific assay, j refers to a sample, and ExtNPX defines an extension normalized NPX value.
- Relate counts to a known standard (Extension control)
- For all assays and all samples, including negative controls, Plate Controls, and Sample controls.
- Log2 transformation gives more normally distributed data
The result is a scale that has increasing sample values against increasing protein concentration for each assay. The median of the Plate Control is then subtracted from the normalized data:
- Perform plate standardization
- For all assays and per plate of samples
Intensity normalization of data is the default for randomized studies on multiple sample plates. In this setting, the median for a random selection of samples is more stable across plates than the three PC’s on each plate. Intensity normalization sets the median level of all assays to the same value for all plates:
- Between plate normalization (Optional but the default for multi-plate projects)
- For each assay, for all plates in the project.
Figure 1: Summary of the three steps involved in NPX generation and data normalization. Step 1 is illustrated by the red graph, step 2 by the yellow graph, and step 3 (data normalization) by the turquoise graph. GHRL refers to appetite-regulating hormone (Uniprot ID: Q9UBU3).
AMP PD Quality Controls
Each sample in a block is given the status pass or warning. For a complete Explore 1536 run, this means that each sample is evaluated 16 times. There are three sample QC criteria per block. Exceeding specifications in any one of them will result in a QC warning status for that sample. The three criteria are listed below:
- Incubation control deviation from the median may not exceed 0.3 NPX.
- Amplification control deviation from the median may not exceed 0.3 NPX.
- Average counts for a sample may not fall below 500 counts.
Data from samples with a warning should be treated with caution.
In addition to specific samples that exceed specifications, the entire block can be considered failed if it meets criteria 1 and 2 below. For practical reasons, this will most likely result in the re-run of a panel. If criteria 3 below is met the panel is failed:
- The number of samples with QC warning exceeds 16/88 or 1/6.
- The MAD for incubation or amplification control across all samples exceeds 0.3.
- The Plate Control and Negative Control exceeds the criteria specified in section Control strip QC.
Control strip QC
The Plate Controls and negative controls have specific QC criteria. The median of the triplicates may not deviate more than 3 standard deviations from predefined tolerance levels for more than 10% of the assays on average for a panel. For the negative control, only positive deviations are considered.
LOD and CV Calculation
LOD is defined as being three standard deviations above the median NPX of negative controls. The median is set using all samples annotated as negative controls per plate. A predefined standard deviation is used (fixSD). Detectability is calculated per assay and plate and is defined by the percentage of samples above the LOD threshold. The overall detectability of the project is generated and reported in the CoA:
The CV is calculated per assay (i) using the assumption of a log-normal distribution. The average CV is then calculated across panels and included in the CoA output.