News & Updates

AMP PD August 2019 Release Notes

Data Summary

Clinical Data
Participant records were compiled from BioFIND, HBS, PDBP, and PPMI cohorts into a harmonized dataset. These records were then paired with RNA and WGS samples and excluded if matching sample data was not available, with the exception of 4 participants whose WGS samples were excluded solely for duplicating samples in AMP PD cohorts.

RNA Data
RNA sample data was sequenced and processed for BioFIND, PDBP, and PPMI cohort participants. RNA samples were excluded during QC rounds when there was no corresponding clinical data.
 
WGS Data

DNA sample data was sequenced and processed for BioFIND, HBS, PDBP, and PPMI cohort participants. WGS samples were not excluded during QC rounds when there was no corresponding clinical data or when available clinical records require further investigation to warrant sample exclusion. In three (3) cases, WGS samples are included in single sample and joint genotyping data whose corresponding clinical data is not included in the clinical data set.

Integrated Data
This release includes 2872 subjects with fully integrated clinical records, WGS samples, and RNA samples. For an additional 353 participants, this release includes RNA samples with corresponding clinical records when WGS is not yet available. There are similarly 1070 WGS samples with clinical records where RNA sample data is not yet available. There are no cases where only RNA and WGS data intersect because RNA QC required that clinical records exist.
 

Coming Soon

Composition by Cohort

BioFIND Data
Of 213 participants whose clinical records met AMP PD minimum clinical data criteria, 166 participants have corresponding samples in all three release data categories.

HBS Data
Of 877 participants whose clinical records met AMP PD minimum clinical data criteria, 873 have corresponding WGS sample data and are represented in joint genotyping data. HBS samples are not fully integrated as no RNA sequence data was processed for this release.
    
PDBP Data
Of 1599 participants whose clinical records met AMP PD minimum clinical data criteria, 1311 participants have corresponding samples in all three release data categories, 1465 have corresponding WGS samples, and 1445 have corresponding RNA samples.

PPMI Data
Of 1610 participants whose clinical records met AMP PD minimum clinical data criteria, 1395 participants have corresponding samples in all three release data categories, 1433 have corresponding WGS samples, and 1572 have corresponding RNA samples.
 

Google Cloud Storage

Participant Data Products

  • amp_pd_participants.csv:  a table of all participants in all release data (n=4302)
  • amp_pd_case_control.csv:  a table of all joint-genotyped participants (n=3945) with minimum diagnosis information (n=3942) and known mutations
  • Harmonized clinical data
    • harmonized clinical data for 27 clinical forms as csv
    • harmonized clinical per-form dictionary files as csv
    • aggregate dictionary as .xls
    • curation summary document as .pdf

Participant Data Locations

gs://amp-pd-data/releases/2019_v1release_0831/
    amp_pd_participants.csv
    amp_pd_case_control.csv
    clinical/*

 

WGS Data Products

  • wgs_samples.csv: a table of all participant samples (n=3945) and processed file locations
  • Single sample processed data:  CRAM, gVCF, and GATK processing metrics (n=3945)
  • Joint genotyping processed data:  annotated variant vcf data (n=3945)
  • Plink files:  aggregated plink bfiles from all processed vcf data (n=3945)

WGS Data Locations
gs://amp-pd-genomics/releases/2019_v1release_0831/wgs
    wgs_samples.csv
    gatk/
        metrics/*
        vcf/*
    plink/
bfile/*

 

RNA Data Products

  • rna_seq_samples.csv: a table of all participant (n=3225) samples (n=8356) and processed file locations
  • Processed RNA sample data
    • picard metrics (n=8356)
    • salmon quantification (n=8356)
    • star bams (n=8356)
    • feature counts (n=8356)

RNA Data Locations
gs://amp-pd-transcriptomics/releases/2019_v1release_0831/rnaseq/
    rna_seq_samples.csv
picard/
        metrics/*
    salmon/
quantification/*
star/
align-reads/*
subread/
feature-counts/*

 

BigQuery Datasets
Participant BQ Dataset:  2019_v1release_0831
AMP PD Metadata Tables
amp_pd_participants
amp_pd_case_control

Clinical Participant Tables
Demographics, PD_Medical_History, Enrollment, Caffeine_history, Family_History_PD, Smoking_and_alcohol_history

Clinical Assessments Tables
Epworth_Sleepiness_Scale, MDS_UPDRS_Part_I,MDS_UPDRS_Part_II, MDS_UPDRS_Part_III, MDS_UPDRS_Part_IV, MMSE, MOCA, Modified_Schwab___England_ADL, PDQ_39, REM_Sleep_Behavior_Disorder_Questionnaire_Mayo, REM_Sleep_Behavior_Disorder_Questionnaire_Stiasny_Kolster, UPDRS, UPSIT

Clinical Bio Tables
Biospecimen_analyses_CSF_abeta_tau_ptau,Biospecimen_analyses_CSF_beta_glucocerebrosidase, Biospecimen_analyses_other, Biospecimen_analyses_SomaLogic_plasma, DaTSCAN_SBR, DaTSCAN_visual_interpretation, MRI, DTI

WGS BQ Dataset:  2019_v1release_0831_genomics
WGS Joint Genotyping Tables
passing_variants, variants_compact

WGS Joint Metrics Tables
variant_calling_detail_metrics, variant_calling_summary_metrics

WGS Single Sample Tables
raw_wgs_metrics, wgs_metrics, wgs_samples

RNA BQ Dataset:  2019_v1release_0831_transcriptomics
RNA Metadata Tables
Rna_seq_samples

Picard  Tables
alignment_summary_metrics, insert_size_metrics, rna_seq_metrics

Salmon Tables
quantification_genes, quantification_transcripts

Star Tables
star_metrics

FeatureCounts Tables
feature_counts

 

References