Terra Platform

Terra Application

Terra is a product of the Broad Institute of the Massachusetts Institute of Technology (MIT), the Harvard Data Sciences Platform, and Verily Life Sciences. 

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate. The technical components include development of a controlled access AMP PD Knowledge Platform with considerations for data storage and management, standardized pipeline analyses for generation of processed datasets, and cloud collaborative project management. The Platform provides mechanisms for researchers to build, use and share packaged analytical methods and query tools, including Docker-based pipelines (e.g. for variant calling) and Jupyter notebooks (e.g. for visualization).
 

Terra Components

The Data Explorer provides the following:

(1) A public UI to allow researchers to easily explore datasets to quickly understand whether a particular dataset suits their needs

(2) A protected UI to allow researchers to dig deeper into datasets, better understand their content, and quickly build research queries.

The Workspace Service lets researchers:

Organize research by: (1) building & saving queries and (2) building & saving notebooks

Collaborate: (1) Share read-only or read-write and (2) Share with individuals or groups.

The Notebook Service manages the loading of Jupyter notebooks and allocating backend compute services for running large computations.

Notebooks are stored in a Cloud Storage bucket associated with a Workspace. The Jupyter server runs in a Compute Engine instance in an end-user Cloud project. The Notebook Service itself runs in a Broad/Verily-managed Cloud project.

The Methods Repository enables sharing workflows and tools (aka "methods") described with either the Common Workflow Language (CWL) or the Workflow Description Language (WDL).

These methods can then be selected from a list and applied to data in Google Cloud.

The Job Manager provides a user interface for monitoring data processing workflows run by the Broad’s Cromwell server. Cromwell is a Workflow Management System geared towards scientific workflows. The Job Manager service is a translation layer on top of a Cromwell service, granting users the ability to create Google Genomics pipelines and manage Google Genomics operations.