The Center for Translational Data Science at the University of Chicago is developing the discipline of data science and its applications to problems in biology, medicine, healthcare and the environment. We develop and operate large scale data platforms to support research in topics of societal interest, including cancer, cardiovascular disease, inflammatory bowel disease (IBD), birth defects, veterans’ health, pain management, opioid use disorder, and environmental science. We also develop new machine learning and AI algorithms over the data in our platforms.
Our center has developed a number of important “firsts:” including, one of the first large scale data clouds (the NSF supported Open Science Data Cloud (2010-2016)); the first data cloud designed to host biomedical data and approved as a NIH Trusted Partner (the Bionimbus Protected Data Cloud (2013-present)); the first large scale data commons (the NCI Genomic Data Commons (2016-present)); and the first set of services to create data ecosystems for biomedical data (Data Commons Frameworks Services (2020-present)).
Today with our partners, we operate a data ecosystem compromising over a dozen data commons that make over 10 PB of data available to the research community about over 600,000 patients. We provide access to this data via secure and compliant workspaces, while protecting patient privacy. These are all based on the open source Gen3 data platform, that includes the Gen3 data commons, Gen3 Framework Services, and Gen3 Workspaces.
For more information, visit ctds.uchicago.edu.
The open commons consortium (OCC) manages and operates cloud computing, data commons, and data ecosystems to advance scientific, medical, health care and environmental research for human and societal impact.
The OCC:
• manages and operates data commons for the research and development community;
• operates data commons framework services so that a research community can develop and operate a data ecosystem containing data commons, data platforms, and data resources and a collection of applications over them;
• contributes to the Data GUID community (dataguids.org), including the long term commitment of maintenance and persistence infrastructure of a resolution system for data GUID (data GUID provide persistent data identifiers for data);
• Provides a governance framework so that a community of stakeholders can develop and operate a data commons or data ecosystems;
In summary, the OCC provides: i) the consortium and project management; ii) the legal agreements and governance structure; iii) manages and operates the cloud computing infrastructure; and iv) manages the security and compliance required.
The OCC is a division of the Center for Computational Science Research inc. (CCSR), a Chicago based 501(c)(3) not-for-profit that supports analytic, scientific, biomedical, and environmental research and applications.