Dorieh Data Platform
Contents
Introduction
What is Data Platform
Data Domains
Data Processing Pipelines
Python Packages
Data Modelling for Dorieh Data Platform
Examples
Data Platform Internals
Database Testing Framework
Adding more data
Executing containerized apps
Terms and Acronyms
Indices
Dorieh Data Platform
Dorieh Data Platform
View page source
Dorieh Data Platform
Contents
Introduction
Introduction to Data Platform
Building Blocks
Working with NSAPH containerized apps
Deployment
Terms and Acronyms
Building Platform documentation
What is Data Platform
Why we need a Data Platform
Architecture
Supported Programming Languages and Tools
Development Mode
Where it can be deployed
Data Domains
Health
Climate
Exposure (from Atmospheric Composition Analysis Group of Washington University in St. Louis)
Environmental Protection Agency (EPA) data
Demographics
Data Processing Pipelines
Introduction
Running Workflows
Testing workflows
Installing Python dependencies
Published and tested workflows
Developing your own workflows
Example of a workflow
Python Packages
General purpose utilities
Data platform components
GIS utilities
Health data manipulation tools
EPA tools
Raster data tools (climate and exposure)
Census data manipulation tools
Data Modelling for Dorieh Data Platform
Introduction to data modelling for Dorieh Data Platform
Domain
Table
Column
Multi-column indices
Generation of the database schema (DDL)
Indexing policies
Linking with nomenclature
Ingesting data
Examples
A CWL workflow example: aggregating a climate variable
How to query the database
Querying Medicaid Data
Monitoring database activity using Dorieh tools
Data Platform Internals
Dorieh Core Data Platform
Dorieh Deployment
Managing database connections
Monitoring database activity
Project (Directory) Loading Utility
Database Testing Framework
Adding more data
What data are you adding?
Data modelling vs data introspection
Adding new data domain
Adding data to existing table
Creating new single table
Automatically ingesting multiple files from a file system
Executing containerized apps
Introduction
Prerequisites
Using pipeline generator
Execute generated pipeline
Appendix 1: Metadata description
Terms and Acronyms
Indices
Documentation Indices
General Index
Python Module Index