Dorieh EPA Package Description

Documentation Home

Toolkit for downloading and preprocessing of data provided by EPA

Handles the following types of data:

  • EPA AQS Data hosted at https://www.epa.gov/aqs and EPA AirNow data from https://docs.airnowapi.org/webservices

  • AirNow contains real-time up-to-date pollution data but is less reliable than AQS

Package Contents

Querying Data

AirNow

Note: Negative values indicate missing data, therefore we need to specify a condition (value > 0)

Columns

COLUMN_NAME

TYPE_NAME

IS_NULLABLE

IS_AUTOINCREMENT

IS_GENERATEDCOLUMN

fullaqscode

text

YES

NO

NO

latitude

numeric

YES

NO

NO

longitude

numeric

YES

NO

NO

utc

timestamp

YES

NO

NO

parameter

varchar

YES

NO

NO

unit

varchar

YES

NO

NO

value

numeric

YES

NO

NO

aqi

numeric

YES

NO

NO

category

int4

YES

NO

NO

sitename

varchar

YES

NO

NO

agencyname

varchar

YES

NO

NO

intlaqscode

text

YES

NO

NO

zcta

int4

YES

NO

NO

state

int4

YES

NO

NO

fips5

int4

YES

NO

NO

statefp

int4

YES

NO

NO

countyfp

int4

YES

NO

NO

county

int4

YES

NO

NO

stusps

varchar

YES

NO

NO

record

varchar

NO

NO

NO

monitor

varchar

YES

NO

NO

Sample AirNow query by date

SELECT 
    utc::date,
    AVG(value) AS mean_value,
    MIN(value) AS minimum,
    MAX(value) AS maximum,
    COUNT(*)
FROM 
    epa.airnow_pm25_2022
WHERE value > 0    
GROUP BY 1
ORDER BY 1

Sample AirNow query, aggregating monthly

SELECT 
    EXTRACT(MONTH FROM utc),
    AVG(value) AS mean_value,
    MIN(value) AS minimum,
    MAX(value) AS maximum,
    COUNT(*)
FROM 
    epa.airnow_pm25_2022
WHERE value > 0    
GROUP BY 1
ORDER BY 1

Sample AirNow query, aggregating by State

SELECT 
    stusps,
    AVG(value) AS mean_value,
    MIN(value) AS minimum,
    MAX(value) AS maximum,
    COUNT(*)
FROM 
    epa.airnow_pm25_2022
WHERE value > 0    
GROUP BY 1
ORDER BY 2 desc

AQS

See AQS Data Dictionary for the description of the columns

Annual Columns

COLUMN_NAME

TYPE_NAME

IS_NULLABLE

IS_AUTOINCREMENT

IS_GENERATEDCOLUMN

state_code

varchar

YES

NO

NO

county_code

varchar

YES

NO

NO

site_num

varchar

YES

NO

NO

parameter_code

varchar

YES

NO

NO

poc

numeric

YES

NO

NO

latitude

numeric

YES

NO

NO

longitude

numeric

YES

NO

NO

datum

varchar

YES

NO

NO

parameter_name

varchar

YES

NO

NO

sample_duration

varchar

YES

NO

NO

pollutant_standard

varchar

YES

NO

NO

metric_used

varchar

YES

NO

NO

method_name

varchar

YES

NO

NO

year

numeric

YES

NO

NO

units_of_measure

varchar

YES

NO

NO

event_type

varchar

YES

NO

NO

observation_count

numeric

YES

NO

NO

observation_percent

numeric

YES

NO

NO

completeness_indicator

varchar

YES

NO

NO

valid_day_count

numeric

YES

NO

NO

required_day_count

numeric

YES

NO

NO

exceptional_data_count

numeric

YES

NO

NO

null_data_count

numeric

YES

NO

NO

primary_exceedance_count

varchar

YES

NO

NO

secondary_exceedance_count

varchar

YES

NO

NO

certification_indicator

varchar

YES

NO

NO

num_obs_below_mdl

numeric

YES

NO

NO

arithmetic_mean

numeric

YES

NO

NO

arithmetic_standard_dev

numeric

YES

NO

NO

c_1st_max_value

numeric

YES

NO

NO

c_1st_max_datetime

varchar

YES

NO

NO

c_2nd_max_value

varchar

YES

NO

NO

c_2nd_max_datetime

varchar

YES

NO

NO

c_3rd_max_value

varchar

YES

NO

NO

c_3rd_max_datetime

varchar

YES

NO

NO

c_4th_max_value

varchar

YES

NO

NO

c_4th_max_datetime

varchar

YES

NO

NO

c_1st_max_non_overlapping_value

varchar

YES

NO

NO

c_1st_no_max_datetime

varchar

YES

NO

NO

c_2nd_max_non_overlapping_value

varchar

YES

NO

NO

c_2nd_no_max_datetime

varchar

YES

NO

NO

c_99th_percentile

numeric

YES

NO

NO

c_98th_percentile

numeric

YES

NO

NO

c_95th_percentile

numeric

YES

NO

NO

c_90th_percentile

numeric

YES

NO

NO

c_75th_percentile

numeric

YES

NO

NO

c_50th_percentile

numeric

YES

NO

NO

c_10th_percentile

numeric

YES

NO

NO

local_site_name

varchar

YES

NO

NO

address

varchar

YES

NO

NO

state_name

varchar

YES

NO

NO

county_name

varchar

YES

NO

NO

city_name

varchar

YES

NO

NO

cbsa_name

varchar

YES

NO

NO

date_of_last_change

date

YES

NO

NO

monitor

varchar

YES

NO

NO

record

varchar

NO

NO

NO

Daily columns

COLUMN_NAME

TYPE_NAME

IS_NULLABLE

SCOPE_TABLE

IS_AUTOINCREMENT

IS_GENERATEDCOLUMN

state_code

varchar

YES

(null)

NO

NO

county_code

varchar

YES

(null)

NO

NO

site_num

varchar

YES

(null)

NO

NO

parameter_code

varchar

YES

(null)

NO

NO

poc

numeric

YES

(null)

NO

NO

latitude

numeric

YES

(null)

NO

NO

longitude

numeric

YES

(null)

NO

NO

datum

varchar

YES

(null)

NO

NO

parameter_name

varchar

YES

(null)

NO

NO

sample_duration

varchar

YES

(null)

NO

NO

pollutant_standard

varchar

YES

(null)

NO

NO

date_local

date

YES

(null)

NO

NO

units_of_measure

varchar

YES

(null)

NO

NO

event_type

varchar

YES

(null)

NO

NO

observation_count

numeric

YES

(null)

NO

NO

observation_percent

numeric

YES

(null)

NO

NO

arithmetic_mean

numeric

YES

(null)

NO

NO

c_1st_max_value

numeric

YES

(null)

NO

NO

c_1st_max_hour

numeric

YES

(null)

NO

NO

aqi

numeric

YES

(null)

NO

NO

method_code

varchar

YES

(null)

NO

NO

method_name

varchar

YES

(null)

NO

NO

local_site_name

varchar

YES

(null)

NO

NO

address

varchar

YES

(null)

NO

NO

state_name

varchar

YES

(null)

NO

NO

county_name

varchar

YES

(null)

NO

NO

city_name

varchar

YES

(null)

NO

NO

cbsa_name

varchar

YES

(null)

NO

NO

date_of_last_change

date

YES

(null)

NO

NO

monitor

varchar

YES

(null)

NO

NO

record

varchar

NO

(null)

NO

NO