The task Module

This module contains classes used to define, schedule and execute long running computations used by gridMET pipelines

count_lines(f)[source]
quote(s: str) str[source]
class Parallel(value)[source]

An enumeration.

points = 'points'
bands = 'bands'
days = 'days'
class Collector[source]
abstract writerow(data: List)[source]
flush()[source]
class CSVWriter(out_stream)[source]
writerow(row: List)[source]
flush()[source]
class ListCollector[source]
writerow(data: List)[source]
get_result()[source]
class ComputeGridmetTask(year: int, variable: GridmetVariable, infile: str, outfile: str, date_filter=None, ram: int = 0)[source]

An abstract class for a computational task that processes data in Unidata netCDF (Version 4) format

Parameters:
  • ram

  • date_filter

  • year – year

  • variable – Gridemt band (variable)

  • infile – File with source data in NCDF4 format

  • outfile – Resulting CSV file

origin = datetime.date(1900, 1, 1)
classmethod get_variable(dataset: Dataset, variable: GridmetVariable)[source]
abstract get_key()[source]
get_days() Dict[source]
prepare() Dict[source]
on_prepare()[source]
execute(mode: str = 'wt')[source]

Executes computational task

Parameters:

mode (str) – mode to use opening result file

Returns:

collect_data(days: Dict, collector: Collector)[source]
abstract compute_one_day(writer: Collector, day, layer)[source]

Computes required statistics for a single day. This method is called by execute() and is implemented in specific subclasses

Parameters:
  • writer – CSV Writer to output the result

  • day – day

  • layer – layer, corresponding to the day

Returns:

Nothing

to_date(day) date[source]
class ComputeShapesTask(year: int, variable: GridmetVariable, infile: str, outfile: str, strategy: RasterizationStrategy, shapefile: str, geography: Geography, date_filter=None, ram=0)[source]

Class describes a compute task to aggregate data over geography shapes

The data is expected in .. _Unidata netCDF (Version 4) format: https://www.unidata.ucar.edu/software/netcdf/

Parameters:
  • ram

  • date_filter

  • year – year

  • variable – gridMET band (variable)

  • infile – File with source data in NCDF4 format

  • outfile – Resulting CSV file

  • strategy – Rasterization strategy to use

  • shapefile – Shapefile for used collection of geographies

  • geography – Type of geography, e.g. zip code or county

on_prepare()[source]
get_key()[source]
compute_one_day(writer: Collector, day, layer)[source]

Computes required statistics for a single day. This method is called by execute() and is implemented in specific subclasses

Parameters:
  • writer – CSV Writer to output the result

  • day – day

  • layer – layer, corresponding to the day

Returns:

Nothing

class ComputePointsTask(year: int, variable: GridmetVariable, infile: str, outfile: str, points_file: str, coordinates: List, metadata: List, date_filter=None, ram=0)[source]

Class describes a compute task to assign data to a collection of points

The data is expected in .. _Unidata netCDF (Version 4) format: https://www.unidata.ucar.edu/software/netcdf/

Parameters:
  • ram

  • year – year

  • variable – Gridemt band (variable)

  • infile – File with source data in NCDF4 format

  • outfile – Resulting CSV file

  • points_file – path to a file containing coordinates of points in csv format.

  • coordinates – A two element list of column names in csv corresponding to coordinates

  • metadata – A list of column names in csv that should be interpreted as metadata (e.g. ZIP, site_id, etc.)

force_standard_api = False
get_key()[source]
on_prepare()[source]
make_point(row) PointInRaster[source]
execute(mode: str = 'w') None[source]

Executes computational task

Parameters:

mode (str) – mode to use opening result file

Returns:

compute_one_day(writer: Collector, day, layer)[source]

Computes required statistics for a single day. This method is called by execute() and is implemented in specific subclasses

Parameters:
  • writer – CSV Writer to output the result

  • day – day

  • layer – layer, corresponding to the day

Returns:

Nothing

class DownloadGridmetTask(year: int, variable: GridmetVariable, destination: str)[source]

Task to download source file in NCDF4 format

Parameters:
  • year – year

  • variable – Gridmet band (variable)

  • destination – Destination directory for all downloads

BLOCK_SIZE = 65536
classmethod get_url(year: int, variable: GridmetVariable) str[source]

Constructs URL given a year and band

Parameters:
  • year – year

  • variable – Gridmet band (variable)

Returns:

URL for download

target()[source]
Returns:

File path for downloaded data

execute()[source]

Executes the task :return: None

class GridmetTask(context: GridMETContext, year: int, variable: GridmetVariable)[source]

Defines a task to download and process data for a single year and variable Instances of this class can be used to parallelize processing

Parameters:
  • context – Configuration object for the pipeline

  • year – year

  • variable – gridMET band (variable)

classmethod destination_file_name(context: GridMETContext, year: int, variable: GridmetVariable)[source]

Constructs a file name for a given set of parameters

Parameters:
  • context – Configuration object for the pipeline

  • year – year

  • variable – Gridmet band (variable)

Returns:

variable_geography_year.csv[.gz]

classmethod find_shape_file(context: GridMETContext, year: int, shape: Shape)[source]

Finds shapefile for a given type of geographies for the closest available year

Parameters:
  • context – Configuration object for the pipeline

  • year – year

  • shape – Shape type

Returns:

a shape file for a given year if it exists or for the latest year before the given

execute()[source]

Executes the task. First the download subtask is executed unless the corresponding file has already been downloaded. Then the compute tasks are executed

Returns:

None