The loader_config Module

Domain Loader Configurator

Intended to configure loading of a single or a set of column-formatted files into NSAPH PostgreSQL Database. Input (aka source) files can be either in FST or in CSV format

Configurator assumes that the database schema is defined as a YAML or JSON file. A separate tool is available to introspect source files and infer possible database schema.

class Parallelization(*values)[source]

lines = 'lines'

files = 'files'

none = 'none'

class DataLoaderAction(*values)[source]

drop = 'drop'

load = 'load'

insert = 'insert'

print = 'print'

classmethod new(value: str)[source]

class LoaderConfig(doc)[source]

Configurator class for data loader

Creates a new object

Parameters:

subclass¶ – A concrete class containing configuration information Configuration options must be defined as class memebers with names, starting with one ‘_’ characters and values be instances of :class Argument:
description¶ – Optional text to use as description. If not specified, then it is extracted from subclass documentation

action: DataLoaderAction | None: If this option is given, then the whole domain schema will be dropped

data: Path to a data file or directory. Can be a single CSV, gzipped CSV or FST file or a directory recursively containing CSV files. Can also be a tar, tar.gz (or tgz) or zip archive containing CSV files

reset: Force recreating table(s) if it/they already exist

page: Explicit page size for the database

log: Explicit interval for logging

limit: Load at most specified number of records

buffer: Buffer size for converting fst files

threads: Number of threads writing into the database

parallelization: Type of parallelization, if any

pattern: pattern for files in a directory or an archive, e.g., “**/maxdata_*_ps_*.csv”

incremental: Commit every file and skip over files that have already been ingested

sloppy: Do not update existing tables and views

validate(attr, value)[source]

Subclasses can override this method to implement custom handling of command line arguments

Parameters:

attr¶ – Command line argument name
value¶ – Value returned by argparse

Returns:

value to use