Parser for FTS files that accompany CMS data from ResDac

See also What is FTS for more details about FTS

The parser tries to recognize a type of a medicare or a medicaid CMS file and extract metadata

Abstract class for CMS FTS file

class CMSFTS(type_of_data: str)[source]

Bases: object

Abstract class for Medicaid and Medicare files from dorieh.cms

Parameters:

type_of_data – Can be either ps for personal summary or ip for inpatient admissions data

common_indices = ['BENE_ID', 'FILE']
init(path: str)[source]
read_file(f)[source]
on_after_read_file(columns: List[FTSColumn])[source]

Callback function

Parameters:

columns – columns read from FTS file

Returns:

nothing

static add_record_column(columns: List[FTSColumn])[source]

Adds a RECORD column, to uniquely identify a record in the database. A column is of type SERIAL, i.e. auto-incremented

Parameters:

columns

Returns:

static add_file_column(columns: List[FTSColumn])[source]

Adds a column containing the name of original file, from which the data has been read

Parameters:

columns

Returns:

column_to_dict(c: FTSColumn) dict[source]

Returns a column as a dictionary object that can be added to YAML data model :param _sphinx_paramlinks_dorieh.cms.fts2yaml.CMSFTS.column_to_dict.c: a column as parsed from FTS :return: dictionary

to_dict()[source]

Returns full metadata for the file as a dictionary to be included in the YAML data model used to generate DDL for the corresponding table

Returns:

dictionary

static v2i(v: str)[source]
to_fwf_meta(data_path: str) FWFMeta[source]

Returns metadata required to read the file if it is a fixed width file

Parameters:

data_path

Returns:

Metadata as required by FWF reader

print_yaml(root_dir: Optional[str] = None)[source]

Concrete subclass describing Medicare FTS file

class MedicareFTS(type_of_data: str)[source]

Bases: CMSFTS

Subclass describing Medicare data file (usually, FWF dat file)

Parameters:

type_of_data – Can be either ps for personal summary or ip for inpatient admissions data

init(fts_path: str)[source]
on_after_read_file(columns: List[FTSColumn])[source]

Callback function

Parameters:

columns – columns read from FTS file

Returns:

nothing

check_key_columns(columns: List[FTSColumn])[source]
add_indices(columns: List[FTSColumn])[source]

Concrete subclass describing Medicaid FTS file

class MedicaidFTS(type_of_data: str)[source]

Bases: CMSFTS

Subclass describing Medicaid data file (usually, CSV)

Parameters:

type_of_data – Can be either ps for personal summary or ip for inpatient admissions data

medicaid_indices = ['EL_DOB', 'EL_SEX_CD', 'EL_DOD', 'EL_RACE_ETHNCY_CD']
init(path: Optional[str] = None)[source]
on_after_read_file(columns: List[FTSColumn])[source]

Callback function

Parameters:

columns – columns read from FTS file

Returns:

nothing

Abstract class describing a column in a CMS data file

class FTSColumn(order, column, c_type, c_format, c_width, label)[source]

Bases: object

Metadata object for a column described in FTS file

A column can be either a CSV or fixed width (fwf) column

classmethod conv(i)[source]

Conversion function that should be applied to the i-th attribute of a column

Parameters:

i – attribute ordinal in the column description

Returns:

Callable function

analyze_format()[source]
to_sql_type()[source]

SQL Type of the column :return: SQL type of the column

get_description()[source]
to_dict()[source]
to_fwf_column(pos: int) FWFColumn[source]

Returns a description of a fixed width (fwf) column required to create a FWF reader

Parameters:

pos – starting position of the column in a record

Returns:

A descriptor for FWF column

Concrete subclass describing a column in a Medicare data file

class MedicareFTSColumn(order: int, long_name: str, short_name: str, type: str, start: int, width, desc: str)[source]

Bases: FTSColumn

Subclass for a column in medicare files

nattrs = 7
classmethod conv(i)[source]

Conversion function that should be applied to the i-th attribute of a column

Parameters:

i – attribute ordinal in the column description

Returns:

Callable function

get_description()[source]

Concrete subclass describing a column in a Medicaid data file

class MedicaidFTSColumn(order, column, c_type, c_format, c_width, label)[source]

Bases: FTSColumn

Subclass for a column in medicaid files

nattrs = 6

Concrete subclass describing a column not present in the original data but that should be generated in the database

class AliasColumn(alias: str, column: FTSColumn)[source]

Bases: FTSColumn

Subclass describing a column not present in the original data but that should be generated in the corresponding database table

to_dict()[source]

Helper Classes

class ColumnReader(constructor, pattern)[source]

Bases: object

Reads columns section of an FTS file

read(line)[source]
class ColumnAttribute(start: int, end: int, conv)[source]

Bases: object

Column attribute as read from FTS

arg(line: str)[source]

Helper Functions

mcr_type(file_name: str) str[source]

Tries to guess medicare file type by its name

Parameters:

file_name – Name of the file

Returns:

string denoting file type

width(s: str)[source]

Parses width of a numeric column as described in FTS file

Parameters:

s – String from FTS file, describing column width

Returns:

A tuple, with fist element specifying total column width and the second a number of digits after decimal point