adios_db.data_sources package

Collection of file readers

Subpackages

Submodules

adios_db.data_sources.importer_base module

class adios_db.data_sources.importer_base.ImporterBase

Bases: object

Only things that are common to all importer modules

deep_get(obj, attr_path, default=None)
deep_set(obj, attr_path, value)

Navigate a period (‘.’) delimited path of attribute values into the oil data structure and set a value at that location in the structure.

Example paths:

sub_samples.0.metadata.sample_id (sub_samples is assumed to be a list, and we go to the zero index for that part)

physical_properties.densities.-1 (densities is assumed to be a list, goes to the last item)

physical_properties.densities.+ (appends an item to the densities list and goes to that part. the index value is assumed to be -1 in this case)

is_int(value)
slugify(label)

Generate a string that is suitable for use as an object attribute.

  • The strings will be snake-case, all lowercase words separated by underscores.

  • They will not start with a numeric digit. If the original label starts with a digit, the slug will be prepended with an underscore (‘_’).

Note: Some unicode characters are not intuitive. Specifically, In German orthography, the grapheme ß, called Eszett or scharfes S (Sharp S). It looks sorta like a capital B to English readers, but converting it to ‘ss’ is not completely inappropriate.

adios_db.data_sources.importer_base.date_only(func)
adios_db.data_sources.importer_base.join_with(separator)

Class method decorator to join a list of labels with a separator

adios_db.data_sources.importer_base.parse_single_datetime(date_str)
adios_db.data_sources.importer_base.parse_time(func)

Class method decorator to parse an attribute return value as a datetime

Note: Apparently there are a few records that just don’t have

a sample date. So we can’t really enforce the presence of a date here.

Note: The April 2020 Env Canada datasheet has much more consistent

date formats, but there are still some variations. Some formats that I have seen: - YYYY-MM-DD # most common - YYYY # 2 records - YYYY-MM # 5 records

Fortunately, dateutil will handle these without problems

adios_db.data_sources.mapper module

class adios_db.data_sources.mapper.MapperBase

Bases: ImporterBase

compound(name, measurement, method=None, groups=None, sparse=False)

Example of content:

{
    'name': "1-Methyl-2-Isopropylbenzene",
    'method': "ESTS 2002b",
    'groups': ["C4-C6 Alkyl Benzenes", ...],
    'measurement': {
        value: 3.4,
        unit: "ppm",
        unit_type: "Mass Fraction",
        replicates: 3,
        standard_deviation: 0.1
    }
}
measurement(value, unit, unit_type=None, standard_deviation=None, replicates=None)
min_max(value)
classmethod slugify(label)

Generate a string that is suitable for use as an object attribute.

  • The strings will be snake-case, all lowercase words separated by underscores.

  • They will not start with a numeric digit. If the original label starts with a digit, the slug will be prepended with an underscore (‘_’).

Note: Some unicode characters are not intuitive. Specifically, In German orthography, the grapheme ß, called Eszett or scharfes S (Sharp S). It looks sorta like a capital B to English readers, but converting it to ‘ss’ is not completely inappropriate.

Note: this function is duplicated in the mapper. Perhaps a base class to all the importer types.

adios_db.data_sources.parser module

class adios_db.data_sources.parser.ParserBase(values)

Bases: ImporterBase

Only things that are common to all parsers

adios_db.data_sources.reader module

Base classes for various different readers that we may want to use.

class adios_db.data_sources.reader.CsvFile(name, field_delim=',', encoding='mac_roman')

Bases: object

A generalized file reader for comma separated variables (.csv) flat datafiles. In spite of its name, this type of file can have various different separating characters besides commas for its rows and fields.

convert_field(field)

Convert data fields to numeric if possible

convert_fields(row)
export(filename)
init_field_names()
readline()
readlines()
rewind()