adios_db.data_sources.noaa_fm package

Submodules

adios_db.data_sources.noaa_fm.mapper module

class adios_db.data_sources.noaa_fm.mapper.OilLibraryAttributeMapper(record)

Bases: MapperBase

A translation/conversion layer for the NOAA FileMaker imported record object.

This is intended to be used interchangeably with either a NOAA Filemaker record or record parser object. Its purpose is to generate named attributes that are suitable for creation of a NOAA Oil Database record.

property adhesion

The parser has an adhesion attribute with a simple float, and we would like to reform it as a value/unit. But we don’t want to change its name. So we redefined it in the mapper.

Note: We don’t really know what the adhesion units are for the

NOAA Filemaker records.

Need to ask JeffL

Based on the range of numbers I am seeing, it kinda looks like we are dealing with Pascals (N/m^2)

property bulk_composition

Tentative Bulk Composition items:

  • Water Content Emulsion

  • Wax Content

  • Sulfur (unit=1 possibly, 0.0104 for Alaska North Slope)

  • Naphthenes (units=???, typical value=0.0004 for Jet A-1)

  • Paraffins (unit=???, 0.783 for Alberta 1992

    0.019 for Salmon Oil & Gas)

  • Nickel (unit=ppm most likely)

  • Vanadium (unit=ppm most likely)

  • Polars (unit=1 possibly, 0.0284 for Alberta 1992)

property compounds

Tentative Compound items:

  • benzene (units=???, typical value=0.05 for gasoline, just a fractional value maybe?)

property distillation_data
environmental_behavior(weathering)

Notes:

  • there is a dispersability_temp_k, but it does not fit with the oil model. sample.environmental_behavior.dispersibilites is for chemical dispersibility with a dispersant, and makes no reference to a temperature.

filter_weathered_attr(attr, weathering)
fresh_sample_props = ('SARA', 'distillation_data', 'compounds', 'bulk_composition', 'industry_properties')
generate_sample_id_attrs(sample_id)
property industry_properties

Industry Property items:

  • Reid Vapor Pressure (min/max/avg = 0/0.81/0.295, probably bars)

  • Conradson Crude (min/max/avg = 0.0054/0.12/0.035, probably just a fractional value)

  • Conradson Residuum (one value, 0.0019, probably just a fractional value)

property metadata
oil_props = ('oil_id', 'metadata', 'status', 'sub_samples')
physical_properties(weathering)
py_json()
sample(weathering)
property sub_samples
weathered_sample_props = ('physical_properties', 'environmental_behavior')

adios_db.data_sources.noaa_fm.parser module

class adios_db.data_sources.noaa_fm.parser.OilLibraryRecordParser(values, file_props)

Bases: ParserBase

A record parsing class for the NOAA Oil Library spreadsheet.

  • We manage a list of properties extracted from an Excel row for an oil.

  • The raw data from the Excel file will be a flat list, even for multidimensional properties like densities, viscosities, and distillation cuts.

property API
property SARA
property conradson
property cut_units
property cuts
property densities
property dynamic_viscosities
property emulsions

Oil Library records have some attributes related to emulsions: - emuls_constant_min: Zero percent emulsion weathered amount - emuls_constant_max: Max percent emulsion weathered amount - water_content_emulsion: water content at max weathered

  • Age will be set to the day of formation

  • Temperature will be set to 15C (288.15K)

property flash_point
property flash_point_max_k
property flash_point_min_k
get_property_sets(num_sets, obj_name, obj_argnames, required_obj_args)

Generalized method of getting lists of data sets out of our record.

Since our data source is a single fixed row of data, there will be a number of fixed subsets of object data attributes, but they may or may not be filled with data. For these property sets, the column names are organized with the following naming convention:

<attr><instance>_<sub_attr>

<attr>

The name of the attribute list.

<instance>

An index in the range [1…N+1] where N is the number of instances in the list.

<sub_attr>

The name of an attribute contained within an instance of the list.

Basically we will return a set of object properties for each instance that contains a defined set of required argument attributes.

property interfacial_tension_air
property interfacial_tension_seawater
property interfacial_tension_water
property interfacial_tensions
property kinematic_viscosities
property name
property oil_class
property oil_id
property pour_point
property pour_point_max_k
property pour_point_min_k
property preferred_oils
property product_type
property reference

The reference content can have:

  • no content: In this case we take the created date of the .csv file header.

  • one year (YYYY): In this case we parse the year as an int and form a datetime with it.

  • multiple years (YYYY): In this case we use the highest numeric year (most recent) and form a datetime with it.

property source_id
property sulfur
property synonyms

Synonyms is a single string field that contains a comma separated list of substring names

property toxicities
property weathering

A NOAA Filemaker record is a flat row of data, but there are some attributes that have weathering associated with their measured values. These attributes are:

  • Density

  • KVis

  • Dvis

In addition to these weathered attributes, the emulsion constant attributes are applied in the context of weathered samples.

  • The min emulsification constant is Emuls_Constant_Min. Its value is a weathered amount.

  • The max emulsification constant is Emuls_Constant_Max. Its value is a weathered amount.

All other attributes should be implicitly regarded as fresh oil measurements.

adios_db.data_sources.noaa_fm.reader module

exception adios_db.data_sources.noaa_fm.reader.ImportFileHeaderContentError

Bases: Exception

exception adios_db.data_sources.noaa_fm.reader.ImportFileHeaderLengthError

Bases: Exception

class adios_db.data_sources.noaa_fm.reader.OilLibraryCsvFile(name, field_delim='\t', ignore_version=False, encoding='mac_roman')

Bases: object

A specialized file reader for the OilLib and CustLib flat datafiles.

  • We will use universal newline support to designate a line of text.

  • Additionally, each line contains a number of fields separated by a tab (’ ‘). In this way it attempts to represent tabular data.

  • The first line in the file contains a file format version number (‘N.N’), followed by a date (‘d/m/YY’), and finally the product (‘adios’).

  • The second line in the file contains a table header, where each field represents the “long” name of a tabular column.

  • The rest of the lines in the file contain table data.

TODO: I just noticed that we are not making use of the .csv utility

class. We need to refactor this to use it.

convert_field(field)

Convert data fields to numeric if possible

convert_fields(row)
export(filename)
property file_props
get_record(oil_id)
get_records()

This is the API that the oil import processes expect

readline(cache=True)
readlines()
rewind()

adios_db.data_sources.noaa_fm.scoring module

This is the “Quality Score” Code from the old OilLibrary

Not sure if we’ll do that again, but it will certainly be different

But maybe some of this code will be helpful

class adios_db.data_sources.noaa_fm.scoring.ImportedRecordWithScore(imported_rec)

Bases: object

aggregate_score(Q_i, w_i=None)

General method for aggregating a number of sub-scores. We implement a weighted average for this.

score()
score_api()
score_cuts()
score_demographics()
score_densities()
score_emulsion_constants()
score_flash_point()
score_interfacial_tensions()
score_pour_point()
score_sara_fractions()
score_toxicities()
score_viscosities()