adios_db.data_sources.noaa_fm package
Submodules
adios_db.data_sources.noaa_fm.mapper module
- class adios_db.data_sources.noaa_fm.mapper.OilLibraryAttributeMapper(record)
Bases:
MapperBase
A translation/conversion layer for the NOAA FileMaker imported record object.
This is intended to be used interchangeably with either a NOAA Filemaker record or record parser object. Its purpose is to generate named attributes that are suitable for creation of a NOAA Oil Database record.
- property adhesion
The parser has an adhesion attribute with a simple float, and we would like to reform it as a value/unit. But we don’t want to change its name. So we redefined it in the mapper.
- Note: We don’t really know what the adhesion units are for the
NOAA Filemaker records.
Need to ask JeffL
Based on the range of numbers I am seeing, it kinda looks like we are dealing with Pascals (N/m^2)
- property bulk_composition
Tentative Bulk Composition items:
Water Content Emulsion
Wax Content
Sulfur (unit=1 possibly, 0.0104 for Alaska North Slope)
Naphthenes (units=???, typical value=0.0004 for Jet A-1)
- Paraffins (unit=???, 0.783 for Alberta 1992
0.019 for Salmon Oil & Gas)
Nickel (unit=ppm most likely)
Vanadium (unit=ppm most likely)
Polars (unit=1 possibly, 0.0284 for Alberta 1992)
- property compounds
Tentative Compound items:
benzene (units=???, typical value=0.05 for gasoline, just a fractional value maybe?)
- property distillation_data
- environmental_behavior(weathering)
Notes:
there is a dispersability_temp_k, but it does not fit with the oil model. sample.environmental_behavior.dispersibilites is for chemical dispersibility with a dispersant, and makes no reference to a temperature.
- filter_weathered_attr(attr, weathering)
- fresh_sample_props = ('SARA', 'distillation_data', 'compounds', 'bulk_composition', 'industry_properties')
- generate_sample_id_attrs(sample_id)
- property industry_properties
Industry Property items:
Reid Vapor Pressure (min/max/avg = 0/0.81/0.295, probably bars)
Conradson Crude (min/max/avg = 0.0054/0.12/0.035, probably just a fractional value)
Conradson Residuum (one value, 0.0019, probably just a fractional value)
- property metadata
- oil_props = ('oil_id', 'metadata', 'status', 'sub_samples')
- physical_properties(weathering)
- py_json()
- sample(weathering)
- property sub_samples
- weathered_sample_props = ('physical_properties', 'environmental_behavior')
adios_db.data_sources.noaa_fm.parser module
- class adios_db.data_sources.noaa_fm.parser.OilLibraryRecordParser(values, file_props)
Bases:
ParserBase
A record parsing class for the NOAA Oil Library spreadsheet.
We manage a list of properties extracted from an Excel row for an oil.
The raw data from the Excel file will be a flat list, even for multidimensional properties like densities, viscosities, and distillation cuts.
- property API
- property SARA
- property conradson
- property cut_units
- property cuts
- property densities
- property dynamic_viscosities
- property emulsions
Oil Library records have some attributes related to emulsions: - emuls_constant_min: Zero percent emulsion weathered amount - emuls_constant_max: Max percent emulsion weathered amount - water_content_emulsion: water content at max weathered
Age will be set to the day of formation
Temperature will be set to 15C (288.15K)
- property flash_point
- property flash_point_max_k
- property flash_point_min_k
- get_property_sets(num_sets, obj_name, obj_argnames, required_obj_args)
Generalized method of getting lists of data sets out of our record.
Since our data source is a single fixed row of data, there will be a number of fixed subsets of object data attributes, but they may or may not be filled with data. For these property sets, the column names are organized with the following naming convention:
<attr><instance>_<sub_attr>
- <attr>
The name of the attribute list.
- <instance>
An index in the range [1…N+1] where N is the number of instances in the list.
- <sub_attr>
The name of an attribute contained within an instance of the list.
Basically we will return a set of object properties for each instance that contains a defined set of required argument attributes.
- property interfacial_tension_air
- property interfacial_tension_seawater
- property interfacial_tension_water
- property interfacial_tensions
- property kinematic_viscosities
- property name
- property oil_class
- property oil_id
- property pour_point
- property pour_point_max_k
- property pour_point_min_k
- property preferred_oils
- property product_type
- property reference
The reference content can have:
no content: In this case we take the created date of the .csv file header.
one year (YYYY): In this case we parse the year as an int and form a datetime with it.
multiple years (YYYY): In this case we use the highest numeric year (most recent) and form a datetime with it.
- property source_id
- property sulfur
- property synonyms
Synonyms is a single string field that contains a comma separated list of substring names
- property toxicities
- property weathering
A NOAA Filemaker record is a flat row of data, but there are some attributes that have weathering associated with their measured values. These attributes are:
Density
KVis
Dvis
In addition to these weathered attributes, the emulsion constant attributes are applied in the context of weathered samples.
The min emulsification constant is Emuls_Constant_Min. Its value is a weathered amount.
The max emulsification constant is Emuls_Constant_Max. Its value is a weathered amount.
All other attributes should be implicitly regarded as fresh oil measurements.
adios_db.data_sources.noaa_fm.reader module
- exception adios_db.data_sources.noaa_fm.reader.ImportFileHeaderContentError
Bases:
Exception
- exception adios_db.data_sources.noaa_fm.reader.ImportFileHeaderLengthError
Bases:
Exception
- class adios_db.data_sources.noaa_fm.reader.OilLibraryCsvFile(name, field_delim='\t', ignore_version=False, encoding='mac_roman')
Bases:
object
A specialized file reader for the OilLib and CustLib flat datafiles.
We will use universal newline support to designate a line of text.
Additionally, each line contains a number of fields separated by a tab (’ ‘). In this way it attempts to represent tabular data.
The first line in the file contains a file format version number (‘N.N’), followed by a date (‘d/m/YY’), and finally the product (‘adios’).
The second line in the file contains a table header, where each field represents the “long” name of a tabular column.
The rest of the lines in the file contain table data.
- TODO: I just noticed that we are not making use of the .csv utility
class. We need to refactor this to use it.
- convert_field(field)
Convert data fields to numeric if possible
- convert_fields(row)
- export(filename)
- property file_props
- get_record(oil_id)
- get_records()
This is the API that the oil import processes expect
- readline(cache=True)
- readlines()
- rewind()
adios_db.data_sources.noaa_fm.scoring module
This is the “Quality Score” Code from the old OilLibrary
Not sure if we’ll do that again, but it will certainly be different
But maybe some of this code will be helpful
- class adios_db.data_sources.noaa_fm.scoring.ImportedRecordWithScore(imported_rec)
Bases:
object
- aggregate_score(Q_i, w_i=None)
General method for aggregating a number of sub-scores. We implement a weighted average for this.
- score()
- score_api()
- score_cuts()
- score_demographics()
- score_densities()
- score_emulsion_constants()
- score_flash_point()
- score_interfacial_tensions()
- score_pour_point()
- score_sara_fractions()
- score_toxicities()
- score_viscosities()