Managing the Data

In order to manage the data, you’ll want to work with the Oil object.

The Oil object is essentially a Python class that mimics the base JSON format. It then provides attributes that let you “drill down” to find the data you want.

The Scripting Module

Most of the things you’ll need for “typical” work with the data can be found in the :py:mod`adios-db.scripting` module. We recommend that you import it like so:

import adios_db.scripting as ads

Then you can access all the needed functions an objects from the ads name.

Working with JSON

For the most part, the data are stored as JSON: either in MongoDB, or as JSON files on disk.

JSON-compatible-Python

A subset of JSON can be mapped directly to Python builtin objects. We call this “JSON-compatible-Python”, or py_json for short. Essentially, it is Python that can be saved-to and loaded-from JSON losslessly: arbitrarily nested lists, dicts, strings, numbers, and bools.

Creating an Oil object

The Oil object (and all the sub-objects) can be created from py_json with:

oil = ads.Oil.from_py_json(a_json_compatible_object)

or directly from a json file with:

ads.Oil.from_file(a_file_path)

The file path can be a string or pathlib.Path object (you can also pass in an open file object).

Saving an Oil object

An Oil object (and all the sub-objects) can be saved to py_json with:

python_object = oil.py_json()

or directly to a json file with:

an_oil.to_file(a_file_path)

The file path can be a string or pathlib.Path object

(you can also pass in an open file object)

Creating an Oil from scratch

You can create an empty Oil object from scratch – this is likely to be useful for creating data from other data sources: CSV files, databases, etc. It does require a fairly in-depth knowledge of the nested structure, however.

oil = ads.Oil('XXXXX')

The only required argument is an oil_id: it can be any moderate-length string. Your ID scheme should match the database you are working with.

For other arguments, see: adios_db.scripting.Oil

Note

Hopefully we will someday write complete documentation for how to create a full oil record from scratch. Below are a few pieces. In the meantime, you can look at the tests and at the included import scripts to see how the pieces are created and put together.

The Oil attributes

See adios_db.models.oil.oil.Oil for the full details. but in short, a basic Oil object has:

  • oil_id: the ID for the record

  • metadata: where the metadata goes – name, product types, etc: adios_db.models.oil.metadata.Metadata

  • subsamples: A list of data about the samples. This is where the actual data goes. Every record should have at least one subsample – the zeroth one should be the “fresh oil” as it arrived at a lab. Other subsamples will have been processes on some way.

Sample

The adios_db.models.oil.sample.Sample class holds all the measurements recorded in the record.

It is broken down into different categories of data – see the API docs for details

Distillation

The adios_db.models.oil.distillation.Distillation holds distillation cut data. It contains information about the distillation process, and the cut data itself. The distillation cuts are stored in a distillation cut list, with a set of fraction: temperature pairs. The is a utility constructor to generate the cut list from arrays of data. For example:

from adios_db.models.oil.distillation import Distillation, DistCutList
from adios_db.models.common.measurement import Temperature, Concentration

fractions = (1.5, 2.8, 12.4, 23.5, 44.3, 63.9, 82.7, 91.7, 96.2, 98.8)
temps = (36.0, 69.0, 119.0, 173.0, 283.0, 391.0, 513.0, 604.0, 672.0, 729.0)

dct = DistCutList.from_data_arrays(fractions=fractions,
                                   temps=temps,
                                   frac_unit='percent',
                                   temp_unit='C'
                                   )

# and now a Distillation object can be created

dist_data = Distillation(type="mass fraction",
                         method="some arbitrary method",
                         end_point=Temperature(value=15, unit="C"),
                         fraction_recovered=Concentration(value=0.8,
                                                          unit="fraction"),
                         cuts=dct
                        )

# this can be added to the Sample:
sample.distillation_data = dist_data

# which could be in an Oil object:
sample = oil.sub_samples[0].distillation_data = dist_data

Example Scripts

There are a number of example scripts in the top-level scripts directory in the source code.

adios_db/scripts

These are various scripts used to do one-off cleanup or manipulation of the data. It is unlikely that you will want to run any of these directly, but they can be used as examples to follow.