Scripts for working with the data

A number of top-level scripts are installed along with the python package. If the package is installed inot a Python installation that is set up correctly, you shoujld be able to invoke the scripts directly, e.g.

adios_db_validate dir_to_validate

The scripts all begin with adios_db_ – if you have command-line completion, you should be able to see them all with a <tab><tab>

The Installed scripts

Scripts for working with the JSON files

adios_db_validate

This generates validation reports from all the JSON files in the provided directory. You must provide a directory to validate, and it will search in that dir, and any child dirs for .JSON files to validate.

Example:

adios_db_validate oil/EC

adios_db_process_json

This processes a set of JSON files in the provided directory. It will load them into a adios_db Oil object and save it out again as JSON. This serves to make sure the JSON is valid, and normalizes the JSON with the same white space and data order, so that files with non-meaningful changes will compare equal.

You must provide a directory to process, and it will search in that dir, and any child dirs for .JSON files to process.

Example:

adios_db_process_json noaa_oil_data/data/oil

will process all the records in the noaa_oil_data repository.

NOTE: this should always be run before merging any new or edited data into the repository.

Optionally, “dry_run” can be passed on the command line to make sure the JSON is all valid, but without making any changes to the data.

adios_db_read_noaa_csv

Reads a “NOAA standard CSV file” and generates an adios_db JSON file from the data.

adios_db_read_noaa_csv name_of_the_file.csv

NOAA standard CSV files are ones that conform to the NOAA Excel template format:

ADIOS data template

If the file will not load, and the error is not obvious, you can pass the --debug flag, and you will get a lot or additional information about what the script is trying to do.

adios_db_assign_ids

Assign IDs to a set of new records.

adios_db_assign_ids PREFIX [dry_run] data_dir, file1, file2, file3, ...

PREFIX is the 2-letter prefix you want to useL, e.g. AD. It can be a new prefix, or an existing one.

data_dir is a directory with the existing data – it will be scanned to determine which IDs are already in use, so new non-conflicting ones will be generated.

You can pass in one or more JSON files, e.g. *.json

If dry_run is on the command line, it will report what it would do, but not save any changes

adios_db_add_labels

This script will add likely labels to the records, based on a set of criteria developed int eh code – it will not correctly label everyting, but should give you a good start.

adios_db_add_labels data_dir [dry_run]

data_dir is the dir where the data are: the script will recursively search for JSON files

If replace is on the command line, existing labels with be replaced. Other wise, new ones will be added, but none removed.

If dry_run is on the command line, it will report what it would do, but not save any changes

Scripts for working with the web application / Mongo DB:

adios_db_init

adios_db_import

adios_db_oil_query

adios_db_backup

adios_db_restore

Scripts for working with the code / tests

adios_db_update_test_data

This will update the test data with the latest version from NOAA oil data. This should only be run by people working on the adios_db codebase.