The OCDB command line client (ocdb-cli) and Python API

Automation and easy access is important. The OCDB database system does, therefore, offer a command line interface as well as a Python API for accessing as well as managing submissions and users.

Both options offer the same functions. If you just want to apply one or more of these functions, the command line interface (CLI) will suit your needs. Otherwise, use the OCDB API in Python scripts to integrate OCDB API functions together with other Python tasks.

Installation

It is possible to install the CLI and API via conda:

conda install -c ocdb -c conda-forge ocdb-client

Once that is done, you can test whether it is running by

cli:

ocdb-cli


Usage: ocdb-cli [OPTIONS] COMMAND [ARGS]...

  EUMETSAT Ocean Color In-Situ Database Client.

Options:
  --version       Show the version and exit.
  --server <url>  OC-DB Server URL.
  --help          Show this message and exit.

Commands:
  conf     Configuration management.
  ds       Dataset management.
  lic      Show license and exit.
  sbm      Submission management.
  sbmfile  Submission management.
  user     User management.

Configure

In order to access the database you need to configure the REST API server address. The default address to be used is https://ocdb.eumetsat.int, i. e.:

cli:

# Check whether the server url for the ocdb-cli is configured:
ocdb-cli conf 

https://ocdb.eumetsat.int

ocdb-cli conf server_url [some url]

python:

from ocdb.api.OCDBApi import new_api

api = new_api()

api.config

#Out[11]: {'server_url': 'https://ocdb.eumetsat.int'}

api.set_config_param('server_url','[some server url]')

Search Database with find_datasets()

After login (see chapter “User Management” below), the method ‘find_datasets’ allows querying the Database for multiple dataset characteristics, using different keywords:

  • expr: looks for any files containing any of the words passed. Also, Lucene syntax can be used (See below for more details)

  • region: looks for files containing measurements collected in the polygon defined by specified coordinates (format: “[West],[South],[East],[North]”)

  • start_time: looks for any files containing measurement collected later than the selected date (format: “20160701”)

  • end_time: looks for any files containing measurement collected earlier than the selected date (format: “20190701”)

  • wdepth: looks for any files containing measurements collected within the defined range of water (bottom) depth (format:”[[min_depth],[max_depth]]”)

  • mtype: filters radiometric data depending on wavelength option. Could be ‘all’, ‘multispectral’ or ‘hyperspectral’

  • shallow: set to ‘yes’ to include also measurements indicated as done in shallow waters by the PIs (Default is ‘no’)

  • pmode: can be set either to ‘contains’ (to filter results based on selected pgroup or variables), or to ‘same_cruise’ (to include measurements from cruise during which all the selected groups/variables were acquired), or to ‘do_not_filter’ (to not filter results at all)

  • pgroup: looks for files containing only certain geophysical variable types. Refer to Search the OCDB using the GUI chapter for the complete list

  • pname: looks for files containing only the specified variables. A complete list of queryable variables are available OCDB standard field names and units

  • status: set to ‘PUBLISHED’ to get only public available data or to ‘PROCESSED’ to get both public and not published data (available only for admin users and data owners)

  • submission_id: looks for data submitted below the specified submission label

  • geojson: (Default is True)

  • user_id: look for data submitted by the specified user (by username)

The result is a dictionary containing information and whole dataset related to the file containing the measurement the satisfied the search criteria. Dictionary keys are:

  • locations: geometries representing the spatial extent of each dataset (point or rectangle)

  • total_count: number of datasets returned by the query

  • datasets: information about dataset files and the submissions they belong to

  • query: query parameterization

  • dataset_ids: ids of the returned datasets

The syntax to search for SEABASS datasets via the command line interface is:

cli syntax:

ocdb-cli ds find --query [keyword]=[value or comma-separated list of values]

cli sample:

ocdb-cli ds find --query region=50,45,51,46
ocdb-cli ds find --query start_time=2021-01-01 --query end_time=2021-12-31

The Python method find_dataset( … ) can be used as follows:

def find_datasets(ctx: WsContext,
                  expr: str = None,    # Lucene syntax can be used here!
                  region: List[float] = None,    # comma-separated list of floats
                  mtype: str = None,             # e. g. 'scan', 'above_water', ...
                  wlmode: str = None,            # Not implemented yet
                  shallow: str = 'no',           # as defined in metadata header "/data_use_warning= ..." 
                  pmode: str = 'contains',       # e. g. "same cruise", "contains"       
                  pgroup: List[str] = None,      # For details refer to: https://ocdb.readthedocs.io/en/latest/ocdb-search.html#product-groups
                  status: str = None,            # Not implemented yet
                  submission_id: str = None,     # equal to submission label as defined at submission
                  pname: List[str] = None,       # for wavelength-dependent parameters, e. g. Lsky412, please use LUcene syntax
                  geojson: bool = False,         # boolean, default is false
                  offset: int = 1,               # used by web user interface for paging
                  user_id: str = None,           # used by web user interface
                  count: int = 1000)             # used by web user interface for paging, default = 1000

Python samples:

data = api.find_datasets(region='50,45,51,46')
data['datasets']

  [{'id': '5d97112af9305e0001c6d6fc', 'path': 'LOG/IOPstudy/DS3', 'filename': 'DS3_IOPstudy.csv'}]
data = api.find_datasets(end_time='2021-12-31')

Search Database with Lucene syntax

The first example below attempts to find data files that include the name “Astrid” in the investigators meta field.

cli:

ocdb-cli ds find --expr "investigators: *Astrid*"

results in the following output:

{
  "locations": {},
  "total_count": 4,
  "datasets": [
    {
      "id": "5d2433e81f59e20001aaae74",
      "path": "AWI/SO/SO235/archive/Bracher_2019_SO235_db.txt"
    },
    ...

python:

api.find_datasets(expr="investigators=*Astrid*")

results in the following output:

{
  "locations": {},
  "total_count": 4 ,
  "datasets": [
    {
      "id": "5d2433e81f59e20001aaae74",
      "path": "AWI/SO/SO235/archive/Bracher_2019_SO235_db.txt"
    },
    ...

A complete and up-to-date list of the fields that can be queried is available Search the OCDB using the GUI

Get Datasets

The search engine returns a list of datasets. In order to retrieve the actual data, dataset IDs obtained through the previous step, using cli ds find function and api find_datasets method, should be used. A dataset ID can be used to get actual data as in the example below:

cli:

ocdb-cli ds get --id 5d971154f9305e0001c6d700

python:

api.get_dataset(dataset_id='5d971154f9305e0001c6d700', fmt='pandas')

results in the following output:

	      date	    time	     lat	    lon	depth	  ...	 tot_chl_a
0     20140723	12:30:00	-19.9743	57.4493	    0	     	  0.05280
1	  20140723	14:00:00	-19.7216	57.6288	    0		      0.04767
2	  20140723	17:00:00	-19.2121	57.9908	    0		      0.05028
3	  20140723	20:00:00	-18.7211	58.3397	    0             0.04490
4	  20140723	23:00:00	-18.2994	58.7023	    0		      0.07901
:

User Management

Commands:

  • add Add a user (see below)

  • delete Delete user (see below)

  • get Get user (see below)

  • list List all user (ocdb-cli user list)

  • login Login a user (see below)

  • logout Log out current user if logged in (ocdb-cli user logout)

  • pwd Set the password for a user (see below)

  • update Update an existing user (see below)

  • whoami Get the current user (see below)

General remarks on CLI syntax

Command line arguments can be specified by single letters, e. g. -u for username or -p for password or as words, such as –password or –username. For details see help for the respective command or subcommand, e. g.:

ocdb-cli user –help

or

ocdb-cli user add –help

Login User:

The login procedure will ask for a user name and a password. You can specify the password as an option. However, under normal circumstances we advice to specify username only and to use the command line prompt.

The example below will login a user with the user name ‘scott’. ‘scott’ is a ‘submitter’ user. ‘scott’, after login, could submit data to the system, but he does not have any administrative privileges.

cli:

ocdb-cli user login --username scott --password tiger

python:

api.login_user(username='scott', password='tiger')

cli:

ocdb-cli user logout

python:

api.logout_user()

Who am I?

To find out whether you are logged in or who is logged in, use:

cli:

ocdb-cli user whoami

python:

api.whoami_user()

Add User (admin only):

To add a user, specify the required user information. Arguments username, password, email and roles are mandatory. role could be either ‘submit’ (for any users) or ‘admin’ (for admin users only).

cli:

ocdb-cli user add -u <user_name> -p <password> -fn <user's first name> -ln <user's family name> -em <user's email> -ph <user's phone number> -r <role1> -r <role2>

In the command line the value for argument roles shall be written as admin, submit or [’submit’,’admin’], e.g.:

ocdb-cli user add --username super_user --password super_secret --roles ['submit','admin'] --email super_user@eum.int

Add further arguments as convenient.

python:

api.add_user(username='<user_name>', password='<passwd>', email='<email>', roles=['<role1>, <role2>'])

Get User Information (admin only):

Get details of a specific user, except password:

cli:

ocdb-cli user get scott

python:

api.get_user(username='scott')

Users can request their own information without restrictions.

Delete a User (admin only):

Delete a user by specifying the username.

cli (do not use key -u or –username!):

ocdb-cli user delete scott

python:

api.delete_user(username='scott')

Update an Existing User (admin or the respective user themselves only):

The following fields can be updated:

  • first_name

  • last_name

  • email

  • phone

  • roles (admin only)

Even an admin user cannot change user id.

cli:

ocdb-cli user update --username scott --key <field to be updated> --value <your value>

python:

api.update_user(<user_name>, key=<key>, value=<value>)

Users can update their own user details without restrictions. However, they have to specify their user name.

Update own password:

Any user can update his own password, after login.

cli:

ocdb-cli user pwd

You will be asked to input your current password and the new password.

Screenshot:

_images/cli_change_own_password.png

python:

api.change_user_login(username=<username>,password=<password>,new_password=<new_password>)

Forgotten password

Please contact ServiceDesk@eumetsat.int [Subject: OCDB, forgotten password (username = …)]

Update password of an existing user (admin only)

Admins can reset a forgotten password for any user. To recover a user’s password, use the pwd command as follows:

cli

ocdb-cli user pwd -u <username> -p <admin password> --new-password <new password>

python:

api.change_user_login(username='scott', password='tiger', new_password='lion')

Provide this <test_password> to the user and ask him/her to change it immediately.

Managing Submissions

Upload a new submission: To contribute data through a new submission.

cli:

ocdb-cli sbm upload "<affiliation>/<experiment>/<cruise>" <data files list> -s <submission label> -ap -d <document files list>

<data files list> correspond to a list of comma separated full paths of measurement files.

<document files list> correspond to a list of comma separated full paths of document files.

-ap should be set only to allow data be available for the general public

python:

api.upload_submission('<affilition>/<experiment>/<cruise>',dataset_files=('<file_path1>','<file_path2>',...),submission_id='<submission_label>', doc_files=('<file_path1>','<file_path2>',...),[allow_publication = <True/False>],[publication_date = '<yyyy/mm/dd>'])

allow_publication should be set to True only to allow data be available for the general public publication_date should be set only when data can be available for the general public but only after the specified date

Get Submission (admin only): to get information for a specific submission

cli:

ocdb-cli sbm get IOPstudy2

python:

api.get_submission('IOPstudy2')

Users can monitor their own submissions without restrictions.

Get Submissions for a specific User (admin only):

cli:

ocdb-cli sbm user scott

python:

api.get_submissions_for_user('scott')

Users can monitor their own submissions without restrictions.

Delete Submission (admin only):

cli:

ocdb-cli sbm delete <submission-id>

python:

api.delete_submission(<submission-id>)

Users can delete their own submissions without restrictions.

Update Submission Status (admin only):

This command allows to manipulate the status assigned to any submission. Some status changes will have impact on whether the data are searchable or not in the Database.

The following list shows the different stati and the impact on the accessibility when changing them:

  • SUBMITTED: A dataset has been submitted. Usually also means that the data has issues. This will trigger the automated validation process

  • VALIDATED: The data has been submitted and passed the quality checks (even in case any warning was raised)

  • PROCESSED: The data has been processed into the database and is searchable, but only by admin users and the user who submetted it

  • PUBLISHED: The data has been processed into the database and is publicly available

  • CANCELED: The data submission has been canceled. Setting this status will remove the data from the database and will not be findable anymore. It can be still reprocessed again into the Database

  • PAUSED: The user paused the submission. This indicates that the admin users shall not publish or process the data

cli:

ocdb-cli sbm status --submission-id <submission-id> --status <status>

python

api.update_submission_status(<submission-id>, <status>)

Users can submit, cancel and pause their own submissions without restrictions.

Download Submission File:

This command will download a single submission file. Please be aware that the version of the file is the one of the submission status. Do not use this feature to download data, instead use the “get_dataset” function of the API.

cli:

ocdb-cli sbmfile download -s <submission_label> --index <index> [--out-file <file_name>]

By default files are downloaded as ‘download.zip’

python

api.download_submission_file(<submission_label>,<index>, out_fn =  <file_name>)

Upload Submission File:

Both, measurement and documentation files, can be added to an existing submission. Existing files will be replaced with updated files.

cli:

ocdb-cli sbmfile add --submission-id <submission_label> --file <local_file_path>  -t <type>

python

api.add_submission_file(<submission_label>,<local_file>,<type>)

where type could be ‘MEASUREMENT’ or ‘DOCUMENT’

Both existing measurement and documentation files can be added to updated, replacing them with a new file from local.

cli:

ocdb-cli sbmfile update --submission-id <submission_label> --file <local_file_path>  --index <index>

python

api.update_submission_file(<submission_label>,<local_file>,<index>)

where index is the index of the file in the submission to be updated.

Users can update their own submission files without restrictions.

General

Get License

ocdb-cli lic