Search the OCDB using the GUI

All data the submitters have agreed to publish data searchable for the public.

The OCDB WebUI offers a graphical search interface. Main feature of this interface is the search text field.

Lucene Syntax

The search field allows using the so-called Lucene syntax which enables you to search for strings and substrings as well as for ranges in specific metadata headers (see list below).

A concise description of the full Lucene query language syntax can be found here. Please note that the OCDB system does not support the complete syntax.

General syntax:

[metadata_header]: [search_term]

Example (Exact match):

investigators: Colleen

Returns all datasets where the field “investigators” exactly matches the term “Colleen”.

Wild Card:

Lucene syntax offers two wildcards; the “*” represents multiple characters, the “?” denotes a single character wildcard.

Note: You cannot use a * or ? symbol as the first character of a search.

So the first example below returns all datasets with the investigators field containing “Coll”, surrounded by any number of characters, whereas the second returns datasets with “Coll” followed by two undefined characters and ‘n’.

investigators: Coll*
investigators: Coll??n

To search for any word containing the char ‘a’ use:

a*

To search for all parameters starting with ‘a’ use:

fields: ,a*

Keep in mind that the value for the metadata header fields is a comma-separated list of parameter names.

Please note:

  • words starting with a digit, must be written in quotes

  • words containing wildcards must be written without quotes

  • the following special characters must be escaped by a preceding backslash if not written in quotes:

"+ - && || ! ( ) { } [ ] ^ " ~ * ? : \"

Examples:

\-999.0
missing: "-999.0" 

Operators AND/OR:

These operators allow to combine conditions. As expected, the “AND” implements a logical AND, the “OR” represents the logical OR operation.

Please note:

  • The operators AND and OR must be written in upper case.

investigators: Colleen* AND start_date: '2016-04-01'
investigators: Colleen* OR investigators: *Helge*
fields: ,chl_a*  or ,sza*

Operator TO to search for ranges:

Thus, searches with numeric ranges require that start and end values have the same length, which is explicitly true for dates.

received: ["20191104" TO "20191108"]
start_date: ["19900101" TO "20211231"] AND end_date: ["20210101" TO "20221231"]
water_depth: ["10" TO "20"]
north_latitude: ["50" TO "60"]

The first example will list all files which:

  • have been submitted between 2019.11.04 and 2019.11.08

  • contain data in the period 2021.01.01 and 2021.12.31

  • contain data measured in water_depths between 10 and 20 meters

  • contain data in latitudes ranging between 50 and 60 degrees north

Please note:

  • The operator TO must be written in upper case.

  • All words are treated as strings, even if they represent numeric content.

When applying the operator ‘TO’, alphanumerical comparisons are used (i. e. ‘C’ > ‘B’ is TRUE and ‘20’ < ‘9’ is TRUE as well!).

The following fields can be considered:

  • path: Path where data files are stored

  • received: Date when data were received (optional)

  • identifier_product_doi: Product DOI (conditional, if available)

  • investigators: Primary Investigators (PIs) of the experiment

  • affiliations: Affiliations of the PIs (see path)

  • contact: Contact (email address) of the PIs

  • experiment: Identifier of the experiment (see path)

  • cruise: Identifier of the cruise (see path)

  • station: Name of the station where data were obtained (conditional, i. e. required if station does not appear in fields)

  • data_file_name: Data file name

  • data_type: Data type (e.g. scan, cast, above_water, …) (mandatory)

  • data_status: Could be preliminary, update or finally (optional but recommended)

  • start_date, end_date: Start and end date

  • start_time, end_time: Start and end time

  • north_latitude, south_latitude, west_longitude, east_longitude: Bounding box coordinates

  • water_depth: Water bottom depth at measurement point (in meters) (mandatory)

  • measurement_depth: Measurement/Sample depth (in meters) (conditionally)

  • secchi_depth: Secchi depth (in meters) (optionally but recommended)

  • missing: Fill value for unvalid data (non-zero, common choice -9999)

  • below_detection_limit: Numeric NULL value for values below detection limit (optional but recommended, common choice -8888)

  • above_detection_limit: Numeric NULL value for values above detection limit (optional but recommended, common choice -7777)

  • delimiter: Delimiter of data file e.g. ‘tab’, ‘comma’ or ‘space’ (e. g. ‘delimiter: comma’)

Examples:

path: "My_Affiliation/My_experiment/My_cruise"
station: "Blyth_NOAH"
start_date: "20160429"
start_time: "17:04:16 [GMT]"
north_latitude  "61.134032 [DEG]" 
missing: "-999.0" 

Consider that some of the metadata in the above list are not mandatory, thus the search results for these metadata headers could be non-exhaustive.

Search examples

Products (Parameter)

  1. Products can be chosen from a select list within the advanced search dialog. However, valid search results can only be obtained for products without postfix, e. g. wavelengths.

  2. For postfixed products such as ‘Rrs400’ or ‘SZA1020’ the search text field shall be used. All product names have to be followed by ‘*’ or ‘?’:

fields: SZA* OR fields: Chl_a*

Product groups

The webbased search interface allows to restrict result sets to certain geophysical variable types, organised by groups. They can be chosen from a selct list. A list of groups and the variables covered is given in the table below. Single product acronyms are fully described OCDB standard field names and units.

Group

Description

a

Spectral absorption coefficients: a, ap, aph, ad, ag, …

b

Spectral scattering coefficients: b, bp

bb

Spectral backscattering coefficients: bb, bbp, bbw, beta (VSF)

c

Spectral attenuation coefficients: c, cg, cp, cpg, cnw

kd

Distribution Coefficient: kd, kl, kpar, ku

AOP

Aerosol Optical Properties: ed, es, lt, lw, rrs, …

AOT

Aerosol optical thickness: AOT, angstrom, water vapor (wvp), ozone (oz)

PAR

Photosynthetically Active Radiation: epar, eupar, par

DC

Dissolved carbon: DIC, DOC, pCO2, total_Alkalinity, CDOM

PC

Particulated carbon: PC, PIC, POC

SPM

Suspended Particulate Matter: spm

nutrients

Si (sio4), N (n2_fix, nh4, no2, no2_no3, no3, tdn, urea), P (pn, po4, pon), oxygen

CTD

Hydrography: wt, sal/cond, sigmat

Chl

Chlorophyll, fluorometrically/spectrophotometrically derived: Chl*, phaeo, tpg

fluorescence

Fluorescence (natf, stimf)

HPLC

HPLC derived phytoplankton pigments (allo, alpha*, anth, asta, beta-*, …

productivity

NPP, NCP, GPP, PP

For a detailed list of parameter names see: https://seabass.gsfc.nasa.gov/wiki/stdfields.

Time range

In order to choose a time period covered by the data files, the metadata headers start_date and end_date can be used as follows to search for data partly covering 1. Jan. to 31. Dec. 2021:

start_date: ["19000101" TO "20211231"] OR end_date: ["20210101" TO "20990101"]

Region

  1. You can use the interactive map to select a region by a rectangle or a polygon.

  2. You can use the Python API or the OCDB command line interface to search for datasets by defining a certain region (see OCDB Command Line Client and Python API).