IOOS QARTOD software (ioos_qc)#

Created: 2020-02-14

Updated: 2022-05-23

This post will demonstrate how to run ioos_qc on a time-series dataset. ioos_qc implements the Quality Assurance / Quality Control of Real Time Oceanographic Data (QARTOD).

We will use bokeh for interactive plots, so let’s start by loading the interactive notebook output.

from bokeh.plotting import output_notebook
Loading BokehJS ...

We will be using the water level data from a fixed station in Kotzebue, AK.

Below we create a simple Quality Assurance/Quality Control (QA/QC) configuration that will be used as input for ioos_qc. All the interval values are in the same units as the data.

For more information on the tests and recommended values for QA/QC check the documentation of each test and its inputs:

qc_config = {
    "qartod": {
      "gross_range_test": {
        "fail_span": [-10, 10],
        "suspect_span": [-2, 3]
      "flat_line_test": {
        "tolerance": 0.001,
        "suspect_threshold": 10800,
        "fail_threshold": 21600
      "spike_test": {
        "suspect_threshold": 0.8,
        "fail_threshold": 3,

Now we are ready to load the data, run tests and plot results!

We will get the data from the AOOS ERDDAP server.

import cf_xarray
from erddapy import ERDDAP

e.dataset_id = "kotzebue-alaska-water-level"
e.constraints = {
    "time>=": "2018-09-05T21:00:00Z",
    "time<=": "2019-07-10T19:00:00Z",

data = e.to_xarray()
/home/filipe/miniforge3/envs/IOOS/lib/python3.10/site-packages/xarray/ SerializationWarning: variable 'sea_surface_height_above_sea_level_geoid_mhhw_qc_agg' has multiple fill values {-127, 4294967169}, decoding all values to NaN.
  new_vars[k] = decode_cf_variable(
- CF Axes:   X: ['longitude']
             Y: ['latitude']
             T: ['time']
             Z: n/a

- CF Coordinates:   longitude: ['longitude']
                    latitude: ['latitude']
                    time: ['time']
                    vertical: n/a

- Cell Measures:   area, volume: n/a

- Standard Names:   latitude: ['latitude']
                    longitude: ['longitude']
                    time: ['time']

- Bounds:   n/a

Data Variables:
- Cell Measures:   area, volume: n/a

- Standard Names:   aggregate_quality_flag: ['sea_surface_height_above_sea_level_geoid_mhhw_qc_agg']
                    altitude: ['z']
                    sea_surface_height_above_sea_level: ['sea_surface_height_above_sea_level_geoid_mhhw']
                    sea_surface_height_above_sea_level quality_flag: ['sea_surface_height_above_sea_level_geoid_mhhw_qc_tests']

- Bounds:   n/a
from ioos_qc.config import QcConfig

qc = QcConfig(qc_config)

# The result is always a list but we only want the first, one and only in this case, variable.
variable_name =["sea_surface_height_above_sea_level"][0]

qc_results =

            {'qartod': OrderedDict([('gross_range_test',
                           array([1, 1, 1, ..., 1, 1, 1], dtype=uint8)),
                           array([1, 1, 1, ..., 1, 1, 1], dtype=uint8)),
                           array([2, 1, 1, ..., 1, 1, 2], dtype=uint8))])})

The results are returned in a dictionary format, similar to the input configuration, with a mask for each test. While the mask is a masked array it should not be applied as such. The results range from 1 to 4 meaning:

  1. data passed the QA/QC

  2. did not run on this data point

  3. flag as suspect

  4. flag as failed

Now we can write a plotting function that will read these results and flag the data.

from datetime import datetime

import numpy as np
import matplotlib.pyplot as plt

def plot_results(data, variable_name, results, title, test_name):
    time =["time"]
    obs = data[variable_name]
    qc_test = results["qartod"][test_name]

    qc_pass = != 1, obs)
    qc_suspect = != 3, obs)
    qc_fail = != 4, obs)
    qc_notrun = != 2, obs)

    fig, ax = plt.subplots(figsize=(15, 3.75))
    fig.set_title = f"{test_name}: {title}"
    ax.set_ylabel("Observation Value")

    kw = {"marker": "o", "linestyle": "none"}
    ax.plot(time, obs,  label="obs", color="#A6CEE3")
    ax.plot(time, qc_notrun, markersize=2, label="qc not run", color="gray", alpha=0.2, **kw)
    ax.plot(time, qc_pass, markersize=4, label="qc pass", color="green", alpha=0.5, **kw)
    ax.plot(time, qc_suspect, markersize=4, label="qc suspect", color="orange", alpha=0.7, **kw)
    ax.plot(time, qc_fail, markersize=6, label="qc fail", color="red", alpha=1.0, **kw)

title = "Water Level [MHHW] [m] : Kotzebue, AK"

The gross range test test should fail data outside the \(\pm\) 10 range and suspect data below -2, and greater than 3. As one can easily see all the major spikes are flagged as expected.


An actual spike test, based on a data increase threshold, flags similar spikes to the gross range test but also indetifies other suspect unusual increases in the series.


The flat line test identifies issues with the data where values are “stuck.”

ioos_qc succefully identified a huge portion of the data where that happens and flagged a smaller one as suspect. (Zoom in the red point to the left to see this one.)


This notebook was adapt from Jessica Austin and Kyle Wilcox’s original ioos_qc examples. Please see the ioos_qc documentation for more examples.