Accessing data from SOS using NHC GIS files

This post is the part 1 of 4 of a notebook series on how to obtain IOOS/NOAA data starting from a Hurricane track GIS format.

We will download Sea Surface Height (SSH) data from the Sensor Observation Service (SOS) closest to where Hurricane Michael made landfall.

The first step is to obtain Hurricane Michael’s GIS data from the National Hurricane Center (NHC).

The function below downloads the “best track” and load all the information we will use into two GeoPandas dataframes, one for the radii shape and another one for the track points.

The inputs for the function based on the NHC storms code:

  • 2 char for region al → Atlantic

  • 2 char for number 14 is Michael

  • and 4 char for year, 2018

import os
from pathlib import Path

import geopandas
import pandas as pd


def load_best_track(code="al14", year="2018"):
    fname = Path(f"{code}{year}_best_track.zip")

    if not fname.is_file():
        import urllib.request

        url = f"https://www.nhc.noaa.gov/gis/best_track/{fname}"
        urllib.request.urlretrieve(url, fname)

    os.environ["CPL_ZIP_ENCODING"] = "UTF-8"

    radii = geopandas.read_file(
        f"zip://{fname}/{code.upper()}{year}_radii.shp"
    )

    pts = geopandas.read_file(f"zip://{fname}/{code.upper()}{year}_pts.shp")

    pts["str"] = pts["DTG"].astype(int).astype(str)

    pts.index = pd.to_datetime(pts["str"], format="%Y%m%d%H", errors="coerce").values

    radii.index = pd.to_datetime(
        radii["SYNOPTIME"], format="%Y%m%d%H", errors="coerce"
    ).values
    return radii, pts

Now we can easily load track data and obtain the information we need to start querying the SOS service.

radii, pts = load_best_track(code="al14", year="2018")

start = radii.index[0]
stop = radii.index[-1]
bbox = radii["geometry"].total_bounds

The only two pieces of information that will not come from the GIS file are the variable of interest and the buoy. The former is user defined and the latter could be auto-discovered. We will cover that in a future notebook.

variable = "water_surface_height_above_reference_datum"

buoy = "8728690"

For the sake of better understanding the SOS service we will create the data endpoint “by hand.” In the cell below we “fill” the URL with the information we got above.

url = (
    "https://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS?"
    "service=SOS"
    "&request=GetObservation"
    "&version=1.0.0"
    f"&observedProperty={variable}"
    f"&offering=urn:ioos:station:NOAA.NOS.CO-OPS:{buoy}"
    "&responseFormat=text/csv"
    f"&eventTime={start:%Y-%m-%dT%H:%M:%SZ}/{stop:%Y-%m-%dT%H:%M:%SZ}"
    "&result=VerticalDatum==urn:ogc:def:datum:epsg::5103"
    "&dataType=PreliminarySixMinute"
)

print(url)
https://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS?service=SOS&request=GetObservation&version=1.0.0&observedProperty=water_surface_height_above_reference_datum&offering=urn:ioos:station:NOAA.NOS.CO-OPS:8728690&responseFormat=text/csv&eventTime=2018-10-07T12:00:00Z/2018-10-15T06:00:00Z&result=VerticalDatum==urn:ogc:def:datum:epsg::5103&dataType=PreliminarySixMinute

Now we can easily load the SSH data as a pandas DataFrame.

df = pd.read_csv(url, index_col="date_time", parse_dates=True)

df.head()
station_id sensor_id latitude (degree) longitude (degree) water_surface_height_above_reference_datum (m) datum_id vertical_position (m) sigma quality_flags
date_time
2018-10-07 12:00:00+00:00 urn:ioos:station:NOAA.NOS.CO-OPS:8728690 urn:ioos:sensor:NOAA.NOS.CO-OPS:8728690:A1 29.7244 -84.9806 0.349 urn:ogc:def:datum:epsg::5103 1.539 0.001 0;0;0;0
2018-10-07 12:06:00+00:00 urn:ioos:station:NOAA.NOS.CO-OPS:8728690 urn:ioos:sensor:NOAA.NOS.CO-OPS:8728690:A1 29.7244 -84.9806 0.341 urn:ogc:def:datum:epsg::5103 1.539 0.003 0;0;0;0
2018-10-07 12:12:00+00:00 urn:ioos:station:NOAA.NOS.CO-OPS:8728690 urn:ioos:sensor:NOAA.NOS.CO-OPS:8728690:A1 29.7244 -84.9806 0.336 urn:ogc:def:datum:epsg::5103 1.539 0.003 0;0;0;0
2018-10-07 12:18:00+00:00 urn:ioos:station:NOAA.NOS.CO-OPS:8728690 urn:ioos:sensor:NOAA.NOS.CO-OPS:8728690:A1 29.7244 -84.9806 0.337 urn:ogc:def:datum:epsg::5103 1.539 0.001 0;0;0;0
2018-10-07 12:24:00+00:00 urn:ioos:station:NOAA.NOS.CO-OPS:8728690 urn:ioos:sensor:NOAA.NOS.CO-OPS:8728690:A1 29.7244 -84.9806 0.329 urn:ogc:def:datum:epsg::5103 1.539 0.003 0;0;0;0

To make it easy to navigate the DataFrame let’s create a helper function that extracts the metadata from the columns. (Pandas DataFrames are table formats so the metadata are columns with value repeated over the data length.)

def extract_metadata(col):
    value = col.unique()
    if len(value) > 1:
        raise ValueError(f"Expected a single value but got {len(value)}")
    return value.squeeze().tolist()
import hvplot.pandas  # noqa

col = df.columns[df.columns.str.startswith(variable)]
sensor_id = extract_metadata(df["sensor_id"])

df[col].hvplot.line(
    figsize=(9, 2.75), legend=False, grid=True, title=sensor_id,
)