system-test

IOOS DMAC System Integration Test project

Catalog based search for the IOOS Regional Associations acronyms

The goal of this post is to investigate if it is possible to query the NGDC CSW Catalog to extract records matching an IOOS RA acronym, like SECOORA for example.

In the cell above we do the usual: instantiate a Catalogue Service Web (csw) using the NGDC catalog endpoint.

In [3]:
from owslib.csw import CatalogueServiceWeb

endpoint = 'http://www.ngdc.noaa.gov/geoportal/csw'
csw = CatalogueServiceWeb(endpoint, timeout=30)

We need a list of all the Regional Associations we know.

In [4]:
ioos_ras = ['AOOS',      # Alaska
            'CaRA',      # Caribbean
            'CeNCOOS',   # Central and Northern California
            'GCOOS',     # Gulf of Mexico
            'GLOS',      # Great Lakes
            'MARACOOS',  # Mid-Atlantic
            'NANOOS',    # Pacific Northwest 
            'NERACOOS',  # Northeast Atlantic 
            'PacIOOS',   # Pacific Islands 
            'SCCOOS',    # Southern California
            'SECOORA']   # Southeast Atlantic

To streamline the query we can create a function that instantiate the fes filter and returns the records.

In [5]:
from owslib.fes import PropertyIsEqualTo

def query_ra(csw, ra='SECOORA'):
    q = PropertyIsEqualTo(propertyname='apiso:Keywords', literal=ra)
    csw.getrecords2(constraints=[q], maxrecords=100, esn='full')
    return csw
Here is what we got:
In [6]:
for ra in ioos_ras:
    csw = query_ra(csw, ra)
    ret = csw.results['returned']
    word = 'records' if ret > 1 else 'record'
    print("{0:>8} has {1:>3} {2}".format(ra, ret, word))
    csw.records.clear()
    AOOS has   1 record
    CaRA has   0 record
 CeNCOOS has   7 records
   GCOOS has   5 records
    GLOS has  15 records
MARACOOS has 100 records
  NANOOS has   1 record
NERACOOS has 100 records
 PacIOOS has   0 record
  SCCOOS has  23 records
 SECOORA has  71 records

I would not trust those number completely. Surely some of the RA listed above have more than 0/1 record.

Note that we have more information in the csw.records. Let's inspect one of SECOORA's stations for example.

In [7]:
csw = query_ra(csw, 'SECOORA')
key = csw.records.keys()[0]

print(key)
id_usf.tas.ngwlms

We can verify the station type, title, and last date of modification.

In [8]:
station = csw.records[key]

station.type, station.title, station.modified
Out[8]:
('downloadableData', 'usf.tas.ngwlms', '2015-11-25T01:32:42-07:00')

The subjects field contains the variables and some useful keywords.

In [9]:
station.subjects
Out[9]:
['air_pressure',
 'air_temperature',
 'water_surface_height_above_reference_datum',
 'wind_from_direction',
 'wind_speed_of_gust',
 'wind_speed',
 'SECOORA',
 'air_pressure',
 'air_temperature',
 'water_surface_height_above_reference_datum',
 'wind_from_direction',
 'wind_speed_of_gust',
 'wind_speed',
 'latitude',
 'longitude',
 'time',
 'climatologyMeteorologyAtmosphere']

And we can access the full XML description for the station.

In [10]:
print(station.xml)
<csw:Record xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/" xmlns:dct="http://purl.org/dc/terms/" xmlns:gml="http://www.opengis.net/gml" xmlns:ows="http://www.opengis.net/ows" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<dc:identifier scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:FileID">id_usf.tas.ngwlms</dc:identifier>
<dc:identifier scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:DocID">{9DDE8E32-EB36-4E72-B2CC-A47D51151271}</dc:identifier>
<dc:title>usf.tas.ngwlms</dc:title>
<dc:type scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:ContentType">downloadableData</dc:type>
<dc:type scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:ContentType">liveData</dc:type>
<dc:subject>air_pressure</dc:subject>
<dc:subject>air_temperature</dc:subject>
<dc:subject>water_surface_height_above_reference_datum</dc:subject>
<dc:subject>wind_from_direction</dc:subject>
<dc:subject>wind_speed_of_gust</dc:subject>
<dc:subject>wind_speed</dc:subject>
<dc:subject>SECOORA</dc:subject>
<dc:subject>air_pressure</dc:subject>
<dc:subject>air_temperature</dc:subject>
<dc:subject>water_surface_height_above_reference_datum</dc:subject>
<dc:subject>wind_from_direction</dc:subject>
<dc:subject>wind_speed_of_gust</dc:subject>
<dc:subject>wind_speed</dc:subject>
<dc:subject>latitude</dc:subject>
<dc:subject>longitude</dc:subject>
<dc:subject>time</dc:subject>
<dc:subject>climatologyMeteorologyAtmosphere</dc:subject>
<dct:modified>2015-11-25T01:32:42-07:00</dct:modified>
<dct:references scheme="urn:x-esri:specification:ServiceType:distribution:url">http://tds.secoora.org/thredds/dodsC/usf.tas.ngwlms.nc.html</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:distribution:url">http://www.ncdc.noaa.gov/oa/wct/wct-jnlp-beta.php?singlefile=http://tds.secoora.org/thredds/dodsC/usf.tas.ngwlms.nc</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:sos:url">http://tds.secoora.org/thredds/sos/usf.tas.ngwlms.nc?service=SOS&amp;version=1.0.0&amp;request=GetCapabilities</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:odp:url">http://tds.secoora.org/thredds/dodsC/usf.tas.ngwlms.nc</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:download:url">http://tds.secoora.org/thredds/dodsC/usf.tas.ngwlms.nc.html</dct:references>
<ows:WGS84BoundingBox>
<ows:LowerCorner>-82.75800323486328 28.1560001373291</ows:LowerCorner>
<ows:UpperCorner>-82.75800323486328 28.1560001373291</ows:UpperCorner>
</ows:WGS84BoundingBox>
<ows:BoundingBox>
<ows:LowerCorner>-82.75800323486328 28.1560001373291</ows:LowerCorner>
<ows:UpperCorner>-82.75800323486328 28.1560001373291</ows:UpperCorner>
</ows:BoundingBox>
<dc:source>{B3EA8869-B726-4E39-898A-299E53ABBC98}</dc:source>
</csw:Record>


This query is very simple, but also very powerful. We can quickly assess the data available for a certain Regional Association data with just a few line of code.

You can see the original notebook here.

In [11]:
HTML(html)
Out[11]:

This post was written as an IPython notebook. It is available for download. You can also try an interactive version on binder.

Comments