Creating a CF-1.6 timeSeries using pocean#
Created: 2018-02-27
IOOS recommends to data providers that their netCDF files follow the CF-1.6 standard. In this notebook we will create a CF-1.6 compliant file that follows file that follows the Discrete Sampling Geometries (DSG) of a timeSeries
from a pandas DataFrame.
The pocean
module can handle all the DSGs described in the CF-1.6 document: point
, timeSeries
, trajectory
, profile
, timeSeriesProfile
, and trajectoryProfile
. These DSGs array may be represented in the netCDF file as:
orthogonal multidimensional: when the coordinates along the element axis of the features are identical;
incomplete multidimensional: when the features within a collection do not all have the same number but space is not an issue and using longest feature to all features is convenient;
contiguous ragged: can be used if the size of each feature is known;
indexed ragged: stores the features interleaved along the sample dimension in the data variable.
Here we will use the orthogonal multidimensional array to represent time-series data from am hypothetical current meter. We’ll use fake data for this example for convenience.
Our fake data represents a current meter located at 10 meters depth collected last week.
from datetime import datetime, timedelta
import numpy as np
import pandas as pd
x = np.arange(100, 110, 0.1)
start = datetime.now() - timedelta(days=7)
df = pd.DataFrame(
{
"time": [start + timedelta(days=n) for n in range(len(x))],
"longitude": -48.6256,
"latitude": -27.5717,
"depth": 10,
"u": np.sin(x),
"v": np.cos(x),
"station": "fake buoy",
}
)
df.tail()
time | longitude | latitude | depth | u | v | station | |
---|---|---|---|---|---|---|---|
95 | 2022-05-10 15:18:50.687502 | -48.6256 | -27.5717 | 10 | 0.440129 | -0.897934 | fake buoy |
96 | 2022-05-11 15:18:50.687502 | -48.6256 | -27.5717 | 10 | 0.348287 | -0.937388 | fake buoy |
97 | 2022-05-12 15:18:50.687502 | -48.6256 | -27.5717 | 10 | 0.252964 | -0.967476 | fake buoy |
98 | 2022-05-13 15:18:50.687502 | -48.6256 | -27.5717 | 10 | 0.155114 | -0.987897 | fake buoy |
99 | 2022-05-14 15:18:50.687502 | -48.6256 | -27.5717 | 10 | 0.055714 | -0.998447 | fake buoy |
Let’s take a look at our fake data.
%matplotlib inline
import matplotlib.pyplot as plt
from oceans.plotting import stick_plot
q = stick_plot([t.to_pydatetime() for t in df["time"]], df["u"], df["v"])
ref = 1
qk = plt.quiverkey(
q, 0.1, 0.85, ref, f"{ref} m s$^{-1}$", labelpos="N", coordinates="axes"
)
plt.xticks(rotation=70)
(array([19024., 19038., 19052., 19066., 19083., 19097., 19113., 19127.]),
[Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, '')])
pocean.dsg
is relatively simple to use. The user must provide a DataFrame, like the one above, and a dictionary of attributes that maps to the data and adhere to the DSG conventions desired.
Because we want the file to work seamlessly with ERDDAP we also added some ERDDAP specific attributes like cdm_timeseries_variables
, and subsetVariables
.
attributes = {
"global": {
"title": "Fake mooring",
"summary": "Vector current meter ADCP @ 10 m",
"institution": "Restaurant at the end of the universe",
"cdm_timeseries_variables": "station",
"subsetVariables": "depth",
# These are only the required attributions from
# https://ioos.github.io/ioos-metadata/ioos-metadata-profile-v1-2.html#attribution
"creator_country": "USA",
"creator_email": "fake_email@somedomain.org",
"creator_institution": "IOOS",
"creator_sector": "academic",
"creator_url": "https://ioos.github.io/ioos_code_lab/content/intro.html",
"publisher_country": "USA",
"publisher_email": "fake_email@somedomain.org",
"publisher_institution": "IOOS",
"publisher_url": "https://ioos.github.io/ioos_code_lab/content/intro.html",
},
"longitude": {
"units": "degrees_east",
"standard_name": "longitude",
},
"latitude": {
"units": "degrees_north",
"standard_name": "latitude",
},
"z": {
"units": "m",
"standard_name": "depth",
"positive": "down",
},
"u": {
"units": "m/s",
"standard_name": "eastward_sea_water_velocity",
},
"v": {
"units": "m/s",
"standard_name": "northward_sea_water_velocity",
},
"station": {"cf_role": "timeseries_id"},
}
We also need to map the our data axes to pocean
’s defaults. This step is not needed if the data axes are already named like the default ones.
axes = {"t": "time", "x": "longitude", "y": "latitude", "z": "depth"}
from pocean.dsg.timeseries.om import OrthogonalMultidimensionalTimeseries
from pocean.utils import downcast_dataframe
df = downcast_dataframe(df) # safely cast depth np.int64 to np.int32
dsg = OrthogonalMultidimensionalTimeseries.from_dataframe(
df,
output="fake_buoy.nc",
attributes=attributes,
axes=axes,
)
The OrthogonalMultidimensionalTimeseries
saves the DataFrame into a CF-1.6 TimeSeries DSG.
!ncdump -h fake_buoy.nc
netcdf fake_buoy {
dimensions:
station = 1 ;
time = 100 ;
variables:
int crs ;
double time(time) ;
time:units = "seconds since 1990-01-01 00:00:00Z" ;
time:standard_name = "time" ;
time:axis = "T" ;
string station(station) ;
station:cf_role = "timeseries_id" ;
station:long_name = "station identifier" ;
double latitude(station) ;
latitude:axis = "Y" ;
latitude:units = "degrees_north" ;
latitude:standard_name = "latitude" ;
double longitude(station) ;
longitude:axis = "X" ;
longitude:units = "degrees_east" ;
longitude:standard_name = "longitude" ;
int depth(station) ;
depth:_FillValue = -9999 ;
depth:axis = "Z" ;
double u(station, time) ;
u:_FillValue = -9999.9 ;
u:units = "m/s" ;
u:standard_name = "eastward_sea_water_velocity" ;
u:coordinates = "time depth longitude latitude" ;
double v(station, time) ;
v:_FillValue = -9999.9 ;
v:units = "m/s" ;
v:standard_name = "northward_sea_water_velocity" ;
v:coordinates = "time depth longitude latitude" ;
// global attributes:
:Conventions = "CF-1.6" ;
:date_created = "2022-02-11T18:18:00Z" ;
:featureType = "timeseries" ;
:cdm_data_type = "Timeseries" ;
:title = "Fake mooring" ;
:summary = "Vector current meter ADCP @ 10 m" ;
:institution = "Restaurant at the end of the universe" ;
:cdm_timeseries_variables = "station" ;
:subsetVariables = "depth" ;
:creator_country = "USA" ;
:creator_email = "fake_email@somedomain.org" ;
:creator_institution = "IOOS" ;
:creator_sector = "academic" ;
:creator_url = "https://ioos.github.io/ioos_code_lab/content/intro.html" ;
:publisher_country = "USA" ;
:publisher_email = "fake_email@somedomain.org" ;
:publisher_institution = "IOOS" ;
:publisher_url = "https://ioos.github.io/ioos_code_lab/content/intro.html" ;
}
It also outputs the dsg object for inspection. Let us check a few things to see if our objects was created as expected. (Note that some of the metadata was “free” due t the built-in defaults in pocean
.
dsg.getncattr("featureType")
'timeseries'
type(dsg)
pocean.dsg.timeseries.om.OrthogonalMultidimensionalTimeseries
In addition to standard netCDF4-python
object .variables
method pocean
’s DSGs provides an “categorized” version of the variables in the data_vars
, ancillary_vars
, and the DSG axes methods.
[(v.standard_name) for v in dsg.data_vars()]
['eastward_sea_water_velocity', 'northward_sea_water_velocity']
dsg.axes("T")
[<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
units: seconds since 1990-01-01 00:00:00Z
standard_name: time
axis: T
unlimited dimensions:
current shape = (100,)
filling on, default _FillValue of 9.969209968386869e+36 used]
dsg.axes("Z")
[<class 'netCDF4._netCDF4.Variable'>
int32 depth(station)
_FillValue: -9999
axis: Z
unlimited dimensions:
current shape = (1,)
filling on]
dsg.vatts("station")
{'cf_role': 'timeseries_id', 'long_name': 'station identifier'}
dsg["station"][:]
array(['fake buoy'], dtype=object)
dsg.vatts("u")
{'_FillValue': -9999.9,
'units': 'm/s',
'standard_name': 'eastward_sea_water_velocity',
'coordinates': 'time depth longitude latitude'}
We can easily round-trip back to the pandas DataFrame object.
dsg.to_dataframe().head()
t | x | y | z | station | u | v | |
---|---|---|---|---|---|---|---|
0 | 2022-02-04 15:18:50.687502 | -48.6256 | -27.5717 | 10 | fake buoy | -0.506366 | 0.862319 |
1 | 2022-02-05 15:18:50.687502 | -48.6256 | -27.5717 | 10 | fake buoy | -0.417748 | 0.908563 |
2 | 2022-02-06 15:18:50.687502 | -48.6256 | -27.5717 | 10 | fake buoy | -0.324956 | 0.945729 |
3 | 2022-02-07 15:18:50.687502 | -48.6256 | -27.5717 | 10 | fake buoy | -0.228917 | 0.973446 |
4 | 2022-02-08 15:18:50.687502 | -48.6256 | -27.5717 | 10 | fake buoy | -0.130591 | 0.991436 |
For more information on pocean
please check the docs.