Gulf of Maine Wilkinson Basin Time Series Station Calanus Abundance Observations
Background
Gulf of Maine Wilkinson Basin Time Series Station (WBTS) Calanus Abundance Observations data collection was supported by multiple awards to NERACOOS, the University of New Hampshire and the University of Maine from various funding agencies, including NSF, NOAA and BOEM.
Projects: NERACOOS ISMN (Integrated Sentinel Monitoring Network)- MBON (Marine Biodiversity Observation Network) Gulf of Maine Plankton Observation Time Series: Wilkinson Basin Time Series Station (WBTS)
Creator: Jeffrey Runge, Ph.D (School of Marine Sciences, University of Maine, Darling Marine Center) Data processor: Dylan Pugh
Data Flow Diagram
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#007396',
'primaryTextColor': '#fff',
'primaryBorderColor': '#003087',
'lineColor': '#003087',
'secondaryColor': '#007396',
'tertiaryColor': '#CCD1D1'
},
'flowchart': { 'curve': 'basis' }
}
}%%
flowchart LR
A["Marine Life data
&
metadata"]
B[("Raw Data
Access Point
(eg. RA ERDDAP)")]
C("Darwin Core
Alignment")
D[(NCEI)]
E[("OBIS node")]
G([OBIS])
H([GBIF])
I[("IOOS Data Catalog
(data.ioos.us)
(metadata only)")]
FC(["Federal Catalogs/
Products"])
A --> B
B ---> I
B --> C
B --> D
C --> E
E --> G
E --> H
I --> FC
G --> FC
H --> FC
D --> FC
click B "http://www.neracoos.org/erddap/tabledap/WBTS_CFIN_2004_2017.html" "NERACOOS ERDDAP" _blank
click C "https://github.com/ioos/bio_data_guide/issues/102" "Data standardization" _blank
click D "https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.nodc:0250940" "NCEI" _blank
click E "https://www1.usgs.gov/obis-usa/ipt/resource?r=gom_wbts_mesozooplankton" "OBIS-USA IPT" _blank
click G "https://obis.org/dataset/5ef55cd8-05a1-4569-8e17-ceb224e40f59" "OBIS" _blank
click H "https://www.gbif.org/dataset/29651377-23c8-4f45-b439-693a1a23cee1" "GBIF" _blank
click I "htpps://data.ioos.us" "IOOS Catalog" _blank
click FC "https://data.noaa.gov/onestop/collections/details/55309a04-8383-42ff-b2fe-ab3497431756" "OneStop" _blank
Access Points
Below is a table of the various places to discover these data and the process by which these data were standardized to Darwin Core.
Order of activities:
- Serve data and metadata on RA ERDDAP
- Align ERDDAP data to DarwinCore
- Share to OBIS-USA
- OBIS-USA shares via IPT to OBIS and GBIF
Serving raw data via IOOS RA ERDDAP
Raw Gulf of Maine WBTS Calanus Abundance Observations available in ERDDAP: http://www.neracoos.org/erddap/tabledap/WBTS_CFIN_2004_2017.html
Aligning raw data to Darwin Core
This dataset was processed by Dylan Pugh during the 2022 Marine BioData Mobilization Workshop in the notebook linked below:
MBTS MBON process: https://github.com/ioos/bio_data_guide/tree/main/datasets/WBTS_MBON
Sending data to OBIS-USA
Data were submitted to OBIS-USA by contributing the Darwin Core aligned files (and code) to the ioos/bio-data-guide repository. See this GitHub Issue and subsequent Pull Requests here and here for more information on the conversion process.
The processed files were uploaded to the repository and OBIS-USA downloaded them for loading in the OBIS-USA IPT.
Data were published via the OBIS-USA IPT at: https://www1.usgs.gov/obis-usa/ipt/resource?r=gom_wbts_mesozooplankton.
Data were shared to OBIS at: https://obis.org/dataset/5ef55cd8-05a1-4569-8e17-ceb224e40f59
Data were shared to GBIF at: https://www.gbif.org/dataset/29651377-23c8-4f45-b439-693a1a23cee1
Sending data to NCEI
- Use ERDDAP’s
ArchiveADataset.sh
for the dataset http://www.neracoos.org/erddap/tabledap/WBTS_CFIN_2004_2017.html to create an archival package for submission to NCEI as a one-off via Send2NCEI.- To run as a one liner with a Docker deployed ERDDAP, use this
$ docker run --rm -it \ -v "$(pwd)/datasets:/datasets" \ -v "$(pwd)/logs:/erddapData/logs" \ -v "$(pwd)/erddap/content:/usr/local/tomcat/content/erddap" \ -v "$(pwd)/erddap/data:/erddapData" \ axiom/docker-erddap:latest \ bash -c "cd webapps/erddap/WEB-INF/ && bash ArchiveADataset.sh -verbose BagIt tar.gz default WBTS_CFIN_2004_2017 default "" "" .nc SHA-256"
- This will create a
.tar.gz
package and accompanying.tar.gz.sha256.txt
manifest file for the dataset. The.tar.gz.sha256.txt
is the manifest for the.tar.gz
file and can be used to verify the integrity of the package. - Inside the
.tar.gz
package is an appopriately formatted BagIt package, as defined by the Library of Congress. - The data file can be found in the
data/
directory of the.tar.gz
package as a netCDF file (which was defined when theArchiveADataset.sh
tool was ran).
- To run as a one liner with a Docker deployed ERDDAP, use this
- Create an account in NCEI’s Send2NCEI system: https://www.ncei.noaa.gov/archive/send2ncei/
- Log in and Create a New Submission Package.
- Populate the requested metadata fields for the dataset. Note that most of the requested metadata are already available in the ERDDAP metadata page, so copy and paste content from the record http://www.neracoos.org/erddap/info/WBTS_CFIN_2004_2017/index.html into the submission form.
- For example,
Start Date:
would map totime_coverage_start
.
- For example,
- Once the metadata is completed, upload the
tar.gz
and.tar.gz.sha256.txt
file and complete the package using the Upload and Submit button. - Record the Reference ID for your submission in case you need to reach out for questions.
- Check the Send2NCEI My Submission Packages page to see the status of the submission.
- Iterate with NCEI to answer any remaining questions they might have.
- Once archived and published, NCEI will provide an accession number and url to the dataset landing page.
Loading data into MBON Portal
Describe the process for loading data into MBON portal, including lessons learned. Include link to MBON portal layer.
Overall lessons learned
Include any overarching lessons learned here.