Introduction to Darwin Core
- Darwin Core isn’t difficult to apply, it just takes a little bit of time.
- Using Darwin Core allows datasets from across projects, organizations, and countries to be integrated together.
- Applying certain general principles to the data will make it easier to map to Darwin Core.
- Implementing Darwin Core makes data FAIR-er and means becoming part of a community of people working together to understand species no matter where they work or are based.
Social Break
Data Cleaning
- When doing conversions it’s best to break out your data into it’s component pieces.
- Dates are messy to deal with. Some packages have easy solutions, otherwise use regular expressions to align date strings to ISO 8601.
- WoRMS LSIDs are a requirement for OBIS.
- Latitude and longitudes are like dates, they can be messy to deal with. Take a similar approach.
Darwin Core and Extension Schemas
- Darwin Core uses cores and extensions to model the multitude of biological observation data that exists.
- OBIS uses the Event (or Occurrence) Core with the Extended Measurement or Fact extension to make sure no information is lost.
- Additional fields are required and put into different files when using a Core with the Extended Measurement or Fact extension.
- ID fields are important keys in your data and we recommend building them from the information in your data.
Social Break
QA/QC
- Several packages (e.g. obistools, Hmisc, pandas) can be used to QA/QC data.
Metadata and publishing
- The IPT is a well-documented and flexible system for publishing data to OBIS
- Some Darwin Core and Ecological Metadata Language fields are required for publishing to OBIS.
- Strive to write more than the minimal metadata
Continuing the Conversation
- The Standardizing Marine Bio Data (SBMD) group is available to help.
- The SMBD meets monthly and you are welcome to join.
- The SMBD github issue tracker is the best place to reach out for help.