Introduction to Darwin Core


  • Darwin Core isn’t difficult to apply, it just takes a little bit of time.
  • Using Darwin Core allows datasets from across projects, organizations, and countries to be integrated together.
  • Applying certain general principles to the data will make it easier to map to Darwin Core.
  • Implementing Darwin Core makes data FAIR-er and means becoming part of a community of people working together to understand species no matter where they work or are based.

Social Break


Data Cleaning


  • When doing conversions it’s best to break out your data into it’s component pieces.
  • Dates are messy to deal with. Some packages have easy solutions, otherwise use regular expressions to align date strings to ISO 8601.
  • WoRMS LSIDs are a requirement for OBIS.
  • Latitude and longitudes are like dates, they can be messy to deal with. Take a similar approach.

Darwin Core and Extension Schemas


  • Darwin Core uses cores and extensions to model the multitude of biological observation data that exists.
  • OBIS uses the Event (or Occurrence) Core with the Extended Measurement or Fact extension to make sure no information is lost.
  • Additional fields are required and put into different files when using a Core with the Extended Measurement or Fact extension.
  • ID fields are important keys in your data and we recommend building them from the information in your data.

Social Break


QA/QC


  • Several packages (e.g. obistools, Hmisc, pandas) can be used to QA/QC data.

Metadata and publishing


  • The IPT is a well-documented and flexible system for publishing data to OBIS
  • Some Darwin Core and Ecological Metadata Language fields are required for publishing to OBIS.
  • Strive to write more than the minimal metadata

Continuing the Conversation


  • The Standardizing Marine Bio Data (SBMD) group is available to help.
  • The SMBD meets monthly and you are welcome to join.
  • The SMBD github issue tracker is the best place to reach out for help.