{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "import subprocess\n", "import sys\n", "\n", "COLAB = \"google.colab\" in sys.modules\n", "\n", "\n", "def _install(package):\n", " if COLAB:\n", " ans = input(f\"Install { package }? [y/n]:\")\n", " if ans.lower() in [\"y\", \"yes\"]:\n", " subprocess.check_call(\n", " [sys.executable, \"-m\", \"pip\", \"install\", \"--quiet\", package]\n", " )\n", " print(f\"{ package } installed!\")\n", "\n", "\n", "def _colab_install_missing_deps(deps):\n", " import importlib\n", "\n", " for dep in deps:\n", " if importlib.util.find_spec(dep) is None:\n", " if dep == \"iris\":\n", " dep = \"scitools-iris\"\n", " _install(dep)\n", "\n", "\n", "deps = [\"palettable\"]\n", "_colab_install_missing_deps(deps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Using r-obistools and r-obis to explore the OBIS database\n", "\n", "Created: 2018-02-20\n", "\n", "The [Ocean Biogeographic Information System (OBIS)](https://www.obis.org/) is an open-access data and information system for marine biodiversity for science, conservation and sustainable development.\n", "\n", "In this example we will use R libraries [`obistools`](https://iobis.github.io/obistools) and [`robis`](https://iobis.github.io/robis) to search data regarding marine turtles occurrence in the South Atlantic Ocean.\n", "\n", "Let's start by loading the R-to-Python extension and check the database for the 7 known species of marine turtles found in the world's oceans." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext rpy2.ipython" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "R[write to console]: 7 names, 0 without matches, 0 with multiple matches\n", "\n" ] } ], "source": [ "%%R -o matches\n", "\n", "library(obistools)\n", "\n", "\n", "species <- c(\n", " 'Caretta caretta',\n", " 'Chelonia mydas',\n", " 'Dermochelys coriacea',\n", " 'Eretmochelys imbricata',\n", " 'Lepidochelys kempii',\n", " 'Lepidochelys olivacea',\n", " 'Natator depressa'\n", ")\n", "\n", "matches = match_taxa(species, ask=FALSE)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
scientificNamescientificNameIDmatch_type
1Caretta carettaurn:lsid:marinespecies.org:taxname:137205exact
2Chelonia mydasurn:lsid:marinespecies.org:taxname:137206exact
3Dermochelys coriaceaurn:lsid:marinespecies.org:taxname:137209exact
4Eretmochelys imbricataurn:lsid:marinespecies.org:taxname:137207exact
5Lepidochelys kempiiurn:lsid:marinespecies.org:taxname:137208exact
6Lepidochelys olivaceaurn:lsid:marinespecies.org:taxname:220293exact
7Natator depressaurn:lsid:marinespecies.org:taxname:344093exact
\n", "
" ], "text/plain": [ " scientificName scientificNameID \\\n", "1 Caretta caretta urn:lsid:marinespecies.org:taxname:137205 \n", "2 Chelonia mydas urn:lsid:marinespecies.org:taxname:137206 \n", "3 Dermochelys coriacea urn:lsid:marinespecies.org:taxname:137209 \n", "4 Eretmochelys imbricata urn:lsid:marinespecies.org:taxname:137207 \n", "5 Lepidochelys kempii urn:lsid:marinespecies.org:taxname:137208 \n", "6 Lepidochelys olivacea urn:lsid:marinespecies.org:taxname:220293 \n", "7 Natator depressa urn:lsid:marinespecies.org:taxname:344093 \n", "\n", " match_type \n", "1 exact \n", "2 exact \n", "3 exact \n", "4 exact \n", "5 exact \n", "6 exact \n", "7 exact " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matches" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We got a nice DataFrame back with records for all 7 species of turtles and their corresponding `ID` in the database.\n", "\n", "Now let us try to obtain the occurrence data for the South Atlantic. We will need a vector geometry for the ocean basin in the [well-known test (WKT)](https://en.wikipedia.org/wiki/Well-known_text) format to feed into the `robis` `occurrence` function.\n", "\n", "In this example we converted a South Atlantic shapefile to WKT with geopandas, but one can also obtain geometries by simply drawing them on a map with [iobis maptool](https://obis.org/maptool)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "import geopandas\n", "\n", "fname = Path(\"..\", \"data\", \"oceans.shp\")\n", "\n", "gdf = geopandas.read_file(fname)\n", "\n", "sa = gdf.loc[gdf[\"Oceans\"] == \"South Atlantic Ocean\"][\"geometry\"].loc[0]\n", "\n", "atlantic = sa.wkt" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Retrieved 5000 records of approximately 5620 (88%)\n", "Retrieved 5620 records of approximately 5620 (100%)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " [1] \"date_year\" \"scientificNameID\" \n", " [3] \"scientificName\" \"dynamicProperties\" \n", " [5] \"superfamilyid\" \"individualCount\" \n", " [7] \"associatedReferences\" \"dropped\" \n", " [9] \"aphiaID\" \"decimalLatitude\" \n", " [11] \"type\" \"taxonRemarks\" \n", " [13] \"phylumid\" \"familyid\" \n", " [15] \"catalogNumber\" \"occurrenceStatus\" \n", " [17] \"basisOfRecord\" \"superclass\" \n", " [19] \"modified\" \"id\" \n", " [21] \"order\" \"recordNumber\" \n", " [23] \"georeferencedDate\" \"superclassid\" \n", " [25] \"verbatimEventDate\" \"dataset_id\" \n", " [27] \"decimalLongitude\" \"collectionCode\" \n", " [29] \"date_end\" \"speciesid\" \n", " [31] \"occurrenceID\" \"superfamily\" \n", " [33] \"suborderid\" \"license\" \n", " [35] \"date_start\" \"organismID\" \n", " [37] \"genus\" \"dateIdentified\" \n", " [39] \"ownerInstitutionCode\" \"bibliographicCitation\" \n", " [41] \"eventDate\" \"scientificNameAuthorship\" \n", " [43] \"absence\" \"taxonRank\" \n", " [45] \"genusid\" \"originalScientificName\" \n", " [47] \"marine\" \"subphylumid\" \n", " [49] \"vernacularName\" \"institutionCode\" \n", " [51] \"date_mid\" \"identificationRemarks\" \n", " [53] \"class\" \"suborder\" \n", " [55] \"nomenclaturalCode\" \"orderid\" \n", " [57] \"datasetName\" \"geodeticDatum\" \n", " [59] \"taxonomicStatus\" \"kingdom\" \n", " [61] \"waterBody\" \"specificEpithet\" \n", " [63] \"classid\" \"phylum\" \n", " [65] \"species\" \"coordinatePrecision\" \n", " [67] \"organismRemarks\" \"subphylum\" \n", " [69] \"datasetID\" \"occurrenceRemarks\" \n", " [71] \"family\" \"category\" \n", " [73] \"kingdomid\" \"node_id\" \n", " [75] \"flags\" \"sss\" \n", " [77] \"shoredistance\" \"sst\" \n", " [79] \"bathymetry\" \"coordinateUncertaintyInMeters\"\n", " [81] \"eventTime\" \"sex\" \n", " [83] \"footprintWKT\" \"lifeStage\" \n", " [85] \"wrims\" \"references\" \n", " [87] \"year\" \"language\" \n", " [89] \"day\" \"locality\" \n", " [91] \"month\" \"samplingProtocol\" \n", " [93] \"eventID\" \"startDayOfYear\" \n", " [95] \"accessRights\" \"country\" \n", " [97] \"habitat\" \"municipality\" \n", " [99] \"stateProvince\" \"behavior\" \n", "[101] \"recordedBy\" \"maximumDepthInMeters\" \n", "[103] \"georeferenceRemarks\" \"minimumElevationInMeters\" \n", "[105] \"maximumElevationInMeters\" \"minimumDepthInMeters\" \n", "[107] \"depth\" \"continent\" \n", "[109] \"fieldNotes\" \"rightsHolder\" \n", "[111] \"associatedMedia\" \"taxonConceptID\" \n", "[113] \"organismQuantity\" \"organismQuantityType\" \n", "[115] \"fieldNumber\" \"eventRemarks\" \n", "[117] \"preparations\" \"identifiedBy\" \n", "[119] \"typeStatus\" \"otherCatalogNumbers\" \n", "[121] \"locationID\" \n" ] } ], "source": [ "%%R -o turtles -i atlantic\n", "library(robis)\n", "\n", "\n", "turtles = occurrence(\n", " species,\n", " geometry=atlantic,\n", ")\n", "\n", "names(turtles)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'Caretta caretta',\n", " 'Chelonia mydas',\n", " 'Dermochelys coriacea',\n", " 'Eretmochelys imbricata',\n", " 'Lepidochelys kempii',\n", " 'Lepidochelys olivacea'}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "set(turtles[\"scientificName\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that there are no occurrences for *Natator depressa* (Flatback sea turtle) in the South Atlantic.\n", "The Flatback sea turtle can only be found in the waters around the Australian continental shelf.\n", "\n", "With `ggplot2` we can quickly put together a of occurrences over time." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeAAAAHgCAMAAABKCk6nAAACtVBMVEUAAAAEBAQMDAwNDQ0PDw8QEBARERETExMUFBQYGBgdHR0eHh4fHx8feLQhISEiIiIkJCQnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzMzoCw0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tMTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Pj4+QkJCRkZGSkpKTk5OUlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqamzuOnp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKy34qzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7u7u9vb2+vr6/v7/AwMDBwcHDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f39/g4ODh4eHi4uLk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7mpn7+/v8/Pz9/f3+/v7///9ilTccAAAUJ0lEQVR4nO3d/4MU5X3AcVubNrZG09Q2Uk2iIiIqRoxnwfgNKwgKMVjFYGwNQYGg5x2guUCji1ys6KoRg4BfTsXkUDy1oEX8UmlajYlGCAd3x+3e7t65e/f8Hd2Z3Zt5Pns3z+2st89+uff7h+3jPDO7D7zC9mBmdo9RVNcdU+kFUHkDuM4DuM4DuM4LA5yIy/r74oaSpsl4qvDJRAnzMxufus/8zCnjscZnTvSbZrVfcKJsXOELA9zVKct0dxrqM0129veaZnsLX0qUMD51z1HTbDxlmj0SN80eTZtmOxPeqKtsXOEDWA9gEcAigPUAthTAegCLABYBrAewpQDWA1gEsAhgPYAtBbAewCKARQDrAWwpgPUAFgEsAlgPYEsBrAewCGARwHoAWwpgPYBFExH4l26jL8sbAVzrwKMyA6wHsKUA1gNYBLBcljcCGGBrAawHsGhiAX/FDWAvgKsigPUAFgEsl+WNAAbYWgDrASwCWC7LGwEMsLUA1gNYBLBcljcCGGBrAawHsAhguSxvBDDA1gJYD2ARwHJZ3ghggK0FsB7AIoDlsrwRwABbKwxwT5cs09tlKGma7BqIm2bjR02zKeNT98ZMs339ptnuRPDcSOCCHfxV9ZSNK3xhgOMx2WAiZqjfNBlLJ02zyT7T7IDxqRPGZ06lTbPxVPDcSOCCHfxVxcvGFT7eovUm+Fs0wABrAVwVAawHsAhguSxvBDDA1gJYD2DRRAR+zg3gToCrJID1ABYBLJfljQAG2FoA6wEsAlguyxsBDLC1ANYDWASwXJY3AhhgawGsB7AIYLksbwQwwNYCWA9gEcByWd4IYICtBbAewCKA5bK8EcAAWwtgPYBFAMtleSOAAbYWwHoAiwCWy/JGAANsLYD1ABYBLJfljQAG2FoA6wEsmrjAuXHhsrwRwABbC2A9gEUAy2V5I4ABthbAegCLAJbL8kYAA2wtgPUAFgEsl+WNAAbYWgDrASwCWC7LGwEMsLUA1gNYBLBcljcCGGBrAaxXLHCuwmV5I4ABthbAegCLAJbL8kYAA2wtgPUAFgEsl+WNAAbYWgDrASwCWC7LGwEMsLUA1gNYBLBcljcCGGBrAawHsAhguSxvBDDA1hoFeHDrXKX6Vq9pyfgP7gTA9QHc9f5tSj39itryhv+Qmyj4FQEsl+WNqhxYqSzw+k/Uvsf9B6VeWb/+s4RsKJUwNGCaTGT6TbP9SdNsOm2aTZlXlTHNJg2LHglcuCx/aFvRkAF47xP+A8D1Bty2U23e7T+4m3mLro+36P2NVze+kVjdtG7Qf3AnAK4P4MAABlgL4KoIYD2ARQDLZXkjgAG2FsB6AIsAlsvyRgADbC2A9QAWASyX5Y0ABthaAOsBLAJYLssbAQywtQDWA1gEsFyWNwIYYGsBrAewCGC5LG8EMMDWAlgPYBHAclneCGCArQWwHsAigOWyvBHAAFsLYD2ARQDLZXkjgAG2FsB6AIsAlsvyRgADbC2A9QAWASyX5Y0ABthaAOsBLAJYLssbAQywtQDWA1gEsFyWNwIYYGsBrAewCGC5LG8EMMDWAlgPYBHAclneyAi8a7b4z8ecDRf/47P5rQeO+UX28fhYSMbgANaz/ye479Tsw8Cfe/994MQTugCuOeA/fXvayRGllp9xakS1N6imc85dMLDrwpsunxqbe+y89oYFfzZja0Nu9sCpa290gDPfmzL5ZrXrgpvPXbh23umduUMAdqpG4PtvUamfqV1TB3vPi7c3dFw0pG5o7firmLryV/snZcWTf5F9yM0emDTw929ngQ/eq9SkfR3HpTLH7lDf35Q7BGCnagT+3xOvezKlmpc54/aG5uOnTfvG8o5pSi3Z4APnZg9k/3vK0PGx9I8vmX1ce8f52T/NB9XtP8sdArBTNQKr1K8XnTrUtDQHvPpW5/92TM8CR3zg3GwWWF266YRY63cy6sz2jhlZ4E51e0vuEICdqhH4qd0q8+XeXaelk2d1tTe8dlJS3bM3B/zbr3lv0e6sA/yHE/46dvdi9faXXvCAc4cA7FSNwP991rQz12R/yDrtFPeHrObTp85J5ICTf3dOHjg36wCru46JfXzK9CVrv759GDh3CMBO1QhcwQDWA1gEsFyWNwIYYGsBrAewCGC5LG8EMMDWAlivXMC9vyyodxwJzQGsB7AIYLksbwQwwONIaA5gPYBFAMtleaOigR89r+G81/L7DLmP+1r2tYz8nTfNjRHAepaB9347pX5/x6GLr5mduf+qFUcuXzh/YM6Uk6fsdDYodfjy794SNAdwUFUF3Py48/jB62ru+63LVPMT6p6tW1q2tLgblGrapjbtC5gDOKiqAl4VdR4/un7JpLdaN6pFF8+/5CEH0d2g1KI3g+cADqqqgPecFVcHFiz+jbpwT2urat6q/tj71E+f+qm7Ifvne7Naf23AHMBBVRWwenDyBef/17bpS1bOzyIeuWL+rEPvnbzh5H91NmT/f/Al310SNFce4FivbLCv11C/abI3nTTNJuPGZzY+dV/CNJsaMM3GUsFzI4ELl+U/TbHAFgoDfLRbNhjrNpQ0TXYP9Jlm+3pNsynjU8fiptlEv2m2JxE8NxK4YAd/VUcLf+dqBJi3aP6hQwtgP4ALAnjcAlivbMAjGkdCcwDrASwCWC7LG40E/kpBAJtmAQ4RwHoAiwCWy/JG1Qg8z3k4x7wvwGUBbvXu3I9E/a3bXhrVoLmtSNjhcsCPnXrspEmTvnaSeV+AxxvYvaJjdOCASgRW8bn79+//bdK8L8DjDJy7oqN19r9MOeJeshGJdl664Mq+B+Ytn34kEj2Uu3jjgfm3//PaeRun9Kjrt5+/eGabu/nVmVf/WyhgFfvVhkgkAnDwXBmAc1d0tC5Tq553L9mIRJs2q3WbWlepnzwfieYv3mhtVtP/Jzaj5ReZs1duVwva3M3LN6l3wgFPvXJJNoCD58oAnLuiI/sWfc8W95KNSHTRm+qZu9wNkWj+4o3Wh1RDd3ran85/5c7r31Kr2tzNPT+cuj4c8GVF7AvwOAPnruhwPd1LNiLR5s2qJZoHzl+8kQdWM6/48I7tam6bu3l3/+DkVCjgxXGAg4B1mfH9Icu9osP1dC/ZiEQPX7ZgTioPvC138cYw8JOT1UdnL5z1jLt520XX3FSUrwd84ZenTZ8+HeDRtpYPOExPbAh/jPKBd3U4mfcFuILA915V3FtyYcPAUaeHzfsCXMP/krVw4cL5J11h3hfgGgZ2ytxo3hfgGgdWM837AlzDJ/xnZTvzYvO+AJcM/FxB9oF37Njx0p5B874A1zBwauOiGx4e475EgGsY+NrZkcilN5v3BbiGgc92Hs417wuwVeCxzxMHXBQgGgY+M539a9LZ5n0BHmfgHV+d1fCD4ZMAQ6GBRxxhAl5x+tKlZ6wCeLSt5QO+VamNN7vn+u+/akX+zL57zt/5+Ab9QgDndH+Le5LfmXF3yR6RvyjA2eTud2iUz3fw/h78clPzq2P8jwHgMgAPneye629dNnxm3z3n73x8w8+1CwGc0/25k/zOzEr3soBlKn9RgLPpJ85+o32+gwecfacY6yMWAS4DcOYf3HP9rRuHzwu65/ydj2/QLwRwTvfnTvI7M7nLAjaq/EUB7iZnv9E+32EY+L7zlJqyEeDRtpYVeO0P3XP9Wc08sHvO3/n4hhbtQgDndH/uJL8zc3v+soD8RQHuJme/0T7fYRj4lOxLJk4BeLSt5fwh64IbEu65fh/YPefvfHyDfiGAc7o/d5LfmRm+LCB/UYCzyd1v2yif7zAMfMagUunTAB5tazX8PbjE0/0+8PKpy5Z+a7l5X4ArBlzq6X7th6xdTc27xtgX4Br+l6xiAhhgLYD9AC6o7oArf8K/mAAuGfibBQFsmgU4RADrASwCWC7LGwEMcFG/4WPc+z+80XjeH2A9y8A7TmxoaPhP/ysZigNWBRv18/4jnwNgPdvAtzqP/lcyLC7qlH8kmr80IHLZ9y/ocs/7e1/vMGfElzoArGcb+KuzZs2K+1/JUNwp/0g0v19kmbpvo3ve3/t6h5Ff6gCwXkX+BPtfyVDcKf8scG6/yH+oZ+90z/t7X+8w8ksdANarDLD3lQzFnfL3gVeo+x90Twt7X+8w8ksdANarxA9Zm/2vZLiuqFP+HvB9c2+8sNsF9r7e4b0RX+oAsF6V/TVJ9AXv8C8mgCsI/EXv8C8mgPmXLC2A/QAuCOBxC2C9cMC579fwl+WNOOEP8DgSmgNYr2zAnxcEsGkW4BABrAewCODxBXZP3Gsf6W44HRyJFnN3vxPAepaBcycbZMUCF4sGsF4lgN0T+j+fd9s/HYlE/+B8Zv8Y5/vd6Ug0uymurttR/B3+AHdWBjh3Qn+F2vhAJOp+Zv9Yt/g70w7w6qfSk0Pc4Q9wZ2WAcyf0H1LP3hmJup/ZP/Yt/s/c5QD/7pr2FcXf4Q+wU0X+BLsn9Jer+x+MRN3P7B/7Fv+WqAOsZiz6oPg7/AF2sg18YkPDfbkT+nNuurA7EnU/s3+sW/ydaRf43uymou/wB9ipYn9NGvun4lJv8QdYr3qBS77FH2A9/iVLBHANA+9Z3Nh4qG/1mpaM++BuA7iOgF9uzz48/Yra8ob74G4DuI7OB7etXBsZWP+J2ve4+6DU9h/96NOUbGggZShjmkwNpk2z6X7jMxufesC4qrTx2P5RV2UG9pflH2GNb+yCgA93qq0vZm33PuE+KLVv69aDMdlgImao3zQZSydNs8k+0+yA8akTxmdOpU2z8dRoW83A/rL8p7FKaC4I+N1P1YvPt+1Um3e7D+423qJLfYuuYEHAHy+9uymRWN20btB9cLcBXEfAowUwwFoAV0UA6wEsAhjgwAC2FMB6AIsABjgwgC0FsB7AIoABDgxgSwGsB7AIYIADA9hSAOsBLAIY4MAAthTAegCLAAY4MIAtBbAewCKAAQ4MYEsBrAewCGCAAwPYUgDrASwCGODAALYUwHoAiwAGOLB6AM5t8ZfljQAG2FoA6wEsAhjgwOoBOJe/LG8EMMDWAlgPYBHAAAcGsKUA1gNYBDDAgQFsKYD1ABYBDHBgAFsKYD2ARQADHBjAlgJYD2ARwAAHBrClANYDWAQwwIEBbCmA9QAWAQxwYABbCmA9gEUAAxwYwJYCWA9gEcAABwawpQDWA1gEMMCBAWwpgPUAFgEMcGAAWwpgvQkO3BeXDSbihgZMk/FMyjSbMj+z8akTSdNsf9o029c/2lYzsL8s/2nKxhW+MMDdh2WZnsOGEqbJw/0x02ys8KVESeNT9/SaZuMp02xXfLStZmBvN39V3WXjCh9v0XoT/C0aYIC1AK6KANYDWAQwwIEBbCmA9QAWAQxwYABbCmA9gEUAAxwYwJYCWG9s4G+6AdwJcJUEsB7AIoABDgxgSwGsB7AIYIADA9hSAOsBLAIY4MAAthTAegCLAAY4MIAtBbBescCfuwFsCGBLAawHsAhggAMD2FIA6wEsmljAOU6AvQCuigDWA1gEMMCBAWwpgPUAFgEMcGAAWwpgPYBFAAMcGMCWAlgPYBHAAAcGsKUA1gsHnMtfljcCGGBrAawHsAhggAMD2FIA6wEsmojAn4/IX5Y3Ari2gJ/TAtgL4KoIYD2ARQADHBjAlgJYD2ARwAAHBrClANYDWAQwwIEBbCmA9QAWAQxwYABbCmC9UoBzJxM7AZYBbCmA9QAWAVwXwH2r17Rk3BHAdQn89CtqyxvuaOIB61dj1S3w+k/UvseVisyc+fuMXu4XnAlsKHjKmR00zQ4aZ4eMT20+dtB8bMGs+XI7HTij/4IHLMAVW1HAe59Q6sAHH3R2ywZj3YaSpsnugT7TbN9R02zK+NSxuGk20W+aPZowzcYzptnulP80FuCKbWzgtp1q8253NPHeomVH06bZmn2LTqxuWjfojgA2zdYssB/AplmARQBbCmA9gEUAiwDWA9hSAOsBLAJYBLAewJYCWA9gEcAigPUAthTAegCLABYBrAewpQDWA1gEsKj2gXu6ZJ92dhnqM012fXbINBsrfCnRwYOm2aO9ptlDn5lmu42L7vyjabYr4Y16ysYVvjDAhV36bunHLnmh9GPvfaD0YzffUfqxr88v/diKBXDxTTjgyKelH/vke6Uf++tXSz/2radLP/bDh0o/tmJ9EWCqgQCu80oAHtw6V6nPVq1r7HFvW/LvXQp17J7FjY2Hwh/buWrN+qESX9c9Nvzr9qxqXtXnv2SoYytfCcBd79+m1Eu/UY++5d625N+7FOrYl9uVft9Tscc+8qZ67J0SX9c9NvzrvrNXPbrHf8lQx1a+kt6is79ZyVubV2bc25by9y6FPbZt5drIQPhjn9ypHm4r8XXdY0t53fSqbv8lQx5b6UoFfvJVtX2ne9tS/t6lsMce7lRbXwx/bHzdfRteKvF13WNLeN2eNf+n/JcM+bqVrlTgB/eq9qfd25b8e5dCHfvup+rF58Mfe+AT9e8flvi67rHhX7f3rs78HVqlvG6lKwF4f+PVjW8cvGt9c69725J/71KoYz9eendTIvyxh5c1bVIlvq57bPjXfewHjY27/ZcMdWzl469JdR7AdR7AdR7AdR7AdR7AdV4dAdfQ310sVpvAU3YpNeNF1XTOuQsGMt+bMvlmtfOic7/AZQB1XG0CP3KtOnJCpuOiIXVD68F7lZq0r+Mvj1R6UdVZbQIn/ybeeodqPn7atG8sT//4ktnHtXecVek1VWm1CaxueeSC36nVtzrD1u9k1JntHdMrvaQqrUaB958xQ6nXTkqqe/bevVi9/aUXAA6oRoHVac4Zu+bTp85JfHzK9CVrv74d4NGrUeCP/raaPvCzmqtN4MZJr1d6CbVSbQJT0QFc5wFc5wFc5wFc5/0/RUiJ6rumEkMAAAAASUVORK5CYII=\n" }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "\n", "turtles$year <- as.numeric(format(as.Date(turtles$eventDate), \"%Y\"))\n", "table(turtles$year)\n", "\n", "library(ggplot2)\n", "\n", "ggplot() +\n", " geom_histogram(\n", " data=turtles,\n", " aes(x=year, fill=scientificName),\n", " binwidth=5) +\n", " scale_fill_brewer(palette='Paired')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One would guess that the 2010 count increase would be due to an increase in the sampling effort, but the drop around 2010 seems troublesome. It can be a real threat to these species, or the observation efforts were defunded.\n", "\n", "To explore this dataset further we can make use of the `obistools`' R package. `obistools` has many visualization and quality control routines built-in. Here is an example on how to use `plot_map` to quickly visualize the data on a geographic context." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "R[write to console]: \n", "Attaching package: ‘dplyr’\n", "\n", "\n", "R[write to console]: The following objects are masked from ‘package:stats’:\n", "\n", " filter, lag\n", "\n", "\n", "R[write to console]: The following objects are masked from ‘package:base’:\n", "\n", " intersect, setdiff, setequal, union\n", "\n", "\n" ] }, { "data": { "image/png": "\n" }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "\n", "library(dplyr)\n", "\n", "coriacea <- turtles %>% filter(species=='Dermochelys coriacea')\n", "plot_map(coriacea, zoom=TRUE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, if we want to create a slightly more elaborate map with clusters and informative pop-ups, can use the python library `folium`.instead." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "import folium\n", "from pandas import DataFrame\n", "\n", "\n", "def filter_df(df):\n", " return df[[\"institutionCode\", \"individualCount\", \"sex\", \"eventDate\"]]\n", "\n", "\n", "def make_popup(row):\n", " classes = \"table table-striped table-hover table-condensed table-responsive\"\n", " html = DataFrame(row).to_html(classes=classes)\n", " return folium.Popup(html)\n", "\n", "\n", "def make_marker(row, popup=None):\n", " location = row[\"decimalLatitude\"], row[\"decimalLongitude\"]\n", " return folium.Marker(location=location, popup=popup)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from folium.plugins import MarkerCluster\n", "\n", "species_found = sorted(set(turtles[\"scientificName\"]))\n", "\n", "clusters = {s: MarkerCluster() for s in species_found}\n", "groups = {s: folium.FeatureGroup(name=s) for s in species_found}" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
date_yearscientificNameIDscientificNamedynamicPropertiessuperfamilyidindividualCountassociatedReferencesdroppedaphiaIDdecimalLatitude...taxonConceptIDorganismQuantityorganismQuantityTypefieldNumbereventRemarkspreparationsidentifiedBytypeStatusotherCatalogNumberslocationID
12012urn:lsid:marinespecies.org:taxname:137209Dermochelys coriaceaMachineObservation9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Robinson, ...0137209-33.500000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
21998urn:lsid:marinespecies.org:taxname:137206Chelonia mydasNone9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P....0137206-7.226000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
32014urn:lsid:marinespecies.org:taxname:137205Caretta carettaMachineObservation9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Coyne, M. ...0137205-29.500000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
42015urn:lsid:marinespecies.org:taxname:220293Lepidochelys olivaceaMachineObservation9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Coyne, M. ...0220293-14.500000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
5-2147483648urn:lsid:marinespecies.org:taxname:137206Chelonia mydasNone987094NoneNone0137206-3.883472...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
..................................................................
56162003urn:lsid:marinespecies.org:taxname:137209Dermochelys coriaceaNone9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P....0137209-32.194000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
56171998urn:lsid:marinespecies.org:taxname:137206Chelonia mydasNone9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P....0137206-8.895000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
56182003urn:lsid:marinespecies.org:taxname:137209Dermochelys coriaceaNone9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P....0137209-35.069000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
56192006urn:lsid:marinespecies.org:taxname:137209Dermochelys coriaceaMachineObservation9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Coyne, M. ...0137209-30.500000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
56201996urn:lsid:marinespecies.org:taxname:137209Dermochelys coriaceaNone9870941[{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P....0137209-39.724000...NoneNoneNoneNoneNoneNoneNoneNoneNoneNone
\n", "

5620 rows × 121 columns

\n", "
" ], "text/plain": [ " date_year scientificNameID \\\n", "1 2012 urn:lsid:marinespecies.org:taxname:137209 \n", "2 1998 urn:lsid:marinespecies.org:taxname:137206 \n", "3 2014 urn:lsid:marinespecies.org:taxname:137205 \n", "4 2015 urn:lsid:marinespecies.org:taxname:220293 \n", "5 -2147483648 urn:lsid:marinespecies.org:taxname:137206 \n", "... ... ... \n", "5616 2003 urn:lsid:marinespecies.org:taxname:137209 \n", "5617 1998 urn:lsid:marinespecies.org:taxname:137206 \n", "5618 2003 urn:lsid:marinespecies.org:taxname:137209 \n", "5619 2006 urn:lsid:marinespecies.org:taxname:137209 \n", "5620 1996 urn:lsid:marinespecies.org:taxname:137209 \n", "\n", " scientificName dynamicProperties superfamilyid \\\n", "1 Dermochelys coriacea MachineObservation 987094 \n", "2 Chelonia mydas None 987094 \n", "3 Caretta caretta MachineObservation 987094 \n", "4 Lepidochelys olivacea MachineObservation 987094 \n", "5 Chelonia mydas None 987094 \n", "... ... ... ... \n", "5616 Dermochelys coriacea None 987094 \n", "5617 Chelonia mydas None 987094 \n", "5618 Dermochelys coriacea None 987094 \n", "5619 Dermochelys coriacea MachineObservation 987094 \n", "5620 Dermochelys coriacea None 987094 \n", "\n", " individualCount associatedReferences \\\n", "1 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Robinson, ... \n", "2 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P.... \n", "3 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Coyne, M. ... \n", "4 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Coyne, M. ... \n", "5 None None \n", "... ... ... \n", "5616 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P.... \n", "5617 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P.... \n", "5618 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P.... \n", "5619 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Coyne, M. ... \n", "5620 1 [{\"crossref\":{\"citeinfo\":{\"origin\":\"Luschi, P.... \n", "\n", " dropped aphiaID decimalLatitude ... taxonConceptID organismQuantity \\\n", "1 0 137209 -33.500000 ... None None \n", "2 0 137206 -7.226000 ... None None \n", "3 0 137205 -29.500000 ... None None \n", "4 0 220293 -14.500000 ... None None \n", "5 0 137206 -3.883472 ... None None \n", "... ... ... ... ... ... ... \n", "5616 0 137209 -32.194000 ... None None \n", "5617 0 137206 -8.895000 ... None None \n", "5618 0 137209 -35.069000 ... None None \n", "5619 0 137209 -30.500000 ... None None \n", "5620 0 137209 -39.724000 ... None None \n", "\n", " organismQuantityType fieldNumber eventRemarks preparations \\\n", "1 None None None None \n", "2 None None None None \n", "3 None None None None \n", "4 None None None None \n", "5 None None None None \n", "... ... ... ... ... \n", "5616 None None None None \n", "5617 None None None None \n", "5618 None None None None \n", "5619 None None None None \n", "5620 None None None None \n", "\n", " identifiedBy typeStatus otherCatalogNumbers locationID \n", "1 None None None None \n", "2 None None None None \n", "3 None None None None \n", "4 None None None None \n", "5 None None None None \n", "... ... ... ... ... \n", "5616 None None None None \n", "5617 None None None None \n", "5618 None None None None \n", "5619 None None None None \n", "5620 None None None None \n", "\n", "[5620 rows x 121 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "turtles" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = folium.Map()\n", "\n", "for turtle in species_found:\n", " df = turtles.loc[turtles[\"scientificName\"] == turtle]\n", " for k, row in df.iterrows():\n", " popup = make_popup(filter_df(row))\n", " make_marker(row, popup=popup).add_to(clusters[turtle])\n", " clusters[turtle].add_to(groups[turtle])\n", " groups[turtle].add_to(m)\n", "\n", "\n", "m.fit_bounds(m.get_bounds())\n", "folium.LayerControl().add_to(m)\n", "\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can get fancy and use shapely to \"merge\" the points that are on the ocean and get an idea of migrations routes." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "\n" }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R -o land\n", "\n", "land <- check_onland(turtles)\n", "\n", "plot_map(land, zoom=TRUE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First let's remove the entries that are on land." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "turtles.set_index(\"id\", inplace=True)\n", "land.set_index(\"id\", inplace=True)\n", "mask = turtles.index.isin(land.index)\n", "ocean = turtles[~mask]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use shapely's buffer to \"connect\" the points that are close to each other to visualize a possible migration path." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from palettable.cartocolors.qualitative import Bold_6\n", "from shapely.geometry import MultiPoint\n", "\n", "colors = {s: c for s, c in zip(species_found, Bold_6.hex_colors)}\n", "style_function = lambda color: (\n", " lambda feature: dict(color=color, weight=2, opacity=0.6)\n", ")\n", "\n", "m = folium.Map()\n", "\n", "for turtle in species_found:\n", " df = ocean.loc[ocean[\"scientificName\"] == turtle]\n", " positions = MultiPoint(\n", " list(zip(df[\"decimalLongitude\"].values, df[\"decimalLatitude\"].values))\n", " ).buffer(distance=2)\n", " folium.GeoJson(\n", " positions.__geo_interface__,\n", " name=turtle,\n", " tooltip=turtle,\n", " style_function=style_function(color=colors[turtle]),\n", " ).add_to(m)\n", "\n", "m.fit_bounds(m.get_bounds())\n", "folium.LayerControl().add_to(m)\n", "\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One interesting feature of this map is *Dermochelys coriacea*'s migration between Brazilian and African shores.\n", "\n", "More information on [*Dermochelys coriacea*](https://www.iucnredlist.org/species/6494/43526147) and the other Sea Turtles can be found in the species [IUCN red list](https://www.iucnredlist.org/)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 2 }