Searching datasets¶
erddapy can wrap the same form-like search capabilities of ERDDAP with the search_for keyword.
[1]:
from erddapy import ERDDAP
e = ERDDAP(
server="https://pae-paha.pacioos.hawaii.edu/erddap",
protocol="griddap",
)
Single word search.
[2]:
import pandas as pd
search_for = "etopo"
url = e.get_search_url(search_for=search_for, response="csv")
pd.read_csv(url)["Dataset ID"]
[2]:
0 etopo1_bedrock
1 etopo1_bedrock_lon360
2 etopo1_ice
3 etopo1_ice_lon360
4 etopo5
5 etopo5_lon180
Name: Dataset ID, dtype: object
Filtering the search with extra words.
[3]:
search_for = "etopo5"
url = e.get_search_url(search_for=search_for, response="csv")
pd.read_csv(url)["Dataset ID"]
[3]:
0 etopo5
1 etopo5_lon180
Name: Dataset ID, dtype: object
Filtering the search with words that should not be found.
[4]:
search_for = "etopo5 -lon360"
url = e.get_search_url(search_for=search_for, response="csv")
pd.read_csv(url)["Dataset ID"]
[4]:
0 etopo5
1 etopo5_lon180
Name: Dataset ID, dtype: object
Quoted search or “phrase search,” first let us try the unquoted search.
[5]:
search_for = "ocean bathymetry"
url = e.get_search_url(search_for=search_for, response="csv")
len(pd.read_csv(url)["Dataset ID"])
[5]:
69
Too many datasets because wind, speed, and wind speed are matched. Now let’s use the quoted search to reduce the number of results to only wind speed.
[6]:
search_for = '"ocean bathymetry"'
url = e.get_search_url(search_for=search_for, response="csv")
len(pd.read_csv(url)["Dataset ID"])
[6]:
6
Another common search operation would be to search multiple servers instead of only one. In erddapy we can achieve that with search_servers
:
[7]:
from erddapy.multiple_server_search import search_servers
df = search_servers(
query="glider",
servers_list=None,
parallel=True,
protocol="tabledap",
)
[8]:
print(f"There are {len(df)} entries in this search!")
There are 3849 entries in this search!
These are the servers that have glider data according to our query.
[9]:
set(df["Server url"])
[9]:
{'http://erddap.cencoos.org/erddap/',
'http://erddap.secoora.org/erddap/',
'https://cwcgom.aoml.noaa.gov/erddap/',
'https://erddap-goldcopy.dataexplorer.oceanobservatories.org/erddap/',
'https://erddap.griidc.org/erddap/',
'https://erddap.observations.voiceoftheocean.org/erddap/',
'https://erddap.sensors.ioos.us/erddap/',
'https://gliders.ioos.us/erddap/',
'https://www.smartatlantic.ca/erddap/'}
One way to reduce is to search a subset of the servers with the servers_list
argument. We can also use it to search servers that are not part of the awesome ERDDAP list (https://github.com/IrishMarineInstitute/awesome-erddap).
One can also perform an advanced search with ERDDAP constraints advanced_search_servers
.
[10]:
from erddapy.multiple_server_search import advanced_search_servers
min_time = "2017-07-01T00:00:00Z"
max_time = "2017-09-01T00:00:00Z"
min_lon, max_lon = -127, -123.75
min_lat, max_lat = 43, 48
standard_name = "sea_water_practical_salinity"
kw = {
"standard_name": standard_name,
"min_lon": min_lon,
"max_lon": max_lon,
"min_lat": min_lat,
"max_lat": max_lat,
"min_time": min_time,
"max_time": max_time,
"cdm_data_type": "timeseries", # let's exclude AUV's tracks
}
servers = {
"ooi": "https://erddap.dataexplorer.oceanobservatories.org/erddap/",
"ioos": "https://erddap.sensors.ioos.us/erddap/",
}
df = advanced_search_servers(servers_list=servers.values(), **kw)
df.head()
[10]:
Title | Institution | Dataset ID | Server url | |
---|---|---|---|---|
0 | Coastal Endurance: Oregon Inshore Surface Moor... | Ocean Observatories Initiative (OOI) | ooi-ce01issm-rid16-02-flortd000 | https://erddap.dataexplorer.oceanobservatories... |
1 | Coastal Endurance: Oregon Inshore Surface Moor... | Ocean Observatories Initiative (OOI) | ooi-ce01issm-rid16-03-ctdbpc000 | https://erddap.dataexplorer.oceanobservatories... |
2 | Coastal Endurance: Oregon Inshore Surface Moor... | Ocean Observatories Initiative (OOI) | ooi-ce01issm-rid16-03-dostad000 | https://erddap.dataexplorer.oceanobservatories... |
3 | Coastal Endurance: Oregon Inshore Surface Moor... | Ocean Observatories Initiative (OOI) | ooi-ce01issm-rid16-07-nutnrb000 | https://erddap.dataexplorer.oceanobservatories... |
4 | Coastal Endurance: Oregon Inshore Surface Moor... | Ocean Observatories Initiative (OOI) | ooi-ce01issm-rid16-06-phsend000 | https://erddap.dataexplorer.oceanobservatories... |