For researchers, getting data out of GBIF nodes is easy…
…but sharing your own data is hard.
.xml
)galaxias
(and friends)galaxias : Build, check & publish DWCAs |
|
corella : Convert a tibble to Darwin Core |
|
delma : Convert markdown to EML or xml |
An archive is a .zip
file containing three things:
|
|
|
data csv format |
metadata eml format |
schema xml format |
|
|
|
|
|
|
data | metadata | schema | archive | validate | submit |
Load galaxias
delma
and corella
are loaded automatically
Load an example dataset
# A tibble: 2 × 5
latitude longitude date time species
<dbl> <dbl> <chr> <chr> <chr>
1 -35.3 149. 14-01-2023 10:23 Callocephalon fimbriatum
2 -35.3 149. 15-01-2023 11:25 Eolophus roseicapilla
How should we convert this dataset to Darwin Core?
If we follow that advice:
df_dwc <- df |>
set_occurrences(occurrenceID = sequential_id(),
basisOfRecord = "humanObservation") |>
set_coordinates(decimalLatitude = latitude,
decimalLongitude = longitude) |>
set_datetime(eventDate = lubridate::dmy(date),
eventTime = lubridate::hm(time)) |>
set_scientific_name(scientificName = species,
taxonRank = "species")
df_dwc
# A tibble: 2 × 8
basisOfRecord occurrenceID decimalLatitude decimalLongitude eventDate
<chr> <chr> <dbl> <dbl> <date>
1 humanObservation 01 -35.3 149. 2023-01-14
2 humanObservation 02 -35.3 149. 2023-01-15
# ℹ 3 more variables: eventTime <Period>, scientificName <chr>, taxonRank <chr>
Save as occurrences.csv
:
|
|
|
|
|
|
data | metadata | schema | archive | validate | submit |
|
Generate a metadata file
Convert to EML
<?xml version="1.0" encoding="UTF-8"?>
<emlEml xmlns:d="eml://ecoinformatics.org/dataset-2.1.0" xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/terms/" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 http://rs.gbif.org/schema/eml-gbif-profile/1.3/eml-gbif-profile.xsd" system="R-paperbark-package" scope="system" xml:lang="en">
<dataset>
<title>A Sentence Giving Your Dataset Title In Title Case</title>
<abstract>A paragraph outlining the content of the dataset</abstract>
<creator>
<individualName>
<surname>Person</surname>
<givenName>Steve</givenName>
<electronicMailAddress>example@email.com</electronicMailAddress>
</individualName>
<organisationName>Put your organisation name here</organisationName>
<address>
<deliveryPoint>215 Road Street</deliveryPoint>
<city>Canberra</city>
|
|
|
|
|
|
data | metadata | schema | archive | validate | submit |
|
|
Automated process for zipping the /data-publish
folder.
We can check that the correct files are present.
The schema file (eml.xml
) has been built automatically.
|
|
|
|
|
|
data | metadata | schema | archive | validate | submit |
|
|
|
|
|
|
|
|
|
|
data | metadata | schema | archive | validate | submit |
|
|
|
|
|
Run submit_archive()
to create an issue on data-publication
repository
|
|
|
|
|
|
data | metadata | schema | archive | validate | submit |
|
|
|
|
|
|
galaxias
.xml
)Peggy Newman |
Martin Westgate |
Amanda Buyan |
Dax Kellie |
Shandiya Balasubramaniam |
galaxias |
|
corella |
|
delma |
|
galah |