Improving data sharing & re-use
at the Atlas of Living Australia

Martin Westgate






I acknowledge the Traditional Owners of the lands on which I live and work, the Ngunnawal people, and pay my respects to Elders past and present.


/outline

L a Data Custodian b Data Store a->b Publish c Data User b->c Download

/outline


/galah getting data
/events survey data
/galaxias publishing data

/galah

Data from GBIF nodes in R & Python

Global gbif.org
UK nbn.org.uk
France openobs.mnhn.fr
Australia ala.org.au
Sweden biodiversitydata.se
Spain gbif.es
Brazil sibbr.gov.br
Portugal gbif.pt
Austria biodiversityatlas.at
Estonia elurikkus.ee
Guatemala snib.conap.gob.at

Number of records


library(galah)

galah_call() |>
  filter(genus == "Perameles",
         basisOfRecord == "HumanObservation") |>
  count() |>
  collect()
# A tibble: 1 × 1
  count
  <int>
1 36610

Download records


galah_call() |>
  filter(genus == "Perameles",
         year == 2024,
         basisOfRecord == "HumanObservation") |>
  collect() |>
  slice_head(n = 3)
--
# A tibble: 3 × 8
  recordID        scientificName taxonConceptID decimalLatitude decimalLongitude
  <chr>           <chr>          <chr>                    <dbl>            <dbl>
1 061ecc74-8b63-… Perameles nas… https://biodi…           -28.4             154.
2 1012b15b-5275-… Perameles nas… https://biodi…           -28.2             153.
3 19fd1821-8ffd-… Perameles gun… https://biodi…           -42.9             147.
# ℹ 3 more variables: eventDate <dttm>, occurrenceStatus <chr>,
#   dataResourceName <chr>

Number of species


galah_call(type = "species") |>
  filter(genus == "Perameles",
         basisOfRecord == "HumanObservation") |>
  count() |>
  collect()
# A tibble: 1 × 1
  count
  <int>
1     6

Species lists


galah_call(type = "species") |>
  filter(genus == "Perameles",
         basisOfRecord == "HumanObservation") |>
  collect()
# A tibble: 6 × 10
  species_guid            species author kingdom phylum class order family genus
  <chr>                   <chr>   <chr>  <chr>   <chr>  <chr> <chr> <chr>  <chr>
1 https://biodiversity.o… Perame… Geoff… Animal… Chord… Mamm… Pera… Peram… Pera…
2 https://biodiversity.o… Perame… Thoma… Animal… Chord… Mamm… Pera… Peram… Pera…
3 https://biodiversity.o… Perame… Gray,… Animal… Chord… Mamm… Pera… Peram… Pera…
4 https://biodiversity.o… Perame… Quoy … Animal… Chord… Mamm… Pera… Peram… Pera…
5 https://biodiversity.o… Perame… Spenc… Animal… Chord… Mamm… Pera… Peram… Pera…
6 https://biodiversity.o… Perame… Gray,… Animal… Chord… Mamm… Pera… Peram… Pera…
# ℹ 1 more variable: vernacular_name <chr>

Comparing packages

/galah

/ALA4R

/rgbif

galah_call() |>
  filter(species == "Eucalyptus gunnii",
         basisOfRecord == "LivingSpecimen") |>
  select(decimalLatitude, decimalLongitude) |>
  collect()
occurrences(
  fq = "basis_of_record:LivingSpecimen",
  taxon = "taxon_name:\"Eucalyptus gunnii\"",
  fields = c("latitude","longitude"),
  qa = "none")
occ_download(
  pred(species, "Eucalyptus gunnii"),
  pred(basisOfRecord, "LivingSpecimen"))

/events

Survey data

Crinia signifera 32
Paracrinia haswelli 9
Uperoleia tyleri 12
Limnodynastes dumerilli 3
Limnodynastes peronii 5
Limnodynases tasmaniensis 19
Litoria jervisiensis 1
Litoria freycineti 0
Litoria nudidigitus 0
Litoria peronii 5

Home

Time series

Survey details

API

graphQL syntax:

{
  "predicate": {
    "type": "and",
    "predicates": [
      {
        "type": "and",
        "predicates": [
          {
            "type": "equals",
            "key": "year",
            "value": "2023"
          },
          {
            "type": "in",
            "key": "stateProvince",
            "values": [
              "Australian Capital Territory"
            ]
          }
        ]
      }
    ]
  },
  "limit": 50,
  "offset": 0
}

galah syntax:

galah_call(type = "events") |>
  filter(year == 2023,
         stateProvince == "Australian Capital Territory") |>
  collect()

NOTE: Not implemented yet!!!!!

/galaxias

Data publication

OccurrenceID 123456
eventDate 2024-02-07T00:00:00Z
decimalLatitude 145.25
decimalLongitude -35.25
basisOfRecord humanObservation
scientificName Litoria peronii
taxonRank species
recordedBy Martin Westgate
occurrenceStatus present
individualCount 1

Building an archive

df <- read_csv("my_data.csv")
metadata <- read_md("my_metadata.qmd")

archive <- dwca() |>
  add_occurrences(df) |>
  add_metadata(metadata)

then:

  • build() - construct a DwCA zip file
  • check() - run tests to check for DwC compliance
  • publish() - send your data to the ALA publication API

/summary

L a Data Custodian b Data Store a->b Publish c Data User b->c Download

/thanks

The ALA Science & Decision Support Team are:

  • Shandiya Balasubramaniam
  • Amanda Buyan
  • Dax Kellie
  • Juliet Seers
  • Olivia Torresan
  • Callum Waite
  • Martin Westgate

These slides were made with Quarto &:

  • dplyr
  • galah
  • ggiraph
  • ggplot2
  • ozmaps
  • sf
  • tibble