Lessons from the Atlas of Living Australia
Martin Westgate / Science & Decision Support Team / ALA
SORTEE Conference / 13 July 2022
@westgatecology
‘…we predict that the word “novel” will appear in every record by the year 2123.’
C Vinkers et al. (2015) Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis.BMJ 351 https://doi.org/10.1136/bmj.h6467
Thesis #1:
Open science is in a transition to a more infrastructure-dependent model
Thread to give a sense of COS's direction in metascience.
— Brian Nosek (@BrianNosek) July 9, 2022
We are evolving from piecemeal project-to-project research to a programs model with areas of ongoing investment to support our mission and guide effective culture change in reward systems, rigor, and reproducibility https://t.co/3JlzE2lAdC
Region | Organisation | Records (Millions) |
---|---|---|
Australia | ALA | 112.3 |
Austria | BAO | 7.8 |
Brazil | SiBBr | 23.6 |
Canada | Canadensys | 6.3 |
Estonia | eElurikkus | 6.2 |
France | INPN | 87.4 |
Portugal | GBIF.pt | 17.5 |
Spain | GBIF.es | 36.3 |
Sweden | SBDI | 103.4 |
UK | NBN | 204.8 |
Vermont | VAL | 7.2 |
Global | GBIF | 2,204.6 |
Project stage | Academia | Infrastructure |
---|---|---|
Data collection | fieldwork | collaboration with institutions |
Data formatting | customizable | established standards |
Data management | spreadsheet, app | processing pipeline |
Data storage | single machine or online | database |
Data out | - | API |
Thesis #2:
In science, stability & innovation are co-dependent
Image source: https://whatson.melbourne.vic.gov.au/things-to-do/state-library-victoria
Image source: Gibney, E. How the revamped Large Hadron Collider will hunt for new physics. Nature 25-05-2022
Images:
Henry Oldenburg - Philosophical Transactions, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=36495651
Trisos, C.H., Auerbach, J. & Katti, M. Decoloniality and anti-oppressive practices for a more ethical ecology. Nat Ecol Evol 5, 1205–1212 (2021). https://doi.org/10.1038/s41559-021-01460-w
Thesis #3:
Working across the stability / innovation boundary is difficult
galah
galah
/ ALA4R
/ benefitsgalah
/ ALA4R
/ problemsNo function naming convention
aus()
ala_fields()
fieldguide()
occurrences()
, images()
galah
/ ALA4R
/ problemsConfusing syntax
ala_list()
, ala_lists()
, specieslist()
wkt
, fq
, qa
solr
queries passed as strings:
"taxon_name:\"Alaba vibex\""
galah
/ ALA4R
/ problems
Inconsistent behaviour
data.frame
occurrences()
returns a list
fieldguide()
and plot.occurrences()
output a PDFgalah
/ benefitsgalah
/ benefits
Lookup | Narrow a query | Run a query |
---|---|---|
show_all() |
galah_filter() |
atlas_counts() |
search_all() |
galah_select() |
atlas_occurrences() |
galah_group_by() |
atlas_media() |
galah
/ number of records# A tibble: 1 × 1
count
<int>
1 992079
galah
/ number of recordsgalah
/ number of recordsgalah_call() |>
galah_identify("Eolophus roseicapilla") |>
galah_filter(year >= 2010,
dataResourceName == "iNaturalist Australia") |>
galah_group_by(year) |>
atlas_counts()
# A tibble: 13 × 2
year count
<chr> <int>
1 2021 1933
2 2020 1571
3 2019 942
4 2022 917
5 2018 821
6 2017 537
7 2016 194
8 2015 110
9 2014 79
10 2013 62
11 2011 54
12 2012 41
13 2010 36
galah
/ number of recordsgalah_call() |>
galah_identify("Cacatuidae") |> # cockatoos
galah_filter(year >= 2019) |>
galah_group_by(year, dataResourceName) |>
atlas_counts()
# A tibble: 80 × 3
year dataResourceName count
<chr> <chr> <int>
1 2021 eBird Australia 248142
2 2021 iNaturalist Australia 7621
3 2021 NSW BioNet Atlas 1490
4 2021 Earth Guardians Weekly Feed 927
5 2021 SA Fauna (BDBSA) 300
6 2021 NatureMapr 166
7 2021 WildNet - Queensland Wildlife Data 153
8 2021 ALA species sightings and OzAtlas 118
9 2021 Wildlife Watch NSC 105
10 2021 Port Adelaide Enfield Flora & Fauna Monitoring 37
# … with 70 more rows
galah
/ occurrenceslibrary(galah)
library(ozmaps)
library(sf)
library(ggplot2)
# Enter email
galah_config(email = "martinjwestgate@gmail.com")
# Download species occurrences
obs <- galah_call() |>
galah_identify("peramelidae") |>
galah_filter(year == 2021) |>
atlas_occurrences()
# Ensure map uses correct projection
oz_wgs84 <- ozmap_data(data = "country") |>
st_transform(crs = st_crs("WGS84"))
# Map points
ggplot(data = obs) +
geom_sf(data = oz_wgs84,
fill = "white") +
geom_point(aes(x = decimalLongitude,
y = decimalLatitude),
color = "#78cccc") +
theme_void()
galah
/ occurrencesgalah
/ other atlases# A tibble: 1 × 1
count
<int>
1 7786013
galah
/ ALA labsgalah
/ ALA labsgalah
Summary:
Martin Westgate
Team Leader / Science & Decision Support / ALA
e: martin.westgate@csiro.au
t: @westgatecology
gh: @mjwestgate
These slides were made using Quarto & RStudio