R packages for data quality
What’s already out there
packages
There are a bunch of R packages available for getting biodiversity data. It is worth saying, first of all, that the data aggregators themselves do a lot of work on data quality. GBIF and the living atlases have systems that tag (and sometimes, alter) records as having various problems. Therefore galah and rgbif are relevant resources here; but that’s a topic for a different time. This page is focussed on R packages that have cleaning biodiversity data as their central purpose, or include a cleaning component as part of some wider workflow.
bdc
Widely-used and well-respected
DOI (paper): 10.1111/2041-210X.13868
CoordinateCleaner
Solid package offering tools not available elsewhere. ROpenSci.
DOI (paper): 10.1111/2041-210X.13152
Workflows
- ebird / Cornell lab for Onithology have data cleaning workflows