R packages for data quality

What’s already out there

packages
Author

Martin Westgate

Published

April 17, 2026

There are a bunch of R packages available for getting biodiversity data. It is worth saying, first of all, that the data aggregators themselves do a lot of work on data quality. GBIF and the living atlases have systems that tag (and sometimes, alter) records as having various problems. Therefore galah and rgbif are relevant resources here; but that’s a topic for a different time. This page is focussed on R packages that have cleaning biodiversity data as their central purpose, or include a cleaning component as part of some wider workflow.

bdc

Widely-used and well-respected

Docs | Code | CRAN

DOI (paper): 10.1111/2041-210X.13868

CoordinateCleaner

Solid package offering tools not available elsewhere. ROpenSci.

Docs | Code | CRAN

DOI (paper): 10.1111/2041-210X.13152

Workflows

  • ebird / Cornell lab for Onithology have data cleaning workflows