+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

Predicting the Ecological Quality Status of Marine Environments from eDNA Metabarcoding Data Using Supervised Machine Learning



Predicting the Ecological Quality Status of Marine Environments from eDNA Metabarcoding Data Using Supervised Machine Learning



Environmental Science and Technology 51(16): 9118-9126



Monitoring biodiversity is essential to assess the impacts of increasing anthropogenic activities in marine environments. Traditionally, marine biomonitoring involves the sorting and morphological identification of benthic macro-invertebrates, which is time-consuming and taxonomic-expertise demanding. High-throughput amplicon sequencing of environmental DNA (eDNA metabarcoding) represents a promising alternative for benthic monitoring. However, an important fraction of eDNA sequences remains unassigned or belong to taxa of unknown ecology, which prevent their use for assessing the ecological quality status. Here, we show that supervised machine learning (SML) can be used to build robust predictive models for benthic monitoring, regardless of the taxonomic assignment of eDNA sequences. We tested three SML approaches to assess the environmental impact of marine aquaculture using benthic foraminifera eDNA, a group of unicellular eukaryotes known to be good bioindicators, as features to infer macro-invertebrates based biotic indices. We found similar ecological status as obtained from macro-invertebrates inventories. We argue that SML approaches could overcome and even bypass the cost and time-demanding morpho-taxonomic approaches in future biomonitoring.

Please choose payment method:






(PDF emailed within 0-6 h: $19.90)

Accession: 060116400

Download citation: RISBibTeXText

PMID: 28665601

DOI: 10.1021/acs.est.7b01518


Related references

Supervised machine learning outperforms taxonomy-based environmental DNA metabarcoding applied to biomonitoring. Molecular Ecology Resources 18(6): 1381-1391, 2018

A review of supervised machine learning algorithms and their applications to ecological data. Ecological Modelling 240(none), 2012

Predicting the HMA-LMA Status in Marine Sponges by Machine Learning. Frontiers in Microbiology 8: 752, 2017

Biomonitoring of marine vertebrates in Monterey Bay using eDNA metabarcoding. Plos one 12(4): E0176343, 2017

Predicting central line-associated bloodstream infections and mortality using supervised machine learning. Journal of Critical Care 45: 156-162, 2018

Water quality of Danube Delta systems: ecological status and prediction using machine-learning algorithms. Water Science and Technology 73(10): 2413-2421, 2016

Predicting growth and mortality of bivalve larvae using gene expression and supervised machine learning. Comparative Biochemistry and Physiology. Part D Genomics and Proteomics 16: 59-72, 2015

Predicting online gambling self-exclusion: an analysis of the performance of supervised machine learning models. International Gambling Studies 16(2): 193-210, 2016

Replication levels, false presences and the estimation of the presence/absence from eDNA metabarcoding data. Molecular Ecology Resources 15(3): 543-556, 2015

Predicting neuroendocrine tumor (carcinoid) neoplasia using gene expression profiling and supervised machine learning. Cancer 115(8): 1638-1650, 2009

Predicting Neuroendocrine Tumor (Carcinoid) Neoplasia Using Gene Expression Profiling and Supervised Machine Learning. Yearbook of Surgery 2010: 168-169, 2010

Validation of the Macrophyte Quality Index (MaQI) set up to assess the ecological status of Italian marine transitional environments. Hydrobiologia 617(1): 117-141, 2009

Predicting hepatitis B virus-positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning. Nature Medicine 9(4): 416-423, 2003

Supervised machine learning and heterotic classification of maize (Zea mays L.) using molecular marker data. Computers and Electronics in Agriculture 74(2): 250-257, 2010

The effect of sample size and disease prevalence on supervised machine learning of narrative data. Proceedings. AMIA Symposium 2002: 519-522, 2002