Publication

Bridging chemical structure and genetic similarity: A Bio-QSAR model for cross-species toxicity predictions

Forastiere, Mirko
Vijver, Martina
Posthuma, Leo
Visser, Marco
Citations
Google Scholar:
Altmetric:
Series / Report no.
Open Access
Type
Journal Article
Article
Language
en
Date of publication
2026-03-23
Year of publication
Research Projects
Organizational Units
Journal Issue
Title
Bridging chemical structure and genetic similarity: A Bio-QSAR model for cross-species toxicity predictions
Translated Title
Published in
Ecotoxicol Environ Saf 2026; 314:120057
Abstract
The addition of species genetic traits into QSAR models enhances accurate cross-species toxicity predictions. Building on this idea, we developed a Bio-QSAR model that interprets genetic similarity as a proxy for common physiological responses and integrates chemical and genetic features within a neural network model. The main advantage is that the species embedding is continuous and thus allows for a higher degree of generalization than categories based on taxonomy, improving predictions for new, untested species. We conducted a comparative analysis of our model's performance against that of recently published Bio-QSAR models. The results indicated that our model achieved a similar level of predictive accuracy, demonstrating its competitiveness within the current state-of-the-art methodologies. Rather than selecting the best-performing model as the flagship, we opted to present the distribution of performance metrics along with their average across repeated cross-validation. Additionally, we report the range of these metrics to facilitate comparisons with other models. Finally, we analyzed the model outputs to identify any potential sampling biases in datasets, highlighting extremes for species and chemical toxic response. Our results demonstrated that this species embedding is a highly effective approach for read-across scenarios in common ecotoxicological datasets with high data sparsity. In the current state, this model can be used to fill gaps in datasets or improve environmental risk assessment with more data, and direct prioritization of new tests for different species or chemicals. These future improvements will allow for more accurate predictions on completely new species, being it a lab-reared, rare, or indigenous species.
Description
Publisher
Sponsors
DOI data
Embedded videos