EUOS25 Challenge

Goal: The EUOS25 Challenge assesses computational methods for predicting absorption and fluorescence properties of compounds [1].
N.B.! Plate or wells ids should NOT be used to predict test set compounds!
Data: A dataset of approximately 100k compounds systematically tested under the EU-OPENSCREEN bioprofiling program. It measured multiple endpoints, including solubility, which had previously been used in the Kaggle Solubility Prediction Challenge[2,3]. For the present EUOS25 Challenge, absorbance and fluorescence spectra were processed to create the prediction targets. The dataset is split 70% for training and 30% for testing (leaderboard + blind set).

Challenge 1: Absorption (Transmittance, %T)

Develop classification models to predict compounds with absorption (Transmittance <= 70%):
* Subtask 1a: at 340 nm
* Subtask 1b: averaged from 450-679 nm range

Challenge 2: Fluorescence

Develop classification models to predict compounds that fluoresce above predefined thresholds at:
* Subtask 2a: 340/450 nm (excitation/emission)
* Subtask 2b: any of 480/540, 525/598, or 560/610 nm (excitation/emission)

N.B.! The modelling of subtasks *b is much more difficult due to only few active compounds available for them (0.23% and 1.5% for Fluorescence and Transmittance tasks, respectively).

Additional Data: The use of any related data to increase model accuracy (e.g., within multi-task modelling, pretraining, etc.) except experimental data for test set compounds, is allowed. These data should be made publicly available during model validation. The participants should use only structural information for prediction of compounds from the test sets. The information about plates and wells (in the extended files) can be, however, used to normalise training set data, if required. Notice that we found statistically significantly lower average absorption for well N8 (>450nm) and plates C1046, C1070, C1099 (both wavelengths) and C1054, C1096, C1051, C1048, C1053, (450 nm only), which may indicate some problem with them. The data from these wells and plates were exclusively used for the training set.
Teams: Each participant can join only one team. One participant of each team can register for an account (or use an existing account) with his/her full first and last name on the OCHEM website. This account will be used to submit predictions for test sets. This participant should provide a list of team members with their full names, emails and affiliations before January 15th 2026 by e-mail to slas@ochem.eu. If no such list is provided, it will be assumed that the team consists of one participant.
Winning solution: The last prediction submitted by each team will be used to calculate the AUC over the blind test set for each subtask, then the average AUC across both subtasks determines the winner for each property. In case of equal results (AUC will be rounded to three digits, e.g., 0.617) the first eligible final result to be submitted will be considered the winning contribution. The very similar rules were also used in Tox24 Challenge.
Validation of the winning model: To be eligible, the winning solution should be released as open source and/or be sufficiently described to be independently reproduced by the Challenge Organisers following instructions of the team members. If this will require additional licences, they should be provided for free solely for model testing. Statistical variations with p > 0.05 due to, e.g., neural network weights initialisation, as determined by bootstrap evaluation,[4] will not disqualify the model even if reproduced accuracy differs slightly. The models that failed to be reproduced will be disqualified.
Prize: The winning teams will be announced at 12-12:30 on Tuesday Feb 10th in the Ignite Theater during the SLAS International Conference and Exhibition (February 7-11, 2026) in Boston and will be awarded a prize of 1,000€ to be sponsored by the organisers. Both winning teams will be invited to give lecture during the conference by Zoom on in person. All participants from all around the world are free to submit their models, but participants from countries currently under sanctions by European and USA authorities may not be eligible to receive prizes. The winning teams, as well as other teams whose models gave an AUC not significantly different from that of the winning solution, will be invited to publish their studies or protocols in SLAS journal. Thus, by participating in this challenge, each team agrees to contribute such an article to SLAS in case if they develop a winning model.
Organisers: The challenge is co-organised by EU-OPENSCREEN and SLAS as well as by Horizon Europe Marie Skłodowska-Curie Actions Doctoral Network grant agreement No. 101120466 “Explainable AI for Molecules” (AiChemist).
References

(1) Skopelitou, K.; Rossella, F.; Awuku Larbi, R.; Gribbon, P.; Cirino, T.; Tetko, I.V. 2nd EUOS/SLAS Joint Challenge: Prediction of spectral properties of compounds SLAS Technol. 2025 Nov 22:100374. 10.1016/j.slast.2025.100374

(2) 1st EUOS/SLAS Joint Challenge: Compound Solubility

(3) Hunklinger, A.; Hartog, P.; Šícho, M.; Godin, G.; Tetko, I. V. The openOCHEM Consensus Model Is the Best-Performing Open-Source Predictive Model in the First EUOS/SLAS Joint Compound Solubility Challenge. SLAS Discov. 2024, 29 (2), 100144. https://doi.org/10.1016/j.slasd.2024.01.005

(4) Vorberg, S.; Tetko, I. V. Modeling the Biodegradability of Chemical Compounds Using the Online CHEmical Modeling Environment (OCHEM). Mol. Inform. 2014, 33 (1), 73–85. https://doi.org/10.1002/minf.201300030


Go back