AI Assessment

Browse cancer datasets available through the platform.

The AI assessment framework in the EuCanImage project includes the Radiomics Quality Score 2.0 (RQS 2.0), which standardizes the evaluation of both deep learning and handcrafted radiomics studies. It incorporates the Radiomics Readiness Levels (RRLs) to provide a structured, stepwise approach to assessing research maturity. The In Silico Trial Platform supports simulated clinical studies to evaluate AI's impact in three settings: clinicians without AI, with AI, and with AI plus explainability. OpenEBench enables researchers to participate in benchmarking events and access public benchmarking results, promoting transparency and comparability of AI methods. Additionally, the project integrates cost-effectiveness analysis to assess the practical and economic value of implementing AI in clinical workflows.

AI Modeling

EnCanImage is creating a collection of radiomics methods and AI algorithms for building novel integrative AI models from large-scale imaging and non-imaging data. They are offered through the AI Virtual Research Environment, a portable computational environment for supporting the development and validation of AI tools.

Tools for cancer image feature extraction and selection
Machine-learning pipeline for integrated predictive modelling

AI Development Platform

Run AI experiments

Run EuCanImage AI tools on a private execution environment.
Use the user-friendly web interface to upload your dataset or import them from any of the EuCanImage Data Repositories.
A pilot installation is online hosted at the Barcelona Supercomputing Center facilities.

Run Online Demo

Install Platform

Access to Services or Software

OpenEBench

ELIXIR Benchmarking Platform

Participate to benchmarking events organized by EuCanImage for assessing your AI method.
Inspect and visualize public benchmarking results.

In silico Trial Platform

In silico Validation

In silico platform allows to conduct studies to evaluate the added value of AI in the simulated clinical workflow. Three types of evaluations are possible:

Clinicians without AI
Clinicians with AI
Clinicians with AI and Explainability

Radiomics Quality Score 2.0

Benchmarking Radiomics Studies

Radiomics Quality Score 2.0 enables benchmarking deep learning and handcrafted radiomics research.
The Radiomics Readiness Levels (RRLs) framework is embedded within RQS 2.0 to establish a structured, step-by-step approach to radiomics research.

Collective Minds Research

Collective Segmentation for Medical Imaging

An advanced, collaborative platform for precise medical image segmentation and other clinical research workflows, designed to accelerate AI-driven oncology research. Built on a secure, cloud infrastructure, it enables the collection, annotation, and benchmarking of cancer-related imaging multi-modal datasets, ensuring high-quality, GDPR-compliant workflows.

Collective Minds Connect

Automatic Imaging Data Collection, Pseudonymization, Tracking and Transfer

A seamless hospital edge gateway and data pipeline for automated imaging acquisition, anonymization, tracking and secure transfer. This service streamlines the entire lifecycle—from data ingestion through tracking to delivery—ensuring compliant, efficient, and traceable data transfers that facilitate federated AI development and multi-modal, multi-centric collaboration.

Benchmarking Challenges

Metrics and scores

Classification Metrics
True Positive Rate (TPR) / Sensitivity / Recall Measures the proportion of actual positives correctly identified (also called recall or true positive rate).
True Negative Rate (TNR) / Specificity Measures the proportion of actual negatives correctly identified (true negative rate).
Positive Predictive Value (PPV) / Precision Proportion of predicted positives that are true positives.
Negative Predictive Value (NPV) Proportion of predicted negatives that are true negatives.
F1 Score Harmonic mean of precision and recall, balancing both in a single metric.
Accuracy Overall proportion of correctly classified instances (both positives and negatives).
Balanced Accuracy Average of sensitivity and specificity, useful for imbalanced datasets.
Cohen's Kappa Measures agreement between predicted and true labels adjusted for chance.
Weighted Cohen's Kappa Cohen's Kappa that accounts for severity of misclassification via weights.
Mathews Correlation Coefficient Correlation coefficient between observed and predicted classifications, suitable for imbalanced data.
Receiver Operating Characteristic Curve AUC Area under the ROC curve; measures the trade-off between TPR and FPR.
Precision Recall Curve AUC Area under the Precision-Recall curve; emphasizes performance on the positive class.

Segmentation Metrics
Dice Index Overlap between prediction and ground truth.
Jaccard Index Intersection over Union of prediction and ground truth.
Surface Dice Index Overlap of surfaces within a fixed tolerance.
Hausdorff Distance Maximum surface distance between masks.
Hausdorff Distance 95 percentile 95th percentile of surface distances.
Average Symmetric Surface Distance Mean of all shortest distances between surfaces of A and B, bidirectionally.
Modified Hausdorff Distance Average of mean nearest distances between surfaces.

Other sections or suggestions?

Please, let us know if you think we can include any other kind of information it would be valuable to share at the portal for AI data researchers willing to understand/use our AI assessment efforts ...