AI Assessment

Browse cancer datasets available through the platform.

The AI assessment framework in the EuCanImage project includes the Radiomics Quality Score 2.0 (RQS 2.0), which standardizes the evaluation of both deep learning and handcrafted radiomics studies. It incorporates the Radiomics Readiness Levels (RRLs) to provide a structured, stepwise approach to assessing research maturity. The In Silico Trial Platform supports simulated clinical studies to evaluate AI's impact in three settings: clinicians without AI, with AI, and with AI plus explainability. OpenEBench enables researchers to participate in benchmarking events and access public benchmarking results, promoting transparency and comparability of AI methods. Additionally, the project integrates cost-effectiveness analysis to assess the practical and economic value of implementing AI in clinical workflows.

AI Modeling

EnCanImage is creating a collection of radiomics methods and AI algorithms for building novel integrative AI models from large-scale imaging and non-imaging data. They are offered through the AI Virtual Research Environment, a portable computational environment for supporting the development and validation of AI tools.

  • Tools for cancer image feature extraction and selection
  • Machine-learning pipeline for integrated predictive modelling
OpenEBench logo

AI Development Platform

Run AI experiments

  • Run EuCanImage AI tools on a private execution environment.
  • Use the user-friendly web interface to upload your dataset or import them from any of the EuCanImage Data Repositories.
  • A pilot installation is online hosted at the Barcelona Supercomputing Center facilities.

Access to Services or Software

OpenEBench logo

OpenEBench

ELIXIR Benchmarking Platform

  • Participate to benchmarking events organized by EuCanImage for assessing your AI method.
  • Inspect and visualize public benchmarking results.
GO
OpenEBench logo

In silico Trial Platform

In silico Validation

In silico platform allows to conduct studies to evaluate the added value of AI in the simulated clinical workflow. Three types of evaluations are possible:

  • Clinicians without AI
  • Clinicians with AI
  • Clinicians with AI and Explainability
GO
OpenEBench logo

Radiomics Quality Score 2.0

Benchmarking Radiomics Studies

  • Radiomics Quality Score 2.0 enables benchmarking deep learning and handcrafted radiomics research.
  • The Radiomics Readiness Levels (RRLs) framework is embedded within RQS 2.0 to establish a structured, step-by-step approach to radiomics research.
GO
Collective Minds

Collective Minds Research

Collective Segmentation for Medical Imaging

An advanced, collaborative platform for precise medical image segmentation and other clinical research workflows, designed to accelerate AI-driven oncology research. Built on a secure, cloud infrastructure, it enables the collection, annotation, and benchmarking of cancer-related imaging multi-modal datasets, ensuring high-quality, GDPR-compliant workflows.

GO
Collective Minds

Collective Minds Connect

Automatic Imaging Data Collection, Pseudonymization, Tracking and Transfer

A seamless hospital edge gateway and data pipeline for automated imaging acquisition, anonymization, tracking and secure transfer. This service streamlines the entire lifecycle—from data ingestion through tracking to delivery—ensuring compliant, efficient, and traceable data transfers that facilitate federated AI development and multi-modal, multi-centric collaboration.

GO

Benchmarking Challenges

Metrics and scores
Classification Metrics
True Positive Rate (TPR) / Sensitivity / Recall
Measures the proportion of actual positives correctly identified (also called recall or true positive rate).
True Negative Rate (TNR) / Specificity
Measures the proportion of actual negatives correctly identified (true negative rate).
Positive Predictive Value (PPV) / Precision
Proportion of predicted positives that are true positives.
Negative Predictive Value (NPV)
Proportion of predicted negatives that are true negatives.
F1 Score
Harmonic mean of precision and recall, balancing both in a single metric.
Accuracy
Overall proportion of correctly classified instances (both positives and negatives).
Balanced Accuracy
Average of sensitivity and specificity, useful for imbalanced datasets.
Cohen's Kappa
Measures agreement between predicted and true labels adjusted for chance.
Weighted Cohen's Kappa
Cohen's Kappa that accounts for severity of misclassification via weights.
Mathews Correlation Coefficient
Correlation coefficient between observed and predicted classifications, suitable for imbalanced data.
Receiver Operating Characteristic Curve AUC
Area under the ROC curve; measures the trade-off between TPR and FPR.
Precision Recall Curve AUC
Area under the Precision-Recall curve; emphasizes performance on the positive class.
Segmentation Metrics
Dice Index
Overlap between prediction and ground truth.
Jaccard Index
Intersection over Union of prediction and ground truth.
Surface Dice Index
Overlap of surfaces within a fixed tolerance.
Hausdorff Distance
Maximum surface distance between masks.
Hausdorff Distance 95 percentile
95th percentile of surface distances.
Average Symmetric Surface Distance
Mean of all shortest distances between surfaces of A and B, bidirectionally.
Modified Hausdorff Distance
Average of mean nearest distances between surfaces.

Other sections or suggestions?

Please, let us know if you think we can include any other kind of information it would be valuable to share at the portal for AI data researchers willing to understand/use our AI assessment efforts ...