Skip to Content

Dataset distribution analysis for Machine Learning validation

Virtual
Pricing/Discount Options: Call #2
Unique Identifier: 00976675-bd7e-43b8-bf6e-887480b67e88

Service Description

This service provides a statistical analysis of datasets and dataset splits used in machine learning pipelines, with the objective of verifying their distributional consistency. It supports the validation of training, validation, and test splits by detecting statistically significant differences that could bias model training or invalidate performance evaluation. The analysis conducted in a controlled and documented execution environment to ensure traceability and repeatability of results.

The service is particularly relevant for AI systems in the health domain, where dataset representativeness directly affects evaluation reliability, which is critical for regulatory compliance and clinical trustworthiness. The analysis is performed using robust statistical techniques and is delivered as a structured, interpretable report. The service execution follows ISO 9001 certified processes.

Provider Logo

Provider & Contact

Provider Organisation Multitel (MULTITEL)
Provider Country Belgium
Organisation Website https://www.multitel.be/
Published Email cadji@multitel.be

Pricing is available to registered users. SMEs receive significant state-aid reductions (GBER) — or, depending on the call, free services during the funded project. Sign in or register to see the price for your organisation.

Operational Details

Service Inputs Client datasets or predefined dataset splits (e.g. training / validation / test).
Service Outputs Report on dataset analysis (statistical differences measured by our method on client dataset, specifying if differences between datasets/splits were found).
Dependencies & Restrictions GDPR for the handling of personal data. A DPA will have to be signed between parties to allow us to process the data.
Certification Support
  • AI Act
  • MDR
Service Standards
  • AI Act Art. 10
  • MDR
Comments This service supports data quality assessment and risk mitigation in machine learning workflows, contributing to the development and evaluation of trustworthy AI systems. The resulting analysis report can be used as supporting evidence within a Medical Device Regulation (MDR) technical documentation. This service addresses Article 10 (data quality obligations) under the EU AI Act by detecting distributional inconsistencies across dataset splits. The service also aligns with key principles of the EU AI Act for high-risk AI systems, notably those related to performance evaluation, transparency, and technical documentation.