Our Approach
Foundation Models for
Prenatal Genomics
We combine transformer-based foundation models with advanced synthetic data generation to deliver earlier, more accurate prenatal genetic screening.
Clinical Platform
Foundation Model Architecture
Our clinical prenatal testing platform is built on a transformer-based foundation model trained on massive cfDNA datasets. Unlike traditional statistical approaches developed in 2008, our AI learns complex patterns across multiple data modalities.
The model uses multi-head attention mechanisms to capture long-range dependencies in genomic data, with the goal of detecting subtle signals at foetal fractions as low as 1-2%. This approach targets a fundamental advance over z-score methods that require 4%+ foetal fraction for reliable results.
Key technical innovations include:
- Multi-modal input processing: fragment size, methylation patterns, coverage depth, and clinical metadata
- Population-specific training for equitable performance across all ancestries
- Uncertainty quantification distinguishing biological from technical limitations
- Explainable AI providing confidence scores and clinical recommendations
Data Generation
Synthetic cfDNA Technology
Our synthetic data generation uses a conditional autoregressive model (AR v15) that produces biologically accurate cell-free DNA fragments. The model learns the complex statistical properties of real cfDNA and generates novel samples that pass standard NIPT detection pipelines.
The generation process provides precise control over:
- Foetal fraction: 1-25% with continuous control
- Karyotype: all common trisomies, sex chromosome aneuploidies, and microdeletions
- Fragment characteristics: size distribution, GC content, coverage patterns
- Sample depth: 1M to 16M fragments per sample
This technology enables unlimited training data for our clinical platform while also serving as a standalone product for researchers and NIPT developers worldwide.
Validation
4-Level Validation Framework
Our rigorous validation framework ensures synthetic data meets quality standards across multiple dimensions.
Distributional Accuracy
Statistical validation ensuring synthetic fragments match real cfDNA distributions for GC content, fragment size, and genomic coverage.
92.9% matchZ-Score Detection
Functional validation confirming synthetic aneuploid samples are detected by standard NIPT z-score algorithms at clinical thresholds.
100% T21 sensitivityPipeline Compatibility
Integration testing with standard NIPT analysis pipelines to ensure synthetic data behaves as expected in downstream workflows.
Full compatibilityDownstream Utility
Validation that models trained on synthetic data perform equivalently when deployed on real patient samples.
+10% AUC improvementResults
Synthetic Data Validation Results
Research
Publications
Research publications are forthcoming in 2026. We are preparing manuscripts detailing our foundation model architecture, synthetic data validation, and clinical performance results.
Preprint expected H2 2026.
Open Science
Commitment to Open Source
We believe in advancing prenatal genomics through open collaboration. Our synthetic data generation tools will be made available to the research community, enabling reproducible science and accelerating innovation worldwide.
View our GitHubExplore Our Technology
Learn more about our clinical platform or start using our synthetic data today.