Available Now
Synthetic cfDNA
That Works
Unlimited biologically accurate cfDNA data for NIPT algorithm development, validation, and research. Off-the-shelf datasets or fully customised generation.
T18/T13 sensitivity at 75% matches clinical detection patterns at 1M depth — higher depth improves sensitivity, as in real NIPT.
Data Products
Choose Your Data Package
Off-the-shelf datasets for immediate use, or custom generation to your exact specifications.
Standard Dataset
Ready-to-use cfDNA samples
£5,000 – £10,000
- 1M fragments per sample
- 31 samples included (Normal, T21, T18, T13)
- Fetal fractions: 8%, 10%, 15%
- HDF5 format with full sequences
- Ground truth labels for all samples
Custom Generation
Data to your exact specifications
£25,000 – £100,000
- 1M-16M fragments per sample
- Unlimited samples generated to spec
- Custom fetal fractions: 1%-25%
- 107 conditions available
- Custom condition prevalence
- Priority support included
Research Partnership
For academic institutions
£100,000 – £500,000
- Everything in Custom Generation
- Academic pricing available
- Co-authorship opportunities
- Technical collaboration
- Publication support
Validation
Proof That It Works
Our synthetic cfDNA has been validated through a 4-level framework covering distributional accuracy, z-score detection, and downstream task performance.
Distributional Similarity
Fragment sizes, GC content, nucleotide patterns, and end motifs match real cfDNA.
Z-Score Detection
Clinical NIPT z-score method correctly detects trisomies in synthetic samples.
Classifier Training
Adding synthetic data to real training data improves model performance.
Controllability
Generation parameters produce expected outputs (fetal fraction, conditions).
Applications
What You Can Build
Algorithm Development
Train and validate NIPT detection algorithms with unlimited labelled data. Test edge cases like low fetal fraction that are rare in real datasets.
Method Validation
Validate new NIPT methods against known ground truth. Test sensitivity at different fetal fractions and read depths.
ML Model Training
Augment limited real data with synthetic samples. Our validation shows +10% AUC improvement in low-data regimes.
Privacy-Compliant Research
Conduct research without patient data concerns. Synthetic data contains no identifiable information.
Education & Training
Train clinical scientists and bioinformaticians with realistic data. Perfect for courses and workshops.
Benchmark Creation
Create standardised benchmarks with known ground truth for comparing NIPT methods across laboratories.
Specifications
What's Included
| Parameter | Standard Dataset | Custom Generation |
|---|---|---|
| Fragments per sample | 1,000,000 | 1M - 16M (configurable) |
| Fragment length range | 50-250 bp | 50-250 bp |
| Fetal fraction | 8%, 10%, 15% | 2% - 25% |
| Conditions | Normal, T21, T18, T13 | 107 including SCAs, microdeletions, microduplications, monogenic, oncology |
| Output format | HDF5 with sequences, positions, chromosomes | HDF5, FASTQ, or custom |
| Metadata | JSON per sample (FF, condition, params) | Full provenance tracking |
| Ground truth labels | Yes | Yes |
| Delivery | Secure download link | Secure download or cloud storage |
Book a 15-min Demo
See the data, ask questions, and find the right package for your needs.