AI training data · procurement-grade

Training data your risk team will approve.

A marketplace for AI training datasets — every one ships with an Ed25519-signed quality certificate. The cert goes directly into your SR 11-7, EU AI Act, and §1557 paperwork. No PDFs to chase. No quality you can't verify.

NEW The LQS Index — public benchmark of training data quality. See it →
Dimensions19 Oracles7 SignatureEd25519 Coverage95% CI · 90% conformal Verify~200ms offline MethodologyOpen + paper →
Live · refreshes every minute

Real numbers. Pulled from production.

Signed certs in production
marketplace + proxy
Datasets in calibration corpus
growing toward 1,000 by Q3 2026
Marketplace datasets
all carry signed LQS certs
Proxy certs scored · last 24h
in the last hour
Public key fingerprint aa4c070af907e2ea · Methodology v3.1 · Sources: HuggingFace · Zenodo
Pilot program · accepting design partners

5–10 regulated-industry teams. 6 months of LQS Enterprise free.

The first risk teams to put LQS into a model package help us tune the standard. No demo, no sales call, two-line application. Free in exchange for a logo and a short case study after rollout.

Apply for a spot → Banking · Health · LLM labs · Public sector · Other regulated industries
Who this is for

Built for the four people who decide whether your AI ships.

If your team is fine-tuning models on bought or licensed data, one of these is you. The cert in your hand is the deliverable, not a marketing artifact.

ML / Data Science Lead

Buying training data of unknown quality

Pain: No way to compare datasets at procurement time. You read the README and hope.
Fix: Every listing has a 19-dimension LQS score with 95% CIs and contamination check.
Compliance / Risk

No audit trail for the data inside the model

Pain: When auditors ask "where did the training data come from?", you have a Slack thread.
Fix: An Ed25519-signed cert per dataset — verifiable offline by anyone with the public key.
Procurement

Vendor due diligence on AI training data is undefined

Pain: No vendor questionnaire fits "labeled dataset for fine-tuning." Legal review takes weeks.
Fix: Pre-vetted commercial licenses, instant download, one-page cert in the procurement file.
Model Risk Manager

SR 11-7 documentation gaps when models go to MRM

Pain: Training data section of the model package is "see attached spreadsheets." MRM kicks it back.
Fix: Cite the cert hash in the model package. Auditor verifies the signature in 3ms, offline.
Try the LQS scorer

Score any public dataset, live.

Paste a URL from HuggingFace or Zenodo. We fetch metadata, run the proxy LQS scorer, and sign the cert with our production Ed25519 key. Verify it offline against our public key.

Try:
Proxy cert · confidence 0.4. Derived from public HuggingFace metadata only. For full file-based 19-dim LQS scoring (confidence > 0.85), upload your dataset →
labelsets lqs-score · ready
$ // paste a HuggingFace dataset URL above to see a live signed cert
Marketplace · flagship datasets

Ready for procurement today.

Hand-curated for regulated domains. Every dataset ships with a signed LQS cert verifiable against our public key.

● For sellers List your dataset. Keep 85% of every sale. Weekly Stripe payouts, $0 scoring fee. Your data gets the same signed cert as the flagships above. Start listing →
Every domain covered

12 dataset categories. One quality standard.

The LQS cert is domain-agnostic. Whether you're building a vision model for traffic cameras or a clinical NLP system, the same signed quality proof ships with every dataset.

See it in action

Upload. Score. Sell.

A real dataset flowing through the pipeline — raw upload, multi-oracle scoring, signed cert, first sale. End-to-end in minutes.

Live pipeline Object Detection Dataset
Original Eval-clean Datasheet Valued
Step 1 Dataset uploaded
Urban Traffic Detection
YOLO · 4,820 images
Upload complete
Step 2 Quality scored
0
LQS score
— · —
Structural88
Annotation79
Training Fit83
Step 3 First sale
$549
One-time · Commercial license
Purchase confirmed
Seller receives $466.65 · 85% share · weekly Stripe payout.
Cert fields map to the compliance paperwork your risk team already files
SOC 2Type II · audit
HIPAABAA ready
EU AI ActArt. 10 aligned
SR 11-7Fed · model risk
21 CFR 11FDA · e-records
GDPR · DPAArt. 28 + SCCs

Start with one signed dataset.

Self-serve, end-to-end. Browse the marketplace, pay by card, verify the cert. Or upload your own data and start listing in under 10 minutes.

Frameworks · vendor-review ready
  • EU AI ActArticle 10 alignment
  • NIST AI RMFcrosswalks maintained
  • HIPAA + §1557BAA available · subgroup-equity scored
  • 21 CFR 11.10(e)FDA audit-trail satisfied via signed certs
  • SOC 2in progress — Type I letters available
  • ISO/IEC 23053crosswalk on request
Full mappings + drop-in procurement clauses at /compliance. Anything not yet attested is labeled in scope or in progress — never falsely claimed.