💬 Curated Catalog · NLP / Text

SQuAD 2.0 — Stanford Question Answering Dataset

150K crowdsourced question-answer pairs on Wikipedia passages, including unanswerable questions.

LQS 85 · gold ✓ Commercial OK 150K Q&A pairs 40 MB JSON Released 2018

Browse commercial NLP / Text → Visit original source ↗

Source: rajpurkar.github.io · maintained by Stanford NLP Group

About this dataset

SQuAD 2.0 combines 100K questions from SQuAD 1.1 with 50K unanswerable questions adversarially written by crowdworkers. Answers are spans of text from Wikipedia passages. It's the canonical benchmark for extractive question answering and reading comprehension.

Maintainer

Stanford NLP Group

License

CC BY-SA 4.0

Formats

JSON

Paper

Read on arxiv.org →

LabelSets Quality Score

LQS is our 7-dimension quality score, computed from the dataset's published statistics. See methodology →

out of 100

gold tier

High-quality dataset across most dimensions

Composite score computed from the 7 dimensions below: completeness, uniqueness, validation health, size adequacy, format compliance, label density, and class balance.

Completeness 95

No public completeness metric; using prior for 'expert_curated' datasets.

Uniqueness 93

Exact-hash deduplication documented by maintainer.

Validation 82

Crowdsourced labels with quality-control protocol (redundancy, golden tests).

Size adequacy 91

150,000 pairs — exceeds 100,000 adequacy target for NLP / Text.

Format compliance 95

Industry-standard format — drop-in compatible with mainstream tooling.

Label density 52

Average 1.0 labels per item (sparse).

Class balance 75

Moderate class skew — realistic production distribution.

What it's used for

Common tasks and benchmarks where SQuAD 2.0 — Stanford Question Answering Dataset is the default or competitive choice.

Extractive QA
Reading comprehension
Unanswerability detection

Sample statistics

What's actually in the dataset — from the maintainer's published stats.

150,000 questions total: 100K answerable + 50K unanswerable. ~23.2 words avg per question, ~3.2 words avg per answer span.

License

SQuAD 2.0 — Stanford Question Answering Dataset is distributed under CC BY-SA 4.0. This is a third-party public dataset; LabelSets indexes and scores it but does not host or redistribute the data. Always verify current license terms with the maintainer before commercial use.

Need commercial-licensed NLP / Text data?

LabelSets sellers offer paid nlp / text datasets with what public datasets often can't give you:

Explicit commercial license in writing
LQS-verified quality in your specific use-case
Instant download — no DUA, credentialed access, or research gating
PII scanned, deduplicated, and production-ready

Browse paid NLP / Text → Sell your dataset

Frequently Asked Questions

SQuAD 2.0 — Stanford Question Answering Dataset is distributed under CC BY-SA 4.0, which generally permits commercial use. Always verify the current license terms with the maintainer (Stanford NLP Group) before using in a commercial product.

SQuAD 2.0 — Stanford Question Answering Dataset contains 150,000 Q&A pairs. 150,000 questions total: 100K answerable + 50K unanswerable. ~23.2 words avg per question, ~3.2 words avg per answer span.

SQuAD 2.0 — Stanford Question Answering Dataset is maintained by Stanford NLP Group and is available at https://rajpurkar.github.io/SQuAD-explorer/. LabelSets indexes and scores this dataset for discoverability but does not redistribute it.

LQS is a 7-dimension quality score (completeness, uniqueness, validation, size adequacy, format compliance, label density, class balance) computed from the dataset's published statistics. Composite scores map to tiers: platinum (≥90), gold (≥75), silver (≥60), bronze (<60). Read the full methodology.

SQuAD 2.0 — Stanford Question Answering Dataset

About this dataset

LabelSets Quality Score

High-quality dataset across most dimensions

What it's used for

Sample statistics

License

Need commercial-licensed NLP / Text data?

Similar public datasets

Frequently Asked Questions