Home·Curated Catalog·Financial / Crypto
📈 Curated Catalog · Financial / Crypto

SEC EDGAR Filings

Every US public company filing since 1993 — 20M+ documents, free and public domain.

LQS 86 · gold ✓ Commercial OK 20M filings 2 TB HTML · XBRL Released 1993
Browse commercial Financial / Crypto → Visit original source ↗
Source: sec.gov · maintained by US Securities and Exchange Commission
20M
filings
2 TB
Size on disk
86
LQS · gold
1993
First released

About this dataset

EDGAR (Electronic Data Gathering, Analysis, and Retrieval) is the SEC's public repository of every filing made by US public companies, mutual funds, and investment advisors since 1993. 20M+ documents covering 10-K, 10-Q, 8-K, S-1, proxy statements, insider transactions, and more. All filings are US Government works — public domain.

Formats
HTML · XBRL · TXT
Source

LabelSets Quality Score

LQS is our 7-dimension quality score, computed from the dataset's published statistics. See methodology →

86
out of 100
gold tier

High-quality dataset across most dimensions

Composite score computed from the 7 dimensions below: completeness, uniqueness, validation health, size adequacy, format compliance, label density, and class balance.

Completeness 90
No public completeness metric; using prior for 'governmental' datasets.
Uniqueness 93
Exact-hash deduplication documented by maintainer.
Validation 93
Ground-truth is the official record itself (filings, medical charts, etc.).
Size adequacy 100
20,000,000 filings — exceeds 10,000 adequacy target for Financial / Crypto.
Format compliance 82
Custom format, documented but non-standard.
Label density 52
Average 1.0 labels per item (sparse).
Class balance 75
Moderate class skew — realistic production distribution.

What it's used for

Common tasks and benchmarks where SEC EDGAR Filings is the default or competitive choice.

Sample statistics

What's actually in the dataset — from the maintainer's published stats.

20M+ filings since 1993 covering every SEC-registered entity. Structured XBRL available for financials post-2009. Full-text search via SEC tool.

License

SEC EDGAR Filings is distributed under Public Domain (US Government work). This is a third-party public dataset; LabelSets indexes and scores it but does not host or redistribute the data. Always verify current license terms with the maintainer before commercial use.

Need commercial-licensed Financial / Crypto data?

LabelSets sellers offer paid financial / crypto datasets with what public datasets often can't give you:

Browse paid Financial / Crypto → Sell your dataset

Similar public datasets

Other entries in the Financial / Crypto catalog.

Frequently Asked Questions

SEC EDGAR Filings is distributed under Public Domain (US Government work), which generally permits commercial use. Always verify the current license terms with the maintainer (US Securities and Exchange Commission) before using in a commercial product.
SEC EDGAR Filings contains 20,000,000 filings. 20M+ filings since 1993 covering every SEC-registered entity. Structured XBRL available for financials post-2009. Full-text search via SEC tool.
SEC EDGAR Filings is maintained by US Securities and Exchange Commission and is available at https://www.sec.gov/edgar/searchedgar/fulltextsearch.html. LabelSets indexes and scores this dataset for discoverability but does not redistribute it.
LQS is a 7-dimension quality score (completeness, uniqueness, validation, size adequacy, format compliance, label density, class balance) computed from the dataset's published statistics. Composite scores map to tiers: platinum (≥90), gold (≥75), silver (≥60), bronze (<60). Read the full methodology.