Scene parsing benchmark with 25K images and pixel-level masks across 3,500+ object classes.
Browse commercial Computer Vision → Visit original source ↗ADE20K is MIT CSAIL's scene parsing benchmark. 25,574 images with dense pixel-level segmentation masks covering 3,500+ object classes and parts — objects like 'ceiling lamp' or 'microwave door' — along with 150 stuff/thing categories in the standard evaluation subset.
LQS is our 7-dimension quality score, computed from the dataset's published statistics. See methodology →
Composite score computed from the 7 dimensions below: completeness, uniqueness, validation health, size adequacy, format compliance, label density, and class balance.
Common tasks and benchmarks where ADE20K is the default or competitive choice.
What's actually in the dataset — from the maintainer's published stats.
ADE20K is distributed under BSD-3-Clause. This is a third-party public dataset; LabelSets indexes and scores it but does not host or redistribute the data. Always verify current license terms with the maintainer before commercial use.
LabelSets sellers offer paid computer vision datasets with what public datasets often can't give you:
Other entries in the Computer Vision catalog.