Computer Vision Datasets
Datasets who is the best at X ?
Computer Vision Datasets
- website: http://clickdamage.com/sourcecode/index.html
 - code: http://clickdamage.com/sourcecode/cv_datasets.php
 - mirror: http://pan.baidu.com/s/1pJmqD4n
 
Introducing the Open Images Dataset
- blog: https://research.googleblog.com/2016/09/introducing-open-images-dataset.html
 - github: https://github.com/openimages/dataset
 - Academic Torrents: http://academictorrents.com/details/9e9194e21ce045deee8d811481b4cd676b20b06b
 
A parallel download util for Google’s open image dataset
Image & Vision Group - Datasets
- intro: Image & Vision , Clothing & Fashion, Computer Graphics, Video Sequences
 - homepage: http://caiivg.weebly.com/dataset.html
 
Huizhong Chen - Datasets
- intro: Google I/O Dataset, Names 100 Dataset, Clothing Attributes Dataset, Stanford Mobile Visual Search Dataset, CNN 2-Hours Videos Dataset
 - homepage: http://huizhongchen.github.io/datasets.html#clothingattributedataset
 
Classification / Recognition
A Large-Scale Car Dataset for Fine-Grained Categorization and Verification

- project page: http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/index.html
 - arxiv: http://arxiv.org/abs/1506.08959
 
CIFAR-10 / CIFAR100
- intro: The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
 - homepage: http://www.cs.toronto.edu/~kriz/cifar.html
 
Tencent ML-Images
- intro: Largest multi-label image database; ResNet-101 model; 80.73% top-1 acc on ImageNet
 - github: https://github.com/Tencent/tencent-ml-images
 
Face
The MegaFace Benchmark: 1 Million Faces for Recognition at Scale
- homepage: http://megaface.cs.washington.edu/
 - arxiv: http://arxiv.org/abs/1512.00596
 
MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition
MSR Image Recognition Challenge (IRC)
UMDFaces: An Annotated Face Dataset for Training Deep Networks
Vehicle
The Comprehensive Cars (CompCars) dataset

http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/
BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance [IEEE T-ITS]

Vehicle Make and Model Recognition Dataset (VMMRdb)
- intro: containing 9,170 classes consisting of 291,752 images, covering models manufactured between 1950 to 2016
 - homepage: http://vmmrdb.cecsresearch.org/
 
Cars Dataset

- intro: contains 16,185 images of 196 classes of cars.
 - homepage: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
 
Scene Recognition
Places: An Image Database for Deep Scene Understanding
- project page: http://places.csail.mit.edu/index.html
 - arxiv: https://arxiv.org/abs/1610.02055
 
Places2
- intro: Places2 contains more than 10 million images comprising 400+ unique scene categories
 - homepage: http://places2.csail.mit.edu/
 
The Places365-CNNs for Scene Classification
MNIST
EMNIST: an extension of MNIST to handwritten letters
Fashion-MNIST
- arxiv: https://arxiv.org/abs/1708.07747
 - github: https://github.com/zalandoresearch/fashion-mnist
 - benchmark: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/
 
Food
3 Million Instacart Orders, Open Sourced
https://tech.instacart.com/3-million-instacart-orders-open-sourced-d40d29ead6f2
Detection
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video
- intro: YouTube-BoundingBoxes (YT-BB)
 - homepage: https://research.google.com/youtubebb/
 - arxiv: https://arxiv.org/abs/1702.00824
 
DeepScores – A Dataset for Segmentation, Detection and Classification of Tiny Objects
https://arxiv.org/abs/1804.00525
Exclusively Dark (ExDark) Image Dataset
- intro: Exclusively Dark (ExDARK) dataset which to the best of our knowledge, is the largest collection of low-light images taken in very low-light environments to twilight (i.e 10 different conditions) to-date with image class and object level annotations.
 - github: https://github.com/cs-chan/Exclusively-Dark-Image-Dataset
 
Face Detection
FDDB: Face Detection Data Set and Benchmark
- homepage: http://vis-www.cs.umass.edu/fddb/index.html
 - results: http://vis-www.cs.umass.edu/fddb/results.html
 
WIDER FACE: A Face Detection Benchmark

Pedestrian Detection

Caltech Pedestrian Detection Benchmark
Caltech Pedestrian Dataset Converter
https://github.com/mitmul/caltech-pedestrian-dataset-converter
CityPersons: A Diverse Dataset for Pedestrian Detection
- arxiv: https://arxiv.org/abs/1702.05693
 - bitbucket: https://bitbucket.org/shanshanzhang/citypersons
 - supplemental: http://openaccess.thecvf.com/content_cvpr_2017/supplemental/Zhang_CityPersons_A_Diverse_2017_CVPR_supplemental.pdf
 
CrowdHuman: A Benchmark for Detecting Human in a Crowd
- intro: CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset
 - homepage: https://sshao0516.github.io/CrowdHuman/
 
EuroCity Persons Dataset
- intro: collected on-board a moving vehicle in 31 cities of 12 European countries, over 238200 person instances manually labeled in over 47300 images, contains a large number of person orientation annotations (over 211200)
 - homepage: https://eurocity-dataset.tudelft.nl/
 - arxiv: https://arxiv.org/abs/1805.07193
 
WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild

- project page: http://www.cbsr.ia.ac.cn/users/sfzhang/WiderPerson/
 
Full-Body Annotations
COCO-WholeBody
https://github.com/jin-s13/COCO-WholeBody
Halpe Full-Body Human Keypoints and HOI-Det dataset
- intro: Halpe: full body human pose estimation and human-object interaction detection dataset
 - github:https://github.com/Fang-Haoshu/Halpe-FullBody
 
Vehicle Detection
Toyota Motor Europe (TME) Motorway Dataset
- intro: composed by 28 clips for a total of approximately 27 minutes (30000+ frames) with vehicle annotation
 - homepage: http://cmp.felk.cvut.cz/data/motorway/
 
Welcome to BIT-Vehicle Dataset

- intro: 9,850 vehicle images, sizes of 16001200 and 19201080 captured from two cameras at different time and places in the dataset
 - homepage: http://iitlab.bit.edu.cn/mcislab/vehicledb/
 
Vehicle Re-ID
A Large-Scale Dataset for Vehicle Re-Identification in the Wild
Logo Detection
QMUL-OpenLogo: Open Logo Detection Challenge
- intro: QMUL-OpenLogo contains 27,083 images from 352 logo classes, built by aggregating and refining 7 existing datasets and establishing an open logo detection evaluation protocol
 - homepage: https://qmul-openlogo.github.io/
 
Head Detection
SCUT-HEAD
- intro: SCUT HEAD is a large-scale head detection dataset, including 4405 images labeld with 111251 heads.
 - github: https://github.com/HCIILAB/SCUT-HEAD-Dataset-Release
 
HollywoodHeads dataset
http://www.di.ens.fr/willow/research/headdetection/
Brainwash dataset.
https://exhibits.stanford.edu/data/catalog/sx925dc9385
Detection From Video
YouTube-Objects dataset v2.2
ILSVRC2015: Object detection from video (VID)
Segmentation
Mapillary Vistas Dataset
Mapillary Vistas Dataset
- intro: 25,000 high-resolution images, 100 object categories, 60 of those instance-specific https://www.mapillary.com/dataset/
 
Releasing the World’s Largest Street-level Imagery Dataset for Teaching Machines to See
http://blog.mapillary.com/product/2017/05/03/mapillary-vistas-dataset.html
Multi-Human Parsing

PASCAL VOC
Augmented Pascal VOC
http://home.bharathh.info/pubs/codes/SBD/download.html
Supervisely Person
- homepage: https://supervise.ly/
 - blog: https://hackernoon.com/releasing-supervisely-person-dataset-for-teaching-machines-to-segment-humans-1f1fc1f28469
 
Microsoft COCO
- homepage: http://mscoco.org/
 - github: https://github.com/pdollar/coco
 
The Oxford-IIIT Pet Dataset
- intro: a 37 category pet dataset with roughly 200 images for each class. All images have an associated ground truth annotation of breed, head ROI, and pixel level trimap segmentation
 - homepage: http://www.robots.ox.ac.uk/~vgg/data/pets/
 
COCO-Stuff
COCO-Stuff: Thing and Stuff Classes in Context
COCO-Stuff 10K dataset v1.1
https://arxiv.org/abs/1612.03716 https://github.com/nightrome/cocostuff
Scene Parsing
MIT Scene Parsing Benchmark
http://sceneparsing.csail.mit.edu/
ADE20K
- intro: train: 20,120 images, val: 2000 images. contains 150 stuff/object category labels (e.g., wall, sky, and tree) and 1,038 imagelevel scene descriptors (e.g., airport terminal, bedroom, and street).
 - homepage: http://groups.csail.mit.edu/vision/datasets/ADE20K/
 
Semantic Understanding of Scenes through the ADE20K Dataset
https://arxiv.org/abs/1608.05442
ImageNet
ImageNet-Utils
- intro: Utils to help download images by id, crop bounding box, label images, etc.
 - github: https://github.com/tzutalin/ImageNet_Utils
 
Captioning / Description
TGIF: A New Dataset and Benchmark on Animated GIF Description
Collecting Multilingual Parallel Video Descriptions Using Mechanical Turk
- intro: 1970 YouTube video snippets: 1200 training, 100 validation, 670 test
 - homepage: http://www.cs.utexas.edu/users/ml/clamp/videoDescription/
 
Video
| Dataset | # Videos | # Classes | Year | Manually Labeled ? | 
|---|---|---|---|---|
| Kodak | 1,358 | 25 | 2007 | ✓ | 
| HMDB51 | 7000 | 51 | ||
| Charades | 9848 | 157 | ||
| MCG-WEBV | 234,414 | 15 | 2009 | ✓ | 
| CCV | 9,317 | 20 | 2011 | ✓ | 
| UCF-101 | 13,320 | 101 | 2012 | ✓ | 
| THUMOS-2 | 18,394 | 101 | 2014 | ✓ | 
| MED-2014 | ≈28,000 | 20 | 2014 | ✓ | 
| Sports-1M | 1M | 487 | 2014 | ✗ | 
| ActivityNet | 27,801 | 203 | 2015 | ✓ | 
| FCVID | 91,223 | 239 | 2015 | ✓ | 
UCF101 - Action Recognition Data Set

- homepage: http://crcv.ucf.edu/data/UCF101.php
 
HMDB51: A Large Video Database for Human Motion Recognition
ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding
- homepage: http://activity-net.org/
 - download: http://activity-net.org/download.html
 - github: https://github.com/activitynet
 
Sports-1M
- homepage: https://github.com/gtoderici/sports-1m-dataset/blob/wiki/ProjectHome.md
 - github: https://github.com/gtoderici/sports-1m-dataset/
 - thumbnails: http://cs.stanford.edu/people/karpathy/deepvideo/classes.html
 
Charades Dataset
- intro: This dataset guides our research into unstructured video activity recogntion and commonsense reasoning for daily human activities.
 - intro: The dataset contains 66,500 temporal annotations for 157 action classes, 41,104 labels for 46 object classes, and 27,847 textual descriptions of the videos.
 - homepage: http://allenai.org/plato/charades/
 
FCVID: Fudan-Columbia Video Dataset
- homepage: http://bigvid.fudan.edu.cn/FCVID/
 
YouTube-8M: A Large-Scale Video Classification Benchmark
- homepage: http://research.google.com/youtube8m/
 - arxiv: http://arxiv.org/abs/1609.08675
 
stabilized video frames
- intro: 9 TB, 35,000,000 clips, 32 frames
 - intro: Generating Videos with Scene Dynamics
 - homepage: http://web.mit.edu/vondrick/tinyvideo/#data
 
The Kinetics Human Action Video Dataset
- intro: Google
 - homepage: https://deepmind.com/research/open-source/open-source-datasets/kinetics/
 - arxiv: https://arxiv.org/abs/1705.06950
 
e-Lab Video Data Set(s)
- intro: “Currently, e-VDS35 has 35 classes and a total of 2050 videos of roughly 10 seconds each (see histogram below). We are aiming to collect overall 1750 (50 × 35) videos with your help.”
 - homepage: https://engineering.purdue.edu/elab/eVDS
 
Video Dataset Overview
- intro: Sortable and searchable compilation of video dataset
 - arxiv: https://www.di.ens.fr/~miech/datasetviz/
 
Scene
SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth
- intro: Imperial College London
 - project page: https://robotvault.bitbucket.org/scenenet-rgbd.html
 - github: https://arxiv.org/abs/1612.05079
 - github: https://github.com/jmccormac/pySceneNetRGBD
 
Autonomous Driving
BDD: Berkely Deep Drive
- intro: 100,000 HD video sequences of over 1,100-hour driving experience across many different times in the day, weather conditions, and driving scenarios
 - homepage: http://bdd-data.berkeley.edu/
 - github: https://github.com/ucbdrive/bdd-data
 
OCR
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
- homepage: http://vision.cornell.edu/se3/coco-text/
 - arxiv: http://arxiv.org/abs/1601.07140
 
Chinese Text in the Wild
- intro: 32,285 high resolution images, 1,018,402 character instances, 3,850 character categories, 6 kinds of attributes
 - homepage: https://ctwdataset.github.io/
 - arxiv: https://arxiv.org/abs/1803.00085
 
ShopSign: a Diverse Scene Text Dataset of Chinese Shop Signs in Street Views
Retrieval
Oxford5k
Paris6k
Oxford105k
UKB
NUS-WIDE
ImageNet-YahooQA
University-1652:
[Paper] 
[Explore Drone-view Data]
[Explore Satellite-view Data]
[Explore Street-view Data]
[Video Sample]
[中文介绍]
- Dataset and Baseline Code: https://github.com/layumi/University1652-Baseline
 
DeepFashion: In-shop Clothes Retrieval
- intro: 7,982 number of clothing items; 52,712 number of in-shop clothes images, and ~200,000 cross-pose/scale pairs; Each image is annotated by bounding box, clothing type and pose type.
 - homepage: http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/InShopRetrieval.html
 
Person Re-ID
| Dataset | Description | 
|---|---|
| CUHK01 | 971 identities, 3884 images, manually cropped | 
| CUHK02 | 1816 identities, 7264 images, manually cropped | 
| CUHK03 | 1360 identities, 13164 images, manually cropped + automatically detected | 
Person Re-identification Datasets
- homepage: http://robustsystems.coe.neu.edu/sites/robustsystems.coe.neu.edu/files/systems/projectpages/reiddataset.html
 - github: https://github.com/RSL-NEU/person-reid-benchmark
 
CUHK Person Re-identification Datasets
http://www.ee.cuhk.edu.hk/~xgwang/CUHK_identification.html
PRW (Person Re-identification in the Wild) Dataset

- homepage: http://www.liangzheng.com.cn/Project/project_prw.html
 - github: https://github.com/liangzheng06/PRW-baseline
 
Person Re-identification in the Wild
- intro: CVPR 2017 spotlight
 - arxiv: https://arxiv.org/abs/1604.02531
 
DukeMTMC-reID
- intro: DukeMTMC-reID is a subset of the DukeMTMC for image-based re-identification, in the format of the Market-1501 dataset
 - intro: 16,522 training images of 702 identities, 2,228 query images of the other 702 identities and 17,661 gallery images
 - github: https://github.com/layumi/DukeMTMC-reID_evaluation
 
DukeMTMC4ReID
- intro: DukeMTMC4ReID dataset
 - github: https://github.com/NEU-Gou/DukeReID
 
Person Re-ID (PRID) Dataset 2011
https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/PRID11/
MARS (Motion Analysis and Re-identification Set) Dataset
- intro: an extension of the Market-1501 dataset
 - homepage: http://www.liangzheng.com.cn/Project/project_mars.html
 - github: https://github.com/liangzheng06/MARS-evaluation
 
X-MARS Reordering of the MARS Dataset for Image to Video Evaluation
- intro: This repository provides the X-MARS dataset splits for image to video/tracklet evaluation
 - github: https://github.com/andreas-eberle/x-mars
 
MSMT17
- intro: 15-camera (12 outdoor cameras, 3 indoor cameras), 4,101 Identities, 126,441 BBoxes
 - homepage: http://www.pkuvmc.com/publications/longhui.html
 - soa: http://www.pkuvmc.com/publications/state_of_the_art.html
 
Labeled Pedestrian in the Wild
- intro: train/test identities: 1,975/756
 - homepage: http://liuyu.us/dataset/lpw/
 
SenseReID
https://drive.google.com/file/d/0B56OfSrVI8hubVJLTzkwV2VaOWM/view
3DPeS
http://www.openvisor.org/3dpes.asp
iQIYI-VID: A Large Dataset for Multi-modal Person Identification
https://arxiv.org/abs/1811.07548
Fashion
Large-scale Fashion (DeepFashion) Database
- intro: Attribute Prediction, Consumer-to-shop Clothes Retrieval, In-shop Clothes Retrieval, and Landmark Detection
 - homepage: http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html
 
Apparel classification with Style

- intro: 15 clothing classes, 88951 images
 - homepage: http://people.ee.ethz.ch/~lbossard/projects/accv12/index.html
 
Attribute Datasets
Attribute Datasets
- intro: in total 41,585 pedestrian samples, each of which is annotated with 72 attributes as well as viewpoints, occlusions, body parts information
 - homepage: https://www.ecse.rpi.edu/homepages/cvrl/database/AttributeDataset.htm
 
Pedestrian Attribute Recognition
A Richly Annotated Dataset for Pedestrian Attribute Recognition
- homepage: http://rap.idealtest.org/
 - arxiv: https://arxiv.org/abs/1603.07054
 
Pedestrian Attribute Recognition At Far Distance
- intro: PEdesTrian Attribute (PETA)
 - homepage: http://mmlab.ie.cuhk.edu.hk/projects/PETA.html
 - paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/mm14.pdf
 
Market-1501_Attribute
DukeMTMC-attribute
Parse27k
- intro: Pedestrian Attribute Recognition in Sequences
 - intro: >27,000 annotated pedestrians, 10 attributes
 - homepage: https://www.vision.rwth-aachen.de/page/parse27k
 - tools: https://github.com/psudowe/parse27k_tools
 
Tracking
UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking
- homepage: http://detrac-db.rit.albany.edu/
 - arxiv: https://arxiv.org/abs/1511.04136
 
DukeMTMC: Duke Multi-Target, Multi-Camera Tracking Project
- intro: DukeMTMC aims to accelerate advances in multi-target multi-camera tracking. It provides a tracking system that works within and across cameras, a new large scale HD video data set recorded by 8 synchronized cameras with more than 7,000 single camera trajectories and over 2,000 unique identities
 - homepage: http://vision.cs.duke.edu/DukeMTMC/
 
The WILDTRACK Seven-Camera HD Dataset
https://cvlab.epfl.ch/data/wildtrack
GOT-10k: Generic Object Tracking Benchmark
- intro: A large, high-diversity, one-shot database for generic object tracking in the wild
 - project page: http://got-10k.aitestunion.com/
 - github: https://github.com/got-10k/toolkit
 
Color Classification
Vehicle Color Recognition on an Urban Road by Feature Context
http://mclab.eic.hust.edu.cn/~pchen/project.html
License Plate Detection and Recognition
Application-Oriented License Plate (AVOP) Database
http://aolpr.ntust.edu.tw/lab/download.html
CCPD: Chinese City Parking Dataset
- paper: http://openaccess.thecvf.com/content_ECCV_2018/papers/Zhenbo_Xu_Towards_End-to-End_License_ECCV_2018_paper.pdf
 - github: https://github.com/detectRecog/CCPD
 - dataset: https://drive.google.com/file/d/1fFqCXjhk7vE9yLklpJurEwP9vdLZmrJd/view
 
Face Anti-Spoofing
CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations
- intro: ECCV 2020
 - arxiv: https://arxiv.org/abs/2007.12342
 - github: https://github.com/Davidzhangyuanhan/CelebA-Spoof
 
Tools
VoTT: Visual Object Tagging Tool 1.5
- intro: Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos
 - github: https://github.com/Microsoft/VoTT
 
LabelImg: a graphical image annotation tool and label object bounding boxes in images

Pychet Labeller
- intro: A python based annotation/labelling toolbox for images. The program allows the user to annotate individual objects in images.
 - github: https://github.com/sbargoti/pychetlabeller
 
ml-pyxis: Tool for reading and writing datasets of tensors (numpy.ndarray) with MessagePack and Lightning Memory-Mapped Database (LMDB).
- intro: Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
 - github: https://github.com/vicolab/ml-pyxis
 
Open Image Dataset downloader
BBox-Label-Tool
- intro: A simple tool for labeling object bounding boxes in images
 - github: https://github.com/puzzledqs/BBox-Label-Tool
 
Data Labeler for Video
- intro: A GUI tool for conveniently label the objects in video, using the powerful object tracking.
 - github: https://github.com//hahnyuan/video_labeler
 
Computer Vision Annotation Tool (CVAT)

- intro: Computer Vision Annotation Tool (CVAT) is a web-based tool which helps to annotate video and images for Computer Vision algorithms
 - github: https://github.com/opencv/cvat
 
Artist
BAM! The Behance Artistic Media Dataset
- intro: 2.5M artwork urls, 393K attribute labels, 74K short image descriptions/captions
 - project page: https://bam-dataset.org/
 - arxiv: https://arxiv.org/abs/1704.08614
 
Resources
CV Datasets on the web
http://www.cvpapers.com/datasets.html
Awesome Public Datasets
- intro: An awesome list of high-quality open datasets in public domains (on-going). By everyone, for everyone!
 - github: https://github.com/caesar0301/awesome-public-datasets
 
Machine Learning Repository
https://archive.ics.uci.edu/ml/datasets.html