Smoke100k: A Database for Smoke Detection

Hsiang-Yin Cheng, Jia-Li Yin, Bo-Hao Chen, and Zhi-Min Yu

Department of Computer Science and Engineering, Yuan Ze University




Due to the complex scenarios and the limited feature information in a single image, a precise smoke detection is much more challenging in practice. Most of previous smoke detection methods either extract textural and spatiotemporal characteristics of smoke or separate the smoke and background components of the image. However, those methods often fail in detecting smoke positions because of the limited feature information within a single image. Moreover, the task of smoke detection can be better achieved if the extra information from collected training dataset is available. One key issue is how to build a training dataset of paired smoke images and ground-truth bounding box positions for end-to-end learning. This paper proposes a large-scale benchmark image dataset to train a smoke detector. With the built dataset, experimental results demonstrate that the discriminative models can be effectively trained as the smoke detector to detect the smoldering fires precisely.


We contribute Smoke100k database, a large-scale smoke detection database, which has several appealing properties:

For more details of the dataset, please refer to the paper Smoke 100k: A Database for Smoke Detection.

Sample Images





If you find Smoke100k useful for your research, please cite our paper:

author={H. {Cheng} and J. {Yin} and B. {Chen} and Z. {Yu}}, 
booktitle={IEEE 8th Global Conference on Consumer Electronics (GCCE)},   
title={Smoke 100k: A Database for Smoke Detection},  


[1] B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman, “Labelme: A database and web-based tool for image annotation,” International Journal of Computer Vision (IJCV), vol. 77, no. 1, pp. 157–173, May 2008.

[2] P. K. Nathan Silberman, Derek Hoiem and R. Fergus, “Indoor segmentation and support inference from rgbd images,” European Conference on Computer Vision (ECCV), 2012.


Please contact Hsiang-Yin Cheng, Jia-Li Yin, or Bo-Hao Chen for questions about the dataset.