# Sentinel-2 Cloud Cover Segmentation Dataset In many uses of multispectral satellite imagery, clouds obscure what we really care about - for example, tracking wildfires, mapping deforestation, or monitoring crop health. Being able to more accurately remove clouds from satellite images filters out interference, unlocking the potential of a vast range of use cases. With this goal in mind, this training dataset was generated as part of [crowdsourcing competition](https://www.drivendata.org/competitions/83/cloud-cover/), and later on was validated using a team of expert annotators. The dataset consists of Sentinel-2 satellite imagery and corresponding cloudy labels stored as GeoTiffs. There are 22,728 chips in the training data, collected between 2018 and 2020. ## Documentation * [Link](https://radiantearth.blob.core.windows.net/mlhub/ref_cloud_cover_detection_challenge_v1/documentation.pdf) ## Tutorials * [How to use deep learning, Pytorch Lightning, and the Planetary Computer to predict cloud cover in satellite imagery.](https://www.drivendata.co/blog/cloud-cover-benchmark/) by [Katie Westone](https://www.drivendata.co/) ## Creator & Contact * [Radiant Earth Foundation](https://radiant.earth) * ml@radiant.earth ## License * [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) ## Citation & DOI Radiant Earth Foundation. (2022). Sentinel-2 Cloud Cover Segmentation Dataset (Version 1). Radiant MLHub. [Date Accessed] https://doi.org/10.34911/rdnt.hfq6m7