Tropical cyclone subset from the Global 3D Cloud Reconstruction Dataset. Contains colocated pairs of geostationary imagery (GOES-16 ABI and Himawari-8/9 AHI) with CloudSat radar profiles for tropical cyclones identified via the IBTrACS (International Best Track Archive for Climate Stewardship) database. Each sample includes: multispectral geostationary imagery (16 spectral channels + satellite/solar angles), CloudSat vertical profiles as ground truth, colocation mask, and cyclone metadata (name, category, basin, distance to center). GOES covers Atlantic and Eastern Pacific basins; Himawari covers Western Pacific (highest TC frequency globally). 256x256 pixel patches in Cloud-Optimized GeoTIFF format.
Click on any region to view partition details
Hierarchical structure showing representative samples across levels. The "..." notation indicates additional samples following the same pattern. All samples at the same level share identical structure (RSUT constraint).
| Level | Types | Total Samples | Sample IDs (preview) |
|---|---|---|---|
| Level 0 | All FOLDER |
482 | Root level samples |
| Level 1 | FILE + FILE |
964 | geo_patch, cloudsat_aligned |
These fields are available for querying with SQL when using TacoReader.
| Field Name | Type | Description |
|---|---|---|
| id | string | Unique sample identifier within parent scope. Must be unique among siblings. |
| type | string | Sample type discriminator (FILE or FOLDER). |
| stac:crs | string | Coordinate reference system (WKT2, EPSG, or PROJ) |
| stac:tensor_shape | list<item: int64> | Raster dimensions [bands, height, width] |
| stac:geotransform | list<item: double> | GDAL affine transform |
| stac:time_start | timestamp[us] | Start timestamp (μs since Unix epoch, UTC) |
| stac:centroid | binary | Center point in EPSG:4326 (WKB) |
| stac:time_end | timestamp[us] | End timestamp (μs since Unix epoch, UTC) |
| stac:time_middle | timestamp[us] | Middle timestamp (μs since Unix epoch, UTC) |
| cloud3d:satellite | string | Geostationary satellite source (GOES, Himawari, MSG) |
| cloud3d:geostationary_id | string | Original geostationary satellite file identifier |
| cloud3d:cloudsat_id | string | CloudSat granule/profile identifier |
| cloud3d:has_flxhr | bool | Whether 2B-FLXHR radiative flux/heating rate data is available |
| cyclone:storm_id | string | IBTrACS storm identifier (SID) |
| cyclone:center_lat | double | Cyclone center latitude from IBTrACS |
| cyclone:center_lon | double | Cyclone center longitude from IBTrACS |
| cyclone:dist_km | double | Distance from patch center to cyclone center in kilometers |
| cyclone:delta_t_seconds | double | Temporal offset between geostationary and CloudSat observations (seconds) |
| majortom:code | string | MajorTOM spherical grid cell identifier (e.g., 0100km_0003U_0005R) with ~dist_km spacing |
| geoenrich:elevation | float | Mean elevation in meters (GLO-30 DEM) |
| geoenrich:precipitation | float | Mean annual precipitation in mm estimated from GPM data |
| geoenrich:temperature | float | Mean annual temperature in °C estimated from MODIS LST data |
| geoenrich:admin_countries | string | Country name at centroid location |
| internal:current_id | int64 | Current sample position at this level (0-indexed). Enables O(1) random access and relational JOINs (ZIP, FOLDER, TACOCAT). |
| internal:parent_id | int64 | Foreign key referencing parent sample position in previous level (ZIP, FOLDER, TACOCAT). |
| Field Name | Type | Description |
|---|---|---|
| id | string | Unique sample identifier within parent scope. Must be unique among siblings. |
| type | string | Sample type discriminator (FILE or FOLDER). |
| geotiff:stats | list<item: list<item: float>> | Per-band statistics (List[List[Float32]]): categorical mode returns class probabilities, continuous mode returns [min, max, mean, std, valid%, p25, p50, p75, p95] |
| taco:header | binary | Binary TACOTIFF header (35 bytes + tile counts) for fast reading without IFD parsing |
| internal:current_id | int64 | Current sample position at this level (0-indexed). Enables O(1) random access and relational JOINs (ZIP, FOLDER, TACOCAT). |
| internal:parent_id | int64 | Foreign key referencing parent sample position in previous level (ZIP, FOLDER, TACOCAT). |
| internal:relative_path | string | Relative path from DATA/ directory. Format: {parent_path}/{id} or {id} for level0 (ZIP, FOLDER, TACOCAT). |
# pip install tacoreader
import tacoreader
# Load dataset
ds = tacoreader.load("https://data.source.coop/taco/3dclouds/cyclones/")
# Basic info
print(f"ID: {ds.id}")
print(f"Version: {ds.version}")
print(f"Samples: {len(ds.data)}")
| Name | Organization | |
|---|---|---|
| Cesar Aybar | Universitat de València | cesar.aybar@uv.es |
| Shirin Ermis | University of Oxford | — |
| Lilli Freischem | University of Oxford | — |
| Stella Girtsou | National Observatory of Athens | — |
| Kyriaki-Margarita Bintsi | Harvard Medical School | — |
| Emiliano Diaz Salas-Porras | Universitat de València | — |
| Michael Eisinger | European Space Agency | — |
| William Jones | University of Oxford | — |
| Anna Jungbluth | European Space Agency | — |
| Benoit Tremblay | Environment and Climate Change Canada | — |
If you use this dataset in your research, please cite:
@dataset{cloud3d-finetune-cyclones0,
title = {Cloud 3D - Tropical Cyclones Dataset},
author = {Cesar Aybar and Shirin Ermis and Lilli Freischem and Stella Girtsou and Kyriaki-Margarita Bintsi and Emiliano Diaz Salas-Porras and Michael Eisinger and William Jones and Anna Jungbluth and Benoit Tremblay},
year = {2015},
version = {0.1.0},
publisher = {Universitat de València}
}