§6.1 · The bucket
plantir.garden/data/
├── latest.json # 5-min snapshot — current state
├── recent-24h.json # rolling window — last 24h
└── archive/
├── index.json # listing — keys + sizes + sha256
├── 2026-04-23.ndjson.gz # day archive — immutable once written
├── 2026-04-24.ndjson.gz
└── ...
latest.json and recent-24h.json are full-replace every 5 minutes.
archive/<YYYY-MM-DD>.ndjson.gz is finalised once after that day’s UTC
midnight + 30 min grace, then never rewritten.
DRAFT — exporter implementation lives at PLN-003; deployment gated on LIVE- prod cutover (Phase 5.9). Schema below is what will land.*
§6.2 · Per-row schema (NDJSON)
One JSON object per line. Compressed with gzip (Content-Encoding: gzip,
served as-gzipped from S3 via Cloudflare).
{"recorded_at":"2026-04-27T00:00:12Z","node_id":"esp32-greenhouse","temp_c":21.4,"humidity_pct":51.2,"moisture_pct":40.8}
Fields:
| Field | Type | Notes |
|---|---|---|
recorded_at | ISO-8601 UTC | Aurora server-side timestamp; canonical. |
node_id | string | Stable across renames via stable_id map. |
temp_c | number | null | BME280 reading. NULL means sensor absent. |
humidity_pct | number | null | BME280 reading. NULL means sensor absent. |
moisture_pct | number | null | Capacitive ADC. NULL means sensor absent. |
raw::jsonb is intentionally excluded per ADR-010 § Privacy. If a
future audit needs raw, that’s a separate authenticated tier — out of
v1 scope.
DRAFT — air-quality columns (lux, co2_ppm, pressure_hpa, pm25) proposal: nest under a separate
airobject. Decision pending.
§6.3 · Gap treatment
The dataset is gap-honest. A node that has been silent past its threshold produces no rows in the archive for that period. The absence is the data point.
Programmatic detection: an analysis script that joins by recorded_at
will see the gap as missing rows, not as zeros and not as forward-filled
last-known-good values. Don’t forward-fill — the gap is the signal.
INC-001 (2026-04-23 four-day gap) is in the archive as a gap.
§6.4 · Reproducibility
Any analysis published alongside the thesis should be reproducible from the public bucket alone, with no Aurora access. Specifically:
index.jsonlists every archive key with size + sha256. Verify the hashes before analysis.recent-24h.jsonshould NOT be used for back-analysis; it’s a live snapshot, not an immutable record.- Cite the URL to the specific archive day, not to
latest.json.
DRAFT — example analysis notebook + a README in the bucket pending by 2026-08.
§6.5 · License
Public Domain (CC0) for the data. Code (analysis, exporters) is whatever
the source repo declares — see the GitHub link in /about.
DRAFT — confirm CC0 with the supervisor before defence.
Status: structural draft v0.1, 2026-05-07. Citation:
https://plantir.garden/thesis/2026/datasetis locked per ADR-011. Related ADRs:docs/adr/010-public-sensor-data-policy.md,docs/adr/011-thesis-url-schema.md.