TimescaleDB til IoT tidsseriedata — arkitektur og performance
Hvordan M-Bus Gateway bruger TimescaleDB til at håndtere millioner af wM-Bus aflæsninger: hypertables, continuous aggregates, komprimering og query-performance.
Af M-Bus Gateway
M-Bus Gateway gemmer alle aflæsninger i TimescaleDB — PostgreSQL udvidet med tidsserieoptimering. Her er arkitekturen bag, og hvorfor det er det rigtige valg for wM-Bus data.
Hvorfor TimescaleDB frem for ren PostgreSQL?
wM-Bus-data har særlige karakteristika:
- Skrivefrekvens: ~1 gang pr. dag pr. gateway (batch)
- Datamønster: Append-only (ingen UPDATE på readings)
- Querytyper: Altid tidsbaserede ranges (fra–til)
- Volumen: ~365 × antal_installationer × antal_år
- Eksempel: 1000 installationer × 365 dage × 10 år = 3,65 mio. rækker
TimescaleDB fordele:
- Hypertables: Automatisk time-based partitionering → hurtigere range queries
- Continuous aggregates: Pre-aggregerede månedlige/ugentlige views
- Komprimering: 90-96% komprimering på gamle partitioner
- PostgreSQL-kompatibel: Fuld SQL, eksisterende tooling, ingen ORM-ændringer
Reading-tabel som hypertable
-- Schema (SQLModel Python definition):
CREATE TABLE reading (
id UUID NOT NULL DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL, -- Multi-tenant isolation
meter_installation_id UUID NOT NULL,
timestamp TIMESTAMPTZ NOT NULL,
value NUMERIC(12,3) NOT NULL,
unit TEXT NOT NULL, -- 'units', 'kwh', 'm3', etc.
battery_level_pct SMALLINT,
rssi_dbm SMALLINT,
received_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
deleted_at TIMESTAMPTZ,
PRIMARY KEY (id, timestamp)
);
-- Gør til hypertable partitioneret pr. uge:
SELECT create_hypertable('reading', 'timestamp', chunk_time_interval => INTERVAL '1 week');
-- Index på de hyppige query-mønstre:
CREATE INDEX ON reading (meter_installation_id, timestamp DESC);
CREATE INDEX ON reading (tenant_id, timestamp DESC);
Continuous aggregates (pre-aggregering)
Månedlige visninger beregnes løbende af TimescaleDB — ikke ved query-tid:
-- Daglig continuous aggregate:
CREATE MATERIALIZED VIEW reading_daily
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 day', timestamp) AS day,
meter_installation_id,
tenant_id,
MAX(value) AS max_value, -- Akkumuleret tæller → brug MAX
MIN(value) AS min_value, -- Periodens startværdi
MAX(value) - MIN(value) AS delta_value, -- Periodens forbrug
AVG(battery_level_pct) AS avg_battery,
AVG(rssi_dbm) AS avg_rssi,
COUNT(*) AS reading_count
FROM reading
WHERE deleted_at IS NULL
GROUP BY 1, 2, 3
WITH NO DATA;
-- Opdater automatisk med 1 times forsinkelse (tillad sene data):
SELECT add_continuous_aggregate_policy('reading_daily',
start_offset => INTERVAL '3 days',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour'
);
-- Månedlig continuous aggregate (bygger på daglig):
CREATE MATERIALIZED VIEW reading_monthly
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 month', day) AS month,
meter_installation_id,
tenant_id,
SUM(delta_value) AS monthly_consumption,
AVG(avg_battery) AS avg_battery,
AVG(avg_rssi) AS avg_rssi
FROM reading_daily
GROUP BY 1, 2, 3
WITH NO DATA;
Query-eksempler
Månedligt forbrug pr. lejlighed (API-endpoint)
# server/src/readings/router.py
async def get_monthly_readings(
property_id: str,
year: int,
session: AsyncSession,
user,
):
stmt = (
select(
ReadingMonthly.month,
ReadingMonthly.meter_installation_id,
ReadingMonthly.monthly_consumption,
)
.join(MeterInstallation, ReadingMonthly.meter_installation_id == MeterInstallation.id)
.join(Unit, MeterInstallation.unit_id == Unit.id)
.where(
Unit.property_id == property_id,
func.date_trunc("year", ReadingMonthly.month) == f"{year}-01-01",
ReadingMonthly.tenant_id == user.tenant_id,
)
.order_by(ReadingMonthly.month, ReadingMonthly.meter_installation_id)
)
result = await session.execute(stmt)
return result.all()
Aflæsningsdækning (seneste 48 timer)
-- Ejendomme med < 80% dækning:
SELECT
p.id AS property_id,
p.name,
COUNT(mi.id) AS total_installations,
COUNT(r.meter_installation_id) AS active_installations,
ROUND(100.0 * COUNT(r.meter_installation_id) / NULLIF(COUNT(mi.id), 0), 1) AS coverage_pct
FROM property p
JOIN unit u ON u.property_id = p.id
JOIN meter_installation mi ON mi.unit_id = u.id AND mi.deleted_at IS NULL
LEFT JOIN LATERAL (
SELECT meter_installation_id
FROM reading
WHERE meter_installation_id = mi.id
AND timestamp > NOW() - INTERVAL '48 hours'
AND deleted_at IS NULL
LIMIT 1
) r ON TRUE
WHERE p.tenant_id = $1 AND p.deleted_at IS NULL
GROUP BY p.id, p.name
HAVING COUNT(mi.id) > 0
ORDER BY coverage_pct ASC;
Komprimering af historiske data
-- Aktiver komprimering på reading-hypertable:
ALTER TABLE reading SET (
timescaledb.compress,
timescaledb.compress_orderby = 'timestamp DESC',
timescaledb.compress_segmentby = 'meter_installation_id'
);
-- Komprimer chunks ældre end 90 dage automatisk:
SELECT add_compression_policy('reading', INTERVAL '90 days');
Typisk komprimeringsrate for wM-Bus data:
Ukomprimeret: 1.000.000 rækker = ~180 MB
Komprimeret: 1.000.000 rækker = ~12 MB (93% komprimering)
Retention policy
Data ældre end 10 år slettes automatisk (GDPR + diskplads):
-- Slet chunks ældre end 10 år:
SELECT add_retention_policy('reading', INTERVAL '10 years');
GDPR-note: Anonymiserede data (tenant_id/meter_installation_id fjernet) kan opbevares længere til aggregeret statistik.
Performance i tal
| Query | Uden TimescaleDB | Med TimescaleDB |
|---|---|---|
| Månedligt forbrug, 1 installation, 2 år | 180 ms | 8 ms |
| Aflæsningsdækning, 500 installationer | 2.400 ms | 45 ms |
| Anomali-scan, alle readings, 30 dage | 8.900 ms | 120 ms |
| Insert, 1000 readings (batch) | 95 ms | 85 ms |
Målt på Hetzner CX21 (2 vCPU, 4GB RAM, 40GB SSD), 5 mio. rækker.
Backup og point-in-time recovery
# Fuld backup via pg_dump (komprimeret):
pg_dump -Fc -Z9 mbus_db > backup-$(date +%Y%m%d).dump
# Inkrementel via WAL-archiving til Hetzner Object Storage:
archive_command = 'aws s3 cp %p s3://mbus-backup/wal/%f --endpoint-url https://s3.hetzner.de'
# Point-in-time recovery til enhver time:
restore_command = 'aws s3 cp s3://mbus-backup/wal/%f %p --endpoint-url https://s3.hetzner.de'
Konklusion
TimescaleDB giver optimal performance til wM-Bus-data: automatisk partitionering på tid, continuous aggregates der gør månedlige queries instant, og 93% komprimering af historiske data. Arkitekturen skalerer fra 10 til 100.000+ installationer uden ændringer.