· PostgreSQL· partitionering· IoT· TimescaleDB· performance· database· Python· asyncpg
PostgreSQL table partitionering til IoT — range og list partitioning
PostgreSQL table partitioning til IoT-data: range partitioning pr. måned, list partitioning pr. tenant, performance fordele, maintenance og TimescaleDB sammenligning.
Af M-Bus Gateway
PostgreSQL native partitionering komplementerer TimescaleDB hypertables. Her er hvornår og hvordan man bruger det.
Native partitionering vs TimescaleDB
TimescaleDB hypertables:
→ Automatisk time-baseret partitionering (chunks)
→ Komprimering og continuous aggregates
→ Bruges til: readings (hypertable)
PostgreSQL native partitionering:
→ Manuel opsætning, fuld SQL-kontrol
→ Bruges til: audit_log, large tenant tables
→ Fordel: Simpelt, ingen extension krævet
Tommelfingerregel:
Tidsseriedata → TimescaleDB hypertable
Store ikke-tidsserie tabeller → PostgreSQL range/list partitioning
Range partitioning (tid)
-- audit_log partitioneret pr. kvartal
CREATE TABLE audit_log (
id UUID NOT NULL DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
action TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
before JSONB,
after JSONB
) PARTITION BY RANGE (created_at);
-- Opret partitioner manuelt (eller via script):
CREATE TABLE audit_log_2025_q1 PARTITION OF audit_log
FOR VALUES FROM ('2025-01-01') TO ('2025-04-01');
CREATE TABLE audit_log_2025_q2 PARTITION OF audit_log
FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');
CREATE TABLE audit_log_2025_q3 PARTITION OF audit_log
FOR VALUES FROM ('2025-07-01') TO ('2025-10-01');
CREATE TABLE audit_log_2025_q4 PARTITION OF audit_log
FOR VALUES FROM ('2025-10-01') TO ('2026-01-01');
-- Indeks oprettes pr. partition automatisk:
CREATE INDEX ON audit_log (tenant_id, created_at);
-- PostgreSQL bruger kun den relevante partition (partition pruning):
-- SELECT * FROM audit_log WHERE created_at >= '2025-07-01'
-- → Scaner KUN audit_log_2025_q3 og Q4
Automatisk partition oprettelse
# server/src/workers/tasks/maintenance.py
import asyncpg
from datetime import datetime, date
async def ensure_partition_exists(
conn: asyncpg.Connection,
table: str,
for_date: date,
) -> None:
"""Opret kvartal-partition hvis den ikke eksisterer."""
q = (for_date.month - 1) // 3 + 1 # Kvartal 1-4
year = for_date.year
start = date(year, (q - 1) * 3 + 1, 1)
if q < 4:
end = date(year, q * 3 + 1, 1)
else:
end = date(year + 1, 1, 1)
partition_name = f"{table}_{year}_q{q}"
await conn.execute(f"""
CREATE TABLE IF NOT EXISTS {partition_name}
PARTITION OF {table}
FOR VALUES FROM ('{start}') TO ('{end}')
""")
# Celery beat: Opret næste kvartal 1 måned inden start:
@celery.task
def create_next_quarter_partitions() -> None:
"""Kør månedligt — sikrer partitioner eksisterer fremad."""
import asyncio
asyncio.run(_create_partitions())
async def _create_partitions() -> None:
from datetime import timedelta
next_period = date.today() + timedelta(days=90)
async with asyncpg.connect(settings.DATABASE_URL) as conn:
for table in ["audit_log", "meter_events"]:
await ensure_partition_exists(conn, table, next_period)
List partitioning (multi-tenant)
-- Stor tabel partitioneret pr. tenant (eksempel — normalt for store porteføljer)
-- Bruges KUN hvis tenant-data er meget asymmetrisk (én stor tenant)
CREATE TABLE property_documents (
id UUID NOT NULL DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
property_id UUID NOT NULL,
filename TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
s3_key TEXT NOT NULL
) PARTITION BY LIST (tenant_id::text);
-- Default partition fanger alle andre:
CREATE TABLE property_documents_default
PARTITION OF property_documents DEFAULT;
-- Stor tenant får dedikeret partition:
CREATE TABLE property_documents_bigcorp
PARTITION OF property_documents
FOR VALUES IN ('550e8400-e29b-41d4-a716-446655440000');
-- Fordel: Queries for BigCorp scaner KUN deres partition
-- SELECT * FROM property_documents WHERE tenant_id = 'BigCorp...'
-- → Fuld partition pruning
Partition maintenance
-- Drop gamle partitioner (GDPR/retention):
-- Data ældre end 7 år slettes
-- Sikkert drop (verificér indhold inden):
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
WHERE tablename LIKE 'audit_log_2018%';
-- Drop partition (øjeblikkelig — ingen vacuum nødvendig):
DROP TABLE IF EXISTS audit_log_2018_q1;
-- Alternativt: Detach og arkivér:
ALTER TABLE audit_log DETACH PARTITION audit_log_2018_q1;
-- Nu kan audit_log_2018_q1 dumpes til S3 og droppes:
-- pg_dump -t audit_log_2018_q1 > archive_2018_q1.sql
DROP TABLE audit_log_2018_q1;
Performance eksempel
Audit log: 500 million rows (7 år historik)
Uden partitionering:
SELECT COUNT(*) WHERE tenant_id='...' AND created_at > '2025-01-01'
→ Sequential scan: 8,2 sekunder
Med kvartal-partitionering:
→ PostgreSQL pruner til 3 partitioner (2025 Q1-Q3)
→ Parallel scan af 3 partitioner: 0,4 sekunder (20× speedup)
Tilgang for M-Bus Gateway:
audit_log: Kvartal-partitionering (7-årig retention policy)
readings: TimescaleDB hypertable (bedre til tidsseriedata)
documents: Default (lille tabel — ingen partitionering)
SQLModel integration
# server/src/db/models.py
from sqlmodel import SQLModel, Field
from datetime import datetime, timezone
from uuid import UUID, uuid4
class AuditLog(SQLModel, table=True):
"""
Partitioneret tabel — SQLModel ser den som én tabel.
PostgreSQL håndterer partition-routing transparant.
"""
__tablename__ = "audit_log"
id: UUID = Field(default_factory=uuid4, primary_key=True)
tenant_id: UUID
action: str
created_at: datetime = Field(
default_factory=lambda: datetime.now(timezone.utc)
)
before: dict | None = Field(default=None, sa_column_kwargs={"type_": "jsonb"})
after: dict | None = Field(default=None, sa_column_kwargs={"type_": "jsonb"})
# NOTE: created_at skal altid sættes eksplicit ved INSERT
# ellers kan data ende i wrong partition (edge case ved partition grænse)
Konklusion
PostgreSQL native range partitionering reducerer query-tid med 20× for store historiske tabeller. Kvartal-partitioner for audit_log giver hurtig retention-håndtering (DROP TABLE er øjeblikkelig vs. DELETE som tager timer). Opret partitioner automatisk via Celery beat 90 dage frem. TimescaleDB er stadig overlegen til tidsseriedata — brug begge.