M-Bus Gateway
← Tilbage til blog
· PostgreSQL· partitionering· IoT· TimescaleDB· performance· database· Python· asyncpg

PostgreSQL table partitionering til IoT — range og list partitioning

PostgreSQL table partitioning til IoT-data: range partitioning pr. måned, list partitioning pr. tenant, performance fordele, maintenance og TimescaleDB sammenligning.

Af M-Bus Gateway

PostgreSQL native partitionering komplementerer TimescaleDB hypertables. Her er hvornår og hvordan man bruger det.


Native partitionering vs TimescaleDB

TimescaleDB hypertables:
  → Automatisk time-baseret partitionering (chunks)
  → Komprimering og continuous aggregates
  → Bruges til: readings (hypertable)

PostgreSQL native partitionering:
  → Manuel opsætning, fuld SQL-kontrol
  → Bruges til: audit_log, large tenant tables
  → Fordel: Simpelt, ingen extension krævet

Tommelfingerregel:
  Tidsseriedata → TimescaleDB hypertable
  Store ikke-tidsserie tabeller → PostgreSQL range/list partitioning

Range partitioning (tid)

-- audit_log partitioneret pr. kvartal

CREATE TABLE audit_log (
    id          UUID NOT NULL DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    action      TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    before      JSONB,
    after       JSONB
) PARTITION BY RANGE (created_at);

-- Opret partitioner manuelt (eller via script):
CREATE TABLE audit_log_2025_q1 PARTITION OF audit_log
    FOR VALUES FROM ('2025-01-01') TO ('2025-04-01');

CREATE TABLE audit_log_2025_q2 PARTITION OF audit_log
    FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');

CREATE TABLE audit_log_2025_q3 PARTITION OF audit_log
    FOR VALUES FROM ('2025-07-01') TO ('2025-10-01');

CREATE TABLE audit_log_2025_q4 PARTITION OF audit_log
    FOR VALUES FROM ('2025-10-01') TO ('2026-01-01');

-- Indeks oprettes pr. partition automatisk:
CREATE INDEX ON audit_log (tenant_id, created_at);

-- PostgreSQL bruger kun den relevante partition (partition pruning):
-- SELECT * FROM audit_log WHERE created_at >= '2025-07-01'
-- → Scaner KUN audit_log_2025_q3 og Q4

Automatisk partition oprettelse

# server/src/workers/tasks/maintenance.py

import asyncpg
from datetime import datetime, date

async def ensure_partition_exists(
    conn: asyncpg.Connection,
    table: str,
    for_date: date,
) -> None:
    """Opret kvartal-partition hvis den ikke eksisterer."""
    q = (for_date.month - 1) // 3 + 1  # Kvartal 1-4
    year = for_date.year
    start = date(year, (q - 1) * 3 + 1, 1)
    if q < 4:
        end = date(year, q * 3 + 1, 1)
    else:
        end = date(year + 1, 1, 1)

    partition_name = f"{table}_{year}_q{q}"

    await conn.execute(f"""
        CREATE TABLE IF NOT EXISTS {partition_name}
        PARTITION OF {table}
        FOR VALUES FROM ('{start}') TO ('{end}')
    """)

# Celery beat: Opret næste kvartal 1 måned inden start:
@celery.task
def create_next_quarter_partitions() -> None:
    """Kør månedligt — sikrer partitioner eksisterer fremad."""
    import asyncio
    asyncio.run(_create_partitions())

async def _create_partitions() -> None:
    from datetime import timedelta
    next_period = date.today() + timedelta(days=90)
    async with asyncpg.connect(settings.DATABASE_URL) as conn:
        for table in ["audit_log", "meter_events"]:
            await ensure_partition_exists(conn, table, next_period)

List partitioning (multi-tenant)

-- Stor tabel partitioneret pr. tenant (eksempel — normalt for store porteføljer)
-- Bruges KUN hvis tenant-data er meget asymmetrisk (én stor tenant)

CREATE TABLE property_documents (
    id          UUID NOT NULL DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    property_id UUID NOT NULL,
    filename    TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    s3_key      TEXT NOT NULL
) PARTITION BY LIST (tenant_id::text);

-- Default partition fanger alle andre:
CREATE TABLE property_documents_default
    PARTITION OF property_documents DEFAULT;

-- Stor tenant får dedikeret partition:
CREATE TABLE property_documents_bigcorp
    PARTITION OF property_documents
    FOR VALUES IN ('550e8400-e29b-41d4-a716-446655440000');

-- Fordel: Queries for BigCorp scaner KUN deres partition
-- SELECT * FROM property_documents WHERE tenant_id = 'BigCorp...'
-- → Fuld partition pruning

Partition maintenance

-- Drop gamle partitioner (GDPR/retention):
-- Data ældre end 7 år slettes

-- Sikkert drop (verificér indhold inden):
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
WHERE tablename LIKE 'audit_log_2018%';

-- Drop partition (øjeblikkelig — ingen vacuum nødvendig):
DROP TABLE IF EXISTS audit_log_2018_q1;

-- Alternativt: Detach og arkivér:
ALTER TABLE audit_log DETACH PARTITION audit_log_2018_q1;
-- Nu kan audit_log_2018_q1 dumpes til S3 og droppes:
-- pg_dump -t audit_log_2018_q1 > archive_2018_q1.sql
DROP TABLE audit_log_2018_q1;

Performance eksempel

Audit log: 500 million rows (7 år historik)

Uden partitionering:
  SELECT COUNT(*) WHERE tenant_id='...' AND created_at > '2025-01-01'
  → Sequential scan: 8,2 sekunder

Med kvartal-partitionering:
  → PostgreSQL pruner til 3 partitioner (2025 Q1-Q3)
  → Parallel scan af 3 partitioner: 0,4 sekunder (20× speedup)

Tilgang for M-Bus Gateway:
  audit_log: Kvartal-partitionering (7-årig retention policy)
  readings:  TimescaleDB hypertable (bedre til tidsseriedata)
  documents: Default (lille tabel — ingen partitionering)

SQLModel integration

# server/src/db/models.py

from sqlmodel import SQLModel, Field
from datetime import datetime, timezone
from uuid import UUID, uuid4

class AuditLog(SQLModel, table=True):
    """
    Partitioneret tabel — SQLModel ser den som én tabel.
    PostgreSQL håndterer partition-routing transparant.
    """
    __tablename__ = "audit_log"

    id: UUID = Field(default_factory=uuid4, primary_key=True)
    tenant_id: UUID
    action: str
    created_at: datetime = Field(
        default_factory=lambda: datetime.now(timezone.utc)
    )
    before: dict | None = Field(default=None, sa_column_kwargs={"type_": "jsonb"})
    after: dict | None = Field(default=None, sa_column_kwargs={"type_": "jsonb"})

    # NOTE: created_at skal altid sættes eksplicit ved INSERT
    # ellers kan data ende i wrong partition (edge case ved partition grænse)

Konklusion

PostgreSQL native range partitionering reducerer query-tid med 20× for store historiske tabeller. Kvartal-partitioner for audit_log giver hurtig retention-håndtering (DROP TABLE er øjeblikkelig vs. DELETE som tager timer). Opret partitioner automatisk via Celery beat 90 dage frem. TimescaleDB er stadig overlegen til tidsseriedata — brug begge.

Se TimescaleDB hypertable guide eller asyncpg guide.