M-Bus Gateway
← Tilbage til blog
· PostgreSQL· partitioning· IoT· tidsseriedata· TimescaleDB· Python· backend· database

PostgreSQL declarative partitioning til IoT tidsseriedata

PostgreSQL declarative partitioning til IoT: range partitioning på timestamp, partition pruning, automatisk partition-oprettelse, vs TimescaleDB hypertables og query-performance benchmarks.

Af M-Bus Gateway

Declarative partitioning i PostgreSQL giver styr på tidsseriedata uden TimescaleDB-extension. Her er implementeringen og trade-offs.


Range partitioning på timestamp

-- Readings tabel med månedlig range partitioning:

CREATE TABLE readings (
    id          UUID DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    meter_installation_id UUID NOT NULL,
    timestamp   TIMESTAMPTZ NOT NULL,
    value       NUMERIC(12, 4) NOT NULL,
    unit        TEXT NOT NULL,
    received_at TIMESTAMPTZ DEFAULT NOW(),
    deleted_at  TIMESTAMPTZ
)
PARTITION BY RANGE (timestamp);

-- Opret partitioner pr. måned:
CREATE TABLE readings_2025_06
    PARTITION OF readings
    FOR VALUES FROM ('2025-06-01') TO ('2025-07-01');

CREATE TABLE readings_2025_07
    PARTITION OF readings
    FOR VALUES FROM ('2025-07-01') TO ('2025-08-01');

-- Indeks arves IKKE automatisk — opret pr. partition:
CREATE INDEX ON readings_2025_06 (tenant_id, meter_installation_id, timestamp DESC);
CREATE INDEX ON readings_2025_07 (tenant_id, meter_installation_id, timestamp DESC);

-- Primary key skal inkludere partitionering-kolonnen:
ALTER TABLE readings ADD PRIMARY KEY (id, timestamp);

Automatisk partition-oprettelse

# server/src/db/partitions.py
# Celery-task der opretter næste måneds partition automatisk

from datetime import date, timedelta
from dateutil.relativedelta import relativedelta
import asyncpg
import calendar


async def ensure_next_month_partition(conn: asyncpg.Connection) -> bool:
    """
    Opretter partition for næste måned hvis den ikke eksisterer.
    Køres månedligt via Celery beat.
    """
    today = date.today()
    next_month = today + relativedelta(months=1)

    partition_name = f"readings_{next_month.year}_{next_month.month:02d}"
    from_date = next_month.replace(day=1)
    to_date = from_date + relativedelta(months=1)

    # Check om partition allerede eksisterer:
    exists = await conn.fetchval(
        """
        SELECT 1 FROM pg_tables
        WHERE tablename = $1 AND schemaname = 'public'
        """,
        partition_name,
    )
    if exists:
        return False

    # Opret partition:
    await conn.execute(f"""
        CREATE TABLE {partition_name}
            PARTITION OF readings
            FOR VALUES FROM ('{from_date}') TO ('{to_date}')
    """)

    # Opret indekser:
    await conn.execute(f"""
        CREATE INDEX ON {partition_name}
            (tenant_id, meter_installation_id, timestamp DESC)
    """)
    await conn.execute(f"""
        CREATE INDEX ON {partition_name} (timestamp DESC)
        WHERE deleted_at IS NULL
    """)

    log.info(
        "partition.created",
        partition=partition_name,
        from_date=str(from_date),
        to_date=str(to_date),
    )
    return True


@shared_task(name="db.ensure_partitions")
def ensure_partitions_task():
    import asyncio
    asyncio.run(_ensure_all_needed_partitions())

Partition pruning — query performance

-- PostgreSQL bruger partition pruning automatisk:

-- Hent aflæsninger for en måned (bruger KUN readings_2025_06):
EXPLAIN SELECT * FROM readings
WHERE timestamp >= '2025-06-01'
  AND timestamp < '2025-07-01'
  AND tenant_id = '...'
  AND meter_installation_id = '...';

-- Output: Seq Scan on readings_2025_06 (Partitions: 1/24)
-- Resten af partitionerne er pruned

-- Kryds-partition query (lidt langsommere):
SELECT date_trunc('month', timestamp), sum(value)
FROM readings
WHERE meter_installation_id = '...'
  AND timestamp >= '2025-01-01'
  AND timestamp < '2026-01-01'
GROUP BY 1;
-- PostgreSQL scanner 12 partitioner, pruner resten

-- Partition pruning kræver literal eller parameter:
-- VIRKER: WHERE timestamp >= '2025-06-01'
-- VIRKER: WHERE timestamp >= $1  (parameteriseret)
-- VIRKER IKKE: WHERE timestamp >= NOW() - INTERVAL '1 year'  (ved plan-tid)
-- Brug enable_partition_pruning = on (default)

Partition drop og retention

# server/src/db/partitions.py

async def drop_old_partitions(
    conn: asyncpg.Connection,
    retention_years: int = 7,
) -> list[str]:
    """
    Drop partitioner ældre end retention_years.
    Lovkrav: 5 år minimum (bogføringsloven).
    Platform bruger 7 år for sikkerhedsmargin.
    """
    cutoff = date.today() - relativedelta(years=retention_years)

    # Find alle readings-partitioner:
    partitions = await conn.fetch("""
        SELECT tablename
        FROM pg_tables
        WHERE tablename LIKE 'readings_____\\___'
          AND schemaname = 'public'
        ORDER BY tablename
    """)

    dropped = []
    for row in partitions:
        name = row["tablename"]
        # Parse readings_2018_05 → 2018-05
        parts = name.split("_")
        if len(parts) != 3:
            continue
        try:
            year, month = int(parts[1]), int(parts[2])
            partition_date = date(year, month, 1)
        except ValueError:
            continue

        if partition_date < cutoff:
            # Soft-check: Eksisterer data vi ikke burde slette?
            count = await conn.fetchval(
                f"SELECT COUNT(*) FROM {name} LIMIT 1"
            )
            if count:
                log.warning(
                    "partition.skip_drop_has_data",
                    partition=name,
                    cutoff=str(cutoff),
                )
                continue

            await conn.execute(f"DROP TABLE {name}")
            dropped.append(name)
            log.info("partition.dropped", partition=name)

    return dropped

PostgreSQL partitioning vs. TimescaleDB

Feature                 PostgreSQL Native    TimescaleDB
────────────────────────────────────────────────────────────
Partitioner             Manuel/auto script   Automatisk (chunk_interval)
Komprimering            Ikke native          90%+ komprimering
Continuous aggregates   Ikke native          Built-in
Query optimizer         Standard pruning     Chunk-aware optimizer
Background jobs         Celery/cron          Policy-baseret intern
Extension krævet        Nej                  Ja (TimescaleDB)
SQL-kompatibilitet      100% standard SQL    Standard SQL (næsten)
Rækkevidde              Ubegrænset           Ubegrænset
Oprettelsestid          Migrering-script     CREATE TABLE ... USING

Hvornår native partitioning er bedre:
  → Managed PostgreSQL (ingen extension-support)
  → Overholdelse af standard-SQL krav
  → Data der ikke er tidsserier (fx audit_log pr. år)
  → Enkle retention-politikker

Hvornår TimescaleDB er bedre:
  → IoT tidsseriedata med millions of rows
  → Continuous aggregates (forudberegnede summaries)
  → Komprimering er kritisk (disk-pris)
  → Kompleks time-bucket analyse

SQLModel med partitionerede tabeller

# server/src/db/models.py

class Reading(SQLModel, table=True):
    """
    Readings er partitioneret på timestamp.
    SQLModel/SQLAlchemy understøtter dette transparent.
    """
    __tablename__ = "readings"

    # Primary key SKAL inkludere partition-kolonne ved PostgreSQL native:
    id: uuid.UUID = Field(default_factory=uuid.uuid4)
    timestamp: datetime = Field(
        sa_column=Column(DateTime(timezone=True), primary_key=True)
    )

    tenant_id: uuid.UUID = Field(nullable=False)
    meter_installation_id: uuid.UUID = Field(nullable=False)
    value: Decimal = Field(sa_column=Column(Numeric(12, 4)))
    unit: str
    received_at: datetime = Field(default_factory=datetime.utcnow)
    deleted_at: datetime | None = None

    # SQLAlchemy query fungerer transparent på tværs af partitioner:
    # session.execute(select(Reading).where(Reading.timestamp >= start))
    # → PostgreSQL pruner automatisk irrelevante partitioner

Konklusion

PostgreSQL declarative range partitioning giver styr på tidsseriedata uden TimescaleDB-extension — idéelt til managed PostgreSQL-instanser eller strenge SQL-kompatibilitetskrav. Månedlige partitioner med automatisk oprettelse via Celery-task og retention-drop efter 7 år (bogføringsloven) giver fuld livscyklus-styring. Partition pruning reducer query-scope automatisk ved timestamp-filtrering. For IoT-platforme med heavy compression-behov og continuous aggregates er TimescaleDB stadig det stærkere valg.

Se TimescaleDB hypertable guide eller asyncpg guide.