Data Modeling Guidelines
Logical Model
Align the model with domain language (DDD): aggregates, entities, value objects.
An aggregate is a consistency boundary — enforce invariants within it, never across aggregates.
Value objects are immutable and identified by their value, not an ID.
Entities have identity that persists across state changes.
Model behavior, not just data: place domain logic inside aggregates, not in services.
Aggregates are pure domain objects — no I/O, no persistence, no external calls. Repositories (infra layer) handle load/save.
Avoid anemic models (plain data bags with no behavior).
Name types, fields, and relations using the ubiquitous language of the domain — no technical abbreviations.
Physical Model
Relational (SQL)
Every table has a surrogate primary key (id); use UUIDs for distributed systems, auto-increment otherwise.
Name tables as singular nouns (order, customer), columns as snake_case.
Add a created_at / updated_at timestamp to every mutable table.
Normalize to 3NF by default; denormalize only when read performance requires it and you can justify it.
Declare foreign keys and let the DB enforce referential integrity.
Index every foreign key and every column used in WHERE, ORDER BY, or JOIN predicates.
Avoid NULL where possible — NULL means "unknown", not "empty" or "zero".
Never store multiple values in a single column (no comma-separated lists).
Migrations are forward-only; never modify existing migration files.
Document Stores
Embed data that is always read together and only written by one aggregate.
Reference (by ID) data that is shared across aggregates or grows unbounded.
Design documents around read patterns, not write patterns.
Keep documents small — avoid unbounded arrays inside a document.
Include a schema_version field for forward-compatible migrations.
Key/Value Stores
Design keys to be self-describing and hierarchical: <namespace>:<entity>:<id> (e.g., session:user:42).
Set a TTL for all ephemeral data (sessions, caches, rate-limit counters).
Never store relational data in a key/value store; use it for caching, sessions, and simple counters.
Document the key schema in the codebase alongside the access code.
Directives
Align model with DDD: aggregates, entities, value objects; aggregates are pure domain objects (no I/O, no persistence)
Place domain logic inside aggregates; avoid anemic models; use ubiquitous language for all names
SQL: surrogate PK (UUID for distributed); created_at/updated_at on every mutable table; declare foreign keys; index FK and WHERE/ORDER BY columns; forward-only migrations
Never store multiple values in one column; avoid NULL where possible
Document stores: embed data read together; include schema_version; design documents around read patterns
Key/value: keys are <namespace>:<entity>:<id>; always set TTL for ephemeral data
Back to top