Sharding DB with multiple tables

Database Partitioning Discussion

Question:

When we talk about database partitioning - is it every table got its own partition?

Answer:

Not necessarily. In database partitioning, not every table needs to be partitioned, and the granularity of partitioning depends on the database design, performance needs, and access patterns.

Types of Partitioning:

  1. Horizontal Partitioning (Sharding) – Rows of a table are divided across multiple partitions based on some criteria (e.g., customer ID, date range).

  2. Vertical Partitioning – Columns of a table are split into different partitions (e.g., frequently accessed columns vs. rarely accessed columns).

  3. Range Partitioning – Data is split into partitions based on a value range (e.g., partitioning by year).

  4. Hash Partitioning – A hash function is applied to determine which partition a row belongs to.

  5. List Partitioning – Data is partitioned based on a predefined list of values.

Question:

I mean if I am doing DB partitioning - and my DB got two tables. Do both tables get partitioned?

Answer:

No, if you are partitioning your database, not all tables need to be partitioned by default. Partitioning is typically applied to large tables where performance, scalability, and manageability benefits outweigh the complexity it introduces.

Key Considerations:

- You can partition only one table if that’s the bottleneck.

- If two tables are frequently joined on a partitioned column, you might want to partition both similarly (partition alignment).

- A large transaction table (e.g., Orders) may be partitioned by date.

- A small reference table (e.g., Countries) typically does not need partitioning.

Example:

1. Orders table (millions of rows, growing fast)

2. Users table (few thousand rows, relatively static)

- Orders: Partitioned by order_date (Range partitioning).

- Users: No need for partitioning as it's small.

Question:

So if Orders get partitioned into multiple shards - will every shard have its own copy of Users table too?

Answer:

Not necessarily. When partitioning (or sharding) a large table like Orders, the handling of the Users table depends on the chosen sharding strategy:

Possible Approaches for the Users Table in a Sharded System:

1. Replicated (Denormalized) Users Table (Common for Reference Data)

2. Unsharded (Single Centralized Users Table)

3. Co-located (Sharded Alongside Orders by User ID)

Which Approach to Choose?