Skip to main content

Snowflake ID Generator

Panduan lengkap Snowflake ID Generator untuk distributed unique ID generation di MStore Backend.

🎯 Overview

Snowflake ID adalah algoritma distributed ID generation yang menghasilkan 64-bit unique IDs dengan karakteristik:
  • Globally Unique - Unik di seluruh sistem
  • Time-Ordered - Sortable by creation time
  • High Performance - 4096 IDs per millisecond per worker
  • Decentralized - No coordination needed
Package: pkg/utils/snowflake/snowflake.go

📊 ID Structure

64-bit Breakdown

┌─────────────────────────────────────────────────────────────────┐
│ 1 bit │    41 bits     │  5 bits │  5 bits │     12 bits       │
│ Sign  │   Timestamp    │ Region  │ Worker  │    Sequence       │
└─────────────────────────────────────────────────────────────────┘
   0          Milliseconds    0-31     0-31        0-4095
Components:
  • Sign bit (1 bit): Always 0 (positive number)
  • Timestamp (41 bits): Milliseconds since epoch (2020-01-01)
  • Region ID (5 bits): Data center/region (0-31)
  • Worker ID (5 bits): Machine/worker (0-31)
  • Sequence (12 bits): Counter per millisecond (0-4095)

Capacity

ComponentBitsMax ValueDescription
Timestamp41~69 yearsFrom 2020-01-01 to 2089
Region53132 regions/data centers
Worker53132 workers per region
Sequence1240954096 IDs per ms per worker
Total Capacity: 1024 workers × 4096 IDs/ms = ~4.2 million IDs/second

🚀 Quick Start

1. Initialize Generator

package main

import (
    "fmt"
    "gitlab.com/mushola-store/mstore_backend/pkg/utils/snowflake"
)

func main() {
    // Create generator
    // region=0, worker=1
    gen, err := snowflake.New(0, 1)
    if err != nil {
        panic(err)
    }
    
    // Generate ID
    id := gen.Next()
    fmt.Printf("Generated ID: %d\n", id)
    // Output: Generated ID: 1234567890123456789
}

2. Environment-Based Initialization

package main

import (
    "os"
    "strconv"
    "gitlab.com/mushola-store/mstore_backend/pkg/utils/snowflake"
)

func initSnowflake() (*snowflake.Generator, error) {
    // Get region & worker from environment
    region, _ := strconv.Atoi(os.Getenv("REGION_ID"))
    worker, _ := strconv.Atoi(os.Getenv("WORKER_ID"))
    
    // Fallback to defaults
    if region == 0 {
        region = 0
    }
    if worker == 0 {
        worker = 1
    }
    
    return snowflake.New(uint32(region), uint32(worker))
}

📝 Usage Examples

Generate Single ID

gen, _ := snowflake.New(0, 1)

id := gen.Next()
fmt.Printf("ID: %d\n", id)
// Output: ID: 1234567890123456789

Generate Multiple IDs

gen, _ := snowflake.New(0, 1)

for i := 0; i < 10; i++ {
    id := gen.Next()
    fmt.Printf("ID %d: %d\n", i+1, id)
}

// Output:
// ID 1: 1234567890123456789
// ID 2: 1234567890123456790
// ID 3: 1234567890123456791
// ...

Concurrent Generation

gen, _ := snowflake.New(0, 1)

// Safe for concurrent use
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        id := gen.Next()
        fmt.Printf("Generated: %d\n", id)
    }()
}
wg.Wait()

Parse ID Components

gen, _ := snowflake.New(0, 1)
id := gen.Next()

// Parse components
timestamp, region, worker, seq := snowflake.Parse(id)

fmt.Printf("Timestamp: %d\n", timestamp)
fmt.Printf("Region: %d\n", region)
fmt.Printf("Worker: %d\n", worker)
fmt.Printf("Sequence: %d\n", seq)

// Convert timestamp to time
t := snowflake.ToTime(timestamp)
fmt.Printf("Created at: %s\n", t.Format(time.RFC3339))

🎨 Use Cases

1. Transaction IDs

package transaction_service

type TransactionService struct {
    snowflake *snowflake.Generator
}

func (s *TransactionService) CreateTransaction(
    ctx context.Context,
    payload PayloadCreateTransaction,
) (*Transaction, error) {
    // Generate unique transaction ID
    txID := s.snowflake.Next()
    
    transaction := &Transaction{
        ID: txID,
        TransactionCode: fmt.Sprintf("TRX-%d", txID),
        // ... other fields
    }
    
    return transaction, s.repo.Create(ctx, transaction)
}

2. Order Numbers

func (s *OrderService) CreateOrder(
    ctx context.Context,
    payload PayloadCreateOrder,
) (*Order, error) {
    orderID := s.snowflake.Next()
    
    order := &Order{
        ID: orderID,
        OrderNumber: fmt.Sprintf("ORD-%d", orderID),
        CreatedAt: time.Now(),
    }
    
    return order, s.repo.Create(ctx, order)
}

3. Distributed Event IDs

func (s *EventService) PublishEvent(
    ctx context.Context,
    eventType string,
    payload interface{},
) error {
    eventID := s.snowflake.Next()
    
    event := &Event{
        ID: eventID,
        EventType: eventType,
        Payload: payload,
        Timestamp: time.Now(),
    }
    
    return s.messageQueue.Publish(ctx, event)
}

4. Idempotency Keys

func (s *PaymentService) ProcessPayment(
    ctx context.Context,
    payment Payment,
) error {
    // Generate idempotency key
    idempotencyKey := s.snowflake.Next()
    
    // Check if already processed
    if exists := s.cache.Exists(ctx, idempotencyKey); exists {
        return ErrDuplicatePayment
    }
    
    // Process payment
    result, err := s.gateway.Charge(ctx, payment, idempotencyKey)
    if err != nil {
        return err
    }
    
    // Cache result
    s.cache.Set(ctx, idempotencyKey, result, 24*time.Hour)
    return nil
}

🔧 Configuration

Region & Worker Assignment

Strategy 1: Environment Variables
# Docker Compose
services:
  api-1:
    environment:
      REGION_ID: 0
      WORKER_ID: 1
  
  api-2:
    environment:
      REGION_ID: 0
      WORKER_ID: 2
Strategy 2: Kubernetes Labels
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pos-api
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: api
        env:
        - name: REGION_ID
          value: "0"
        - name: WORKER_ID
          valueFrom:
            fieldRef:
              fieldPath: metadata.labels['pod-index']
Strategy 3: Auto-Assignment
func assignWorkerID() uint32 {
    // Use hostname hash
    hostname, _ := os.Hostname()
    hash := fnv.New32a()
    hash.Write([]byte(hostname))
    return hash.Sum32() % 32 // 0-31
}

gen, _ := snowflake.New(0, assignWorkerID())

Custom Epoch

Default epoch: 2020-01-01 00:00:00 UTC
// Change epoch (in snowflake.go)
var epoch = time.Date(2024, 1, 1, 0, 0, 0, 0, time.UTC)
Why custom epoch?
  • Extends ID lifespan (41 bits = ~69 years from epoch)
  • Smaller IDs if epoch is recent

📊 Performance

Benchmarks

BenchmarkNext-8         20000000        50 ns/op        0 B/op      0 allocs/op
BenchmarkParse-8        100000000       10 ns/op        0 B/op      0 allocs/op
Throughput:
  • Single worker: ~20 million IDs/second
  • 4 workers: ~80 million IDs/second
  • 32 workers: ~640 million IDs/second

Comparison

MethodPerformanceUniquenessSortableDistributed
Snowflake⭐⭐⭐⭐⭐✅ Global✅ Yes✅ Yes
UUID v4⭐⭐⭐⭐✅ Global❌ No✅ Yes
Auto Increment⭐⭐⭐⭐⭐✅ Local✅ Yes❌ No
ULID⭐⭐⭐⭐✅ Global✅ Yes✅ Yes

🎯 Best Practices

1. Singleton Pattern

// ✅ GOOD - Single instance per application
var globalSnowflake *snowflake.Generator

func init() {
    var err error
    globalSnowflake, err = snowflake.New(0, 1)
    if err != nil {
        panic(err)
    }
}

func GenerateID() uint64 {
    return globalSnowflake.Next()
}

2. Unique Worker IDs

// ✅ GOOD - Each instance has unique worker ID
// Instance 1: region=0, worker=1
// Instance 2: region=0, worker=2
// Instance 3: region=0, worker=3

// ❌ BAD - Same worker ID across instances
// Instance 1: region=0, worker=1
// Instance 2: region=0, worker=1  // COLLISION!

3. Store as BIGINT

-- ✅ GOOD
CREATE TABLE transactions (
    id BIGINT UNSIGNED PRIMARY KEY,
    transaction_code VARCHAR(64),
    created_at TIMESTAMP
);

-- ❌ BAD - INT too small
CREATE TABLE transactions (
    id INT PRIMARY KEY,  -- Max 2^31 = 2.1 billion
    ...
);

4. Index Strategy

-- ✅ GOOD - Snowflake ID is naturally sorted
CREATE TABLE orders (
    id BIGINT UNSIGNED PRIMARY KEY,
    customer_id BIGINT,
    created_at TIMESTAMP,
    INDEX idx_customer (customer_id)
);

-- ❌ BAD - Redundant created_at index
CREATE TABLE orders (
    id BIGINT UNSIGNED PRIMARY KEY,
    created_at TIMESTAMP,
    INDEX idx_created (created_at)  -- Redundant!
);

5. JSON Serialization

// ✅ GOOD - Serialize as string to avoid precision loss
type Transaction struct {
    ID uint64 `json:"id,string"`  // "1234567890123456789"
}

// ❌ BAD - JavaScript loses precision for large numbers
type Transaction struct {
    ID uint64 `json:"id"`  // 1234567890123456768 (rounded!)
}

🔍 Troubleshooting

Clock Skew

Problem: Server clock goes backward Solution: Generator waits until time catches up
// Built-in protection
func (g *Generator) Next() uint64 {
    g.mu.Lock()
    defer g.mu.Unlock()
    
    now := time.Now().UnixMilli()
    if now < g.lastMS {
        // Wait until clock catches up
        g.waitFun(g.lastMS + 1)
        now = time.Now().UnixMilli()
    }
    // ...
}

Sequence Overflow

Problem: More than 4096 IDs in same millisecond Solution: Wait for next millisecond
if g.seq > maxSeq {
    // Wait for next millisecond
    g.waitFun(g.lastMS + 1)
    g.seq = 0
}

Worker ID Collision

Problem: Two instances use same worker ID Detection:
// Monitor duplicate IDs in logs
if existingID := cache.Get(id); existingID != nil {
    log.Error("Duplicate Snowflake ID detected",
        "id", id,
        "worker", workerID,
        "region", regionID,
    )
}
Prevention:
  • Use unique worker IDs per instance
  • Implement worker ID registry
  • Use hostname-based assignment

📚 Advanced Usage

Custom Wait Function

gen, _ := snowflake.New(0, 1)

// Custom wait with metrics
gen.SetWaitFunc(func(until int64) {
    metrics.IncrementWaitCount()
    for time.Now().UnixMilli() < until {
        time.Sleep(100 * time.Microsecond)
    }
})

ID Parsing Utility

func ParseSnowflakeID(id uint64) map[string]interface{} {
    timestamp, region, worker, seq := snowflake.Parse(id)
    
    return map[string]interface{}{
        "id": id,
        "timestamp": timestamp,
        "region": region,
        "worker": worker,
        "sequence": seq,
        "created_at": snowflake.ToTime(timestamp),
    }
}

// Usage
info := ParseSnowflakeID(1234567890123456789)
fmt.Printf("Created at: %v\n", info["created_at"])
fmt.Printf("Worker: %v\n", info["worker"])

Migration from Auto-Increment

-- Step 1: Add new column
ALTER TABLE transactions ADD COLUMN snowflake_id BIGINT UNSIGNED;

-- Step 2: Generate IDs for existing rows
UPDATE transactions SET snowflake_id = GENERATE_SNOWFLAKE_ID();

-- Step 3: Make it primary key
ALTER TABLE transactions 
    DROP PRIMARY KEY,
    ADD PRIMARY KEY (snowflake_id),
    DROP COLUMN id;

Transaction Code Generator

Human-readable transaction codes

Database Schema

Database schema & indexes

System Design

Distributed system architecture

Snowflake IDs: Digunakan untuk semua entity utama (transactions, orders, payments) untuk memastikan uniqueness di distributed system.