Snowflake ID Generator
Panduan lengkap Snowflake ID Generator untuk distributed unique ID generation di MStore Backend.
🎯 Overview
Snowflake ID adalah algoritma distributed ID generation yang menghasilkan 64-bit unique IDs dengan karakteristik:
✅ Globally Unique - Unik di seluruh sistem
✅ Time-Ordered - Sortable by creation time
✅ High Performance - 4096 IDs per millisecond per worker
✅ Decentralized - No coordination needed
Package : pkg/utils/snowflake/snowflake.go
📊 ID Structure
64-bit Breakdown
┌─────────────────────────────────────────────────────────────────┐
│ 1 bit │ 41 bits │ 5 bits │ 5 bits │ 12 bits │
│ Sign │ Timestamp │ Region │ Worker │ Sequence │
└─────────────────────────────────────────────────────────────────┘
0 Milliseconds 0-31 0-31 0-4095
Components :
Sign bit (1 bit): Always 0 (positive number)
Timestamp (41 bits): Milliseconds since epoch (2020-01-01)
Region ID (5 bits): Data center/region (0-31)
Worker ID (5 bits): Machine/worker (0-31)
Sequence (12 bits): Counter per millisecond (0-4095)
Capacity
Component Bits Max Value Description Timestamp 41 ~69 years From 2020-01-01 to 2089 Region 5 31 32 regions/data centers Worker 5 31 32 workers per region Sequence 12 4095 4096 IDs per ms per worker
Total Capacity : 1024 workers × 4096 IDs/ms = ~4.2 million IDs/second
🚀 Quick Start
1. Initialize Generator
package main
import (
" fmt "
" gitlab.com/mushola-store/mstore_backend/pkg/utils/snowflake "
)
func main () {
// Create generator
// region=0, worker=1
gen , err := snowflake . New ( 0 , 1 )
if err != nil {
panic ( err )
}
// Generate ID
id := gen . Next ()
fmt . Printf ( "Generated ID: %d \n " , id )
// Output: Generated ID: 1234567890123456789
}
2. Environment-Based Initialization
package main
import (
" os "
" strconv "
" gitlab.com/mushola-store/mstore_backend/pkg/utils/snowflake "
)
func initSnowflake () ( * snowflake . Generator , error ) {
// Get region & worker from environment
region , _ := strconv . Atoi ( os . Getenv ( "REGION_ID" ))
worker , _ := strconv . Atoi ( os . Getenv ( "WORKER_ID" ))
// Fallback to defaults
if region == 0 {
region = 0
}
if worker == 0 {
worker = 1
}
return snowflake . New ( uint32 ( region ), uint32 ( worker ))
}
📝 Usage Examples
Generate Single ID
gen , _ := snowflake . New ( 0 , 1 )
id := gen . Next ()
fmt . Printf ( "ID: %d \n " , id )
// Output: ID: 1234567890123456789
Generate Multiple IDs
gen , _ := snowflake . New ( 0 , 1 )
for i := 0 ; i < 10 ; i ++ {
id := gen . Next ()
fmt . Printf ( "ID %d : %d \n " , i + 1 , id )
}
// Output:
// ID 1: 1234567890123456789
// ID 2: 1234567890123456790
// ID 3: 1234567890123456791
// ...
Concurrent Generation
gen , _ := snowflake . New ( 0 , 1 )
// Safe for concurrent use
var wg sync . WaitGroup
for i := 0 ; i < 100 ; i ++ {
wg . Add ( 1 )
go func () {
defer wg . Done ()
id := gen . Next ()
fmt . Printf ( "Generated: %d \n " , id )
}()
}
wg . Wait ()
Parse ID Components
gen , _ := snowflake . New ( 0 , 1 )
id := gen . Next ()
// Parse components
timestamp , region , worker , seq := snowflake . Parse ( id )
fmt . Printf ( "Timestamp: %d \n " , timestamp )
fmt . Printf ( "Region: %d \n " , region )
fmt . Printf ( "Worker: %d \n " , worker )
fmt . Printf ( "Sequence: %d \n " , seq )
// Convert timestamp to time
t := snowflake . ToTime ( timestamp )
fmt . Printf ( "Created at: %s \n " , t . Format ( time . RFC3339 ))
🎨 Use Cases
1. Transaction IDs
package transaction_service
type TransactionService struct {
snowflake * snowflake . Generator
}
func ( s * TransactionService ) CreateTransaction (
ctx context . Context ,
payload PayloadCreateTransaction ,
) ( * Transaction , error ) {
// Generate unique transaction ID
txID := s . snowflake . Next ()
transaction := & Transaction {
ID : txID ,
TransactionCode : fmt . Sprintf ( "TRX- %d " , txID ),
// ... other fields
}
return transaction , s . repo . Create ( ctx , transaction )
}
2. Order Numbers
func ( s * OrderService ) CreateOrder (
ctx context . Context ,
payload PayloadCreateOrder ,
) ( * Order , error ) {
orderID := s . snowflake . Next ()
order := & Order {
ID : orderID ,
OrderNumber : fmt . Sprintf ( "ORD- %d " , orderID ),
CreatedAt : time . Now (),
}
return order , s . repo . Create ( ctx , order )
}
3. Distributed Event IDs
func ( s * EventService ) PublishEvent (
ctx context . Context ,
eventType string ,
payload interface {},
) error {
eventID := s . snowflake . Next ()
event := & Event {
ID : eventID ,
EventType : eventType ,
Payload : payload ,
Timestamp : time . Now (),
}
return s . messageQueue . Publish ( ctx , event )
}
4. Idempotency Keys
func ( s * PaymentService ) ProcessPayment (
ctx context . Context ,
payment Payment ,
) error {
// Generate idempotency key
idempotencyKey := s . snowflake . Next ()
// Check if already processed
if exists := s . cache . Exists ( ctx , idempotencyKey ); exists {
return ErrDuplicatePayment
}
// Process payment
result , err := s . gateway . Charge ( ctx , payment , idempotencyKey )
if err != nil {
return err
}
// Cache result
s . cache . Set ( ctx , idempotencyKey , result , 24 * time . Hour )
return nil
}
🔧 Configuration
Region & Worker Assignment
Strategy 1: Environment Variables
# Docker Compose
services:
api-1:
environment:
REGION_ID: 0
WORKER_ID: 1
api-2:
environment:
REGION_ID: 0
WORKER_ID: 2
Strategy 2: Kubernetes Labels
apiVersion : apps/v1
kind : Deployment
metadata :
name : pos-api
spec :
replicas : 3
template :
spec :
containers :
- name : api
env :
- name : REGION_ID
value : "0"
- name : WORKER_ID
valueFrom :
fieldRef :
fieldPath : metadata.labels['pod-index']
Strategy 3: Auto-Assignment
func assignWorkerID () uint32 {
// Use hostname hash
hostname , _ := os . Hostname ()
hash := fnv . New32a ()
hash . Write ([] byte ( hostname ))
return hash . Sum32 () % 32 // 0-31
}
gen , _ := snowflake . New ( 0 , assignWorkerID ())
Custom Epoch
Default epoch: 2020-01-01 00:00:00 UTC
// Change epoch (in snowflake.go)
var epoch = time . Date ( 2024 , 1 , 1 , 0 , 0 , 0 , 0 , time . UTC )
Why custom epoch?
Extends ID lifespan (41 bits = ~69 years from epoch)
Smaller IDs if epoch is recent
Benchmarks
BenchmarkNext-8 20000000 50 ns/op 0 B/op 0 allocs/op
BenchmarkParse-8 100000000 10 ns/op 0 B/op 0 allocs/op
Throughput :
Single worker : ~20 million IDs/second
4 workers : ~80 million IDs/second
32 workers : ~640 million IDs/second
Comparison
Method Performance Uniqueness Sortable Distributed Snowflake ⭐⭐⭐⭐⭐ ✅ Global ✅ Yes ✅ Yes UUID v4 ⭐⭐⭐⭐ ✅ Global ❌ No ✅ Yes Auto Increment ⭐⭐⭐⭐⭐ ✅ Local ✅ Yes ❌ No ULID ⭐⭐⭐⭐ ✅ Global ✅ Yes ✅ Yes
🎯 Best Practices
1. Singleton Pattern
// ✅ GOOD - Single instance per application
var globalSnowflake * snowflake . Generator
func init () {
var err error
globalSnowflake , err = snowflake . New ( 0 , 1 )
if err != nil {
panic ( err )
}
}
func GenerateID () uint64 {
return globalSnowflake . Next ()
}
2. Unique Worker IDs
// ✅ GOOD - Each instance has unique worker ID
// Instance 1: region=0, worker=1
// Instance 2: region=0, worker=2
// Instance 3: region=0, worker=3
// ❌ BAD - Same worker ID across instances
// Instance 1: region=0, worker=1
// Instance 2: region=0, worker=1 // COLLISION!
3. Store as BIGINT
-- ✅ GOOD
CREATE TABLE transactions (
id BIGINT UNSIGNED PRIMARY KEY ,
transaction_code VARCHAR ( 64 ),
created_at TIMESTAMP
);
-- ❌ BAD - INT too small
CREATE TABLE transactions (
id INT PRIMARY KEY , -- Max 2^31 = 2.1 billion
...
);
4. Index Strategy
-- ✅ GOOD - Snowflake ID is naturally sorted
CREATE TABLE orders (
id BIGINT UNSIGNED PRIMARY KEY ,
customer_id BIGINT ,
created_at TIMESTAMP ,
INDEX idx_customer (customer_id)
);
-- ❌ BAD - Redundant created_at index
CREATE TABLE orders (
id BIGINT UNSIGNED PRIMARY KEY ,
created_at TIMESTAMP ,
INDEX idx_created (created_at) -- Redundant!
);
5. JSON Serialization
// ✅ GOOD - Serialize as string to avoid precision loss
type Transaction struct {
ID uint64 `json:"id,string"` // "1234567890123456789"
}
// ❌ BAD - JavaScript loses precision for large numbers
type Transaction struct {
ID uint64 `json:"id"` // 1234567890123456768 (rounded!)
}
🔍 Troubleshooting
Clock Skew
Problem : Server clock goes backward
Solution : Generator waits until time catches up
// Built-in protection
func ( g * Generator ) Next () uint64 {
g . mu . Lock ()
defer g . mu . Unlock ()
now := time . Now (). UnixMilli ()
if now < g . lastMS {
// Wait until clock catches up
g . waitFun ( g . lastMS + 1 )
now = time . Now (). UnixMilli ()
}
// ...
}
Sequence Overflow
Problem : More than 4096 IDs in same millisecond
Solution : Wait for next millisecond
if g . seq > maxSeq {
// Wait for next millisecond
g . waitFun ( g . lastMS + 1 )
g . seq = 0
}
Worker ID Collision
Problem : Two instances use same worker ID
Detection :
// Monitor duplicate IDs in logs
if existingID := cache . Get ( id ); existingID != nil {
log . Error ( "Duplicate Snowflake ID detected" ,
"id" , id ,
"worker" , workerID ,
"region" , regionID ,
)
}
Prevention :
Use unique worker IDs per instance
Implement worker ID registry
Use hostname-based assignment
📚 Advanced Usage
Custom Wait Function
gen , _ := snowflake . New ( 0 , 1 )
// Custom wait with metrics
gen . SetWaitFunc ( func ( until int64 ) {
metrics . IncrementWaitCount ()
for time . Now (). UnixMilli () < until {
time . Sleep ( 100 * time . Microsecond )
}
})
ID Parsing Utility
func ParseSnowflakeID ( id uint64 ) map [ string ] interface {} {
timestamp , region , worker , seq := snowflake . Parse ( id )
return map [ string ] interface {}{
"id" : id ,
"timestamp" : timestamp ,
"region" : region ,
"worker" : worker ,
"sequence" : seq ,
"created_at" : snowflake . ToTime ( timestamp ),
}
}
// Usage
info := ParseSnowflakeID ( 1234567890123456789 )
fmt . Printf ( "Created at: %v \n " , info [ "created_at" ])
fmt . Printf ( "Worker: %v \n " , info [ "worker" ])
Migration from Auto-Increment
-- Step 1: Add new column
ALTER TABLE transactions ADD COLUMN snowflake_id BIGINT UNSIGNED;
-- Step 2: Generate IDs for existing rows
UPDATE transactions SET snowflake_id = GENERATE_SNOWFLAKE_ID();
-- Step 3: Make it primary key
ALTER TABLE transactions
DROP PRIMARY KEY ,
ADD PRIMARY KEY (snowflake_id),
DROP COLUMN id;
Transaction Code Generator Human-readable transaction codes
Database Schema Database schema & indexes
System Design Distributed system architecture
Snowflake IDs : Digunakan untuk semua entity utama (transactions, orders, payments) untuk memastikan uniqueness di distributed system.