Category Archives: design

Some useful Kafka command

Get the max partition:offset for a topic ./ –broker-list broker-ip:port –topic topic-name Get current partition:offset for a consumer group ./ –bootstrap-server broker-ip:port –group consumer-group-name –describe Get a single record by partition:offset ./ –bootstrap-server broker-ip:port –topic topic-name –max-messages 1 –partition 54 –offset 90012464 Tailing Kafka topic in real time ./ –bootstrap-server broker-ip:port –topic topic-name Set a ConsumerGroup offset to latest ./ –bootstrap-server server:port… Read More »

Design Elevators

电梯调度 停车场 Elevator max people capacity max weight airstairs[only to living area], goods stair[to storage area] Building Each elevator can each level or certain levels? Inside building the button controls all elevator or just one elevator. Scheduler same direction > static > different direction when elevator is running, can we press the button for different… Read More »

Token Bucket

public class TokenBucket { private long maxBucketSize; private long refillRate; // per second private long currentBucketSize; private long lastRefillTimestamp; synchronized public boolean allowRequest(int token) { refill(); if (currentBucketSize >= token) { currentBucketSize -= token; return true; } return false; } private void refill() { long now = System.nanoTime(); double add = (now – lastRefillTimestamp) /… Read More »

groupon design

rule: { rule_id, type: discount/one-time value: 30%/50$ limited_user: [usr1, usr2] limited_user_grp: [grp1, grp2] number_of_usage: 1, service: [LYFT, CHIPOTLE] } groupon: { g_id, code: XMASCODE rule_id: xxx expire_date: 2021/12/31 } usage: { user_id, groupon_id, time }

Thought about leaderboard

First of all, we can have below schema for the players and its score: contest_id(partitionKey), user_id1(sortKey), problem1_finish_time, problem2_finish_time, score(secondaryIndex) For the score and user_id, we can build a BST. Let’s say we want to find the players who ranks between 50-60. We can find the 50th player in O(logN) time. Check this leetcode problem. Then, we… Read More »

How to calculate percentile metrics

Suppose we want to calculate latency in p90 percentile. Use count min sketch to count each latency times. For example, 5ms happened 10 times, 9ms happened 20 times etc Because latency has long tail. That means most of latency is around 7ms, but there could be cases for 100ms, 110ms etc. For that long tail latency,… Read More »

Cache TTL consideration

When there is no eviction mechanism, we can use ttl to invalidate. But this is based on that cache might have stale data. TTL doesn’t lead 100% cache correctness. If we want to guarantee cache correctness, we need evection mechanism, write-through etc strategies. In theory, TTL is just a way of data replacement. Others are LRU, LFU.


To solve non-repeatable read: 1. lock 2. mvcc As long as there is an update, then generate a new record with current tx_id. undo log points to previous one. When read, it should have: ReadView { m_ids_list, min_trx_id, max_trx_id, creator_trx_id } When read or doing a transaction, it should have current tx_id and its ReadView.… Read More »

payment system keywowrd

double payment sync payment async payment 3rd party interaction different state sensitive data, accuracy, relational database. websocket, long polling, stateful

consistency model

Strictly Serializable support transaction, Spanner Linearizable Doesn’t support transaction. There is global timer. Things in different machine happens with the global timer. Sequential consistency Above example satisfies sequential consistency. P3, P4, P5 all observe that W(x)1 happens before W(x)3. They are globally making sense. So it is sequential consistent. But considering the time when W(x)3… Read More »