How to create a unique identifier in a distributed system

How to create a unique identifier in a distributed system

using Redi's atomic increment command (incr)

If you are having a distributed system with multiple server instances and need a unique identifier, which follows a specific pattern, for example with the prefix "MA" followed by 12 numerical digits, such as MA000000000001, you can use 'Redis' to create a Global ID.

Redis Use-cases

You can use Redis for a lot of use cases, such as

  1. Global ID
  2. Cache
  3. Session
  4. Distributed Lock
  5. Rate Limiter
  6. Counter
  7. Rank/Leader Dashboard
  8. Message Queue

To create a global unique identifier in a distributed system, we are using Redis to create a Global ID.

Why to use Redis?

There are several reasons why Redis is commonly used in many applications:

  1. High Performance: Redis is an in-memory data store, which offers exceptionally fast read and write speeds, making it suitable for applications that require low-latency data access.
  2. Persistence Options: Although Redis primarily stores data in memory, it also offers options for persistence. Redis can periodically save data to disk or log changes to a file, ensuring that data is not lost in the event of a system restart or failure.
  3. Scalability and Replication: Redis supports master-slave replication, allowing data to be replicated across multiple Redis instances. This provides scalability, fault tolerance, and high availability by distributing the workload and allowing read replicas to handle read-intensive operations.

In our case we want a single source of truth with high write speed, scalability and persistence.

Implementation

First we need to create a unique numerical identifier, which can be incremented concurrently. One common approach is to use Redis's atomic increment command (incr) to generate unique numerical IDs. If you are using java and jedis, it will look like this

Long nextId = jedis.incr("MA_COUNTER");
String id = String.format("MA%012d", nextId); 
// id: "MA000000000001"

By using the incr command, Redis guarantees that the increment operation is atomic, avoiding any conflicts when generating unique IDs concurrently.

MA: prefix
%: denotes the start of the format specifier.
0: leading zeros should be used for padding.
12: specifies the width of the formatted value, indicating that it should be twelve characters wide.
d: indicates that the argument is an integer and should be displayed as a decimal number.

Important facts about Redis

Redis is an in-memory storage, which means you have to make sure you don't loose data in case of a system failure. This can be done for example by a (master-replica) replication using Redis Sentinel that is simple to use and configure. It allows replica Redis instances to be exact copies of master instances. The replica will automatically reconnect to the master every time the link breaks, and will attempt to be an exact copy of it regardless of what happens to the master.

You probably also want to have some kind of retry mechanism on your application level in case of a failover, your application tries to re-create a new ID or even a fallback mechanism, which would make sure you could still generate ID's in case of a system failure and therefore won't block your application.

Redis uses by default asynchronous replication, which being low latency and high performance, is the natural replication mode for the vast majority of Redis use cases. However, if we want to maximize real world data safety, we can also use Redi's command WAIT. This allows having optional synchronous replication.

WAIT numreplicas timeout

This command blocks the current client until all the previous write commands are successfully transferred and acknowledged by at least the specified number of replicas. Note that WAIT does not make Redis a strongly consistent store: while synchronous replication is part of a replicated state machine, it is not the only thing needed. However in the context of Sentinel or Redis Cluster failover, WAIT improves the real world data safety. Specifically if a given write is transferred to one or more replicas, it is more likely (but not guaranteed) that if the master fails, we'll be able to promote, during a failover, a replica that received the write: both Sentinel and Redis Cluster will do a best-effort attempt to promote the best replica among the set of available replicas.

However this is just a best-effort attempt so it is possible to still lose a write synchronously replicated to multiple replicas. So if your organization does not already have a resilient, battle-tested and ready-to-go caching system in place, you have to think about those scenarios before integrating it into your system.

To have a double-safety net, I would also recommend to add an unique key constraint on your database column, in case you are using a RDB, such as MySQL. This way, if everything goes wrong and all Redis instances would fail (which is very unlikely), you won't insert wrong data into your main database and could easily set the incr counter to a proper value after your Redis master is back online.

Conclusion

Redis is a powerful tool for generating unique identifiers in a distributed system. If configured properly, it offers high performance, scalability, and persistence. By utilizing Redis's atomic increment command incr, we can ensure the generation of unique numerical IDs concurrently, making it an ideal solution for creating Global IDs with specific patterns, such as the "MA" prefix followed by 12 numerical digits.