Both JSON and Redis need no introduction; the former is the standard data interchange format between modern applications, whereas the latter is ubiquitous wherever performant data management is needed by them. That being the case, I was shocked when a couple of years ago I learned that the two don’t get along.
Redis isn’t a one-trick pony–it is, in fact, quite the opposite. Unlike general purpose one-size-fits-all databases, Redis (a.k.a the “Swiss Army Knife of Databases”, “Super Glue of Microservices” and “Execution context of Functions-as-a-Service”) provides specialized tools for specific tasks. Developers use these tools, which are exposed as abstract data structures and their accompanying operations, to model optimal solutions for problems. And that is exactly the reason why using Redis for managing JSON data is unnatural.
Fact: despite its multitude of core data structures, Redis has none that fit the requirements of a JSON value. Sure, you can work around that by using other data types: Strings are great for storing raw serialized JSON, and you can represent flat JSON objects with Hashes. But these workaround patterns impose limitations that make them useful only in a handful of use cases, and even then the experience leaves an un-Redis-ish aftertaste. Their awkwardness clashes sharply with the simplicity and elegance of using Redis normally.
But all that changed during the last year after Salvatore Sanfilippo’s @antirez visit to the Tel Aviv office, and with Redis modules becoming a reality. Suddenly the sky wasn’t the limit anymore. Now that modules let anyone do anything, it turned out that I could be that particular anyone. Picking up on C development after more than a two decades hiatus proved to be less of a nightmare than I had anticipated, and with Dvir Volk’s @dvirsky loving guidance we birthed ReJSON.
While you may not be thrilled about its name (I know that I’m not – suggestions are welcome), ReJSON itself should make any Redis user giddy with JSON joy. The module provides a new data type that is tailored for fast and efficient manipulation of JSON documents. Like any Redis data type, ReJSON’s values are stored in keys that can be accessed with a specialized subset of commands. These commands, or the API that the module exposes, are designed to be intuitive to users coming to Redis from the JSON world and vice versa. Consider this example that shows how to set and get values:
127.0.0.1:6379> JSON.SET scalar . '"Hello JSON!"' OK 127.0.0.1:6379> JSON.SET object . '{"foo": "bar", "ans": 42}' OK 127.0.0.1:6379> JSON.GET object "{\"foo\":\"bar",\"ans\":42}" 127.0.0.1:6379> JSON.GET object .ans "42" 127.0.0.1:6379> ^C ~$
Like any well-behaved module, ReJSON’s commands come prefixed. Both JSON.SET
and JSON.GET
expect the key’s name as their first argument. In the first line we set the root (denoted by a period character: “.”) of the key named scalar to a string value. Next, a different key named object is set with a JSON object (which is first read whole) and then a single sub-element by path.
What happens under the hood is that whenever you call JSON.SET
, the module takes the value through a streaming lexer that parses the input JSON and builds tree data structure from it:
ReJSON stores the data in binary format in the tree’s nodes, and supports a subset of JSONPath for easy referencing of subelements. It boasts an arsenal of atomic commands that are tailored for every JSON value type, including: JSON.STRAPPEND
for appending strings; JSON.NUMMULTBY
for multiplying numbers; and JSON.ARRTRIM
for trimming arrays… and making pirates happy.
Because ReJSON is implemented as a Redis module, you can use it with any Redis client that: a) supports modules (ATM none) or b) allows sending raw commands (ATM most). For example, you can use a ReJSON-enabled Redis server from your Python code with redis-py like so:
import redis
import json
data = {
'foo': 'bar',
'ans': 42
}
r = redis.StrictRedis()
r.execute_command('JSON.SET', 'object', '.', json.dumps(data))
reply = json.loads(r.execute_command('JSON.GET', 'object'))
But that’s just half of it. ReJSON isn’t only a pretty API – it also a powerhouse in terms of performance. Initial performance benchmarks already demonstrate that, for example:
The above graphs compare the rate (operations/sec) and average latency of read and write operations performed on a 3.4KB JSON payload that has three nested levels. ReJSON is pitted against two variants that store the data in Strings. Both variants are implemented as Redis server-side Lua scripts with the json.lua variant storing the raw serialized JSON, and msgpack.lua using MessagePack encoding.
If you have 21 minutes to spare, here’s the ReJSON presentation from Redis Day TLV:
You can start playing with ReJSON today! Get it from the GitHub repository or read the docs online. There are still many features that we want to add to it, but it’s pretty neat as it is. If you have feature requests or have spotted an issue, feel free to use the repo’s issue tracker. You can always email or tweet at me – I’m highly-available 🙂