Whitfin's Blog: Cachex v4.0: Optimization, Consolidation & Routing

The first v4.x version of Cachex, v4.0.0, has been released. This version includes many features and improvements designed to optimize cache management and internal efficiency. In this post, I'll briefly cover some of the main changes, including new features and optimizations. If that isn't for you, feel free to jump straight over to the release notes!

Signature Cleanup

We'll start off with some of the simpler improvements; the cleanup of various functions that were either misnamed or redundant!

Cachex has been around a long time now, and some of the naming hasn't been the best. A few names were either vague or the result of arbitrarily separated functions in the past. Cachex v4.0 includes the following changes in the main Cachex API:

Cachex.count/2 has been merged into Cachex.size/2 via the :expired option
Cachex.dump/3 and Cachex.load/4 have been renamed to Cachex.save/3 and Cachex.restore/4 respectively
Cachex.save/4 and Cachex.restore/4 have been changed to use Cachex.stream/3 and Cachex.import/3 under the hood
Cachex.set/3 and Cachex.set_many/4 have (finally!) been removed

These changes allowed for a lot of cleanup internally, and the API surface area has been reduced as a result. Alongside these changes you may also see performance improvements in these functions due to various optimizations made along the way.

Janitor Optimizations

Next up are some optimizations inside the Cachex Janitor. For those unfamiliar, the Janitor is the process responsible for periodically handling expired entries inside a cache. There are two major changes to the Janitor in Cachex v4.x which will have a minor impact on an average cache.

First off, there has been some internal refactoring which allows internal cache services to implement Cachex.Provision. The Janitor needed access to the cache’s state when checking expirations, meaning that in the v3.x line, it had to retrieve this state on each run. With the change to Cachex.Provision, the state of the cache is now pushed to the Janitor on update. This is a very minor performance increase, but as we all know, these small savings do add up over time.

Secondly, due to the way the Janitor process works under the hood, each execution of the Janitor's purge would create a transaction lock. This meant that any operations relying on a transaction would have to wait for the Janitor to complete. This locking is required to avoid inconsistencies within ongoing transactions during a purge. Although this is unavoidable, the main concern is that the Janitor would create the lock even if there were no expired entries in the cache.

To improve this in Cachex v4.x, the Janitor will now run an extremely quick check (via Cachex.stream/3) to see if it can locate at least one expired entry in the cache table before starting a purge. This means that it can now be much more clever; there's no need to create the lock if no entries have expired. This results in the Janitor running much, much quicker in the case there are no expired entries to clean up.

As a net result of these two changes, there is a one more side effect; for developers who are not using expiration in their cache, the Janitor now becomes effectively zero overhead. One of Cachex’s goals is to minimize overhead unless a feature is enabled explicitly, which makes this an excellent improvement!

Streaming and Queries

You might have noticed that both of the previous sections have mentioned new use of Cachex.stream/3 internally, so now is the perfect time to cover some of the changes made in this area!

Both the Cachex.stream/3 function and the Cachex.Query module underwent various changes in this release. Cachex.stream/3 no longer uses the Erlang qlc module under the hood, instead opting for vanilla ets selection utilities. The use of QLC had been causing a few problems for some users. When looking into fixing this, it was determined that using QLC was simply unnecessary for what we needed in the Cachex API. The new implementation on top of ETS is both quicker and more reliable.

Changes have been made to Cachex.Query to make it more extensible, via Cachex.Query.build/1. This function supports two options; :where and :output. These options can be used to convert cache entries inside a streaming call. The best way to demonstrate how they work is to show a few examples, as it's a little hard to explain:

# create a cache
Cachex.start(:cache)

# put some values in the cache
Cachex.put(:cache, "one", 1)
Cachex.put(:cache, "two", 2)
Cachex.put(:cache, "three", 3)

# 1. Fetching with no args...
entries =
  :cache
  |> Cachex.stream!()
  |> Enum.to_list()

# ...gives us cache entries
assert entries == [
  entry(key: "one", value: 1),
  entry(key: "two", value: 2),
  entry(key: "three", value: 3)
]

# 2. Fetching only the :key entry field...
key_query = Cachex.Query.build(output: :key)
key_stream = Cachex.stream!(:cache, key_query)

# ... gives us ["one", "two", "three"]
key_results = Enum.to_list(key_stream)

# 3. Summing only the odd values...
odd_filt = {:==, {:rem, :value, 2}, 1}
odd_query = Cachex.Query.build(where: odd_filt, output: :value)
odd_stream = Cachex.stream!(:cache, odd_query)

# ... gives us a total of 4
odd_results = Enum.sum(odd_stream)

We can use the changes in Cachex.Query to augment Cachex.stream/3 by predefining options to make our streaming pipeline both more readable and more efficient. A lot of the changes in Cachex v4.x are as a result of these improvements (such as the two previous sections of this blog post!). Many other cache actions are now able to use these two modules under the hood to lessen the surface area required for testing, and to use a known foundation to ensure we’re getting the best implementation possible.

This is all the result of a very minor rewrite and refactor, but I think it's way easier to work with and felt it worth mentioning. If you're interested, you can read more about these changes in the relevant documentation.

Warming Controls

If you've used Cachex before, you're probably pretty familiar with the concept of reactive vs. proactive cache warming. I'm generally happy with the state of reactive warming (via Cachex.fetch/4), but proactive warming in Cachex v3.x needed some minor improvements.

One of the things missing from the Cachex v3.x implementation of proactive warming was the ability for the developer to intervene if necessary; the model was that you attach a Cachex.Warmer implementation to a cache at startup and trust that everything's okay. A major selling point of both Elixir and Erlang is the ability to attach to a running application via iex and poke around, so what if someone knew that their cached data was invalid and wanted to re-run a warmer?

In Cachex v4.0, this is now possible. The Cachex interface now provides Cachex.warm/2, which grants the ability for a developer to execute warmers attached to a cache. You can use this function to run all warmers attached to a cache, or only a subset as needed. It also supports running in the background or foreground via the :wait option:

# warm the cache
Cachex.warm(:cache)

# warm the cache and block until complete
Cachex.warm(:cache, wait: true)

# warm the cache, but only with specific warmers
Cachex.warm(:cache, only: [MyWarmer])

This is extremely helpful for things like evented cache invalidation (i.e. you can trigger a warming if you know something has changed) and debugging. Cachex's internal warming was also rewritten to use this exactly same function under the hood, so there should be no surprising inconsistency between the automated warming and warming triggered in this way.

Sizing Restrictions

One area of Cachex I have been a little dissatisfied with has been the concept of a cache "policy", in particular the way I handled the Cachex.LRW module in the Cachex v3.x line. Fortunately I was able to revisit this and rework it inside Cachex v4.x without having to worry (too much) about maintaining the existing interface and API contract. There are a number of changes to go through here:

Cache Pruning

As such Cachex v4.x introduces the new Cachex.prune/3 function, which allows a developer to effectively "shrink" a cache to a given size. This function operates using an Least Recently Written (LRW) style of pruning (where "written" equals "modified"), such that the oldest records are removed first. Let's take a look at pruning a cache to a smaller size:

# start a new cache named :cache
{:ok, _pid} = Cachex.start(:cache)

# add 500 entries
for x <- 1..500 do
  # verify the entry is added properly
  {:ok, true } = Cachex.put(:cache, x, x)

  # wait 1 ms
  :timer.sleep(1)
end

# verify there are 500 entries
{:ok, 500} = Cachex.size(:cache)

# prune the cache down to 100 entries
{:ok, true} = Cachex.prune(:cache, 100)

# verify the cache size after
{:ok, 90} = Cachex.size(:cache)

You can see from the example above that the cache shrinks to 90 records instead of 100, what gives? Well, Cachex is trying to delay the next time a cache has to be pruned; if we trimmed to 100 entries, the very next write to the cache would push us back over the limit! This constant need for pruning can cause heavy resource contention, and so Cachex is trying to balance this in a healthier way.

The :reclaim option defines an extra portion of the key space to "reclaim" as an additional buffer. This setting defaults to 0.1 (10%), which is why we see an extra 10 entries pruned in the example above. Of course this is an optional behaviour, so setting this to 0 will cause the cache to prune to the exact size specified by the developer.

Automated Pruning

In the v3.x line of Cachex, a cache supported an option :limit. This limit would be used to automatically restrict the size of a cache based a on a "policy". This option has been removed in Cachex v4.0, and has been replaced with some more general hooks you can add to your cache to call Cachex.prune/3 automatically. The option you decide to use depends on your use case:

Cachex.Limit.Scheduled
- Runs periodically to apply prune/3 on a cache
- Will briefly allow your cache to grow beyond the max size
- Very low overhead and resource usage when applying limits
- Effective in most cases; a good default if you need one
Cachex.Limit.Evented
- Runs in response to a cache modification action
- Will never allow your cache to grow beyond the max size
- Higher overhead and resource usage when applying limits
- Very effective for caches with low write rates and/or tight memory constraints

Looking back at the manual pruning calls we ran in the previous section, we can now convert this over to an automated approach. This is done by registering our chosen hook (we'll use Cachex.Limit.Scheduled) at cache startup:

# include records
import Cachex.Spec

# start our cache
Cachex.start(:cache,
  hooks: [
    # maximum 100 entries, scheduled eviction, 10% reclaim
    hook(module: Cachex.Limit.Scheduled, args: {
      100,  # setting cache max size
      [],   # options for `Cachex.prune/3`
      []    # options for `Cachex.Limit.Scheduled`
    })
  ]
)

This operates in exactly the same way as calling Cachex.prune/3 periodically, so you can rest assured knowing your cache will continually trim itself down to size. The default frequency is to run once every three seconds, but you can control this via the :frequency option in the third index of the arguments tuple.

Least Recently Used (LRU)

One of the features people have asked for over and over again is the ability to support Least Recently Used (LRU) limits, instead of the default LRW tooling. Cachex v4.0 will finally introduce a very naive LRU implementation. This is done by providing an additional hook to work in tandem with the existing LRW hooks, which I feel is a great balance of reusing existing components and enabling new approaches.

To use the new LRU pruning in a cache a developer can define an LRW hook (as shown above), but they can also attach a new Cachex.Limit.Accessed) hook. This hook will update the internal modification time of a cache entry on read access. The combination of these two hooks therefore provide us with a basic LRU pruning:

# include records
import Cachex.Spec

# start our cache
Cachex.start(:my_cache,
  hooks: [
    hook(module: Cachex.Limit.Accessed),
    hook(module: Cachex.Limit.Scheduled, args: {
      500,  # setting cache max size
      [],   # options for `Cachex.prune/3`
      []    # options for `Cachex.Limit.Scheduled`
    })
  ]
)

For a long time, I was resistant to introducing LRU limits on a cache, as I believed that the hidden overhead in entry management during read operations would catch people off guard. With that being said sometimes you simply need to use this approach, and so I decided to stop being stubborn and include this in Cachex v4.0.

Although I caved on supporting LRU, I attempted to do so in a way that leads people to the best choice for their situation. In the example above, we can clearly see that the approach chosen clearly demonstrates that LRU support is something additional you have to add on top of LRW tooling. I'm not sure how this'll go, but my hope is that it will guide users to avoiding LRU unless it's something they really need in their application.

Configurable Routing

Another major version of Cachex means another attempt at distributed caching, but this time I think I got it right (or at least close!).

The new v4.x strips out the previous implementation of the :nodes option, in favour of what I've chosen to refer to as "routers". A cache router provides the developer the ability to choose how keys are allocated to a node inside an OTP application. The new Cachex.Router module hopes to provide more flexibility to the developer, enabling them to choose the routing algorithm which best matches their use case.

Cachex ships with several routers included, in an attempt to handle the most common use cases out of the box. A cache router is a module which implements the Cachex.Router behaviour. The current set of included routers is as follows (at the time of writing):

Module	Description
Cachex.Router.Local	Routes keys to the local node only (the default)
Cachex.Router.Mod	Routes keys to a node using basic modulo hashing (i.e. `hash(key) % len(nodes)`)
Cachex.Router.Jump	Routes keys to a node using the Jump Consistent hash algorithm
Cachex.Router.Ring	Routes keys to a node using Discord's hash ring implementation

Each of these routers has different strengths and weaknesses, so it's up to you to choose which best fits your use case. As a rule of thumb:

If you are using a single node, use Cachex.Router.Local
If you are using a statically sized cluster, use Cachex.Router.Mod or Cachex.Router.Jump
If you are using a dynamically sized cluster, use Cachex.Router.Ring
If you want the same behaviour as Cachex v3.x, use Cachex.Router.Jump

Once you know which router you want, you can specify it in your cache's options. Each router has it's own options and configuration, which can be found in the appropriate module documentation. As an example let's look at creating a Cachex.Router.Ring router:

# for records
import Cachex.Spec

# create a cache with a Ring router
Cachex.start(:my_cache, [
  router: router(module: Cachex.Router.Ring, options: [
    monitor: true
  ])
])

This will create a router based on ex_hash_ring, which will monitor the addition and removal of nodes in your cluster automatically (via :monitor). This means that if you're using a different library or tool to handle node registration for your application, this router will automatically (de)register them for caching.

There is a lot to write about this topic, but rather than repeat it all here I recommend that those interested read through the documentation on Cache Routers and Distributed Caches. These pages include an example implementation of a custom router, as well as more in depth summary of how to use a router within a multi-node cache.

Documentation Overhaul

Last but definitely not least, the official Cachex documentation has had a much needed refresh. The ExDoc library that Hex uses for documentation has changed a lot since I first wrote Cachex!

The following changes and improvements have been made to the Cachex documentation:

Created a default "getting started" type landing page
Formatted pages into subheadings for different topics
Hidden all internal-only modules from the module listing
Linked features and modules more consistently between pages
Rewritten all documentation pages for Cachex v4.x (from scratch)

These changes should hopefully make things much clearer and easier to navigate for both new and existing users of Cachex. If you stumble across any documentation that is difficult to understand, incorrect, or even missing, please file an issue in the Cachex repository and it'll be addressed!

Thoughts and Feedback

Cachex v4.x is a result of only a few months programming, but a couple of years of thought finally put into place. Complexity has been reduced for the developers using Cachex as well as inside the library itself. Many old implementations have been revisited and brought forward, surfacing very old bugs and inefficiencies which have now been resolved. Going forward I'll be writing some more in-depth posts on several aspects of Cachex and how they have evolved over time - mainly for fun, but it might be worth checking out if you're interested in that type of content.

Cachex has always been my favourite personal project, even if it is a pretty small project in the grand scheme of things. I always look forward to improving it more and more, so if you find any bugs or have any feedback/suggestions (however small!), please do let me know. You can do this either in the repository or by contacting me directly, whatever works for you!