Our big project these days is a major overhaul of our map tile servers for our Weather Overlays API. We’re moving the entire codebase over to Node.js and using mapnik to generate tile images.
Performance and resource usage is a major concern. We need to generate images for more than 20 different weather data sets, some of which update as often as every 2 minutes. To keep our maps snappy and not break the bank, we need to do a really good job caching at every level of resource creation.
There are several great caching libraries out there, but we had trouble finding something that matched all of our requirements:
Nothing we found quite fit that bill, so my colleague Seth Miller decided to roll his own instead called CrispCache.
CrispCache can be installed via
npm
:
npm install --save crisp-cache
Let’s see how we could use CrispCache to get the current temperature using the Aeris Observations API. First, we’ll start with our business logic of fetching the data:
// lib/service/current-temp.js const request = require('request'); /** * @param {string} placeName * @param {bool} isMetric * @param {function(err:Error?, temp:number)} callback */ function currentTemp(placeName, isMetric, callback) { // Execute the HTTP request against the observations API request({ url: 'http://api.aerisapi.com/observatrions/${placeName}', // Aeris API keys // Sign up for a free developer account: http://www.aerisweather.com/signup/ qs: { client_id: 'my_client_id', client_secret: 'my_client_secret' }, // Parse response as JSON json: true }, (err, response, json) => { if (err) { return callback(err); } // Resolve with temperature callback(null, isMetric ? json.response.ob.tempC : json.response.ob.tempF); }); } module.exports = currentTemp;
Next, create a cached version of
currentTemp
using CrispCache. I’ll often put these cached wrappers in a
services.js
module so I can easily swap out which implementation my app is using (e.g. for test mocks).
// lib/services.js const CrispCache = require('crisp-cache'); const MINUTE = 1000 * 60; module.exports = { currentTemp: CrispCache.wrap(require('./services/current-temp'), { // Tell CrispCache how to generate a cache key from the // arguments provided tocurrentTemp()createKey: (placeName, isMetric) => [placeName, isMetric ? 'C' : 'F'].join(','),
parseKey: (key) => {
const args = key.split(',');
// convert back to [placeName, isMetric]
return [args[0], args[1] === 'C'];
},// Our cache will be stale after 15 minutes, and expire in 30 minutes
defaultStaleTtl: 15 * MINUTE,
defaultExpiresTtl: 30 * MINUTE
})
};The great thing about
CrispCache.wrap()is that it allows us to use our cached function just like we would the original function:
// lib/index.js const currentTemp = require('./services').currentTemp; currentTemp('minneapolis,mn', false, (err, tempF) => { console.log(It is currently ${tempF}°F in Minneapolis, MN);
});This means our implementation code can be entirely agnostic to the caching layer. It wouldn’t be too much work to override our
currentTempservice with a test mock or even disable caching entirely in a development environment.
Stale vs. Expired Cache
“But wait a minute”, I hear you say, “why would I want to serve my users stale data? What happened to ‘crispy fresh'”? Let me explain what happens when a cache entry is stale, and you will see how this makes our application as crispy fresh as a head of iceberg lettuce in the springtime.
A cache entry may exist in one of four states: empty, valid, stale, or expired. When a value is requested from the cache, the cache’s behavior depends on the state of the matching cache entry:
State | Cache Behavior |
empty | Invokes the underlying function, resolves with the result, and saves the result to the cache. |
valid | Resolves immediately with the cached value. Underlying function is not invoked |
stale | Resolved immediately with the cached value. Invokes the underlying function, and saves the result for the next request. |
expired | Removes the expired cached value. Invokes the underlying function, resolves with the result, and saves the result to the cache. |
Consider this example in which we invoke
currentTemp()
at 3pm:
// 3:00pm currentTemp('minneapolis,mn', (err, temp) => { // Cache is empty // API request is executed, responding with ob.tempF=7°F // currentTemp() resolves with temp=7°F });
Because this is our first request, this is a cache miss, which means we will have to wait a moment for the request to the Aeris Observations API to resolve. But if we try again at 3:01pm:
// 3:01pm currentTemp('minneapolis,mn', (err, temp) => { // Cache has a valid entry (temp=7°F) // So no API request is executed // currentTemp() resolves immediately with temp=7°F });
The cache response is instantaneous because the API response was cached in memory. But what happens if we make another request at 3:16pm, one minute after our cache entry has gone “stale”:
// 3:16pm currentTemp('minneapolis,mn', (err, temp) => { // Cache has a stale entry (temp=7°F) // API request is executed in the background, responding with ob.tempF=12°F // but cache resolves immediately with temp=7°F });
So even though our cache entry is stale, we still get an immediate response. But at the same time, we are firing off another request to the Aeris API in the background so that new data will be ready for our next request:
// 3:17pm currentTemp('minneapolis,mn', (err, temp) => { // Cache entry is valid, from our previous request (temp=12°F) // So no API request is executed // resolves immediately with temp=12°F });
The result is that we are always providing the freshest data available, without making anyone wait for the data to be fetched.
On our map image servers, all of our map tile images are cached in memory. Configuring the
ttl
values for the image cache can be a little tricky. If I set it for an hour, will it devour all of the memory on my server? Can I squeeze more out of my server and cache for a little longer?
Wouldn’t it be nice if you could define the max memory usage of your caches and forget about it? Well here’s a config file pulled right out of our map image server:
const KB = 1024 * 1024; const MB = KB * 1024; const MINUTE = 1000 * 60; // config/cache.js module.exports = { tiles: { staleTtl: MINUTE * 30, expiresTtl: MINUTE * 60, maxSize: MB * 100 } }
As you can see, we’ve set a memory limit of 100mb on our tile image cache. When we create our cache wrapper, we just need to reference the configured maxSize, and tell CrispCache how to determine an entry’s size:
// lib/services.js const cacheConfig = require('../config/cache.js'); // lib/services.js module.exports = { getTile: CrispCache.wrap(require('./tile-source/get-tile'), { // Convert multi-arg function signature into a single cache key createKey: (z, x, y) => [z, x, y].join(','), parseKey: key => key.split(','), // Set default ttls defaultStaleTtl: cacheConfig.tiles.staleTtl, defaultExpiresTtl: cacheConfig.tiles.expiresTtl, // Configure LRU cache maxSize: cacheConfig.tiles.maxSize, getOptions: tileImageBuffer => ({ size: tileImageBuffer.length }) }) };
So what is an LRU cache? I’m glad you asked. LRU stands for Least Recently Used, and what it means is that the cache automatically removes entries as it approaches its
maxSize
, prioritizing the least popular entries for deletion.
Take the following example:
// Create a function that resolves with whatever you pass it (for easy testing) const identity = (val, cb) => cb(null, val); const cachedFn = CrispCache.wrap(identity, { maxSize: 10, // Tell CrispCache to useval.lengthto determine
// the cache entry'ssizegetOptions: val => ({
size: val.length
})
});
cachedFn('foo', () => {}); // add 3 characters to cache (cache size = 3)
cachedFn('foo', () => {}); // retrieve 'foo' from cache (cache size = 3)
cachedFn('foo', () => {}); // retrieve 'foo' from cache (cache size = 3)
cachedFn('bar', () => {}); // add 3 characters to cache (cache size = 6)
cachedFn('shazaam', () => {}); // add 7 characters to cache (cache size = 13)When we add
"shazaam"to the cache, we have exceeded the cache’s configured
maxSize. As a result, CrispCache finds the entry which was least used (in this case
'bar') and removes it from the cache.
With the
maxSizeand
getOptions: () => ({ size })configurations, we can rely on CrispCache to manage our memory usage for us.
As a web developer, I am keenly aware of how much of my work is just a thin layer on top of existing open-source tools and platforms. So, I really enjoy the chance for our team to put something back out there for the community.
Give CrispCache a try the next time you’re in the mood for some crispy fresh caching. It does a lot more than I’ve been able to cover here so check out the docs. And while you’re playing with it, open an issue, send a pull request, and we’ll keep building on it.
No comments yet.
Be the first to respond to this article.