Learning points from writing my own JS/CSS caching system
This post talks about the various points I have learned from writing my own caching system, Cache-N-Crunch.
Why write your own caching system?
A while ago I wanted to minify the JavaScript and CSS on a website I worked on. The obvious solution would be to use one of the many caching libraries.
I wanted to take advantage of merging all the files together and delivering a minified asset set.
However there was one problem, the second developer didn’t want to use anything that wouldn’t automatically flush the cache. They typically devved directly on the main server as it was their testing server. This meant they required it to be simple to update as they were testing.
Many of the caching layers I looked at require a cache clear. This was a manual step which created production ready assets. This wouldn’t be acceptable to him.
Many of them were able to turn on development mode. In this mode it would continually regenerate the assets to use during development. But again running a command so they could work was not acceptable.
So eventually I decided that I would learn the intricacies of caching and write one myself.
Compressing the assets
I wrote two small PHP wrappers around UglifyCSS and UglifyJS which allowed my caching library to minify the data. This simplified the logic to purely determining whether the caching needed to be invalidated and regenerated.
This also benefitted from a pretty thorough compression program that was already used by many thousands of people.
Fulfilling the requirement of automatic compression
The primary requirement was that if a file has changed, it needs to be automatically compressed and minified for production. It also needs to be immediately reflected when devving. To achieve this I had two sets of logic, one for development and one for production use.
Logic for production mode
- If a minified asset is available serve that
- If no minified asset is available, serve all assets that make up the minified asset directly.
This logic means that by default we will serve the minified assets. However if it isn’t available then it will serve the full set of assets that go into the minified final asset.
No check for whether the file has changed would be performed in production mode. If there was a minified file it would be assumed to be the latest version. This sped up the production caching as it was a simple O(1) lookup to see if a minified asset was available. This assumes that any minified asset will be the latest version (which is a problem resolved by the dev mode logic).
Logic for dev mode
- When loading an asset, if one of its components have changed, minify the assets
- Once the asset has been fully minified, link to all component assets directly.
In dev mode, every time the page is loaded the assets are checked and minified if they have changed. This means also that the production assets are kept up to date as they are being worked on.
Determining if files have changed
To determine if the files have changed I would keep track of the hash of every file used when minified and then compare that when running in development mode.
This kept the checks simple and relatively fast as running a simple hashing function on the files was quick. This did however mean that changes to formatting that would have no effect on the code would cause it to be regenerated.
Side effects of merging files
Currently the files are merged together and minified as a group however this has some side effects. The main one I noticed was that since each file is merged, “use strict” caused issues.
While some files were intended to run in strict mode, not all worked when set into this mode. Since the files were merged together, it caused all the files merged after a strict mode file, to be ran in strict mode.
This actually caused me to rework a number of the files running in strict mode. Amusingly this problem has also been found to happen on major websites too. Blindly merging files together therefore has some concerns depending how they are written!
Smarter optimisation operations
While minifying does somewhat reduce the size while researching I found something more interesting.
Webpack performs a technique called tree shaking on the source. Here the source is statically analysed to work out what functions and variables are needed by each asset. During the minification process all unnecessary elements are removed from the final source.
This is something that has significant advantages when used with ES2015 modules and is something my caching system did not implement. This could have been an improvement point however the code was not ES2015 compatible so likely would not have brought any benefit to that website.
A conclusion on what I have learnt
Caching is a hard problem and you will run into unexpected problems when trying to roll your own. There are people much smarter than I who have written much more useful systems so always use a pre-built one if you can!
If you do want to write your own its great fun, although can be a little mentally taxing! Lots of testing of different files helped when writing it. This alongside a codebase with a lot of different Javascript files helped me test it.
This was certainly something that I enjoy doing but I still prefer using a pre-built minification system like webpack.