JavaScript concurrency and locking the HTML5 localStorage

March 05, 2012

The specification says there is no problem

Concurrency in JavaScript is not an issue. According to the HTML5 specification,

There must be [...] at most one event loop per unit of related similar-origin browsing contexts.

Most simply, this means that in the JavaScript that's executing in a single page, no two functions are executing at the same time. If a function foo() is running for five seconds, I can click on the page as often as I want during those five seconds; the click handler will not be fired until after foo() has returned.

Even better, this also holds for the JavaScript in all pages that have access to each other's data, e.g. an IFRAME, its parent, and another window that this parent opened via window.open() – they still share an event loop.

That's why the following code doesn't have to worry about a change that another thread would make to elem.innerHTML between the test and the assignment:

var elem = document.getElementById("foo");
if (/good/.test(elem.innerHTML))
    elem.innerHTML = "better";

Even though the DOM may be accessed by lots of different pieces of JavaScript, the event loop model guarantees that this shared memory isn't suddenly changed by someone else while you're in the middle of working with it.

If two pieces of JavaScript run in different event loops, there are two possibilities:

All good! No threading issues in JavaScript. Happy end of story. … Right?

Enter localStorage.

It's awesome. If you're carful.

Supported in all modern browsers (even in IE8), the HTML5 localStorage is an awesome way to store per-domain data in the user's browser, for example for caching, draft-saving, and lots of other useful stuff. The localStorage is shared by all pages coming from the same domain, so your web app only has to load all the user's settings from the server once, namely when they open a page for the first time, and store them locally. On subsequent pageviews, you can just pull the data from the localStorage.

All good! Store data locally, without the need for cookie hacks. Happy end of story. … Well, you probably know where I'm going.

The localStorage of course is a piece of memory that is shared among different event loops. Several browser windows that do not know about each other, and thus may have different event loops, still use the same localStorage if they come from the same domain.

The following is not safe:

function storeAnotherNumber(n) {
    var value = localStorage["myKey"];  // (1)
    if (value)
        value += "," + n;
    else
        value = n;
    localStorage["myKey"] = value;      // (2)
}

Between retrieving the value at (1) and storing the new value at (2), another piece of JavaScript could have changed the content of localStorage["myKey"]. These changes will be overwritten.

In my testing, this issue actually occured in Chrome and in Opera, while (at least in my simple tests) Firefox, Internet Explorer and Safari did not show this problem. I don't know whether this is because the test pages were using the same event loop, or the browser was holding the storage mutex for a longer time, or for different reasons.

Now, if you're careful, the chance of this happening is pretty low. On the Stack Exchange Chat, we have for a long time been using the localStorage as a way for several chat room tabs the user may have opened to communicate with each other, so that only one of the tabs has to talk to the server, and can pass the received data on to the other tabs. This communication is also used for a few small other niceties; e.g. when a user closes a notification in one tab, this is communicated to the other tabs, so they can close this notification as well.

Know what the other ones are up to

This cross-tab communication happens via a JSON-serialized array of message objects that is stored in the localStorage under a fixed key. When one tab wants to broadcast a new message, it pulls the current content from the storage, deserializes it into an array, appends the new message object, re-serializes it, and puts it back into the storage (that last step causes a storage event to be fired in all open tabs from that domain, so they know to look for new data).

Sound familiar? Yep, that's precisely the storeAnotherNumber thing from above.

So, how do we make sure that two tabs broadcasting a message at the same time don't overwrite each other's changes?

Most importantly, by making this strictly fire-and-forget. There's only broadcasting of messages. We never ever send responses or confirmations. If one browser tab would send a message saying "Hey, anyone else there?", and three other tabs immediately answer "Yes, I'm here", these three responses happen so close to each other that the chance for overwriting messages increases.

Another thing is that the messages are only broadcast when some event happens; usually a click or an AJAX request. The chance of two of these happening in such a short succession (and we're talking really short; it's not like storing a new message takes very long) is extremely low.

However, this cross-tab communication has been working great, and in order to improve it and utilize it in even more places, a back-and-forth communication (i.e. the responses I ruled out above) would be nice to have.

That's why today, I took a look at locking algorithms, and whether I could have a chance to make this retrieve-and-resave loop not be dangerous, and I actually managed to get something working. You can find the (yet very undocumented) code in this Bitbucket repository.

Finally some code!

It uses Algorithm 1 from this 1985 (!) paper by Leslie Lamport in order to make an atomic check-and-set on a mutex.

This algorithm has the huge advantage that it doesn't require any process to know how many other processes there are accessing the same data, as long as each one has its own unique ID. I create these IDs from both a time stamp and a random number; since this happens on page load (or rather, when the JS file is executed), the chances of collisions are purely theoretical.

The disadvantage of this algorithm is that it has a timing requirement. In case of a lock contention, it makes the assumption that the time spent in the critical section has a fixed upper bound. But since the only thing we do in the critical section is check and set the mutex lock owner, I considered 40ms to be a very generous bound.

The usage is pretty simple:

LockableStorage.lock("myKey", function () { storeAnotherNumber(42); });

Assuming that all the code (on this domain) writing to localStorage["myKey"] uses this locking functionality, there should be no overwriting of changes. If you're only reading, you don't need to lock (the specification guarantess the atomicity of both reads and writes).

The function LockableStorage.lock takes a string (the localStorage key), and a callback function that is called asynchronously once the lock is aquired (it will be released again when this function exits).

There is a third (optional) argument maxDuration. This is the time, in milliseconds, that the callback function guarantees not to take longer than. It defaults to 5000 (you're hopefully not executing blocking functions that take longer than 5 seconds, are you?). This is necessary because a browser may close the page at any time. If this happens while there's a lock on a localStorage key, the lock will never be released. That's why locking has a maximum duration, so it can't block forever. Think of the classic action movie quote “If I'm not back in ten minutes, assume I'm dead” and its variants.

Once again, here's the link: bitbucket.org/balpha/lockablestorage. Just for the record: This is very experimental, I was just playing around, the code is very undocumented so far, and I have no idea what I'm talking about when it comes to locking algorithms. But I hope you still find it interesting.


previous post: jQuery script insertion and its consequences for debugging

next post: An unexcited look at browser sniffing

blog comments powered by Disqus