A bit of history
Ever since I joined Stack Exchange in 2010, I've been the developer tasked with implementing the April 1st happenings on Stack Overflow and the other Stack Exchange sites. In fact, even before I was hired, I had a large part in the 2010 April Fool's gag, where for a day all users' avatars where replaced with random unicorns, created by my unicorn generator Unicornify, which itself has its roots on the Stack Exchange network.
My first April Fool's project as an employee, in 2011, provided unicorn animations whenever you upvoted or downvoted a post. Those unicorns where again created with Unicornify's rendering engine, which I had tweaked to allow me to animate the unicorns.
In 2013 we introduced Chat with an expert, a chatterbot based on the principles of ELIZA, but with responses that were customized to problem-solving situations instead of psychiatry clichés, and with some additional more sophisticated features. The “expert” bot, codenamed Adviza (both a play on the word “advisor” and a contraction of “advanzed Eliza”), was not really very helpful, though it seemed convincing to some.
For 2014, we offered our own virtual currency called Unicoins, allowing users to mine coins and buy fairly useless power-ups.
And now we're in 2015, and this is the story of this year's April 1st feature, StackEgg. I say “feature” because it wasn't really a gag or prank with an even remote chance of being considered a serious new feature; rather, it was a game that offered a little bit of fun for the two days it existed.
Every year, around the beginning of March, somebody announces that “it's time to start thinking about April 1st”. This year, if I recall correctly, Laura was the first to bring it up. We had a Trello card on the Core Team board to drop ideas on. One of the first comments was from Jon, who suggested
MMO Pacman - regular users gain fake rep as they eat dots on the map. Fruits are badges. Mods/high-rep users can be ghosts. Stackman! Pacoverflow! Websockets!
Well, good thing we didn't end up going there, because Google did almost the same thing this year. I then dropped a comment with an idea that I had, although I couldn't yet see turning it into an April 1st thing:
I had a similar idea to Jon: A Tamagotchi that you have to keep alive for the day. (Probably not going to happen, but we're just brainstorming here.)
Turns out that the idea stuck anyway. During the next weekly team meeting (this was on March 9th), we created a Google doc that contained the basic idea at the top and then a brainstorming section for everyone to drop ideas in. Here's how the doc started:
Every site gets one Tamagotchi which represents the site. Your goal is to keep your ‘site’ alive & healthy together. Mechanics don’t have any effect on the site, but rather metaphorically represent a site (feed it with upvotes, clean up the poop via review, etc.)
Actions are named after site actions and roughly correspond to the things that make a site healthy,
e.g. upvotes = feeding (make it happier but fatter), close votes (or review actions) = clean up the poop (if you don’t it gets sick)
Note: these are not actual actions, just “themed” options in the UI. It’s all a metaphor, man.
Users vote together on the next action to take. Every X seconds (30?) the top-voted action is taken
Here are some examples of brainstorming ideas that did not actually make it into the final game (I only had three weeks after all):
Tamagotchis have a branching evolution path, based on the care they receive
So more Digimon than Tamagotchi
E.g. evolve from a pony to a unicorn or maybe to a narwhal
Give hints on the console
different looks depending on the reputation level. high rep users see a much cooler version. Eg: LED -> grayscale -> 8bit color -> true color
Every X minutes (30?) two site Tamagotchis are paired up and fight (“A challenger appears”)
Research and prototype
And so I started working on it. The first thing I did was installing a Tamagotchi app on my phone as a refresher of how the real thing worked. I did own an actual first-generation Tamagotchi back in the 90s, but that's been a while.
Then I began trying to figure out how stats and actions would work concretely. I attempted to model it after the real evolution of various Stack Exchange sites, looking at site analytics and creating tons of Excel spreadsheets that simulated various different models (including one in which I invented three numerical indicators – Engagement, Maturity, and Relevance – based on actual sites, which was not really helpful at all).
It still took a while until I ended up with game dynamics that I felt made sense, including a breakdown into game phases of increasing difficulty and increasing amounts of available stats and actions. These phases, based on the actual Stack Exchange site “life cycles”, were private beta, public beta, and launched (later renamed to graduated), with the final winning state dubbed “winning the internet”. On March 18th, I gave this playable prototype to my coworkers to try, and the feedback was very positive, and thus the core game logic was born. Except for some very small tweaks in numbers, the game at this point was identical to what actually ended up being played on April 1st.
And so I ported the prototype to C#. You can check out the details yourself; I published the core game on GitHub after April 1st was over. At the same time (we're on March 23rd now), Marc was extremely helpful in creating the backend implementation of recording and storing the voting data (the core game is a single-player game, but what really happened on April 1st was that many users at once played the game by voting on the next action to be taken), and also of creating the infrastructure for repeatedly evaluating the votes after each voting round. We piggybacked on the heartbeat functionality in one of our backend services called StackServer. This “heart” beats every ten seconds, and on every third heartbeat it would then evaluate the past voting round, for a round length of thirty seconds (we later reduced this to twenty).
I created a wrapper class (
StackEggGameWrapper) that held a core game object (the
StackEggGame), but that also stored and handled things that were not part of the core game,
like the time the current voting round will end, the quorum required during this voting round, handling phases in which the actual game was no longer active (like after
winning), and the current message that would be displayed in the UI.
We stored all the game state in Redis, serialized via ProtoBuf. That way we didn't need to make any database changes for data that we only needed for a few days, not to mention the fact that it's also blazingly fast – because Redis itself is very fast, and also because our Redis infrastructure has built-in webserver-local caching so that most of the time it didn't even have to call out to the Redis server.
Finally, I made sure that the current state of the game would both be available via an AJAX route, and be regularly broadcast over websockets, so the UI has something to display. I was able to easily use our existing websocket infrastructure (which we use to give realtime updates about new questions and similar things), so this required very little work.
Turning the clock back a few days, on March 20th I also started working on the client, that is, the popup in which you played the game. I usually don't get to design any major new features (we have an awesome design team that does these things a lot better than I ever could), so here was a welcome opportunity to create something new. I started by making this mockup in Inkscape, which looks pretty close to what the final thing ended up like:
You'll notice that the primary color here was orange, and it actually stayed orange until a few hours before StackEgg launched, when I decided to go with something that looked a little less Stack Overflow-y.
The image of the keychain toy with the LCD in it was a quick Blender hack. I'm an amateur in Blender, and if you look closely (and especially if you're a Blender pro) you'll notice a lot of issues, but I think it turned out fine for the purpose.
On March 25th the client was in a state where I could give it to my coworkers to try it out. It worked well, and the most important feedback was that 30-second voting rounds were just too long, and so we reduced the time to 20 seconds.
From the beginning it was clear that we needed tiny pixelated LCD animations as a nod to the real Tamagotchi. I decided to use a 32x16 pixel screen with three colors, “light” (the greenish LCD background), “dark” (the blueish “LCD on” color), and “medium”, in the middle between the two. Having a third color in addition to the classic on-and-off-only made it a bit easier to draw somewhat recognizable very-low-resolution images. It also came down to a nice two bits per pixel, with the fourth value being transparent.
Because I wanted to! Creating a programming language is fun, after all.
The animations where smaller that way, although admittedly chances are that gzip would've eaten up a lot of the disadvantage.
I needed some way to abstract away pauses. In my little language I could just write
wait 300to pause for 0.3 seconds until the next animation step. But of course this had to be translated into asynchronous execution when the animation was playing. And since
Being constrained by this simple language prevented me from being too fancy with the animations, which wouldn't have been fitting for the idea of a tiny keychain LCD.
I tried hard not to make the language Turing-complete, but I failed in the end when I added the
GotoTimesVar instruction, which can be abused to create a conditional. The source code
of the simplest animation, the one where the StackEgg just idled back and forth on the screen when nothing else was there to be displayed, looked like this:
var eggx 4 var eggy 2 setxy eggx eggy label walking label loopright clear inc eggx setx eggx picxy egg picxy eggeyes_right wait 300 gototimes 11 loopright picxy eggeyes_down wait 300 label loopleft clear dec eggx setx eggx picxy egg picxy eggeyes_left wait 300 gototimes 11 loopleft picxy eggeyes_down wait 300 goto walking
I have created a page on which you can see all animations that existed.
Approaching the deadline
The last few days before the launch consisted mostly of finishing up small but important details like
user settings, so fun-haters could disable the StackEgg, and also so that we could remember whether the user has ever interacted with the popup (before that, we would not animate the StackEgg widget in the sidebar; our very strict “no animated ads” policy counts even for things like this),
copywriting – there needed to be a help text,
compiling a list of example places for all full-hour timezones, in order to display “it is currently such-and-such time on April 1st in such-and-such place”, pre-empting user complaints that it's not April 1st yet (or anymore) where they live – because of the collaborative nature of the game, we had decided to enable StackEgg for everybody for the whole time it was April 1st anywhere in the world, and not, like in the previous years, only while it was April 1st for the particular user,
adding my library ByTheWay to enable multiple browser tabs that a user has open on a site to share data that comes in, reducing the amount of necessary server communication,
creating the leaderboard page that ranks all Stack Exchange sites by their StackEgg performance,
creating missing animations,
and general polishing, tweaks, and fixes.
Finally, and a little bit too late (it was already 23 minutes after midnight on April 1st in Samoa), I enabled StackEgg for everyone. And it was noticed quickly: Within minutes, there were over 60 users playing the game on Stack Overflow.
And then Stack Overflow went down.
It didn't go down with a bang; rather we noticed that page loads were becoming slower and slower until the site was unresponsive most of the time. You can read a bunch of details in the post-mortem on our status blog, but it's missing one detail (and deliberately so, because we believe in blameless post-mortems): the fact that the outage was completely my fault.
In my quest to add as little work as possible to the page request (because we want our pages to render as quickly as possible), I achieved quite the opposite. My idea was to load the initial data that StackEgg needed after the page itself had been rendered (because the page iself is what the user was actually came for), and thus I loaded that initial data via an AJAX request.
What I didn't consider is that when you make an AJAX request so close after the page load, the connection between the client and the server is kept alive, because it looks like there may be more requests coming. An AJAX request shortly after page load also didn't strike me as unusual, because I see them all the time on Stack Overflow. But of course that's because I'm a developer, and those AJAX requests are for things like MiniProfiler; page views by normal users don't initiate any immediate AJAX requests.
And so all those persistent connections quickly exhausted the maximum that was configured in our load balancer, and things went south.
In addition, I made two other mistakes: For one, there was no way to tell existing clients to stop polling the server for data (most updates came over websockets, but there was still some AJAX polling). Lesson learned here: If you have recurring AJAX requests in the page, have a kill switch so the server can tell the client to stop. Because people keep pages open a lot (which, of course, they are not to blame for at all) even when not actively looking at them anymore.
Funny thing: There actually was a way we could have done it, and we utilized it later, but in the stress of the immediate situation it didn't come to my mind at first. Remember how I
to not make the request if the URL matched a certain pattern?
Thanks to the awesome work of our sysadmin team (special thanks to Nick and Kyle), and through a few quickfixes and then later more thorough fixes in both StackEgg's server and client code, the crisis was averted quickly, and people could finally enjoy two days of playing StackEgg. In total, almost 460,000 votes were cast by more than 15,000 users, and the internet was won 454 times across the network.
And I could finally relax after three weeks. Don't get me wrong, this was a very cool project to work on. But it was also
nice when it was finally done.
previous post: Catastrophic backtracking: When regular expressions explode