Within the next few days, some applications and mashups based on the Twitter API may behave unpredictably or even crash – at least that’s the warning given by Canadian software company WhereCloud’s Twitpocalypse website. This impending “Twitpocalypse,” much like the famous Y2K bug of 2000, is based on a data processing limitation.
Every tweet in Twitter’s system is uniquely identified by an integer value. For example, the system’s very first public tweet, “just setting up my twttr,” by Twitter founder Jack Dorsey, is tweet number 20 (presumably tweets 0 through 19 were used for testing). The maximum signed 32-bit integer value for most database applications is 2,147,483,648. This is a huge value, but the accelerating popularity of Twitter means has the amount of tweets is rapidly approaching this limit. If third party application developers haven’t designed their Twitter clients to store tweet IDs using something like the less restrictive unsigned 64-bit integer data structure, users might start seeing strange errors, such as tweets listed in the wrong order – or worse, applications not working at all.
Current.com blogger James O’Malley recently posted this message explaining the Twitpocalypse’s potential for programmatic mayhem:
“Twitter will probably have thought about this in advance, but the worry is the twitter clients: the tweetdecks, the twhirls, the tweeties and the twitterfoxes of this world: have they prepared? Can they count to 2,147,483,648? If not, things could get messy.”
Is the Twitpocalypse a real problem for third party applications using the Twitter API, or is WhereCloud’s warning a simple publicity stunt that will fizzle out with the same whimper as the Y2K bug? A real Twitpocalypse seems unlikely, as it would mean that a large number of Twitter client developers have carelessly designed their systems with insufficient data limits. If anything, the Twitpocalypse is a lesson in implementing programming best practices for developers who use third party APIs. Standards-based web services like the Twitter API may be robust and reliable, but no API can guarantee that the data structures used in an application have been well designed.
How does the Twitter API team feel about an impending Twitpocalypse? Twitter API developer Doug Williams recently posted this tweet to the API’s official feed:
“A friendly reminder: we’re nearing the http://www.twitpocalypse.com/. Ensure you are storing status_ids as unsigned integers.”
and followed it up less than an hour later with this tweet:
“…we’re using a unsigned 64-bit bigint internally to store status_ids. You should, too.”
Meanwhile, in the span between Doug’s two posts, over 450,000 other tweets were added to the system, bringing users that much closer to microblogging doom.
Even if most developers are prepared to properly handle the ever growing amount of Twitter tweets, they may have to deal with another Twitpocalypse in the future – the very distant future. MySQL’s BIGINT upper limit tops out at 18,446,744,073,709,551,615. To put that value into perspective, every single one of the 6.8 billion people on Earth would have to post 2.7 billion tweets each to reach this limit. Twitter is exploding in popularity, but at this point, it looks like there won’t be another Twitpocalypse for quite some time.