Roblox experienced the worst kind of Halloween trick this weekend, but kids everywhere are now in for a treat: Roblox is back online. The game-creating platform was restored on Sunday afternoon, after being dark for more than two days.
In a blog post, Roblox founder/CEO David Baszucki apologized for the lengthy delay in restoring the service, blaming several factors for the outage.
Roblox users began to face problems on Thursday; according to Baszucki, a core system in Roblox’s infrastructure became overwhelmed, prompted by a subtle bug in our backend service communications while under heavy load. He made sure to clarify that the problem was “not due to any peak in external traffic or any particular experience. Rather the failure was caused by the growth in the number of servers in our datacenters. The result was that most services at Roblox were unable to effectively communicate and deploy.”
That distinction was important to make by Roblox, after users speculated that the shutdown was caused by an overload of users engaging in a Chipotle promotion. The restaurant chain had begun a Halloween event on Roblox involving a giveaway of $1 million worth of free burritos shortly before the blackout began.
“Due to the difficulty in diagnosing the actual bug, recovery took longer than any of us would have liked,” Baszucki wrote. “Upon successfully identifying this root cause, we were able to resolve the issue through performance tuning, re-configuration, and scaling back of some load.
Baszucki said Roblox will share a post-mortem with more details as the company completes its analysis of what happened. For now, the company doesn’t believe any users have lost any data. “Your Roblox experience should now be fully back to normal,” he wrote.
Roblox attracts more than 200 million active users monthly, on devices including iOS, Android, tablets, computers, Xbox, Oculus Rift and HTC Vive. Daily active users spend an average of 156 minutes per day with the game.
Here is Baszucki’s full blog post:
As most of the Roblox community is aware, we recently experienced an extended outage across our platform. We are sorry for the length of time it took us to restore service. A key value at Roblox is “Respect the Community”, and in this case we apologize for the inconvenience to our community.
On Thursday afternoon October 28th, users began having trouble connecting with our platform. This immediately became our highest priority. Teams began working around the clock to identify the source of the problem and get things back to normal.
This was an especially difficult outage in that it involved a combination of several factors. A core system in our infrastructure became overwhelmed, prompted by a subtle bug in our backend service communications while under heavy load. This was not due to any peak in external traffic or any particular experience. Rather the failure was caused by the growth in the number of servers in our datacenters. The result was that most services at Roblox were unable to effectively communicate and deploy.
Due to the difficulty in diagnosing the actual bug, recovery took longer than any of us would have liked. Upon successfully identifying this root cause, we were able to resolve the issue through performance tuning, re-configuration, and scaling back of some load. We were able to fully restore service as of this afternoon.
We will publish a post-mortem with more details once we’ve completed our analysis, along with the actions we’ll be taking to avoid such issues in the future. In addition, we will implement a policy to make our creator community economically whole as a result of this outage. There are more details on this to come. As part of our “Respect the Community” value we will continue to be transparent in our post-mortem.
To the best of our knowledge there has been no loss of player persistence data, and your Roblox experience should now be fully back to normal. You can always contact our support team if you experience any hiccups using Roblox now or in the future.
We are grateful for the patience and support of our players, developers, and partners during this time.