Inside the secret lab where Facebook tries to save your battery life


Deep inside Facebook's very first data center, located in a sprawling facility in the hills of Prineville, OR, lies a series of about 60 server racks. Each one houses 32 smartphones, all of which are running a version of one of Facebook's many mobile apps.
The company calls this setup the Mobile Device Lab, and it's designed to test Facebook's software on older phones to discover whether any bit of a new code, no matter how minor, results in a dip in performance or poorer battery life. For those smartphone owners who tote around a two- or even three-year-old device — and users in developing countries purchasing lower-cost devices for the first time — the lab is the very reason Facebook is still a viable home screen staple.


Antoine Reversat, part of Facebook's production engineering team, opens one of the racks for a group of reporters during its first Oregon data center tour in almost three years this week. Behind a large black metal door sit 32 iPhone 5C devices all in the process of either scrolling through the News Feed, testing various operations' lag on battery consumption, or rebooting to resume an identical state before running yet another test. In each server row sits more than a dozen racks, some holding devices as old as the iPhone 4 and others housing newer Google Nexus 5s. In total, Facebook has almost 2,000 handsets used to tell developers when they've screwed something up, and whether that degradation is only noticeable on an older phone.

With Facebook serving more than 1.65 billion users around the world, taking into account every variation in device type, mobile operating system, and network condition has become an increasingly complex operation. Entire companies have built robust operations around testing mobile software in similar fashion, and some of those startups have been scooped up by big-name competitors. In 2014, Google bought San Francisco-based mobile app tester Appurify. Facebook's engineers, on the other hand, figured the company could do the job itself, especially considering it had the computing power and server rack space at its expansive Oregon data center.

The concept started at Facebook as a program called CT-Scan. The service, developed internally last year, tests any app changes to see how they affect products like Facebook Messenger and Instagram. It looks over any new code submitted by engineers and analyzes whether it has a negative impact on how the app utilizes phone memory, how fast users can scroll through a feed, and battery consumption.

CT-Scan worked great for engineers who kept a single device at their desks, but it could not cover an ever-growing number mobile OS versions and device types. So Facebook's production engineering team, led in part by Reversat, decided to move away from single-device tests and even opted against relying on software simulation. "For example, we wouldn’t be able to track down a 1 percent performance regression in a simulator," Reversat explains. "So we opted for on-device testing."
It took Facebook quite a few iterations to arrive at the final design of its testing racks. First, it put together what it called the "sled," a metal rack that ended up interfering with Wi-Fi connections and rendering testing quite difficult. From there, the team moved to the "gondola," which was a 100-phone plastic rack that gave the devices' Wi-Fi connection some breathing room, but resulted in a convoluted nest of USB cables and extension cords.

After a few more tweaks, the Mobile Device Lab group had constructed the "slatwall," which could hold 240 phones and took up an entire room in the company's Menlo Park headquarters. But to get the device diversity it was looking for, Reversat says Facebook would have needed to replicate the slatwall across nine rooms, with space it couldn't necessarily afford. So starting in March 2015, the lab was moved to the Prineville data center, where it now sits in Facebook's Cold Storage facility.



To continuously push each new Facebook-owned app update to every phone, the team now uses eight Mac Minis for iOS racks and four Linux-based OCP Leopard servers for Android handsets. Each of the servers connects to an equal number of iPhones and Android devices, and each rack has its very own Wi-Fi network. The phones rest on a pegboard, so mounted cameras can record on-screen activity and give developers a remote recording of what's happening with each new build of an app that is installed, tested, and then uninstalled.

"When a developer makes a change, we build a new version of the app," Reversat says. "We then install it here on one of the devices and check that the change didn't introduce regressions." Regression analysis studies the relationship among different variables, and Facebook engineers worry that even the tiniest change in code could have unpredictable results in something as major as battery consumption.

Noticeably, the Mobile Device Lab does not test for performance under various network speeds, which may have involved working with Facebook's separate Internet.org initiative. Internet.org focuses on connecting parts of the world with poor or nonexistent internet access, potentially using solar-powered drones. Reversat is quick to point out that although users in developing countries may be using older phones Facebook has tested in its lab, the tests are not designed to check and optimize Facebook software for slower internet connections. There aren't any plans to do so at the moment, he says.

Down the line, Reversat says the Mobile Device Lab will expand from 32 phones per rack to 64. There are also a slew of other hardware and software improvements planned to accommodate phones with larger screens, none of which Facebook has tested. There are also areas of inefficiency. For instance, to put an iOS device in a reliably testing state requires a 20-step manual procedure, Reversat says. His team wants to reduce that process to one step.

Facebook also plans to open source the hardware design of its mobile device testing racks and the software it uses to put each phone back into its testing state. The entire process remains part of the Open Compute Project, the open-source community Facebook founded in 2011 to share data center equipment and software design. That way, Reversat says, any new app — whether its made by Facebook or not — can be tested to make sure owners of older phones are not left behind.


0 comments:

Post a Comment

 
Top