If you pay attention to phone launches, you may have noticed an unusual trend with the latest bumper crop from Apple, Samsung, and other OEMs. According to many reviewers, battery life for new devices has regressed overall. The exact figures vary depending on what kind of tests you run and which devices you compare, but in many cases, 2018 devices aren’t matching the performance of earlier models.
The Washington Post, which recently covered the issue, published its own Wi-Fi browsing tests:
In their tests, devices like the iPhone 8 Plus beat the iPhone XS by an hour, with the Pixel 2 outperforming the Pixel 3 by nearly 90 minutes. This flies in the face of the marketing spiel from these various companies and contradicts the general perception that each new generation of products is more power-efficient than the last. But the Post also notes that the trend isn’t universal and that Consumer Reports, which uses a very different battery testing methodology, has seen exactly the opposite results. According to CR, the XS and Max are both huge improvements over the iPhone X. Why the huge disparity? The Post doesn’t really address it, but we’re going to.
Different Methodologies, Different Results
Most reviews test battery life by running a looped test — typically either a series of webpage loads or video playback — until the battery dies. Different review sites use a different mix of web pages and video codecs, so you’ll still see variation from site to site as a result. The majority of sites also calibrate brightness to a specific value, typically between 150-200 nits.
Consumer Reports does something very different. It writes:
To find out how long a phone’s battery can go, Consumer Reports uses a robotic finger programmed to put the phone through a range of tasks designed to simulate a consumer’s average day.
The robot browses the internet, takes pictures, uses GPS navigation, and, of course, makes phone calls. For the sake of consistency, we run all smartphone battery tests with the display set to 100 percent brightness. But if you want to extend battery life on your phone, it’s useful to turn on the auto or adaptive brightness function, which independently adjusts the display to suit the lighting environment. (Emphasis added.)
I emphasized this point because I want to illustrate that there are two significant points of difference in play rather than just one. Just for the heck of it, I charged my iPhone SE to 100 percent and attached it to the same Kill-A-Watt meters we use for system power consumption measurements. With the iPhone SE set to minimum brightness, the amount of power drawn at the wall was a steady 1.3-1.4W. With the brightness set to maximum, power consumption hit 1.8-1.9W. That’s roughly a 1.35x increase. I don’t claim to know how the OLED panels on newer iPhones respond to shifts in brightness, but we can safely assume that this difference in configuration also matters to overall results. Nonetheless, CR’s results are quite different, with the iPhone XS lasting 24.5 hours and the XS Max at 26, compared with the iPhone X’s 19.5-hour run time.
Why Phone Battery Tests Look the Way They Do
There are some specific reasons why review sites tend to test batteries the way they do. First, using a single test like playing a video or loading the same suite of web pages establishes a standard metric that can be used across phones over a period of several years. While it’s true that very few people will literally run a device from 100 percent to 0 percent doing just one activity, batteries also don’t discharge at a constant speed. Testing from 100 percent to 0 percent allows the reviewer to avoid any errors caused by extrapolating total discharge time from a partial measurement.
Second, phones are sandboxed in ways that PCs and laptops aren’t. On the PC, you can use a benchmark harness to run a suite of applications to model real-world usage, with pauses and idle periods to simulate the time spent waiting on the end user. This kind of cross-application scripting isn’t available on phones, at least not under normal operating conditions. Consumer Reports‘ robotic finger gets around that problem, though the company presumably has to calibrate it for each and every test device.
Third, there’s an astonishing variation in how people use phones and the conditions they use them in. Do you use Wi-Fi or primarily rely on LTE? Do you use an ad blocker? Do you use apps or prefer mobile browsers for access to sites like Facebook? What brightness level do you set? What sites do you visit? During our own conversation on this topic, my colleague Ryan Whitwam pointed out that nobody actually just sits and browses the web from 0-100 percent. I agree. On the other hand, iOS reports that Safari accounts for 51 percent of my personal battery power consumption over the past 7 days, followed by Netflix (17 percent) and Amazon Prime Video (7 percent).
Fourth, the companies with the best data on how customers use their products — Google, Qualcomm, Samsung, etc — do not share this information with the press beyond the most general terms. When a company launches a new SoC, it’ll sometimes show improvements in two or three specific areas like video playback, standby time, and web browsing, but they never share all of it. Companies that make broad statements about improved all-day battery life will tell you that their measurements take a variety of usage patterns into account, but they won’t unpack exactly what those usage patterns are, what assumptions they make, or where the improvements come from.
Fifth, keep in mind that draining the battery of a device can take hours, and no reviewer can sit for 8-10 hours performing exactly the same set of motions on 5-6 devices in a row. Modern reviewing does not lend itself to such leisurely evaluation schedules, even if you could find someone willing to sit through it and capable of staying on task. Battery testing must be automated, and if you don’t have access to a nifty robot arm, that’s going to mean relying on scripted or looped tests.
Finally, there are a number of subtle factors that could sway the final results without a simple “right” configuration option. Should you configure the phone for push notifications and frequent email downloads, or minimal notifications and only manual data loads? Should Bluetooth and Wi-Fi be enabled or disabled? Should you allow low battery mode to kick in when the manufacturer offers to activate it or run the device in full-power mode until it dies? That’s just a handful of the relevant potential options and there’s no “wrong” choice — just a different usage model.
Why Battery Life Improvements are Difficult to Predict
Once upon a time (the late 1990s), CPUs ran at one constant clock speed and new process nodes always delivered lower-power CPUs and higher clocks. The world has moved on. A modern smartphone performs a complex power-management juggling act orders of magnitude more complex than a laptop circa 2001. Because node shrinks can no longer be counted on to deliver regular improvements, ARM and Intel have poured huge amounts of money into developing sophisticated silicon capabilities to turn off sections of a chip when not in use, lower clocks to minimum values for responsive operation, and even deployed specialized low-power, high-efficiency CPU cores for optimal performance and power efficiency.
Instead of performing most functions in software, SoCs have integrated multiple accelerators to handle the same tasks in a fraction of the power envelope. On the software side of the equation, Google and Apple have established best practices to minimize power consumption as well and improved their OS schedulers to be aware of which workloads ought to run on which cores for maximum efficiency.
But even as companies have made these improvements, they’ve boosted core counts, clock speeds, GPU performance, screen resolutions, screen size, cellular performance, and Wi-Fi speeds. The push and pull between the higher power consumption required to drive increased performance and technology-driven efficiency improvements plays out differently in different areas. One reason I like the iPhone SE so much is that it married a still relatively high-end SoC with a much more power-efficient display. The result was a phone with better battery life in browsing tests than any other Apple device in its class.
This new mélange of hardware acceleration, software best practices, scheduler improvements, low-power CPU cores, and improved power gating has unquestionably yielded benefits, but it hasn’t yielded them equally across the entire playing field. That’s why CR’s director of electronics testing, Maria Rerecich, says “You can’t make a straight trend,” when asked if battery lives are increasing or decreasing. It depends, overwhelmingly, on where you look. It depends on how you test. It depends on what you care about and what your use case actually is. And unfortunately, that means the trend lines are likely to continue pointing in both directions.