SMT Solver on iPhone
- Transfer
Why buy an expensive PC if your iPhone solves SMT faster?
The problem of the satisfiability of formulas in theories (satisfiability modulo theories, SMT) is the problem of solvability for logical formulas taking into account the theories underlying them. - Wikipedia
A few days ago I tweeted : “A curious experiment: on the new iPhone, the Z3 prover is faster than on my (rather expensive) Intel desktop. It's time to transfer all the formal research methods to the phone. ”
I read about the incredible progress that Apple developers have made , and that Macs will soon be transferred to Apple 's own ARM processors . These reports usually refer to some cross-platform tests, such as Geekbench, to demonstrate that Apple's mobile processors are not inferior to Intel's mobile and desktop processors. But I was always a bit skeptical about these cross-platform tests (as well as others ) - do they really reflect the speed of the actual tasks for which I use my Macs?
As a formal method researcher, I regularly have to run an SMT solver, usually the Z3 prover . I spent a lot of time studying the performance characteristics of the Z3. It has some features that are not counted in tests (Z3 is usually single-threaded). I recently bought a new iPhone XS with the latest A12 processor from Apple. And somehow, having nothing to do, I decided to compile Z3 in iOS and see how fast the new phone works (or a hypothetical future Mac).
The cross-compilation of Z3 was surprisingly simple, you only need to change a few lines of code. I posted the sources for running Z3 on your own iOS device . For the test, I took a few queries from my recent work on profiling symbolic calculations : for each case, the SMT output generated by Rosette is extracted .
In the first test, I compared the iPhone XS with one of the desktops that runs on the Intel Core i7-7700K - the best Intel chip for the consumer market at the time when I collected the car 18 months ago. Intel was supposed to win without any problems, but it turned out otherwise:
In this 23-second test, the iPhone XS was about 11% faster! I tweeted about this, but twitter does not leave much room for details, so I’ll present them here:
How is this possible? Core i7-7700K is the same desktop processor. On a single-threaded task, it consumes about 45 watts and operates at a frequency of 4.5 GHz. On the other hand, an unplugged iPhone. It probably does not consume 10% of this power and works (we hope) somewhere in the 2 GHz band. Moreover, after a comparative test, I checked the report on the use of the iPhone battery: it said that Slack used 4 times more energy than the Z3 application, despite the shorter time on the screen.
Apple does not provide enough information to understand the performance of the Z3 on the iPhone, but fortunately, Intel provides this information for its processor. I rummaged in VTune for a while to find performance bottlenecks when running Z3 on the desktop. As noted by Mait Soos, most of the time the SAT solver spends on distribution , which is very cache-sensitive . VTune agrees and says that Z3 spends a lot of time waiting in memory when iterating through the observed literals. Thus, the key to performance seems to be cache size and memory latency. This effect may explain why the iPhone is so strong in this test: the A12 chip has a giant L2 cache with low latency , and also seems to have better memory latency after a cache miss compared to 7700K.
To confirm the results, I conducted a more extensive experiment, collecting all the Apple devices I could get. I also chose about 10 times longer benchmark (i.e., 4 minutes on the desktop) to allay concerns about the spikes in mobile CPU performance.
Here are the results for these devices (with their release dates) regarding the A7, Apple's first 64-bit user processor:
Immediately it should be noted that the i7-7700K desktop processor is superior to the iPhone XS on this longer test. But the iPhone is incredibly competitive, showing results between the i7-7700K and its predecessor, the i7-6700K, which was the fastest consumer desktop processor a little less than two years ago.
As a joke, I added another processor Core m7-6Y75 from my MacBook 2016. In the Z3 test, my phone is about 50% faster than a laptop.
The really remarkable thing here is the trend: a fairly consistent improvement of 30% per year for this benchmark Z3. Obviously, one should not draw far-reaching conclusions from one stupid test, but it seems that after a couple of iterations, Apple processors will become quite suitable for workloads. 2. I honestly did not expect that we are already so close: the modern architecture of smartphones are incredible!
Thanks to Megan Cowan , Max Wills and Eddie Jan for helping us test on other devices.
1 . Max noticed that the iPhone is waterproof, so the theory can be tested by immersing it in an ice bath. But I paid a lot of money for the phone and do not want to voluntarily conduct such an experience. ↑
2 . I bet that the A12X in the new iPad Pro is even faster due to the larger thermal envelope that the tablet gives. ↑
The problem of the satisfiability of formulas in theories (satisfiability modulo theories, SMT) is the problem of solvability for logical formulas taking into account the theories underlying them. - Wikipedia
A few days ago I tweeted : “A curious experiment: on the new iPhone, the Z3 prover is faster than on my (rather expensive) Intel desktop. It's time to transfer all the formal research methods to the phone. ”
I read about the incredible progress that Apple developers have made , and that Macs will soon be transferred to Apple 's own ARM processors . These reports usually refer to some cross-platform tests, such as Geekbench, to demonstrate that Apple's mobile processors are not inferior to Intel's mobile and desktop processors. But I was always a bit skeptical about these cross-platform tests (as well as others ) - do they really reflect the speed of the actual tasks for which I use my Macs?
As a formal method researcher, I regularly have to run an SMT solver, usually the Z3 prover . I spent a lot of time studying the performance characteristics of the Z3. It has some features that are not counted in tests (Z3 is usually single-threaded). I recently bought a new iPhone XS with the latest A12 processor from Apple. And somehow, having nothing to do, I decided to compile Z3 in iOS and see how fast the new phone works (or a hypothetical future Mac).
First test
The cross-compilation of Z3 was surprisingly simple, you only need to change a few lines of code. I posted the sources for running Z3 on your own iOS device . For the test, I took a few queries from my recent work on profiling symbolic calculations : for each case, the SMT output generated by Rosette is extracted .
In the first test, I compared the iPhone XS with one of the desktops that runs on the Intel Core i7-7700K - the best Intel chip for the consumer market at the time when I collected the car 18 months ago. Intel was supposed to win without any problems, but it turned out otherwise:
In this 23-second test, the iPhone XS was about 11% faster! I tweeted about this, but twitter does not leave much room for details, so I’ll present them here:
- This benchmark is a fragment
QF_BV
from SMT, so Z3 solves this part with the help of bit-blasting and SAT-solver. - The result is quite stable, even if you run the cycle ten times: the iPhone supports this performance and does not seem to start to slow down due to overheating. one. Nevertheless, the benchmark is still quite fleeting.
- Several people asked if this was due to non-determinism. Perhaps on different platforms the solver goes a different way due to the use of random numbers or for another reason? But I rather thoroughly checked the detailed issue of the Z3, and the results can hardly be explained by this.
- Both systems ran Z3 4.8.1, which I compiled using Clang with the same optimization settings. I also ran tests on the i7-7700K with ready Z3 binaries (which are compiled by GCC), but they are even slower.
What's happening?
How is this possible? Core i7-7700K is the same desktop processor. On a single-threaded task, it consumes about 45 watts and operates at a frequency of 4.5 GHz. On the other hand, an unplugged iPhone. It probably does not consume 10% of this power and works (we hope) somewhere in the 2 GHz band. Moreover, after a comparative test, I checked the report on the use of the iPhone battery: it said that Slack used 4 times more energy than the Z3 application, despite the shorter time on the screen.
Apple does not provide enough information to understand the performance of the Z3 on the iPhone, but fortunately, Intel provides this information for its processor. I rummaged in VTune for a while to find performance bottlenecks when running Z3 on the desktop. As noted by Mait Soos, most of the time the SAT solver spends on distribution , which is very cache-sensitive . VTune agrees and says that Z3 spends a lot of time waiting in memory when iterating through the observed literals. Thus, the key to performance seems to be cache size and memory latency. This effect may explain why the iPhone is so strong in this test: the A12 chip has a giant L2 cache with low latency , and also seems to have better memory latency after a cache miss compared to 7700K.
The rapid progress of Apple processors
To confirm the results, I conducted a more extensive experiment, collecting all the Apple devices I could get. I also chose about 10 times longer benchmark (i.e., 4 minutes on the desktop) to allay concerns about the spikes in mobile CPU performance.
Here are the results for these devices (with their release dates) regarding the A7, Apple's first 64-bit user processor:
Immediately it should be noted that the i7-7700K desktop processor is superior to the iPhone XS on this longer test. But the iPhone is incredibly competitive, showing results between the i7-7700K and its predecessor, the i7-6700K, which was the fastest consumer desktop processor a little less than two years ago.
As a joke, I added another processor Core m7-6Y75 from my MacBook 2016. In the Z3 test, my phone is about 50% faster than a laptop.
The really remarkable thing here is the trend: a fairly consistent improvement of 30% per year for this benchmark Z3. Obviously, one should not draw far-reaching conclusions from one stupid test, but it seems that after a couple of iterations, Apple processors will become quite suitable for workloads. 2. I honestly did not expect that we are already so close: the modern architecture of smartphones are incredible!
Thanks to Megan Cowan , Max Wills and Eddie Jan for helping us test on other devices.
1 . Max noticed that the iPhone is waterproof, so the theory can be tested by immersing it in an ice bath. But I paid a lot of money for the phone and do not want to voluntarily conduct such an experience. ↑
2 . I bet that the A12X in the new iPad Pro is even faster due to the larger thermal envelope that the tablet gives. ↑