How eBay did a barcode scanner on WebAssembly
- Transfer
Since its announcement, WebAssembly technology has immediately attracted the attention of front-end developers. The web community enthusiastically accepted the idea of running code in a browser written in languages other than JavaScript. The main thing is that WebAssembly guarantees speed much higher than JavaScript.
Our engineers closely followed the development of the standard. As soon as WebAssembly 1.0 support was implemented in all major browsers, the developers immediately wanted to try it out.
But there was a problem. Although many applicationsbenefit from WebAssembly, but the scope of technology in e-commerce is still primitive. We could not immediately find the correct version of its use. There were a few suggestions, but JavaScript was better in all variations. When we evaluate new technologies at eBay, the first question is: “What is the potential benefit for our customers?” If this is not clear, we will not proceed to the next step. It is very easy to get carried away with new fashionable technology, even if it does not matter to customers and only complicates the existing workflow. User experience is always more important than developer experience. But with WebAssembly differently. This technology has huge potential, we just could not find the right use case. However, in the end they still found it.
In native eBay apps on iOS and Android, there is a UPC barcode scanning feature to automatically enter into the form. It works only in applications and requires intensive processing of images on the device in order to recognize the barcode digits in the image stream from the camera. The resulting code is then sent to the server service, which, in turn, fills out the form. This means that the image processing logic on the device must be very efficient. For native applications, we compiled our own C ++ library into native code for iOS and Android. It recognizes barcodes exceptionally well. We are gradually moving to native APIs in iOS and Android, but our C ++ library is still reliable.
Barcode scanner is an intuitive function for sellers, it significantly simplifies the filling out of the form. Unfortunately, this function did not work on the mobile version of the site, and sellers had to manually enter the UPC, which is inconvenient.
We used to look for an option to scan barcodes on the web. Two years ago, they even released a prototype based on the BarcodeReader open source JavaScript library . The problem was that it worked well only in 20% of cases. The remaining 80% of the time the scanner worked extremely slowly or did not work at all. In most cases, it was a timeout. It is quite expected: JavaScript can be compared in speed with native code only if it is on a “hot path”, that is, JIT is highly optimized-compilers. The trick is that JavaScript engines use numerous heuristics to determine if a path is "hot" without guaranteeing a result. This discrepancy obviously led to user frustration, and we had to disable this feature. But now everything is different. With the rapid development of the web platform, the question arose: “Is it possible to implement a reliable barcode scanner on the web?”
One of the options is to wait for the Shape Detection API to come out with built-in image detection functions, including barcodes . But these interfaces are still at a very early stage of development and are far from cross-browser compatibility. And even in this case, work on all platforms is not guaranteed . Therefore, you have to consider other options.
This is where WebAssembly comes into play. If a barcode scanner is implemented on WebAssembly, then it is guaranteed to work. The strong typing and bytecode structure of WebAssembly allows you to always keep the "hot path" of execution. In addition, we already have a C ++ library for native applications. C ++ libraries are ideal candidates for compilation in WebAssembly. We thought the problem was resolved. It turned out, not really.
The working prototype architecture for the barcode scanner on WebAssembly was pretty simple.
WebAssembly Workflow
The first step in any WebAssembly project is to define a clear compilation pipeline. Emscripten has become the de facto standard for compiling WebAssembly, but it is important to have a consistent environment that produces a deterministic result. Our frontend is based on Node.js, so we need to find a solution compatible with the npm workflow. Fortunately, around that time, Surma Das published an article called “Emscripten and npm” . The Docker- based approach for compiling WebAssembly makes sense as it eliminates a ton of overhead. As recommended in the article, we took the docker image of Emscripten from trzeci. To enable compilation in WebAssembly, the native C ++ library had to be tweaked a bit. Basically, we acted at random, trial and error. In the end, I managed to compile it, and also set up a neat WebAssembly workflow within the existing assembly pipeline.
Scanner performance is measured by the number of frames processed by the Wasm API per second. The Wasm API takes a frame from the camera’s video stream, performs calculations, and returns a response. This is done on an ongoing basis until a barcode is detected. Performance is measured in FPS.
Our test implementation of WebAssembly showed an amazing speed of 50 FPS. However, it worked only in 60% of cases, and in the rest it crashed by timeout. Even with such a high FPS, they could not quickly detect the barcode for the remaining 40% of scans, giving a warning message at the end. In comparison, the previous JavaScript implementation usually ran at 1 FPS. Yes, WebAssembly is much faster (50 times), but for some reason it does not work in almost half the cases. It should also be noted that in some situations JavaScript worked very well and immediately found the barcode. One of the obvious options was to increase the timeout, but this will only increase the frustration of users, and so we do not solve the real problem. Therefore, we abandoned this idea.
At first, we could not understand why the native C ++ library, which worked perfectly in native applications, did not show the same result on the web. After lengthy testing and debugging, we found that the recognition speed depends on the focus angle of the object and the background shadow. But how then does everything work in native applications? The fact is that in native applications we use the built-in APIs for autofocus and provide the user with the opportunity to focus manually by pointing a finger at the barcode. Therefore, native applications always provide the library with high-quality clear images.
Realizing the essence of what is happening, we decided to try another native library: a fairly popular and stable ZBar barcode scanneropen source. More importantly, it works well with blurry and grainy images. Why not give it a try? Since we already had the WebAssembly workflow, the compilation and deployment of ZBar in WebAssembly went smoothly. Performance turned out to be decent, around 15 FPS, although not as good as that of our own C ++ library. But the success rate was close to 80% for the same timeout. A clear improvement over our C ++ library, but still not 100%.
The result did not satisfy us yet, but we noticed something unexpected. Where Zbar crashed out, our own C ++ library did the job very quickly. It was a pleasant surprise. It seems that libraries processed images of different quality in different ways. This led us to the idea.
You probably already understood. Why not create two worker threads: one for Zbar and one for our C ++ library, and not run them in parallel. Whoever won (whoever first sends a valid barcode) sends the result to the main stream, and both workers stop. We implemented such a scenario and started testing ourselves, trying to simulate as many scenarios as possible. This setting showed 95% of successful scans. Much better than previous results, but still not 100%.
One of the strange suggestions was to add the original JavaScipt library to the competition. It will be three streams. We honestly did not think that this would change anything. But such a test did not require any effort, because we standardized the working interface. To our surprise, with three streams, the success rate really came close to 100%. This again was completely unexpected. As mentioned earlier, JavaScript worked very well in some situations. Apparently, he closed the gap. So the popular wisdom of the law is “JavaScript always wins . ” If without jokes, the following illustration provides an overview of the final architecture that we have implemented.
Barcode Scanner Web Architecture
The following figure shows a high-level functional diagram:
Functional diagram of a barcode scanner
The resources necessary for the scanner to work are preloaded after rendering the main page. In this way, the landing page loads quickly and is ready for interaction. WebAssembly resources (wasm files and middleware scripts) and the JavaScript scanner library are preloaded and cached using XMLHttpRequest after loading the main page. It is important here that they are not executed immediately in order to leave the main thread free for user interaction with the page. Execution occurs only when the user clicks on the barcode icon. If the user clicked on the icon before loading the resources, they will be loaded on demand and immediately executed. The barcode scanner event handler and the worker controller are loaded with the page, but they are very small.
After rigorous testing and internal use by employees, we launched A / B testing on users. The scanner icon (screenshot below) was shown to the test group, but not to the control group.
End product
To measure success, we introduced the Draft Completion Rate metric. This is the time between starting editing a draft and submitting a form. The metric should show how a barcode scanner helps people fill out forms. The test lasted several weeks, and the results were very pleasant. They are fully consistent with our original hypothesis. Draft completion time decreased by 30% for a stream with a barcode scanner.
A / B Test Results
We also added profiling to evaluate the effectiveness of all types of scanners. As expected, the largest contribution was made by Zbar (53% of successful scans), then our C ++ library (34%) and, finally, the JavaScript library with 13%.
The experience of implementing WebAssembly has become very informative for us. Engineers are very happy about the emergence of new technologies and immediately want to try them out. If the technology is also useful for customers, then this is a double joy. Let us repeat the thought expressed at the beginning of the article. Technology is developing at a very fast pace. Every day something new appears. But only a few technologies matter to customers, and WebAssembly is one of them. Our biggest conclusion from this exercise is to say “no” in 99 situations and “yes” in the only case when it is really important for customers.
In the future, we plan to expand the use of a barcode scanner and introduce it on the buyers side, so that they can scan product codes offline for search and purchase on eBay. We’ll also consider expanding the function using the Shape Detection API and other functions in the browser. But we are pleased to have found the right use case for WebAssembly on eBay and successfully applied the technology in e-commerce.
Special thanks to Surma Das and Lin Clark for numerous articles on WebAssembly. They really helped us break the deadlock several times.
Our engineers closely followed the development of the standard. As soon as WebAssembly 1.0 support was implemented in all major browsers, the developers immediately wanted to try it out.
But there was a problem. Although many applicationsbenefit from WebAssembly, but the scope of technology in e-commerce is still primitive. We could not immediately find the correct version of its use. There were a few suggestions, but JavaScript was better in all variations. When we evaluate new technologies at eBay, the first question is: “What is the potential benefit for our customers?” If this is not clear, we will not proceed to the next step. It is very easy to get carried away with new fashionable technology, even if it does not matter to customers and only complicates the existing workflow. User experience is always more important than developer experience. But with WebAssembly differently. This technology has huge potential, we just could not find the right use case. However, in the end they still found it.
Barcode Scanner
In native eBay apps on iOS and Android, there is a UPC barcode scanning feature to automatically enter into the form. It works only in applications and requires intensive processing of images on the device in order to recognize the barcode digits in the image stream from the camera. The resulting code is then sent to the server service, which, in turn, fills out the form. This means that the image processing logic on the device must be very efficient. For native applications, we compiled our own C ++ library into native code for iOS and Android. It recognizes barcodes exceptionally well. We are gradually moving to native APIs in iOS and Android, but our C ++ library is still reliable.
Barcode scanner is an intuitive function for sellers, it significantly simplifies the filling out of the form. Unfortunately, this function did not work on the mobile version of the site, and sellers had to manually enter the UPC, which is inconvenient.
Web Barcode Scanner
We used to look for an option to scan barcodes on the web. Two years ago, they even released a prototype based on the BarcodeReader open source JavaScript library . The problem was that it worked well only in 20% of cases. The remaining 80% of the time the scanner worked extremely slowly or did not work at all. In most cases, it was a timeout. It is quite expected: JavaScript can be compared in speed with native code only if it is on a “hot path”, that is, JIT is highly optimized-compilers. The trick is that JavaScript engines use numerous heuristics to determine if a path is "hot" without guaranteeing a result. This discrepancy obviously led to user frustration, and we had to disable this feature. But now everything is different. With the rapid development of the web platform, the question arose: “Is it possible to implement a reliable barcode scanner on the web?”
One of the options is to wait for the Shape Detection API to come out with built-in image detection functions, including barcodes . But these interfaces are still at a very early stage of development and are far from cross-browser compatibility. And even in this case, work on all platforms is not guaranteed . Therefore, you have to consider other options.
This is where WebAssembly comes into play. If a barcode scanner is implemented on WebAssembly, then it is guaranteed to work. The strong typing and bytecode structure of WebAssembly allows you to always keep the "hot path" of execution. In addition, we already have a C ++ library for native applications. C ++ libraries are ideal candidates for compilation in WebAssembly. We thought the problem was resolved. It turned out, not really.
Architecture
The working prototype architecture for the barcode scanner on WebAssembly was pretty simple.
- Compile the C ++ library with Emscripten . It will produce the middleware and the .wasm file.
- Select a worker thread from the main thread. The JavaScript code for the worker imports the generated JavaScript linking code, which in turn creates the .wasm file.
- The main stream sends a snapshot from the stream from the camera to the worker’s stream, and it will call the corresponding WASM API through the connecting code. The API response is passed to the main thread. The response can be a UPC string (which is passed to the backend) or an empty string if no barcode is detected.
- For a blank answer, the above step is repeated until a barcode is detected. This cycle runs for the specified time interval in seconds. Once the threshold is reached, we will display a warning message “Invalid product code. Try a different barcode or text search . ” Either the user did not focus the camera on a real barcode, or the scanner is not effective enough. We track statistics on timeouts as an indicator of the quality of the scanner.
WebAssembly Workflow
Compilation
The first step in any WebAssembly project is to define a clear compilation pipeline. Emscripten has become the de facto standard for compiling WebAssembly, but it is important to have a consistent environment that produces a deterministic result. Our frontend is based on Node.js, so we need to find a solution compatible with the npm workflow. Fortunately, around that time, Surma Das published an article called “Emscripten and npm” . The Docker- based approach for compiling WebAssembly makes sense as it eliminates a ton of overhead. As recommended in the article, we took the docker image of Emscripten from trzeci. To enable compilation in WebAssembly, the native C ++ library had to be tweaked a bit. Basically, we acted at random, trial and error. In the end, I managed to compile it, and also set up a neat WebAssembly workflow within the existing assembly pipeline.
It works fast, but ...
Scanner performance is measured by the number of frames processed by the Wasm API per second. The Wasm API takes a frame from the camera’s video stream, performs calculations, and returns a response. This is done on an ongoing basis until a barcode is detected. Performance is measured in FPS.
Our test implementation of WebAssembly showed an amazing speed of 50 FPS. However, it worked only in 60% of cases, and in the rest it crashed by timeout. Even with such a high FPS, they could not quickly detect the barcode for the remaining 40% of scans, giving a warning message at the end. In comparison, the previous JavaScript implementation usually ran at 1 FPS. Yes, WebAssembly is much faster (50 times), but for some reason it does not work in almost half the cases. It should also be noted that in some situations JavaScript worked very well and immediately found the barcode. One of the obvious options was to increase the timeout, but this will only increase the frustration of users, and so we do not solve the real problem. Therefore, we abandoned this idea.
At first, we could not understand why the native C ++ library, which worked perfectly in native applications, did not show the same result on the web. After lengthy testing and debugging, we found that the recognition speed depends on the focus angle of the object and the background shadow. But how then does everything work in native applications? The fact is that in native applications we use the built-in APIs for autofocus and provide the user with the opportunity to focus manually by pointing a finger at the barcode. Therefore, native applications always provide the library with high-quality clear images.
Realizing the essence of what is happening, we decided to try another native library: a fairly popular and stable ZBar barcode scanneropen source. More importantly, it works well with blurry and grainy images. Why not give it a try? Since we already had the WebAssembly workflow, the compilation and deployment of ZBar in WebAssembly went smoothly. Performance turned out to be decent, around 15 FPS, although not as good as that of our own C ++ library. But the success rate was close to 80% for the same timeout. A clear improvement over our C ++ library, but still not 100%.
The result did not satisfy us yet, but we noticed something unexpected. Where Zbar crashed out, our own C ++ library did the job very quickly. It was a pleasant surprise. It seems that libraries processed images of different quality in different ways. This led us to the idea.
Multithreading and speed racing
You probably already understood. Why not create two worker threads: one for Zbar and one for our C ++ library, and not run them in parallel. Whoever won (whoever first sends a valid barcode) sends the result to the main stream, and both workers stop. We implemented such a scenario and started testing ourselves, trying to simulate as many scenarios as possible. This setting showed 95% of successful scans. Much better than previous results, but still not 100%.
One of the strange suggestions was to add the original JavaScipt library to the competition. It will be three streams. We honestly did not think that this would change anything. But such a test did not require any effort, because we standardized the working interface. To our surprise, with three streams, the success rate really came close to 100%. This again was completely unexpected. As mentioned earlier, JavaScript worked very well in some situations. Apparently, he closed the gap. So the popular wisdom of the law is “JavaScript always wins . ” If without jokes, the following illustration provides an overview of the final architecture that we have implemented.
Barcode Scanner Web Architecture
The following figure shows a high-level functional diagram:
Functional diagram of a barcode scanner
Resource Loading Note
The resources necessary for the scanner to work are preloaded after rendering the main page. In this way, the landing page loads quickly and is ready for interaction. WebAssembly resources (wasm files and middleware scripts) and the JavaScript scanner library are preloaded and cached using XMLHttpRequest after loading the main page. It is important here that they are not executed immediately in order to leave the main thread free for user interaction with the page. Execution occurs only when the user clicks on the barcode icon. If the user clicked on the icon before loading the resources, they will be loaded on demand and immediately executed. The barcode scanner event handler and the worker controller are loaded with the page, but they are very small.
results
After rigorous testing and internal use by employees, we launched A / B testing on users. The scanner icon (screenshot below) was shown to the test group, but not to the control group.
End product
To measure success, we introduced the Draft Completion Rate metric. This is the time between starting editing a draft and submitting a form. The metric should show how a barcode scanner helps people fill out forms. The test lasted several weeks, and the results were very pleasant. They are fully consistent with our original hypothesis. Draft completion time decreased by 30% for a stream with a barcode scanner.
A / B Test Results
We also added profiling to evaluate the effectiveness of all types of scanners. As expected, the largest contribution was made by Zbar (53% of successful scans), then our C ++ library (34%) and, finally, the JavaScript library with 13%.
Conclusion
The experience of implementing WebAssembly has become very informative for us. Engineers are very happy about the emergence of new technologies and immediately want to try them out. If the technology is also useful for customers, then this is a double joy. Let us repeat the thought expressed at the beginning of the article. Technology is developing at a very fast pace. Every day something new appears. But only a few technologies matter to customers, and WebAssembly is one of them. Our biggest conclusion from this exercise is to say “no” in 99 situations and “yes” in the only case when it is really important for customers.
In the future, we plan to expand the use of a barcode scanner and introduce it on the buyers side, so that they can scan product codes offline for search and purchase on eBay. We’ll also consider expanding the function using the Shape Detection API and other functions in the browser. But we are pleased to have found the right use case for WebAssembly on eBay and successfully applied the technology in e-commerce.
Special thanks to Surma Das and Lin Clark for numerous articles on WebAssembly. They really helped us break the deadlock several times.