Multithreading in the browser with Emscripten and Boost ASIO

6 min readSep 26, 2023

JavaScript’s single-threaded nature has always been raising doubts about its suitability for environments where high performance is crucial. However, now that we can use technologies such as Emscripten and Boost ASIO, developers can incorporate multithreading capabilities, allowing JavaScript to keep up with its counterparts. In this article, we’ll present a straightforward setup for enabling it.

As mentioned in my previous article, we’ve been using WebAssembly for some of our calculations at CodeComply.ai. At one moment, we encountered a performance challenge while handling large plans; the limitations of single-threaded WebAssembly became evident, resulting in slow response times and an unsatisfactory user experience. Our C++ engineer modified the script for multithreading, and our responsibility was to successfully launch it in the browser.

Navigating Emscripten’s documentation on multithreading proved to be less than ideal. Implementing a final solution involved a substantial amount of trial and error, with the help of some equally lost people in GitHub issues. Ultimately, we found ourselves needing to debug the output JavaScript file, despite its size of a few thousand lines. This article aims to assist its readers in avoiding the need for such extensive measures in their own projects.

emcc commands

To achieve multithreading with ASIO, the initial step is to enable the feature in your C++ code, though the specifics won’t be covered in this article. Once that’s set, we need to add the following parameters to the the emcc invocation (please refer to the previously linked article since we will build upon the emcc command from it):

-pthread \

The first, and most important flag - without it, implementing multithreading in WASM is not possible. Read more here.

-sENVIRONMENT=web,worker \

Here I extended the environment variable flag from the previous article with “worker”, since I personally use a web worker to launch WASM.

-s PTHREAD_POOL_SIZE=32 \

Setting the thread pool size became a task which led to a tricky bug that was difficult to resolve. Initially, I had set the value to 8, matching the number of cores on my laptop, which, however, later caused machines with more cores to encounter infinite loops during code execution. To mitigate this issue and make our build future-proof, we’ll later replace the fixed value with one obtained from navigator.hardwareConcurrency, ensuring we don't spawn unnecessary workers on lower-end machines and at the same time are prepared for even more powerful CPUs in the future.

-sEXPORT_NAME=yourWasmModule \

The clearest explanation of why you should use this flag comes from the error:
pthreads + MODULARIZE currently require you to set -sEXPORT_NAME=Something (see settings.js) to Something != Module, so that the .worker.js file can work.
Since we have to use MODULARIZE as Vite has problems working with files that aren’t modules, we’re forced to keep this one :)
Our final emcc invocation looks as follows:

emcc -lembind \
  yourBinding.cpp \
  -O2 \
  -s MODULARIZE=1 \
  -o yourWasmModule.js  \
  -pthread \
  -sENVIRONMENT=web,worker \
  -s PTHREAD_POOL_SIZE=32 \
  -sEXPORT_NAME=centerline \
  -s TOTAL_MEMORY=256MB \
  -I ../eigen/  \
  -I ../spatial/src/  \
  -I ../boost_1_78_0/  \
  -std=c++14 \
  -Wno-enum-constexpr-conversion \
  --no-entry

Adjusting command

In the previous article, we presented some commands that customise the final JS files to suit our project’s specific requirements.

sed -i '' 's|locateFile("yourWasmModule.worker.js")|locateFile("/wasm/yourWasmModule.worker.js")|' yourWasmModule.js && \

As we introduce the new file yourWasmModule.worker.js, we must handle its location in our project, similarly to how we did it before.

sed -i '' 's/||_scriptDir/||".\/yourWasmModule.js"||_scriptDir/g' yourWasmModule.js && \

This one took me a day or two :) basically, we need to pass the JS script used for loading WASM to the workers that are used to simulate the threads. However, the default way of doing it, using _scriptDir, was malfunctioning, so I replaced it with a direct path.

sed -i '' 's|self.location.href|""|' yourWasmModule.js && \

Now that we are deeply involved in the worker logic, we need to be on the lookout for potential traps. One such obstacle arose when the script attempted to load the workers from a path relative to the script’s launch location, and we store the file launching the WASM files in another folder. A quick replacement fixed this hurdle.

sed -i '' 's/pthreadPoolSize=32/pthreadPoolSize=yourWasmModule.maxThreads-4||32/' yourWasmModule.js && \

As mentioned earlier with threads, we replace the thread pool size from the building process with one that depends on window.navigator.hardwareConcurrency. From my experience, it’s better to set a value which is a bit lower than the maximum amount, for cases when a really heavy workload would freeze your computer. To avoid potential errors when this script is called in a worker, we pass it as a parameter when invoking the script.

sed -i '' "s|if (typeof exports === 'object' && typeof module === 'object')|if(0)typeof await/2//2; export default yourWasmModule;|g" yourWasmModule.js && \

This one is admittedly a hack, and a dirty one. As I’ve mentioned above, we need a version of this script to load in the worker. Since the worker cannot interpret syntax like default export or module.exports by default, and I wanted to avoid creating two files that differed only by one line (export), I resorted to using a trick from StackOverflow to handle the situation. If only Vite was better with importing non-modules :)

Enabling necessary headers in your project

Since you’ll need to enable the functionality of SharedArrayBuffer to make multithreading work, you’ll also have to enable certain headers. The documentation of emscripten actually mentions a way to achieve this, but the included link doesn’t provide any specific tips about the implementation.
For Vite, the following configuration helped with enabling it locally:

export default defineConfig(({ command }) => {
  return {
    ...
    plugins: [
       ...
      {
        name: "configure-response-headers",
        configureServer: (server) => {
          server.middlewares.use((_req, res, next) => {
            res.setHeader("Cross-Origin-Embedder-Policy", "require-corp");
            res.setHeader("Cross-Origin-Opener-Policy", "same-origin");
            if (_req.headers.host.includes("localhost")) {
              res.setHeader(
                "Access-Control-Allow-Origin",
                `http://${_req.headers.host}`
              );
            }
            next();
          });
        },
      },

while this reddit link helped with enabling it on the AWS side.

The invocation itself

    const wasmMethods = await yourWasmModule({
      maxThreads,
    });

In the browser environment, we created the variable maxThreads, which holds the value of window.navigator.hardwareConcurrency and is later passed to the worker calling our WASM module. As demonstrated above, we access this value using yourWasmModule.maxThreads. This mechanism works thanks to the output JS file accepting a parameter, enabling seamless communication with the WASM handler.

wasmMethods.PThread?.terminateAllThreads();

After completing your calculations, be sure to terminate any hanging threads. It appears that the garbage collection may often be unreliable when the script is initiated repeatedly.

Once you’ve successfully deployed the WebAssembly script with multithreading, you should observe multiple instances of yourWasmModule.worker.js in the Memory tab.

Results

Results may vary based on your specific usage. It’s important to keep in mind that every architectural decision involves trade-offs. If you identify a consistent pattern where a single-threaded instance of WebAssembly produces better results, it might be a wise approach to use conditional logic to switch between single and multithreaded solutions accordingly.

With smaller plans, we have observed slightly worse results, likely due to the overhead of orchestrating threads. However, as the plan size increased, the performance significantly improved. For plans with a few hundred rooms, we have noticed performance differences of multiple times (from 2 minutes to under 30 seconds).

Nevertheless, with certain plans we haven’t observed any noticeable performance advantages, and later on decided to move the calculations away from frontend.

Hopefully this article helped you understand how to configure and utilize multi-threaded WebAssembly. Please leave a follow if you found it valuable and feel free to comment if you had any problems with setting it up, maybe we can work on it together!