C++ in the browser with WebAssembly via Emscripten, Vite and React
At CodeComply.ai, by building upon the functionality of Togal.ai, we’re creating a product which performs a complex fire safety review of floor plans. Since that requires a lot of computation, we decided to give WebAssembly a try.
This will be the first article in a series, in which I’ll share tips and experience gained during the process of porting scripts from our internal C++ library to the browser, where we handle them using React Hooks and TypeScript.
Most of the articles I’ve found online describe fairly simple examples which couldn’t be applied when connecting a real-world set of functions.
There was no mention of linking libraries (we’re using Boost, Eigen and Spatial) or errors resulting from more advanced use.
Therefore, this one might seem to have a bit of a steep learning curve. I’ll link the simpler examples and documentation along the way and at the end, hoping it will help you comprehend the whole topic more quickly.
Before we start
There is a reason why I’m mentioning Vite in the title — examples in most of the articles are built with Webpack, which seems to be handling the output much more gracefully. With Vite, many solutions wouldn’t work, even when the same files were used. The reason will be explained later on.
Also, this article won’t be mentioning standalone WASM (launched without the use of a JS helper). As of December 2022, there’s no good polyfill for WASI that we could use in the browser (although Node seems to be doing pretty well).
Installing
Personally, I consider installing emsdk to be the supreme way of setting the emcc compiler up. For Mac users, brew install emscripten is an even better option. If you’re planning to integrate some CI/CD scripts, there should already be a launcher for your platform.
Writing our first binding
Lots of articles were compiling functions that were placed in the same file where the binding was taking place.
In real world, this approach would tie our hands — our functions are usually long and depend on other modules, so what we want to achieve is some kind of interface, where we’ll just import the functions that we want to expose and create a binding.
// embinding.cpp
#include <emscripten/bind.h>
#include "../Scripts/some_module.h"
EMSCRIPTEN_BINDINGS(my_module) {
emscripten::function("testFunction", &testFunction)
}
Not surprisingly, we need to import the bind module from emscripten, and the function which we want to bind. Next, we call the EMSCRIPTEN_BINDINGS function. The my_module parameter is there to avoid name collisions.
In a simple case like the one above, we invoke a function from the emscripten namespace, giving it two parameters — the name that we’ll use to call the function in the browser, and a reference to the function.
Gotchas in this step
As I’ve mentioned, the case above is a very optimistic one. For example, if your function uses raw pointers, the emscripten::function will require a third parameter: emscripten::allow_raw_pointers().
But still, this might not be the end of your trouble — the compilation might succeed, but in the browser you’ll now see binding errors , such as:
Cannot call testFunction due to unbound types: PKc, PPc
If your function expects parameters of a type that’s not implemented in JS, (such as char*), you’ll need to prepare a custom way of calling them.
And here Emscripten’s optional_override comes to the rescue. For example, a function expecting a pointer to a char would be handled like this:
EMSCRIPTEN_BINDINGS(my_module) {
emscripten::function("functionWithPointer",
emscripten::optional_override(
[](std::string firstArg) {
return functionWithPointer(firstArg.c_str());
})
);
}
Writing the compilation command
Once we’re done with the previous step, we can begin implementing the emcc compilation command. Since it accepts parameters that gcc would accept as well, this should mostly be familiar to you.
As it can get long with all necessary parameters, we’ll prepare a script to launch everytime we want to compile.
emcc -lembind \
embinding.cpp \
-O2 \
-s MODULARIZE=1 \
-o yourOutput.js \
-sENVIRONMENT="web" \
-I ../eigen/ \
-I ../spatial/src/ \
-I ../boost_1_78_0/ \
-std=c++14 \
-sNO_DISABLE_EXCEPTION_CATCHING \
-Wno-enum-constexpr-conversion \
--no-entry
- First, we’re launching emcc with the -lembind flag ( — bind works too, although it’s deprecated), which is used for linking against the embind library.
- Next, we pass the file containing the bindings (embinding.cpp).
- -O2 is a flag which means that we want Emscripten to optimize the output file’s size. To show you what kind of optimization we’re talking about, let’s look at a table I’ve made based on compiling some of our scripts:
| Level | Output WASM size | Output JS size | Output overall |
| ----- | ---------------- | ---------------- | ------------------ |
| O0 | 901 kB | 242 kB | 1143 kB |
| O1 | 360 kB (40%) | 200 kB (82%) | 560 kB (49%) |
| O2 | **319 kB (35%)** | 106 kB (44%) | **425 kB (37.1%)** |
| O3 | 393 kB (43%) | **102 kB (42%)** | 495 kB (43%) |
| Os/Oz | 328 kB (36%) | **102 kB (42%)** | 430 kB (37.6%) |
There’s a sudden drop in JS size between O1 and O2 because of the minimization of code - all comments are removed, and the final solution has around 22 lines (compared to 3000+ originally).
The WASM size notes a big decrease too, but we’re already getting that with O1.
- -s MODULARIZE=1 — without this parameter, the output would be placed in the global scope. This doesn’t fix the Vite problem yet, but it’s a good starting point
- -o yourOutput.js — specifies the output name
- -sENVIRONMENT=”web” — for cases, where you only want to run your code on the Web side.
- -I ../eigen/ — specifies the headers of the libraries you want to link
- -std=c++14 — if your code uses functionalities unavailable in e.g. C++17, you can specify the target version with this flag.
- -sNO_DISABLE_EXCEPTION_CATCHING — enables exception catching.
- -Wno-enum-constexpr-conversion — related to false negatives that might occur in your code, caused by clang. Might not be useful in your case.
- — no-entry — we have to specify that we’re not creating a main() function.
Launching this method should create a JS and WASM file in the directory.
Improving the output
First, we add an /* eslint-disable */ line at the beginning of the file, since some parts of the output code are in conflict with our Eslint config (inspired by this article).
sed -i '1s;^;\/* eslint-disable *\/;' yourOutput.js
# for Mac add '' after -i
sed -i '' '1s;^;\/* eslint-disable *\/;' yourOutput.js
Second, by default the path to the WASM file is less than ideal. We place the files in the public/wasm folder, so let’s also add the following line:
sed -i 's|yourOutput.wasm|/wasm/yourOutput.wasm|' yourOutput.js
Let’s also look at how the module is currently exported.
if (typeof exports === 'object' && typeof module === 'object')
module.exports = Module;
else if (typeof define === 'function' && define['amd'])
define([], function() { return Module; });
else if (typeof exports === 'object')
exports["Module"] = Module;
This way of exporting just doesn’t seem to sit well with Vite, it works fine with Webpack though. 🤷
Let’s replace the whole block of code using a Perl one-liner (inspired by this comment) with export default Module;
perl -i -p0e "s/(if \(typeof exports === 'object' && typeof module === 'object'\))[\s\S]*/export default Module;/g" yourOutput.js
For the comfort of usage, I created a typed useWASM hook:
import { useState } from "react";
export interface BaseWASMModule {
_malloc: Function;
HEAPU8: Uint8Array;
}
const useWASM = <T>(
helperOutput: (
Module?: unknown,
...args: unknown[]
) => Promise<BaseWASMModule & T>
) => {
const [methods, setMethods] = useState<(BaseWASMModule & T) | null>(null);
helperOutput().then((m) => {
!methods && setMethods(m);
});
return methods;
};
export default useWASM;
I’m passing the exported module as a parameter, but you could also pass the path/filename and dynamically import it.
import { useEffect } from "react";
import useWasm from "./useWASM";
import Module from "./yourOutput";
interface YourOutputMethods {
functionWithPointer: (
firstArg: string,
) => number;
}
export const useYourOutput = () => {
const module = useWasm<YourOutputMethods>(Module);
useEffect(() => {
if (module) {
// do something, probably assign the output to some state and return it
// or perform some operation and return the result
}
}, [module]);
// return something
};
Now you can call the above hook in your React component and see the magic happen.
If you notice any problems with how your function works, remember — the std::cout calls are actually propagated to the console. It might not be the best debugging experience, but it’s something 😅
In the next article, I’ll show you how to test the JS/WASM output, how to create TypeScript interfaces from the bindings and share some tips related to moving the process to CI/CD.
Hope you’ve enjoyed this article, if you happen to have any problems along the way, make sure to comment, I’ll try to help the best I can.
I’d like to thank my colleagues Olek and Dmitry for helping me with certain C++ and CI/CD challenges and to Olek and Piotr for their comments 🙌
Articles and repositories worth visiting:
- Repository which describes how certain emcc params work really well
- Tool reference (although there’s no styling so you’ll have to use CTRL + F to make anything out of it)
- OpenGL with WebAssembly (in case you’d wish to port some WebGL stuff)
- Helpful article when you don’t know how to handle input parameters with multiple pointers