Extending the capabilities of web applications using WebAssembly and Python

Extending the capabilities of web applications using WebAssembly and Python

In this article, we’ll show you how to run a Python program inside another program using the Wasm runtime (the host) and have the Python program talk to the host and vice versa.

A few months ago we added Python to the Wasm Language Runtimes. We have published the compiled binary python.wasmwhich can be used to execute Python scripts using WebAssembly to provide increased security and portability.

After this release, we received a lot of feedback on how to make it even more useful for developers. One frequently mentioned topic was the need for bidirectional communication between the Wasm host and the Python code running in python.wasm.

We worked on this together with the Suborbital team and implemented an application that demonstrates bidirectional communication thanks to the SE2 Plugin ABI implementation. This work was later implemented in Suborbital SE2.

An example program can be found in WLR/python/examples/bindings/se2-bindings. It’s easy to get started and will walk you through how to embed Python in a Wasm application and implement bindings for bidirectional communication.


WebAssembly is an excellent technology for extending the capabilities of applications. On the one hand, it provides a sandboxed environment where extensions can safely run in the same process as the main application. On the other hand, it allows you to write extensions in any language that can be embedded in or interpreted by the Wasm module, e.g.

python.wasm

.

More and more web applications and platforms offer extensibility on top of WebAssembly. It is used by serverless platforms such as Cosmonic wasmCloud, CloudFlare Workers, Fastly Compute, Fermyon Cloud, our own Wasm Workers Server, as well as solutions for extending the capabilities of ready-made applications, such as Dylibso Extism, LoopholeLabs Scale and Suborbital Extension Engine.

Expanding the capabilities of applications

Extending the capabilities of programs with the help of external code provides greater flexibility in software development processes. A company may have a community of developers writing extensions for a single application, or a platform may have great core functionality that developers can reuse by building extensions on top of it.

Traditionally, this is implemented with the help of so-called plugins, which are loaded into the main application and provide their own functionality. However, this usually limits the ability of plug-in implementers to use a specific programming language (or multiple languages), and also poses a threat to the stability and security of the core application.

When creating web applications, developers can use WebHooks, which allow multiple web applications to work as one, each of which can be implemented in its own language. However, this approach implies a decrease in the speed of communications between different web applications and a complication of the deployment structure.

Extending with WebAssembly

To extend your application with WebAssembly, you need to embed the Wasm runtime so that it can execute code from the Wasm module. We call such a program a Wasm host. Communications between a Wasm host and a Wasm module are explicitly specified by the exported and imported functions declared in the Wasm module. Exported functions are implemented by a module and can be called by the host, while imported functions (also called

host functions

) are implemented by the host and can be called by the module.

However, when a module embeds the Python runtime to execute a Python script, there is no well-defined way for the Wasm host to call the function from the Python script, and vice versa. To simplify this task, we need to add a Wasm code module that will expose Python script host functions and Python host functions. We call such code bindingas it translates the API of the Wasm module to Python code.

During our first experiments with creating such bindings, we had a discussion with Suborbital, which was starting work on adding Python support to its SE2 engine. We decided to cooperate. After experimenting with libpython and with the Python C API, we arrived at a valid solution, which we will talk about in this article.

The Suborbital team joined our efforts and rolled out Python plugin support in SE2 quite quickly.

Why work on it?

The main reason for simplifying communications between the Wasm host and Python is code reusability. There are many Python and Python developers in the world. Although we can run Python scripts in

python.wasm

, while there is still no established way of communicating with the Wasm host. If necessary, developers have to find their way through trial and error.

One of the main goals of Wasm Labs is to accelerate the adoption of WebAssembly, so we decided that working on a demo example of how to implement this two-way communication could help developers bridge the gap between Wasm and their off-the-shelf Python applications.

We decided to join forces with Suborbital, who needed this functionality within their platform.

Previous works

All this is based on work on

python.wasm

, by the CPython team. She has already prepared the target composition

wasm32-wasi

, which we used earlier to publish reusable Python binaries. It was enough for us to add to the released resources

libpython

(which this team also collected).

The Suborbital team already had a well-defined ABI for communications between the Wasm host and the JavaScript plugins interpreted by the Wasm module. We can easily add functionality to existing programs and implement them for another language.

Our application consists of three components:

  • se2-mock-runtime: WebAssembly host.
  • py-plugin: a Python application
  • wasm-wrapper-c: Wasm application that provides bindings for communication between two other components.

The diagram below shows the program execution flow:

  • se2-mock-runtime causes run_edefined in the module plugin.py.
  • plugin.py causes return_resultdefined in se2-mock-runtime

We run the code

Docker is enough to get started. Running this example then boils down to the following:

export TMP_WORKDIR=$(mktemp -d)
git clone --depth 1 https://github.com/vmware-labs/webassembly-language-runtimes ${TMP_WORKDIR}/wlr
(cd ${TMP_WORKDIR}/wlr/python/examples/bindings/se2-bindings; ./run_me.sh)

After that, you will get a detailed logging output that looks like this:

Next to the methods used for explicit memory management, you can see some additional logs. These are explained in the file accompanying the example

README.md

and in order not to complicate things, we will not consider them here.

For ease of reading, the logs are organized as follows:

  • Logs se2-mock-runtime are at the beginning of the line, and the file name is indicated in dark yellow-hot color
  • Logs wasm-wrapper-c are indented by one tab, and the file name is indicated in green
  • Logs plugin.py are indented by two tabs, and the file name is indicated in purple

For a better understanding of the code and the entire demo application, take a look at it

on GitHub

. There you will find instructions on how to run it, as well as a more detailed description of the components.

Review

Let’s take a closer look at each of the components.

  • se2-mock-runtime – A Node JS application that emulates the Suborbital SE2 runtime

    • loads the Wasm module (equivalent to the SE2 plugin)
    • calls its method run_e to execute plugin logic with an example script
    • provides return_result and return_errorwhich are used by the plugin to return the execution result
  • wasm-wrapper-c – Wasm module (written in C) which

    • mocks the SE2 plugin by exporting a method run_e and using imported return_result or return_error
    • uses libpython for embedding the interpreter, thus passing the implementation run_e Python script.
  • py-plugin – Python application

    • performed by the application wasm-wrapper-c
    • provides implementation run_e in pure Python – “reversal of strings containing only characters (without ', . and so on)”

Using libpython.a and the Python C API

Implementing the Python interpreter was quite easy.

Python C API

is well documented and there are many examples of open source software.

Difficulties in working with WASI arise due to the lack of support for dynamic libraries. Because of this we have to use libpython as a static library and compile it into our Wasm module. For convenience, we’ve added a libpython tarball to our Python release. It contains all the necessary headers, and the file libpython.a is a thick library containing all other dependencies (eg zlib, libuuid, sqlite3 and so on).

Also, complex Python programs are likely to require an increased stack size. To ensure sufficient size and correct configuration, we added the linker option configuration file to the above tarball.

// Linker configuration in lib/wasm32-wasi/pkg-config/libpython3.11.pc.
-Wl,-z,stack-size=524288 -Wl,--stack-first -Wl

This provides a fairly large stack of half a megabyte. Also, the stack is placed before the global data, thus ensuring that any stack overflow will cause Wasm to abort immediately instead of silently corrupting the global data (in some cases).

Learn more about program layout in C with libpython from WebAssembly runtimes can be read in files build-wasm.sh and CMakeLists.tst in WLR/python/examples/bindings/se2-bindings/wasm-wrapper-c/.

Calling a Python function from the host

Suppose we have a function in

plugin.py

. How to call it from the Wasm host?

def run_e(payload, id):
    """Processes UTF-8 encoded `payload`. Execution is identified by an `id`.
    """
    log(f'Received payload "{payload}"', id)

To be able to call from a Wasm host, you must first export from the Wasm module, which is a built-in Python interpreter. Since it’s better to use simple types, we’ll represent a string as a pointer and a length.

__attribute__((export_name("run_e"))) void run_e(u8 *ptr, i32 len, i32 id);

You can then easily cast this method to a Python method using the Python C API.

If you skip error and memory handling

it all boils down to something like this:

void run_e(u8 *ptr, i32 len, i32 id) {
  PyObject *module_name = PyUnicode_DecodeFSDefault("plugin");
  PyObject *plugin_module = PyImport_Import(module_name);
  PyObject *run_e = PyObject_GetAttrString(plugin_module, "run_e");
  PyObject *run_e_args = Py_BuildValue("s#i", ptr, len, id);
  PyObject *result = PyObject_CallObject(run_e, run_e_args);
}

The main method of interest to us here is this

Py_BuildValue

; also we need python documentation about

Building values

.

Calling a host function from Python code

Let’s say we have a host function like this:

/** Can be called to return UTF-8 encoded result for an execution `id`
  @param ptr Pointer to the returned result
  @param len Length of the returned result.
  @param id Execution id
*/
void env_return_result(u8 *ptr, i32 len, i32 id)
    __attribute__((__import_module__("env"),
                   __import_name__("return_result")));

To enable Python code in

plugin.py

to access it, you need to create a Python module in C that will do a broadcast of something like

def return_result(result, id)

into the above function.

An example implementation (without error handling) of an SDK module with such a function would look like this:

static PyObject *sdk_return_result(PyObject *self, PyObject *args) {
    char *ptr;
    Py_ssize_t len;
    i32 id;
    PyArg_ParseTuple(args, "s#i", &ptr, &len, &id);
    env_return_result((u8 *)result, result_len, ident);
    Py_RETURN_NONE;
}

static PyMethodDef SdkMethods[] = {
    {"return_result", sdk_return_result, METH_VARARGS, "Returns result"},
    {NULL, NULL, 0, NULL}};

static PyModuleDef SdkModule = {
    PyModuleDef_HEAD_INIT, "sdk", NULL, -1, SdkMethods,
    NULL, NULL, NULL, NULL};

static PyObject *PyInit_SdkModule(void) {
    return PyModule_Create(&SdkModule);
}

Here the core is again

PyArg_ParseTuple

which is well documented in the Python documentation about

parsing arguments

.

Finally, before we initialize the Python interpreter, we need to add the ‘sdk’ module to the list of built-in modules. It will become available thanks to import sdk in interpreted Python modules.

PyImport_AppendInittab("sdk", &PyInit_SdkModule);
Py_Initialize();

We connect everything together

You can see how it all works together in the example program below.

  1. When the Wasm host calls _start in the Wasm module, it will internally call _initializein order to

    • add plugin sdk as a built-in Python module
    • initialize the Python interpreter
    • download the module plugin (which imports the builtin sdk)
  2. When the Wasm host calls run_e in the Wasm module, he

    • looking for function run_e from the module plugin
    • broadcasts arguments using Py_BuildValue
    • calls a Python function with these arguments

  3. When run_e in plugin.py causes sdk.return_resultimplementation in sdk_module

    • casts the argument using Py_ParseTuple
    • calls the imported function env:return_result with these broadcast arguments

Our sample application contains a lot of manual work. In an ideal scenario, we should provide a file

.wit

that declares the API, and be able to automatically generate binding code.

Developers at many companies are already working on a more generalized approach to using Python interchangeably with Wasm on the server side. You can track progress in the ByteCodeAlliance Python guest runtime and bindings stream in the Zulip space.

You can try the demo application yourself

here

.

If you want to create something from scratch, you can find an already assembled one libpythonis part of our recent Python release. Don’t forget the linker options mentioned above.

Related posts