Solving The Async Context Challenge In Node.Js

What is the best way to manage context between asynchronous flows? is there a way to achieve thread-like local storage?

Israel Zablianov

Published in

Wix Engineering

8 min readJan 9, 2024

room filled with servers — Credits: Nightcafe

What am I looking for?

Maintain context throughout an async flow
Ensure accessibility of the context only within the same async flow

The problem

Consider for example a standard web server that handles a couple of I/O operations:

// server.ts
router.get(
  path.join('/', 'user-details'), async (req, res) => {
      const userDetailsRequest = extractRequestParams(req);
      const xRequestId = generateXRequestId();
      const userDetailsResponse = await getUserDetails(xRequestId, userDetailsRequest);
      res.send(userDetailsResponse);
  },
);

// getUserDetails.ts
import { logger } from "logger";
export async function getUserDetails(xRequestId: string, userDetailsRequest: UserDetailsRequest) {
  const isPermitted = await checkUesrReqPermissions(xRequestId, userDetailsRequest);
    if (!isPermitted) {
        throw new NotPermittedError('Not permitted');
    }

    const userDetalis = await queryUserDetails(xRequestId, userDetailsRequest);
    await logger.report(xRequestId, `getUserDetails response ${userDetalis} for request ${userDetailsRequest}`);

    return userDetails;
}

// logger.ts
export const logger = {
  report(xRequestId: string, msg: string) {
    return remoteLogger.log(`${xRequestId ?? '-'}: ${msg}`);
  }
}

For each call to ‘user-details’ endpoint, we have

1. Backend API call (permission check)
2. DB query (query user details)
3. Backend API call (logging to a remote server)

standard web server that handles a couple of I/O operations on each request

This chain of operations, from the moment the request reaches the backend until the response is sent back to the client, throughout all callbacks and promise chains, is what I will refer to from now on as asynchronous flow.

This endpoint makes use of two params throughout its async flow -

userDetailsRequest - holds information about the requested user
xRequestId - A distinct identifier transferred across all asynchronous operations within the same flow that makes it easier to search corresponding log entries, without relying on timestamps and IP addresses.

These params are the contextual data associated with this request, and they are available in all function scopes that are triggered within the endpoint. Although it’s logical to explicitly send the userDetailsRequest param, the xRequestId is unrelated to the business logic of the functions within the endpoint.

The preferred solution would be to get access to the xRequestId parameter without the need to explicitly provide it in every function call.

In traditional multi-threaded programming, I/O operations are handled in different threads. It means that each request will be handled in a different thread and as such, we can use theThread-Local-Storage (TLS) solution, to store the context associated with the request.

This is not the case with Node.js which is a single-threaded environment that runs all requests on the same thread aka the main thread.

To better understand the challenge, let’s refactor our code to use a global object as a context

// store.ts
const store = new Map<string, string>();

// server.ts
import { store } from "store"
router.get(
  path.join('/', 'user-details'), async (req, res) => {
      const userDetailsRequest = extractRequestParams(req);
      store.set('xRequestId', generateXRequestId());
      const userDetailsResponse = await getUserDetails(userDetailsRequest);
      res.send(userDetailsResponse);
  },
);

// getUserDetails.ts
import { logger } from "logger";
export async function getUserDetails(userDetailsRequest: UserDetailsRequest) {
  const isPermitted = await checkUesrReqPermissions(userDetailsRequest);
    if (!isPermitted) {
        throw new NotPermittedError('Not permitted');
    }

    const userDetalis = await queryUserDetails(userDetailsRequest);
    await logger.report(`getUserDetails response ${userDetalis} for request ${userDetailsRequest}`);

    return userDetails;
}

// logger.ts
import { store } from "./store";
export const logger = {
  report(msg: string) {
    const xRequestId = store.get("xRequestId");
    return remoteLogger.log(`${xRequestId ?? '-'}: ${msg}`);
  }
}

The problem with this code happens when request #2 reaches the backend. It will override the value of xRequestId with a new one and when the time comes for request #1 to complete its last backend API call, it will be logged with the xRequestId of request #2.

So global object as a context is not an option, we need a way to associate a context within an async flow similar to TLS but this time on the same thread.

Multithreading

“If you can’t solve a problem, then there is an easier problem you can solve: find it.” - George Polya

We have a solution for a multi-threaded environment and it is TLS, so maybe we should solve a different problem — making Node a multi-threaded environment, and then we could use TLS as well

Despite its single-threaded nature, Node provides a way to create threads like worker threads, child processes, and clusters. How would that work if we used a multi-threaded approach in Node? Say, for example, every request and each async operation will be handled by a different worker thread. Since Node threads do not share memory by default, we can use a global object in this case as our TLS solution implementation.

“Unlike child_process or cluster, worker_threads can share memory. They do so by transferring ArrayBuffer instances or sharing SharedArrayBuffer instances.”

This means that child_processes and clusters don’t share memory anyway, and the same goes for worker threads unless you explicitly use ArrayBuffers or SharedArrayBuffers.

Although it sounds like we have found a solution in theory, it is not practical. Unlike traditional multi-threaded environments, Node uses a single thread for I/O operations to reduce context switching and this is why it performs better when it comes to I/O. Worker threads were designed to handle CPU-intensive tasks, not I/O tasks, so if you need multiple threads to handle I/O, you probably shouldn’t use Node.

“Workers (threads) are useful for performing CPU-intensive JavaScript operations. They do not help much with I/O-intensive work. The Node.js built-in asynchronous I/O operations are more efficient than Workers can be.”

The solution - AsyncLocalStorage

AsyncLocalStorage is a built-in Node.js API that provides a way of propagating the context of the current async operation through the call chain without the need to explicitly pass it as a function parameter. It is similar to thread-local storage in other languages.

The main idea of Async Local Storage is that we can wrap some function calls with the AsyncLocalStorage#run call. All code that is invoked within the wrapped call gets access to the same store, which will be unique to each call chain.

Let's refactor our code to use the AsyncLocalStorage API

// store.ts
import { AsyncLocalStorage } from 'node:async_hooks';
export const asyncLocalStorage = new AsyncLocalStorage<Map<string, string>>();

// server.ts
import { asyncLocalStorage } from "./store";
router.get(
  path.join('/', 'user-details'), async (req, res) => {
    const store = new Map<string, string>();
    store.set("xRequestId", generateXRequestId());

    asyncLocalStorage.run(store, async () => {
      const userDetailsRequest = extractRequestParams(req);
      const userDetailsResponse = await getUserDetails(userDetailsRequest);
      res.send(userDetailsResponse);
    });
  }
);

// getUserDetails.ts
import { logger } from "logger";
export async function getUserDetails(userDetailsRequest: UserDetailsRequest) {
  const isPermitted = await checkUesrReqPermissions(userDetailsRequest);
    if (!isPermitted) {
        throw new NotPermittedError('Not permitted');
    }

    const userDetalis = await queryUserDetails(userDetailsRequest);
    await logger.report(`getUserDetails response ${userDetalis} for request ${userDetailsRequest}`);

    return userDetails;
}

// logger.ts
import { asyncLocalStorage } from "./store";
export const logger = {
  report(msg: string) {
    const store = asyncLocalStorage.getStore();
    const xRequestId = store.get("xRequestId");
    return remoteLogger.log(`${xRequestId ?? '-'}: ${msg}`);
  }
}

Now, the logger has access to the xRequestId value and the AsyncLocalStorage handles the isolation for me.

What about performance?

While you can create your own implementation on top of the node:async_hooks module, AsyncLocalStorage should be preferred as it is a performant and memory safe implementation that involves significant optimizations that are non-obvious to implement.

Disclaimer: The performance test was done on a Mac-M1 machine with Node version 16.14.0

To test the performance impact of AsyncLocalStorage I’ve created 4 functions to simulate different use cases.

1. Regular (no hooks)

function simpleAsyncOperation() {
  return new Promise((resolve) => {
    setTimeout(() => {
        resolve("done");
    });
  });
}

measure(simpleAsyncOperation);

2. With AsyncLocalStorage

import {AsyncLocalStorage} from "node:async_hooks";
export const asyncLocalStorage = new AsyncLocalStorage();

function withAsyncLocalStorage() {
    return asyncLocalStorage.run({}, () => {
        return simpleAsyncOperation();
    });
}

measure(withAsyncLocalStorage);

3. With a single async hook

import async_hooks from "node:async_hooks";

function setupWithAsyncHook() {
    return async_hooks.createHook({
        init(asyncId, type, triggerAsyncId, resource) { },
        before(asyncId) { },
        after(asyncId) { },
        destroy(asyncId) { },
        promiseResolve(asyncId) { },
    }).enable();
}

setupWithAsyncHook();
measure(simpleAsyncOperation);

Note: A custom implementation of AsyncLocalStorage can be achieved using a single async hook

4. With an async hook for every async operation

function withAsyncHooks() {
  setupWithAsyncHook();
  return simpleAsyncOperation();
}

measure(withAsyncHooks);

You can see that using async operations and AsyncLocalStorage impacts performance and it is written pretty clearly on the async_hooks docs -

Please migrate away from this API, if you can. We do not recommend using the createHook, AsyncHook, and executionAsyncResource APIs as they have usability issues, safety risks, and performance implications

I didn’t count the last use case async hook per async operation in the chart, because I couldn’t really think of a real use case for it, async hook on demand did not really make sense and also the numbers were so ridiculous to compare.

The numbers were -
1,000 requests - 180ms
10,000 requests - 14,000ms
100,000 requests - I stopped the program after 2 min

Here is the snippet code of the measure function

import { PerformanceObserver, performance } from "node:perf_hooks";

export async function measure(callback: () => Promise<unknown>, label = "hooks-performance-test") {
  const perfObserver = new PerformanceObserver((items) => {
    items.getEntries().forEach((entry) => {
      console.log(entry)
    })
  });
  
  perfObserver.observe({ entryTypes: ["measure"], buffered: true });

  performance.mark("performance-test-start");
  await callback();
  performance.mark("performance-test-end");
  performance.measure(label, "performance-test-start", "performance-test-end");
}

Performance Summary

Async hooks and AsyncLocalStorage have a significant impact on the application performance
It is better to use the build-in AsyncLocalStorage than a custom one
as said in the docs

Context Loss

In most cases, AsyncLocalStorage works without issues. In rare situations, the current store is lost in one of the asynchronous operations.

If your code is promise-based there should be no problem otherwise if it is callback-based, it is enough to promisify it with util.promisify() so it starts working with native promises.

Conclusion

AsyncLocalStorage is the best option for a TLS-like solution to store a local context between an async flow.

Before using it you should be aware of its pitfalls

possible context loss if you have a callback-based API
performance impact