I was recently tasked with finding a NodeJS memory leak, and fixing it of course. Diving into the deep inner workings of a managed language is always difficult because we’re used to not having to deal with it. So here’s how I approached it.

Run Locally

I always stress the fact that teams should be able to run their applications locally. I would think this is a no-brainer these days, but I still encounter teams where running locally is difficult or impossible. Luckily, our application could be run locally. It’s a Serverless application that we run on AWS Lambda, and with the serverless-offline package, we could run it with the following (redacted) command:

node --inspect ./node_modules/serverless/bin/serverless offline start -r eu-west-1 --stage dev

The important part is the --inspect flag. It means we can attach an inspector to look at our memory.

Inspect

Next I opened a Chromium-based browser (Edge in my case) and went to the Inspection screen by navigating to edge://inspect (chrome://inspect for Chrome). It has a “Memory” tab which is what I needed:

In this tab, you can see the current memory usage, but it’s the “Take Snapshot” at the bottom that is useful. I took a snapshot and then used my application heavily (using a Postman collection to hammer it with HTTP requests).

After a while, I took a second snapshot.

Compare

You can select the second snapshot and then use the dropdown to compare it with the first one:

I saw all sorts of numbers, but what I found was a lot of the same strings being added (I sorted by allocated size, clicked the “Show all” button and then scrolled through the list):

I could see that it was the require-from-string package loading these string. I saw a lot of other duplicate string that were referenced by GC roots which I believe isn’t an issue because that’s the garbage collector (let me know if I’m wrong and it is a problem).

Fixing The Issue

We were using an older version of this package and the newer one contains a fix for this issue. So this time I was lucky and just had to update the package.

Verifying The Results

The real test is to see how this affects the production environment. Of course, we deployed to non-production environments first and things looked good there, without any regression bugs.

This is what the memory usage looked like in production:

The significant drop might be attributed to cold starts, but the fact that the memory usage stays lower is a good sign.

We also saw a drop in our P99 latency (the orange line):

Key Takeaways

So what have I learned?

Analyzing NodeJS (or V8) memory dumps is complex. But if you can set up your application to run locally and have a suite of requests you can launch, you should be able to create useful heap dumps. Then it takes a little digging and scrolling, but you should be able to find the variables that are building up without being collected by the garbage collector.

Another takeaway is that this proves why we need to keep our dependencies up to date. The update contained a fix. But what’s more: the longer we wait to update dependencies, the harder it becomes because of breaking changes.

Good luck finding those memory leaks!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.