I was recently tasked with finding a NodeJS memory leak, and fixing it of course. Diving into the deep inner workings of a managed language is always difficult because we’re used to not having to deal with it. So here’s how I approached it.
I always stress the fact that teams should be able to run their applications locally. I would think this is a no-brainer these days, but I still encounter teams where running locally is difficult or impossible. Luckily, our application could be run locally. It’s a Serverless application that we run on AWS Lambda, and with the serverless-offline package, we could run it with the following (redacted) command:
node --inspect ./node_modules/serverless/bin/serverless offline start -r eu-west-1 --stage dev
The important part is the
--inspect flag. It means we can attach an inspector to look at our memory.
Next I opened a Chromium-based browser (Edge in my case) and went to the Inspection screen by navigating to
edge://inspect (chrome://inspect for Chrome). It has a “Memory” tab which is what I needed:
In this tab, you can see the current memory usage, but it’s the “Take Snapshot” at the bottom that is useful. I took a snapshot and then used my application heavily (using a Postman collection to hammer it with HTTP requests).
After a while, I took a second snapshot.
You can select the second snapshot and then use the dropdown to compare it with the first one:
I could see that it was the
require-from-string package loading these string. I saw a lot of other duplicate string that were referenced by
GC roots which I believe isn’t an issue because that’s the garbage collector (let me know if I’m wrong and it is a problem).
Fixing The Issue
We were using an older version of this package and the newer one contains a fix for this issue. So this time I was lucky and just had to update the package.
Verifying The Results
The real test is to see how this affects the production environment. Of course, we deployed to non-production environments first and things looked good there, without any regression bugs.
This is what the memory usage looked like in production:
The significant drop might be attributed to cold starts, but the fact that the memory usage stays lower is a good sign.
We also saw a drop in our P99 latency (the orange line):
So what have I learned?
Analyzing NodeJS (or V8) memory dumps is complex. But if you can set up your application to run locally and have a suite of requests you can launch, you should be able to create useful heap dumps. Then it takes a little digging and scrolling, but you should be able to find the variables that are building up without being collected by the garbage collector.
Another takeaway is that this proves why we need to keep our dependencies up to date. The update contained a fix. But what’s more: the longer we wait to update dependencies, the harder it becomes because of breaking changes.
Good luck finding those memory leaks!