“Simple things bring infinite pleasure. Yet, it takes us a while to realize that. But once simple is in, complex it out – forever.” ― Joan F. Marques
Now that I have your attention, let me clear up the word “flames.” The flames that I’m referring to have nothing to do with fire. All I am talking about is performance tools in Node.js. When it comes to performance, everyone thinks of fighting fires, as many think performance optimization is a nightmare. Most of us think only that some individuals are masters in profiling.
Anyone can become master in profiling when given simple tools. At eBay, we strive to make things simple and easy for our developers to use. During the course of Node.js development and production issues, we soon realized that profiling in Node.js is not an easy thing to do.
Before jumping to the CPU profiling tool that simplified our lives, let me walk you through our journey that ended up in seeing flame charts from a completely different angle.
Flame graphs using kernel tools
With Brendan Gregg’s flame graph generation, it was much easier to visualize CPU bottlenecks. However, we need to run a small number of tools and scripts to generate these graphs.
Yunong Xiao has posted an excellent blog on how to generate flame graphs using the perf command based on Gregg’s tools. Kernel tools like DTrace (BSD and Solaris) and perf (Linux) are very useful in generating stack traces from the core level and transform the stack calls to flame graphs. This approach gives us complete picture from Node internals, from the V8 engine all the way to JS code.
However, running tools like this need some good understanding on tool itself and sometimes you need different OS itself. In most cases your production box and profiling box setup differ completely. This way makes it hard to investigate the issue going in production as one has to attempt to reproduce this issue in completely different environment.
After managing to run the tools, you will end up with flame charts like this.
Image source from Yunong Xiao’s blog
Here are some pros and cons for this approach.
Pros:
- Easy to find CPU bottleneck
- Graphical view
- Complete profile graph for native and JS frames.
Cons:
- Complexity in generating graphs.
- Limited DTrace support by different platforms, harder to profile in DEV boxes
Chrome profiling tool
The Chrome browser is just amazing. It is famous not only for its speed but also for its V8 engine, which is core to Node.js. In addition to these features, one tool that web developers love about Chrome is Developer Tools.
There is one tool inside Developer Tools that is used to profile browser-side JS. The v8-profiler enables us to use server-side profile data in the Chrome Profile tool.
Let us see how we can use this for profiling our Node.js application. Before using Profiles in Chrome, we have to generate some profiling data from our running Node.js application. We will use v8-profiler for creating CPU profile data.
In the following code, I have created a route /cpuprofile for generating CPU profile data for a given number of seconds and then streaming the dump to a browser to open in Chrome.
This sample code creates a CPU dump using v8-profiler.
//file index.js
var express = require('express');
var util = require('util');
var profiler = require('v8-profiler');
var app = express();
app.get('/', function(req, res){
res.send(“Hello World!!”);
});
app.get('/cpuprofile', function(req, res){
var duration = req.query.duration || 2;
res.header('Content-type', 'application/octet-stream');
res.header('Content-Disposition', 'attachment; filename=cpu-profile' + Date.now() + '.cpuprofile');
//Start Profiling
profiler.startProfiling('CPU Profile', true);
setTimeout(function(){
//Stop Profiling after duration
var profile = profiler.stopProfiling();
//Pipe profile dump to browser
profile.export().pipe(res).on('finish', function() {
profile.delete();
});
}, duration * 1000); //Convert to millisec
});
app.listen(8080);
To generate CPU profile data, use these steps:
- Start your app.
node index.jsIt’s a good idea to run
abto put some load on the page. - Access the CPU profile dump using http://localhost:8080/cpuprofile?duration=2. A
cpu-profile.cpuprofilewill be downloaded from the server. - Load the downloaded file
cpu-profile.cpuprofilein Chrome using Developer Tools > Profiles > Load. Upon loading, you should see in your Profiles tab something like the following.
Now that you have opened profile data, you can drill down the tree and analyze which piece of code is taking more CPU time. With this tool, anyone can generate profile data anytime with just one click, but just imagine how hard it is to drill down with this tree structure when you have big application.
In comparison to Flame Graphs using Kernel Tools, here are some pros and cons.
Pros
- Easy generation of a profile dump
- Platform independent
- Profiling available during live traffic
Cons
- Chrome provides a graphical view for profile data, but the data is not aggregated and navigation is limited.
Flame graphs @ eBay
Now that we have seen two different approaches for generating CPU profile data, let us see how we can bring in a nice graphical view like flame graphs to V8-profiler data.
At eBay, we have taken a different approach to make it very simple and easy to use tool for our Node.js developers. We used V8-profiler data, applied the aggregation algorithm, and rendered the data as flame charts using the d3-flame-graphs module.
If you look at the .cpuprofile file closely (created above), it is basically a JSON file. We came across a generic d3-flame-graphs library that can draw flame graphs in a browser using input JSON data. Thanks to “cimi” for his d3-flame-graphs module.
After we made some modifications to the chrome2calltree aggregation algorithm and aggregated profile data (removed core-level CPU profile data), we could convert .cpuprofile data file to JSON that can be read by d3-flame-graphs, and the final outcome is simply amazing.
Three-step process
- Generate
.cpuprofileon demand usingv8-profileras shown in Chrome Profiling Tool. - Convert
.cpuprofileinto aggregated JSON format (source code). - Load the JSON using
d3-flame-graphsto render the flame graph on browser.
Output
This time access CPU flame graph on browser using the same URL (http://localhost:8080/cpuprofile?duration=2) from Chrome Profiling Tool.
The above flame chart shows only JS frames, which is what most Node application developers are interested in.
Third-party packages used
expressjs(npm)v8-profiler(npm)d3-flame-graph(bower)- Aggregation algorithm (modified)
Pros
- Easy and simple to generate flame graphs
- Doesn’t need special setup
- Platform independent
- Early performance analysis during development
- Graphical view integrated into every application
Cons
- Imposes 10% overhead during profiling
Summary
To summarize, we have seen three different ways of profiling CPU in Node.js, starting from using OS-level tools to rendering flame graphs on a browser using simple open source JS code. Simple and easy-to-use tools help anyone master profiling and performance tuning. At eBay, we always strive to make some difference in our developers’ lives.




Great work! It’s good to have another option for Node.js profiling, and I’m sure this will be useful. Thanks for posting the instructions!
I’d add another Con to the v8-profiler approach: it’s blind to CPU time in v8 internals, system libraries, and the kernel. Most of the time that’s not an issue, as you’re debugging issues in your app, but sometimes the issue is elsewhere.
The current state at Netflix: we’re using Linux perf to profile Node.js at Netflix (Yunong’s link had the instructions), using v8’s –perf-basic-prof-only-functions option, although sometimes we get “[unknown]” symbols due to a buffering bug, which has been filed as https://bugs.chromium.org/p/v8/issues/detail?id=5015. Looks like that bug has now been fixed in the latest v8; we have a workaround anyway.
Nice. Good work Mahesh
I am utterly confused – what is the difference between `d3-flame-graphs` and the flame charts that are already provided by Chrome Developer Tools (http://i.imgur.com/B7hH5EL.png)?
Hi Gajus,
With Chrome, you will see lot of profile data from Node & V8. In this approach we are showing only what is required for JS developer.
Also this is browser independent, you can draw the flame graphs on the fly on any browser.
-Mahesh
Flames of failure! 🔥
Did you investigate the debugging improvements in node 6.3? It essentially exposes v8-inspector via the inspector protocol so you can connect to it with the Chrome DevTools.
PR: https://github.com/nodejs/node/pull/6792
If so, what did you think?
This looks great! It’d be awesome if you guys could open source the amends you made to https://github.com/jlfwong/chrome2calltree, even better if there was the full express example.
Thanks!
Great stuff.
On demand cpu profiling is what I was looking for.
For flamegraph method I’m having some trouble. Can you please help me on my question on stackoverflow http://stackoverflow.com/questions/40112106/unable-to-create-flamegraph-for-nodejs-process/ .
Was there a link to source step 2 of your final/preferred method that has since been removed?
2. Convert .cpuprofile into aggregated JSON format (source code). <==??
This seem quite cool, but without some direction from a source perspective, it seems like a fair amount of research to take advantage of the information.