For the most part, eBay runs on a Java-based tech stack. Our entire workflow centers around Java and the JVM. Considering the scale of traffic and the stability required by a site like ebay.com, using a proven technology was an obvious choice. But we have always been open to new technologies, and Node.js has been topping the list of candidates for quite some time. This post highlights a few aspects of how we developed eBay's first Node.js application.
It all started when a bunch of eBay engineers (Steven, Venkat, and Senthil) wanted to bring an eBay Hackathon-winning project called "Talk" to production. When we found that Java did not seem to fit the project requirements (no offense), we began exploring the world of Node.js. Today, we have a full Node.js production stack ready to rock.
We had two primary requirements for the project. First was to make the application as real time as possible--i.e., maintain live connections with the server. Second was to orchestrate a huge number of eBay-specific services that display information on the page--i.e., handle I/O-bound operations. We started with the basic Java infrastructure, but it consumed many more resources than expected, raising questions about scalability for production. These concerns led us to build a new mid-tier orchestrator from scratch, and Node.js seemed to be a perfect fit.
Within a couple of days, we had an exhaustive list to work on. As expected, the most common questions centered around the reliability of the stack and the efficiency of Node.js in handling eBay-specific functionality previously implemented in Java. We answered each one of the questions, providing details with real-world examples. At times this exercise was eye-opening even for us, as we had never considered the angle that some of the questions presented. By the end of the exercise, people understood the core value of Node.js; indeed, some of the con arguments proved to be part of the beauty of the language.
Once we had passed the test of our peers' scrutiny, we were all clear to roll.
We started from a clean slate. Our idea was to build a bare minimum boilerplate Node.js server that scales; we did not want to bloat the application by introducing a proprietary framework. The first four node modules we added as dependencies were express, cluster, request, and async. For data persistence, we decided on MongoDB, to leverage its ease of use as well as its existing infrastructure at eBay. With this basic setup, we were able to get the server up and running on our developer boxes. The server accepted requests, orchestrated a few eBay APIs, and persisted some data.
For end-to-end testing, we configured our frontend servers to point to the Node.js server, and things seemed to work fine. Now it was time to get more serious. We started white-boarding all of our use cases, nailed down the REST end points, designed the data model and schema, identified the best node modules for the job, and started implementing each end point. The next few weeks we were heads down--coding, coding, and coding.
Once the application reached a stable point, it was time to move from a developer instance to a staging environment. This is when we started looking into deployment of the Node.js stack. Our objectives for deployment were simple: Automate the process, build once, and deploy everywhere. This is how Java deployment works, and we wanted Node.js deployment to be as seamless and easy as possible.
We were able to leverage our existing cloud-based deployment system. All we needed to do was write a shell script and run it through our Hudson CI job. Whenever code is checked in to the master branch, the Hudson CI job kicks off. Using the shell script, this job builds and packages the Node.js bundle, then pushes it to the deployment cloud. The cloud portal provides an easy user interface to choose the environment (QA, staging, or pre-production) and activate the application on the associated machines.
Now we had our Node.js web service running in various stable environments. This whole deployment setup was quicker and simpler than we had expected.
At eBay, we have logging APIs that are well integrated with the Java thread model as well as at the JVM level. An excellent monitoring dashboard built on top of the log data can generate reports, along with real-time alerts if anything goes wrong. We achieved similar monitoring for the Node.js stack by hooking into the centralized logging system. Fortunately for us, we had logging APIs to consume. We developed a logger module and implemented three different logging APIs:
- Code-level logging. This level includes logging of errors/exceptions, DB queries, HTTP service calls, transaction metadata, etc.
- Machine-level logging. This level includes heartbeat data about CPU/memory and other OS statistics. Machine-level logging occurs at the cluster module level; we extended the npm cluster module and created an eBay-specific version.
- Logging at the load balancer level. All Node.js production machines are behind a load balancer, which sends periodic signals to the machines and ensures they are in good health. In the case of a machine going down, the load balancer fails-over to a backup machine and sends alerts to the operations and engineering teams.
We made sure the log data formats exactly matched the Java-based logs, thus generating the same dashboards and reports that everyone is familiar with.
One particular logging challenge we faced was due to the asynchronous nature of the Node.js event loop. The result was that the logging of transactions was completely crossed. To understand the problem, let's consider the following use case: The Node process starts a URL transaction and issues a DB query with an async callback. The process will now proceed with the next request, before the DB transaction finishes. This being a normal scenario in any event loop-based model like Node.js, the logs are crossed between multiple URL transactions, and the reporting tool shows scrambled output. We have worked out both short-term and long-term resolutions for this issue.
With all of the above work completed, we are ready to go live with our Hackathon project. This is indeed the first eBay application to have a backend service running on Node.js. We've already had an internal employee-only launch, and the feedback was very positive--particularly on the performance side. Exciting times are ahead!