eBay Tech Blog

Problem statement

In eBay’s existing CI model, each developer gets a personal CI/Jenkins Master instance. This Jenkins instance runs within a dedicated VM, and over time the result has been VM sprawl and poor resource utilization. We started looking at solutions to maximize our resource utilization and reduce the VM footprint while still preserving the individual CI instance model. After much deliberation, we chose Apache Mesos for a POC. This post shares the journey of how we approached this challenge and accomplished our goal.

Jenkins framework’s Mesos plugin

The Mesos plugin is Jenkins’ gateway into the world of Mesos, so it made perfect sense to bring the plugin code in sync with our requirements. This video explains the plugin. The eBay PaaS team made several pull requests to the Mesos code, adding both new features and bug fixes. We are grateful to the Twitter engineering team (especially Vinod) for their input and cooperation in quickly getting these features validated and rolled out.  Here are all the contributions that we made to the recently released 0.2 Jenkins Mesos plugin version. We are adding more features as we proceed.

Mesos cluster setup

docker_caching_single

docker_caching_single

Our new Mesos cluster is set up on top of our existing OpenStack deployment. In the model we are pursuing, we would necessarily have lots of Jenkins Mesos frameworks (each Jenkins Master is essentially a Mesos framework), and we did not want to run those outside of the Mesos cluster so that we would not have to separately provision and manage them. We therefore decided to use the Marathon framework as the Mesos meta framework; we launched the Jenkins master (and the Mesos framework) in Mesos itself. We additionally wanted to collocate the Jenkins masters in a special set of VMs in the cluster, using the placement constraint feature of Marathon that leverages slave attributes. Thus we separated Mesos slave nodes into a group of Jenkins masters and another group of Jenkins slave nodes. For backup purposes, we associated special block storage with the VMs running the CI master. Special thanks to the Mesosphere.io team for quickly resolving all of our queries related to Marathon and Mesos in general.

Basic testing succeeded as expected. The Jenkins master would launch through Marathon with a preconfigured Jenkins config.xml, and it would automatically register as a Mesos framework without needing any manual configuration. Builds were correctly launched in a Jenkins slave within one of the distributed Mesos slave nodes. Configuring slave attributes allowed the Mesos plugin to pick the nodes on which  slave jobs could be scheduled. Checkpointing was enabled so that build jobs were not lost if the slave process temporarily disconnected and reconnected back to the Mesos master (the slave recovery feature). In case there was a new Mesos master leader elected, the plugin used Zookeeper endpoints to locate the new master (more on this a little later).

docker_caching_single

docker_caching_single

We decided to simulate a large deployment and wrote a Jenkins load test driver (mammoth). As time progressed we started uncovering use cases that were unsuccessful. Here is a discussion of each problem and how we addressed it.

Frameworks stopped receiving offers after a while

One of the first things we noticed occurred after we used Marathon to create the initial set of CI masters. As those CI masters started registering themselves as frameworks, Marathon stopped receiving any offers from Mesos; essentially, no new CI masters could be launched. The other thing we noticed was that, of the Jenkins Frameworks that were registered, only a few would receive offers. At that point, it was evident that we needed a very thorough understanding of the resource allocation algorithm of Mesos – we had to read the code. Here is an overview on the code’s setup and the dominant resource fairness algorithm.

Let’s start with Marathon. In the DRF model, it was unfair to treat Marathon in the same bucket/role alongside hundreds of connected Jenkins frameworks. After launching all these Jenkins frameworks, Marathon had a large resource share and Mesos would aggressively offer resources to frameworks that were using little or no resources. Marathon was placed last in priority and got starved out.

We decided to define a dedicated Mesos role for Marathon and to have all of the Mesos slaves that were reserved for Jenkins master instances support that Mesos role. Jenkins frameworks were left with the default role “*”. This solved the problem – Mesos offered resources per role and hence Marathon never got starved out. A framework with a special role will get resource offers from both slaves supporting that special role and also from the default role “*”. However, since we were using placement constraints, Marathon accepted resource offers only from slaves that supported both the role and the placement constraints.

Certain Jenkins frameworks were still getting starved

Our next task was to find out why certain Jenkins frameworks were getting zero offers even when they were in the same role and not running any jobs. Also, certain Jenkins frameworks always received offers. Mesos makes offers to frameworks and frameworks have to decline them if they don’t use them.

The important point was to remember that the offer was made to the framework. Frameworks that did not receive offers, but that had equal resource share to frameworks that received and declined offers, should receive offers from Mesos. Basically, past history has to be accounted for. This situation can also arise where there are fewer resources available in the cluster. Ben Hindman quickly proposed and implemented the fix for this issue, so that fair sharing happens among all of the Jenkins frameworks.

Mesos delayed offers to frameworks

We uncovered two more situations where frameworks would get starved out for some time but not indefinitely. No developer wants to wait that long for a build to get scheduled. In the first situation, the allocation counter to remember past resource offers (refer to the fix in the previous paragraph) for older frameworks (frameworks that joined earlier) would be much greater than for the new frameworks that just joined. New frameworks would continue to receive more offers, even if they were not running jobs; when their allocation counter reached the level of older frameworks, they would be treated equally. We addressed this issue by specifying that whenever a new framework joins, the allocation counter is reset for all frameworks, thereby bringing them to a level playing field to compete for resources. We are exploring an alternative that would normalize the counter values instead of setting the counter to zero. (See this commit.)

Secondly, we found that once a framework finished running jobs/Mesos tasks, the user share – representing the current active resources used – never came down to zero. Double arithmetic led to a ridiculously small value (e.g., 4.44089e-16), which unfairly put frameworks that had just finished builds behind frameworks that had their user share at 0. As a quick fix, we used precision 0.000001 to treat those small values as 0 in the comparator. Ben Hindman suggested an alternative: once a framework has no tasks and executors, it’s safe to set the resource share explicitly to zero. We are exploring that alternative as well in the Mesos bug fix review process.

Final hurdle

After making all of the above changes, we were in reasonably good shape. However, we discussed scenarios where certain frameworks running active jobs and finishing them always get ahead of inactive frameworks (due to the allocation counter); a build started in one of those inactive frameworks would waste some time in scheduling. It didn’t make sense for a bunch of connected frameworks to be sitting idle and competing for resource offers when they had nothing to build. So we came up with a new enhancement to the Jenkins Mesos plugin:  to register as a Mesos framework only when there was something in the build queue. The Jenkins framework would unregister as a Mesos framework as soon as there were no active or pending builds (see this pull request). This is an optional feature, not required in a shared CI master that’s running jobs for all developers. We also didn’t need to use the slave attribute feature any more in the plugin, as it was getting resource offers from slaves with the default role.

Our load tests were finally running with predictable results! No starvation, and quick launch of builds.

Cluster management

When we say “cluster” we mean a group of servers running the PaaS core framework services. Our cluster is built on virtual servers in our OpenStack environment. Specifically, the cluster consists of virtual servers running Apache Zookeeper, Apache Mesos (masters and slaves), and Marathon services. This combination of services was chosen to provide a fault-tolerant and high-availability (HA) redundant solution. Our cluster consists of at least three servers running each of the above services.

By design, the Mesos and Marathon services do not operate in the traditional redundant HA mode with active-active or active-passive servers. Instead, they are separate, independent servers that are all active, although for HA implementation they use the Zookeeper service as a communication bus among the servers. This bus provides a leader election mechanism to determine a single leader.

Zookeeper performs this function by keeping track of the leader within each group of servers. For example, we currently provision three Mesos masters running independently. On startup, each master connects to the Zookeeper service to first register itself and then to elect a leader among themselves. Once a leader is elected, the other two Mesos masters will redirect all client requests to the leader. In this way, we can have any number of Mesos masters for an HA setup and still have only a single leader at any one time.

The Mesos slaves do not require HA, because they are treated as workers that provide resources (CPU, memory, disk space) enabling the Mesos masters to execute tasks. Slaves can be added and removed dynamically. The Marathon servers use a similar leader election mechanism to allow any number of Marathon servers in an HA cluster. In our case we also chose to deploy three Marathon servers for our cluster. The Zookeeper service does have a built-in mechanism for HA, so we chose to deploy three Zookeeper servers in our cluster to take advantage of its HA capabilities.

Building the Zookeeper-Mesos-Marathon cluster

Of course we always have the option of building each server in the cluster manually, but that quickly becomes time-intensive and prone to inconsistencies. Instead of provisioning manually, we created a single base image with all the necessary software. We utilize the OpenStack cloud-init post-install to convert the base image into either a Zookeeper, Mesos Master, Mesos Slave, or Marathon server.

We maintain the cloud-init scripts and customization in github. Instead of using the nova client directly or the web gui to provision, we added another automation feature and wrote Python scripts to call the python-novaclient and pass in the necessary cloud-init and post-install instructions to build a new server. This combines all the necessary steps into a single command. The command provisions the VM, instructs the VM to download the cloud-init post-install script from github, activates the selected service, and joins the new VM to a cluster. As a result, we can easily add servers to an existing cluster as well as create new clusters.

Cluster management with Ansible

Ansible is a distributed systems management tool that helps to ease the management of many servers. It is not too difficult or time-consuming to make changes on one, two, or even a dozen servers, but making changes to hundreds or thousands of servers becomes a non-trivial task. Not only do such changes take a lot of time, but they have a high chance of introducing an inconsistency or error that would cause unforeseen problems.

Ansible is similar to cfengine, puppet, chef, salt, and many other systems management tools. Each tool has its own strengths and weaknesses. One of the reasons we decided to use Ansible is its ability to execute remote commands using ssh without having the need for any Ansible client to run on the servers.

Ansible can be used as a configuration management, software deployment, and do-anything-you-want kind of a tool. It employs a plug-and-play concept, where existing modules have already been written for many functions. For example, there are modules for connecting to hosts with a shell, for AWS EC2 automation, for networking, for user management, etc.

Since we have a large Mesos cluster and several of the servers are for experimentation, we use Ansible extensively to manage the cluster and make consistent changes when necessary across all of the servers.

Conclusion

Depending on the situation and use case, many different models for running CI in Mesos can be tried out. The model that we outlined above is only one. Another variation is a shared master using the plugin and temporary masters running the build directly.  In Part II of this blog post, we will introduce an advanced use case of running builds in Docker containers on Mesos.

{ 3 comments }

In the era of cloud and XaaS (everything as a service), REST/SOAP-based web services have become ubiquitous within eBay’s platform. We dynamically monitor and manage a large and rapidly growing number of web servers deployed on our infrastructure and systems. However, existing tools present major challenges when making REST/SOAP calls with server-specific requests to a large number of web servers, and then performing aggregated analysis on the responses.

We therefore developed REST Commander, a parallel asynchronous HTTP client as a service to monitor and manage web servers. REST Commander on a single server can send requests to thousands of servers with response aggregation in a matter of seconds. And yes, it is open-sourced at http://www.restcommander.com.

Feature highlights

REST Commander is Postman at scale: a fast, parallel asynchronous HTTP client as a service with response aggregation and string extraction based on generic regular expressions. Built in Java with Akka, Async HTTP Client, and the Play Framework, REST Commander is packed with features beyond speed and scalability:

  • Click-to-run with zero installation
  • Generic HTTP request template supporting variable-based replacement for sending server-specific requests
  • Ability to send the same request to different servers, different requests to different servers, and different requests to the same server
  • Maximum concurrency control (throttling) to accommodate server capacity

Commander itself is also “as a service”: with its powerful REST API, you can define ad-hoc target servers, an HTTP request template, variable replacement, and a regular expression all in a single call. In addition, intuitive step-by-step wizards help you achieve the same functionality through a GUI.

Usage at eBay

With REST Commander, we have enabled cost-effective monitoring and management automation for tens of thousands of web servers in production, boosting operational efficiency by at least 500%. We use REST Commander for large-scale web server updates, software deployment, config pushes, and discovery of outliers. All can be executed by both on-demand self-service wizards/APIs and scheduled auto-remediation. With a single instance of REST Commander, we can push server-specific topology configurations to 10,000 web servers within a minute (see the note about performance below). Thanks to its request template with support for target-aware variable replacement, REST Commander can also perform pool-level software deployment (e.g., deploy version 2.0 to QA pools and 1.0 to production pools).

Basic workflow

Figure 1 presents the basic REST Commander workflow. Given target servers as a “node group” and an HTTP command as the REST/SOAP API to hit, REST Commander sends the requests to the node group in parallel. The response and request for each server become a pair that is saved into an in-memory hash map. This hash map is also dumped to disk, with the timestamp, as a JSON file. From the request/response pair for each server, a regular expression is used to extract any substring from the response content.

workflow

 Figure 1. REST Commander Workflow.

Concurrency and throttling model with Akka

REST Commander leverages Akka and the actor model to simplify the concurrent workflows for high performance and scalability. First of all, Akka provides built-in thread pools and encapsulated low-level implementation details, so that we can fully focus on task-level development rather than on thread-level programming. Secondly, Akka provides a simple analogy of actors and messages to explain functional programming, eliminating global state, shared variables, and locks. When you need multiple threads/jobs to update the same field, simply send these results as messages to a single actor and let the actor handle the task.

Figure 2 is a simplified illustration of the concurrent HTTP request and response workflow with throttling in Akka. Throttling (concurrency control) indicates the maximum concurrent requests that REST Commander will perform. For example, if the throttling value is 100, REST Commander will not send the “n_th” request until it gets the “{n-100}_th” response back; so the 500th request will not be sent until the response from the 400th request has been received.

concurrency 

Figure 2. Concurrency Design with Throttling in Akka (see code)

Suppose one uniform GET /index.html HTTP request is to be sent to 10,000 target servers. The process starts with the Director having the job of sending the requests. Director is not an Akka actor, but rather a Java object that initializes the Actor system and the whole job. It creates an actor called Manager, and passes to it the 10,000 server names and the HTTP call. When the Manager receives the data, it creates one Assistant Manager and 10,000 Operation Workers. The Manager also embeds a task of “server name” and the “GET index.html HTTP request” in each Operation Worker. The Manager does not give the “go ahead” message for triggering task execution on the workers. Instead, the Assistant Manager is responsible for this part: exercising throttling control by asking only some workers to execute tasks.

To better decouple the code based on functionality, the Manager is only in charge of receiving responses from the workers, and the Assistant Manager is responsible for sending the “go ahead” message to trigger workers to work. The Manager initially sends the Assistant Manager a message to send the throttling number of messages; we’ll use 1500, the default throttling number, for this example. The Assistant Manager starts sending a “go ahead” message to each of 1500 workers. To control throttling, the Assistant Manager maintains a sliding window of [response_received_count, request_sent_count]. The request_sent_count is the number of “go ahead” messages the Assistant Manager has sent to the workers. The response_received_count comes from the Manager; when the Manager receives a response, it communicates the updated count to the Assistant Manager. Every half-second, the Assistant Manager sends itself a message to trigger a check of response_received_count and request_sent_count to determine whether the sliding window has room for sending additional messages. If so, the Assistant Manager sends messages until the sliding window is greater than or equal to the throttling number (1500).

Each Operation Worker creates an HTTP Worker, which also has Ning’s async HTTP client functions. When the Manager receives a response from an Operation Worker, it updates the response part of the in-memory hash map of for the associated server. In the event of failing to obtain the response or of timing out, the worker would return exception details (e.g., connection exception) back to the Manager. When the Manager has received all of the responses, it returns the whole hash map of back to the Director. As the job successfully completes, the Director dumps the hash map to disk as a JSON file, then returns.

Beyond web server management – generic HTTP workflows

When modeling and abstracting today’s cloud operations and workflows – e. g., provisioning, file distributions, and software deployment – we find that most of them are similar: each step is a certain form of HTTP call with certain responses, which trigger various operations in the next step. Using the example of monitoring cluster server health, the workflow goes like this:

  1. A single HTTP call to query data storage (such as database as a service) and retrieve the host names and health records of the target servers (1 call to 1 server)
  2. Massive uniform HTTP calls to check the current health of target servers (1 call to N servers); aggregating these N responses; and conducting simple analysis and extractions
  3. Data storage updates for those M servers with changed status (M calls to 1 server)

REST Commander flawlessly supports such use cases with its generic and powerful request models. It therefore is used to automate many tasks involving interactions and workflows (orchestrations) with DBaaS, LBaaS (load balancer as a service), IaaS, and PaaS.

Related work review

Of course, HTTP is a fundamental protocol to the World Wide Web, SOAP/REST-based web services, cloud computing, and many distributed systems. Efficient HTTP/REST/SOAP clients are thus critical in today’s platform and infrastructure services. Although many tools have been developed in this area, we are not aware of any existing tools or libraries on HTTP clients that combine the following three features:

  • High efficiency and scalability with built-in throttling control for parallel requests
  • Generic response aggregation and analysis
  • Generic (i.e., template-based) heterogeneous request generation to the same or different target servers

Postman is a popular and user-friendly REST client tool; however, it does not support efficient parallel requests or response aggregation. Apache JMeter, ApacheBench (ab), and Gatling can send parallel HTTP requests with concurrency control. However, they are designed for load/stress testing on a single target server rather than on multiple servers. They do not support generating different requests to different servers. ApacheBench and JMeter cannot conduct response aggregation or analysis, while Gatling focuses on response verification of each simulation step.

ql.io is a great Node.js-based aggregation gateway for quickly consuming HTTP APIs. However, having a different design goal, it does not offer throttling or generic response extraction (e.g., regular expressions). Also, its own language, table construction, and join query result in a higher learning curve. Furthermore, single-threaded Node.js might not effectively leverage multiple CPU cores unless running multiple instances and splitting traffic between them. 

Typhoeus is a wrapper on libcurl for parallel HTTP requests with throttling. However, it does not offer response aggregation. More critically, its synchronous HTTP library supports limited scalability. Writing a simple shell script with “for” loops of “curl” or “wget” enables sending multiple HTTP requests, but the process is sequential and not scalable.

Ning’s Async-http-client library in Java provides high-performance, asynchronous request and response capabilities compared to the synchronous Apache HTTPClient library. A similar library in Scala is Stackmob’s (PayPal’s) Newman HTTP client with additional response caching and (de)serialization capabilities. However, these HTTP clients are designed as raw libraries without features such as parallel requests with templates, throttling, response aggregation, or analysis.

Performance note

Actual REST Commander performance varies based on network speed, the slowest servers, and Commander throttling and time-out settings. In our testing with single-instance REST Commander, for 10,000 servers across regions, 99.8% of responses were received within 33 seconds, and 100% within 48 seconds. For 20,000 servers, 100% of responses were received within 70 seconds. For a smaller scale of 1,000 servers, 100% of responses were received within 7 seconds.

Conclusion and future work

“Speaking HTTP at scale” is instrumental in today’s platform with XaaS (everything as a service).  Each step in the solution for many of our problems can be abstracted and modeled by parallel HTTP requests (to a single or multiple servers), response aggregation with simple (if/else) logic, and extracted data that feeds into the next step. Taking scalability and agility to heart, we (Yuanteng (Jeff) Pei, Bin Yu, and Yang (Bruce) Li) designed and built REST Commander, a generic parallel async HTTP client as a service. We will continue to add more orchestration, clustering, security, and response analysis features to it. For more details and the video demo of REST Commander, please visit http://www.restcommander.com.  

Yuanteng (Jeff) Pei

Cloud Engineering, eBay Inc.

References

Postman

http://www.getpostman.com

Akka

http://akka.io

Async HTTP Client

https://github.com/AsyncHttpClient/async-http-client 

Play Framework

http://www.playframework.com

Apache JMeter

https://jmeter.apache.org

ApacheBench (ab)

http://httpd.apache.org/docs/2.2/programs/ab.html

Gatling

http://gatling-tool.org

ql.io

http://ql.io 

Typhoeus

https://github.com/typhoeus/typhoeus

Apache HttpClient

http://hc.apache.org/httpclient-3.x

Stackmob’s Newman

https://github.com/stackmob/newman

{ 0 comments }

Yet Another Responsive vs. Adaptive Story

by Senthil Padmanabhan on 03/05/2014

in Software Engineering

Yes, like everyone else in web development, eBay has become immersed in the mystical world of Responsive Web Design. In fact, our top priority for last year was to make key eBay pages ready for multi-screen. Engineers across the organization started brainstorming ideas and coming up with variations on implementing a multi-screen experience. We even organized a “Responsive vs. Adaptive Design” debate meetup to discuss the pros and cons of various techniques. This post summarizes some of the learnings in our multi-screen journey.

There is no one-size-fits-all solution

This is probably one of the most talked-about points in the responsive world, and we want to reiterate it. Every web page is different, and every use case is different. A solution that works for one page might not work for another – in fact, sometimes it even backfires. Considering this, we put together some general guidelines for building a page. For read-only pages or web pages where users only consume information, a purely responsive design (layout controlled by CSS) would suffice. For highly interactive pages or single-page applications, an adaptive design (different views dependent on the device type) might be the right choice. But for most cases, the RESS (Responsive Design + Server Side Components) approach would be the ideal solution. With RESS we get the best of both worlds, along with easier code maintenance and enhancements. Here the server plays a smart role by not switching to a completely new template or wireframe per device; instead, the server helps deliver the best experience by choosing the right modules and providing hints to the client.

User interaction is as important as screen size

Knowing how a user interacts with the device (keyboard, mouse, touch, pointer, TV remote, etc.) is crucial to delivering the optimal experience. Screen size is required for deciding the layout, but is not in itself sufficient. This point resonates with the previous point about RESS:  the server plays a role. The first hint that our servers provide to the browser is an interaction type class (touch, no-touch, pointer, etc.) added to the root HTML or module element. This class helps CSS and JavaScript to enhance features accordingly. For instance, the CSS :hover pseudo selector is applied only to elements having a no-touch class as a predecessor; and in JavaScript, certain events are attached only when the touch class is present. In addition to providing hints, the server can include/exclude module and JavaScript plugins (e.g., Fastclick) based on interaction type.

Keeping the importance of user interaction in mind, we created a lightweight jQuery Plugin called tactile just to handle gesture-based events:  tap, drag (including dragStart, dragEnd), and swipe. Instead of downloading an entire touch library, we felt that tactile was sufficient for our use cases. By including this plugin for all touch-based devices, we enhance the user interaction to a whole new level, bringing in a native feel. These results would not be possible in a purely responsive design.

Understanding the viewport is essential

At a glance the term ‘viewport’ sounds simple, referring to the section of the page that is in view. But when you dig a little deeper, you will realize that the old idiom ‘The devil is in the detail’ is indeed true. For starters, the viewport itself can have three different perspectives:  visual viewport, layout viewport, and ideal viewport. And just adding the default viewport meta tag <meta name="viewport" content="width=device-width, initial-scale=1"/> alone may not always be sufficient (for example, in a mobile-optimized web app like m.ebay.com, the user-scalable=no property should also be used). In order to deliver the right experience, a deeper understanding of the viewport is needed. Hence before implementing a page, our engineers revisit the concept of viewport and make sure they’re taking the right approach.

To get a good understanding of viewports, see the documents introduction, viewport 1, viewport 2, and viewport 3, in that order.

Responsive components vs. responsive pages is a work in progress

Another idea that has been floating around is to build components that are responsive, instead of page layouts that are responsive. However, until element query becomes a reality, there is no clear technical solution for this. So for now, we have settled on two options:

  • The first option is to use media queries at a component level, meaning each component will have its own media queries. When included in a page, a component responds to the browser’s width and optimizes itself (based on touch/no-touch) to the current viewport and device. This approach, though, has a caveat:  It will fail if the component container has a restricted width, since media queries work only at a page level and not at a container level.
  • The second approach was suggested by some engineers in the eBay London office, where they came up with the idea of components always being 100% in width and all their children being sized in percentages. The components are agnostic of the container size; when dropped into a page, they just fit into whatever the container size is. A detailed blog about this technique can be found here.

We try to implement our components using one of the above approaches.  But the ultimate goal is to abstract the multi-screen factor from the page to the component itself. 

We can at least remove the annoyance

Finally, even if we are not able to provide the best-in-class experience on a device, at minimum we do not want to annoy our users. This means following a set of dos and don’ts.

Dos

  • Always include the viewport meta tag <meta name="viewport" content="width=device-width, initial-scale=1"/>
  • Add the interaction type class (touch, no-touch, etc.) to the root HTML or module element
  • Work closely with design to get an answer on how the page looks across various devices before even starting the project

Don’ts

  • Tiny click area, less than 40px
  • Hover state functionality on touch devices
  • Tightly cluttered design
  • Media queries based on orientation due to this issue

This post provides a quick summary of the direction in which eBay is heading to tackle ever-increasing device diversity. There is no silver bullet yet, but we are getting there.

Senthil
Engineer @ eBay

{ 3 comments }

When I started writing this blog post, my original goal was to provide (as alluded to in the title) some insights into my first year as a presentation engineer at eBay – such as my day-to-day role, some of the things we build here, and how we build them. However, before I can do that, I feel I first need to step back and talk about the renaissance. “The renaissance!?”, I hear you say.

The web dev renaissance

Unless you’ve been living under a rock for the past few years, you can’t help but have noticed a renaissance of sorts in the world of web development – propelled in large part, of course, by Node.js, NPM, GitHub, and PaaS, all of which are enabling and empowering developers like never before. Combined with the rapid innovations in the browser space – and within HTML, CSS, and JavaScript – what you have is an incredibly exciting and fun time to be a web developer! And I’m glad to say that the renaissance is truly alive and well here at eBay!

Node.js

Of course the darling of this renaissance is Node.js. Discovering JavaScript on the server for me was just as exciting and liberating as the day I discovered it in the browser – and I’m sure many, many others will share that sentiment. Spinning up an HTTP server in the blink of an eye with just a few lines of JavaScript is simply audacious, and to this day it still makes me grin with delight – especially when I think of all the hours I’ve wasted in my life waiting for the likes of Apache or IIS! But it’s not just the speed and simplicity that enthralls; it’s also the feeling of utmost transparency and control.

CubeJS

But I digress. I hear you say, “What does this so-called renaissance have to do with eBay?” and “Isn’t eBay just a tired, old Java shop?” That might have been true in the past. But these days, in addition to an excellent new Java stack (we call it Raptor and, as the name correctly implies, it is anything but tired!), we now also have our very own Node.js stack (we call it CubeJS), which already powers several of our existing sites and applications. Yes, the wait is over; Node.js in the enterprise is finally a reality for developers. Since joining eBay in the spring of 2013, I have barely touched a line of Java or JSP code.

JavaScript everywhere

Why is this a big deal? Well, a common pattern for us web developers is that every time we change jobs more often than not we also have to change server-side languages. Over the years I’ve used Perl/CGI, ASP Classic, JSP, ColdFusion/CFML, PHP, and ASP.NET. Now as much as I do enjoy learning new skills (except the circus trapeze – that was ill-advised), I’d be stretching the truth if I said I knew all of those languages and their various intricacies inside out. Most of the time I will learn what I need to learn, but rarely do I feel the need or desire to specialize. It would be fair to say I wasn’t always getting the best out of the language and the language wasn’t always getting the best out of me. Really, deep down, I wanted to be using JavaScript everywhere. And now of course that pipe dream is true.

Polyglotism

Adoption of Node.js is a win-win situation for eBay as we seek to embrace the flourishing world-wide community of JavaScript developers like myself as well as to leverage our excellent open-source eco-system. Node.js might only be the beginning; as eBay further adopts and advocates for such polyglotism, we increasingly welcome developers from different tribes – Python, PHP, Ruby on Rails, and beyond – and eagerly anticipate the day they become integrated with our PaaS (Platform as a Service). You see, it’s all about choice and removing barriers – which empowers our developers to delight our users.

Vive la Renaissance

In this post I’ve mainly focused my attention on Node.js but, as mentioned, the renaissance at eBay doesn’t stop there. We also embrace NPM. We embrace GitHub. We embrace PaaS. We embrace modern principles, tools, and workflows (Modular JavaScript, Grunt, JSHint, Mocha, LESS, and Jenkins – to name but a few!). Yes, we embrace open source – and it’s not all take, take, take either; be sure to check out KrakenJS (a web application framework built on Node.js by our good friends over at PayPal), RaptorJS (eBay’s end-to-end JavaScript toolkit for building adaptive modules and UI components), and Skin (CSS modules designed by eBay to build an online store). And be sure to keep your eyes open for more contributions from us in the near future!

—-

Do you share our passion for JavaScript, Node.js, and the crazy, fast-paced world of front-end web development? Interested in finding out more about joining our presentation team? Please visit http://www.ebaycareers.com for current openings.

{ 0 comments }

Deployment to the cloud is an evolving area. While many tools are available that deploy applications to nodes (machines) in the cloud, zero deployment downtime is rare or nonexistent. In this post, we’ll take a look at this problem and propose a solution. The focus of this post is on web applications—specifically, the server-side applications that run on a port (or a shared resource).

In traditional deployment environments, when switching a node in the cloud from the current version to a new version, there is a window of time when the node is unusable in terms of serving traffic. During that window, the node is taken out of traffic, and after the switch it is brought back into traffic.

In a production environment, this downtime is not trivial. Capacity planning in advance usually accommodates the loss of nodes by adding a few more machines. However, the problem becomes magnified where principles like continuous delivery and deployment are adopted.

To provide effective and non-disruptive deployment and rollback, a Platform as a Service (PaaS) should possess these two characteristics:

  • Best utilization of resources to minimize deployment downtime as much as possible
  • Instantaneous deployment and rollback

Problem analysis

Suppose we have a node running Version1 and we are deploying Version2 to that node. This is how the lifecycle would look:

  typical deployment workflow

Every machine in the pool undergoes this lifecycle. The machine stops serving traffic right after the first step and cannot resume serving traffic until the very last step. During this time, the node is effectively offline.

At eBay, the deployment lifecycle takes a reasonably sized application about 9 minutes. For an organization of any size, many days of availability can be lost if every node must go into offline phase during deployment.

So, the more we minimize the off-traffic time, the closer we get to instant/zero-downtime deployment/rollback.

Proposed solutions

Now let’s look into a few options for achieving this goal.

A/B switch

In this approach, we have a set of nodes standing by. We deploy the new version to those nodes and switch the traffic to them instantly. If we keep the old nodes in their original state, we could do instant rollback as well. A load balancer fronts the application and is responsible for this switch upon request.

The disadvantage to this approach is that some nodes will be idle, and unless you have true elasticity, it will amplify the node wastage. When a lot of deployments are occurring at the same time, you may end up needing to double the capacity to handle the load.

Software load balancers

In this approach, we configure the software load balancer fronting the application with more than one end point so that it can effectively route the traffic to one or another. This solution is elegant and offers much more control at the software level. However, applications will have to be designed with this approach in mind. In particular, the load balancer’s contract with the application will be very critical to successful implementation.

From a resource standpoint, both this and the previous approach are similar; both use additional resources, like memory and CPU. The first approach needs the whole node, whereas the other one is accommodated inside the same node.

Zero downtime

With this approach, we don’t keep a set of machines; rather, we delay the port binding. Shared resource acquisition is delayed until the application starts up. The ports are switched after the application starts, and the old version is also kept running (without an access point) to roll back instantly if needed.

Similar solutions exist already for common servers.

Parallel deployment – Apache Tomcat

Apache Tomcat has added the parallel deployment feature to their version 7 release. They let two versions of the application run at the same time and take the latest version as default. They achieve this capability through their context container. The versioning is pretty simple and straightforward, appending ‘##’ to  the war name. For example, webapp##1.war and webapp##2.war can coexist within the same context; and for rolling back to webapp##1, all that is required is to delete webapp##2.

Although this feature might appear to be a trivial solution, apps need to take special care with shared files, caches (as much write-through as possible), and lower-layer socket usage.

Delayed port binding

This solution is not available in web servers currently. A typical server first binds to the port, then starts the services. Apache lets you delay binding to some extent by overriding bindOnInit, but still the binding occurs after the connector is started.

What we propose here is the ability to start the server without binding the port and essentially without starting the connector. Later, a separate command will start and bind the connector. Version 2 of the software can be deployed while version 1 is running and already bound. When version 2 is started later, we can unbind version 1 and bind version 2. With this approach, the node is effectively offline only for a few seconds.

The lifecycle for delayed port binding would look like this:

delayed_port_binding_workflow

However, there is still a few-second glitch, so we will look at the next solution.

Advanced port binding

Now that we have minimized the window of unavailability to a few seconds, we will see if we can reduce it to zero. The only way to do that would be to bring version 2 up before version 1 goes down. But first:

Breaking the myth:  ‘Address already in use’

If you’ve used a server to run an application, I am sure you’ve seen this exception at least once. Let’s consider this scenario: We start the server and bind to the port. If we try to start another instance (or another server with the same port), the process fails with the error ‘Address already in use’. We kill the old server and start it again, and it works.

But have you ever given a thought as to why we cannot have two processes listening to the same port? What could be preventing it? The answer is “nothing”! It is indeed possible to have two processes listening to the same port.

SO_REUSEPORT

The reason we see this error in typical environments is because most servers bind with the SO_REUSEPORT option off. This option lets two (or more) processes bind to the same port, provided the application that bound the first process had this option set while binding. If this option is off, the OS interprets the setting to mean that the port is not to be shared, and it blocks subsequent processes from binding to that port.

The SO_REUSEPORT option also provides fair distribution of requests (important since threading suffers from bottlenecks in multi-cores). Both of the threading approaches—one thread listening and then dispatching, as well as multiple threads listening—suffer from the under/over utilization of cycles. An additional advantage of SO_REUSEPORT is that it takes care of sending the datagram from the same client to the same server process. However, it has a shortcoming:  packets might be dropped if new processes are added or removed on the fly. This shortcoming is being addressed.

You can find a good article about SO_REUSEPORT at this link on LWN.net. If you want to try this out yourself, see this post on the Free-Programmer’s Blog.

The SO_REUSEPORT option address two issues:

  • The small glitch between the application version switching:  The node can serve traffic all the time, effectively giving us zero downtime.
  • Improved scheduling:  Data indicates (see this article on LWN.net) that thread scheduling is not fair; the ratio between the busiest thread versus the one with the least connections is 3:1.

 zero_downtime_workflow

Please note that SO_REUSEPORT is not the same as SO_REUSEADDRESS, and that it is not available in Java as not all operating systems support it.

Conclusion

Applications can successfully serve traffic during deployment, if we carefully design and manage those applications to do so. Combining both late binding and port reuse, we can effectively achieve zero downtime. And if we keep the standby process around, we will be able to do an instant rollback as well.

{ 11 comments }

eBay is experiencing phenomenal growth in the transactional demands on our databases, in no small part due to our being at the forefront of mobile application development. To keep up with such trends, we continually assess the design of our schemas.

Schema design is a logical representation of the structures used to store the data that applications produce and consume.  Given that database resources are finite, execution times for transactions can vary wildly as those transactions compete for the resources they require. As a result, schema design is the most essential part of any application development life cycle. This blog post covers schema design for online transaction processing applications and recommends a specific approach.

Unfortunately, there is no predefined set of rules for designing databases in an efficient manner, but there can certainly be a defined design process that achieves that outcome. Such a design process includes, but is not limited to, the following activities:

  1. determining the purpose of the database
  2. gathering the information to be recorded
  3. dividing the  information items into major entities
  4. deciding what information needs to be stored
  5. setting up relationships between entities
  6. refining the design further

Historically, OLTP systems have relied on a systematic way of ensuring that a database structure is suitable for general-purpose querying and is free of certain undesirable characteristics – insertion, update, and deletion anomalies– that could lead to loss of data integrity. A highly normalized database offers benefits such as minimizing redundancy, freeing up the relations from undesired insertion, update and deletion dependency, data consistency within the database, and a much more flexible database design.

But as they say, there are no free lunches. A normalized database exacts the price of inserting into multiple tables and reading by way of joining multiple tables. Normalization involves design decisions that are likely to cause reduced database performance. Schema design requires keeping in mind that when a query or transaction request is sent to the database, multiple factors are involved, such as CPU usage, memory usage, and input/output (I/O). Depending on the use case, a normalized database may require more CPU, memory, and I/O to process transactions and database queries than does a denormalized database.

Recent developments further compound schema design challenges. With increasing competition and technological advances such as mobile web applications, transactional workload on the database has increased exponentially. As the competitor is only a click away, an online application’s valuable users must be ensured consistently good performance via QoS and transaction prioritization. Schema design for such applications cannot be merely focused on normalization; performance and scalability are no less important.

For example, at eBay we tried a denormalized approach to improve our Core Cart Service’s DB access performance specifically for writes. We switched to using a BLOB-based cart database representation, combining 14 table updates into a single BLOB column. Here are the results:

  • The response time for an “add to cart” operation improved on average 30%. And in use cases where this same call is made against a cart that already contains several items (>20), the performance at 95 percentile has improved by 40% .
  • For the “create cart” operation, total DB call time for the worst case was improved by approximately 50% due to significant reduction in SQL counts.
  • Parallel transaction DB call times improved measurably for an average use case.

These realities do not imply that denormalization is in any way a blessing. There are costs to denormalization. Data redundancy is increased in a denormalized database. This redundancy can improve performance, but it also requires more extraneous efforts to keep track of related data. Application coding can create further complications, because the data is spread across various tables and may be more difficult to locate. In addition, referential integrity is more of a chore, because related data is divided among a number of tables.

There is a happy medium between normalization and denormalization, but both require a thorough knowledge of the actual data and the specific business requirements. This happy medium is what I call “the medium approach.” Denormalizing a database is the process of taking the level of normalization within the database down a notch or two. Remember, normalization can provide data integrity, which is the assurance of consistent and accurate data within a database, but at the same time it can slow performance because of its frequently occurring table join operations.

There is an old proverb:  “normalize until it hurts, denormalize until it works.” One has to land in the middle ground to get all of the goodies of these two different worlds.

{ 4 comments }

The series of interconnected tents are buzzing with activity. Small groups in animated discussion huddle around laptops and monitors, while some people are lost in private discovery as they interact with new apps or prototypes on their smart phones. Similar scenes repeat themselves from row to row throughout the space.

Sound like the typical software industry expo?  In this case, the venue is one of multiple showcases held on eBay campuses over the summer, and the presenters are not product vendors or seasoned exhibitors, but rather college interns demonstrating their work to peers, managers, and executives. With rare exception, the interns’ already-enthusiastic faces light up when asked if their summer at eBay had been a positive experience.

eBay’s global internship program brought more than 500 undergraduate, master’s, and PhD students to eBay campuses across the U. S. as well as to eBay Centers of Excellence in India and China. About 100 universities were represented in this year’s program. The vast majority of the interns are computer science, software engineering, applied science, or related majors. Their work covers the gamut of engineering challenges at eBay:  from unsupervised machine learning techniques and predictive forecasting models, to big data analysis and visualization; from personalization and localization, to new front-end features and development for multi-screen users; and from site security and fraud detection, to end-to-end testing and internal developer tools.

Lincoln J. Race, Computer Science graduate student at University of California at San Diego, says his internship “has been a tremendous learning experience for me, learning about how to work with a larger team to meet a project deadline.” His summer work has focused on big data. “I’ve loved working with the people around me to ‘GSD’, as David Marcus would say” (referring to the PayPal president’s shorthand for “get stuff done”), “and I can’t wait to continue doing that in the future.”

Oregon State University Computer Science major Marty Ulrich has been interning with eBay’s mobile core apps group in Portland. “The internship has so far exceeded my expectations,” he says. “I like that I’m given the freedom to work on the parts of the app that interest me. And if I have any questions, there are always knowledgeable people ready to help.” He adds, “My experience here this summer has made me want to work at eBay full time when I graduate.”

Ranjith Tellakula, a graduate student in Computer Science at San Jose State University, says he was excited when eBay offered him an internship that would combine his interests in data mining and back-end application development. Throughout the summer, he worked on developing internal metrics applications that identify gaps in the information available to eBay’s own engineers. Visual dashboards and export tools are now enabling support groups to prioritize and close those information gaps.

Like Marty, Ranjith says his internship has exceeded his expectations. “My work is actually going to make engineers’ lives better,” he says, “and I’ve gotten to learn new technologies all along the way. For example, I had no experience with shell scripting, but now shell scripts I created are running every day. And I’ve been amazed that I’m treated as a peer, even by people with years of experience.” His colleagues say he offered expertise in MongoDB that has benefited the entire team. Ranjith continues to work with eBay on a half-time internship while he completes his master’s degree.

Here is a sampling of other intern projects:

  • 3D structure estimation by augmenting a single 2D image with its depth metadata — Mohammed Haris Baig, Computer Science PhD candidate, Dartmouth
  • Write-once run-anywhere by-anyone integration testing – Greg Blaszczuk, Computer Science and Engineering undergraduate student, University of Illinois at Urbana-Champaign
  • Extraction algorithm to cluster eBay search engine output into groups that are meaningful to users – Chao Chang, Statistics PhD candidate, Washington University in St. Louis
  • Buyer recommendations for similar products that are more economic and environmentally friendly – University of California at Santa Cruz undergraduate students Trieste Devlin (Robotic Engineering), Navneet Kaur (Bioengineering), Anh Dung Phan (Bioengineering), and Alisa Prusa (Computer Science)
  • Examination of how we treat money obtained through programs like eBay Bucks differently from money we earn through normal means – Darrell Hoy, Computer Science PhD candidate, Northwestern University
  • Prototype for providing price guidance to eBay bidders—Isabella Li, Information Technology master’s student, Carnegie Mellon University
  • Use of intelligent caching in eBay API calls – Bharad Raghavan, Computer Science undergraduate student, Stanford University
  • Emulation and testing of various types of DDOS attack tools – Sree Varadharajan, Computer Science master’s student, University of Southern California
  • Dashboard framework enabling data-driven decisions without requiring coding – Jie Zha, Software Engineering master’s student, UC Irvine

Interns received onboarding orientations, goal-setting sessions with their managers, deliverables, and performance reviews, much like regular new-hires. In addition, they attended a three-day conference featuring presentations specifically tailored to the internship experience, including talks by eBay President and CEO John Donahoe and eBay Global Marketplaces President Devin Wenig.

Of course, interns had all of the fun experiences typical at a high-tech company (casino and bowling nights, sports leagues, various other competitions, barbeques, etc. etc.). But according to the interns’ feedback, what made them want to come back to eBay are the opportunities they saw for innovative research and product development, in a self-driven manner, using cutting-edge technology.

“Infusing young talent into the eBay culture is really the future of our company,” says eBay’s university engagement partner Jill Ripper. Adds her colleague Joy Osborne, “Providing access for our interns to connect with both fellow interns and the broader eBay Inc. family in a meaningful way was a key component to the success of our summer internship program.”

To learn more about eBay’s internship program, visit http://students.ebaycareers.com/jobs/interns.

collage

 

{ 0 comments }

Want to hear what some of the world’s most talented testers think about the changing face of software and software testing? Then come to the Conference of the Association for Software Testing (CAST), where you’ll also get a chance to talk with these testers and explore your own thoughts and ideas.

CAST is put together by testers, for testers (and those interested in testing). This year, it takes place in Madison, Wisconsin, August 26-28. The presenters are among the best practitioners from around the globe, and many attendees travel thousands of miles specifically for this conference. eBay will have a strong presence, including a keynote by Quality Engineering Director Jon Bach and a track presentation by Ilari Henrik Aegerter, manager of quality engineering in Europe.

Unlike many testing conferences, at CAST a third of each presentation is reserved for a “threaded” question-and-answer session, in which the audience uses colored cards to indicate new questions or questions related to the current thread. With this format, you can satisfy your curiosity, raise doubts, and make presenters defend their positions. That includes the keynote speakers. The conference also includes a test lab where you can get hands-on and try out ideas you might have heard about, see how other testers test, and share your own experience. You’ll find testers hanging out in the hallways having in-depth discussions and challenging each other to demonstrate their testing skills. Everything about the environment is designed to support testers who want to excel at their craft.

The theme for CAST 2013 is “Old Lessons applied and new lessons learned: advancing the practice and building a foundation for the future.”  The technology we work with changes at a rapid pace. Some testing practices stand the test of time; others become obsolete and irrelevant as the technology changes around them.

If this conference sounds like something you’d like to be a part of, then I urge you to register.

http://www.associationforsoftwaretesting.org/conference/cast-2013

I’d love to see you there.

- Ben Kelly, eBay EUQE team and a content chair for CAST 2013

{ 0 comments }

As discussed in a previous post, earlier this year we unveiled the Digital Service Efficiency (DSE) methodology, our miles-per-gallon (MPG) equivalent for viewing the productivity and efficiency of our technical infrastructure across four key areas: performance, cost, environmental impact, and revenue. The goal of releasing DSE was to provide a transparent view of how the eBay engine is performing, as well as spark an industry dialogue between companies, big and small, about how they use their technical infrastructure and the impact it’s having on their business and the planet. In the past month, we’ve been excited to see – and participate in – the resulting dialogue and conversation.

When we shared DSE externally for the first time, we also set a number of performance goals for 2013 and committed to disclosing updates on our progress on a quarterly basis. Today, we’re pleased to share our first such quarterly update with year-over-year comparisons and analysis. This post provides highlights of our findings as well as links to the DSE dashboard and other details.

Dash_Q1-600x339

Q1 metrics summary

Here’s where we stand on the progress toward 2013 performance and efficiency goals:

  • Our transactions per kWh increased 18% year over year. The growth of eBay Marketplaces and continued focus on driving efficiencies have contributed to this increase.
  • Our cost per transaction decreased by 23% in Q1 alone, already exceeding our initial goal.
  • Our carbon per transaction showed a net decrease of 7% for the quarter. As we’re still on track for our Utah Bloom fuel cell installation to go live this summer, we expect this number to continue to decrease and contribute significantly to our 10% carbon reduction goal for the year, even with our projected business growth.

Recognizing that our dynamic infrastructure changes each quarter, we’re confident that we’re on track for our 10% net gain across performance, cost, and environmental impact for the year.

Trends

We’ve seen a few other interesting trends as well:

The New eBay: Over the past year we’ve added numerous new features to our site. Last fall, we rolled out our feed technology, which makes the shopping experience on eBay.com more personal to each of our individual users. On the backend, with this curated site personalization, we’ve seen a jump in our transaction URLs as the eBay feed technology increases infrastructure utilization. This is one example of our customers’ site use driving a more productive work engine.

Continued Growth and Efficiency: As you can see on the dashboard, we had a significant spike in the number of physical servers powering eBay.com – a 37% increase year over year. The rise is in direct response to increased customer demand and personalization for our users. And while we added a significant number of new servers, we were able to limit our increase in power consumption to just 16% (2.69 MW). Compared to the first quarter of last year, we’ve reduced the power consumed by an average of 63 watts per server; this is a direct result of our efforts to run more energy-efficient servers and facilities.

Running a Cleaner Engine: As eBay aspires to be the leading global engine for greener commerce, we’re continually focused on integrating cleaner and renewable energy into our portfolio. In March, our Salt Lake City data center’s solar array came online and though it’s relatively small, it increased our owned clean energy powering eBay.com by 0.17%. Our active projects with fuel cells and renewable energy sourcing will continue to increase this value through the year.

Continuing to Refine Our Approach

When we first announced DSE, one of our top priorities was continued transparency into our metrics, calculations, and lessons learned. Along those lines, the greater transparency has also sparked internal conversations at eBay about how our server pools are categorized.

We organize each server into one of three categories: “buy,” “sell,” or “shared.” “Buy” and “sell” serve the pages and APIs directly for customer traffic, which count as our transactions (URLs); “shared” is the backend support equipment, which does not receive external traffic.

When we released the 2012 DSE numbers we reported 7.3 trillion URLs or transactions in the “buy” and “sell” groups. As we rolled up the Q1 2013 numbers, we found that some internally facing servers were grouped into “buy” and “sell.” We moved them to the appropriate “shared” group. While this did not change overall server counts or power consumption, it did decrease the 2012 transactions count coming from the external servers to 4.3 trillion. We also moved some server groups from “sell” to “buy” to be more consistent with our business model (unlike before, when “buy” and “sell” were relatively more equal). We felt it was important to stay strict with our methodology, and so we have retroactively updated the 2012 baseline numbers to ensure that our year-over-year results were standard and consistent.

Based on lessons learned in the Q1 2013 work, we’ve fine-tuned our methodology as follows:

  1. We’ll look only at those server pools that receive external web traffic in order to ensure that we’re accurately speaking to the year-over-year comparisons for “buy” and “sell” – all other server pools not receiving external web traffic will be considered “shared.”
  2. We’re now measuring Revenue per MW hour (as opposed to Revenue per MW), as this metric represents total consumption per quarter and year instead of quarterly averages. Likewise, we’ve decided to measure CO2e per MW hour instead of CO2e per MW.

Conclusion

With the Q1 2013 results under our belt, we’re happy with the progress we’ve made to fine-tune our technical infrastructure and the DSE metric. We still have more to do, though, and we’ll be sure to keep you updated along the way.

You can find the full Q1 2013 DSE results at http://dse.ebay.com, and click through the dashboard yourself to see the full cost, performance, and environmental impact of our customer “buy” and “sell” transactions.

Have a question or comment? Leave a note in the comments below or email dse@ebay.com and we’ll be sure to get back to you.

- Sri Shivananda, Vice President of Platform, Infrastructure and Engineering Systems, eBay Inc.

 

 

{ 2 comments }

For the most part, eBay runs on a Java-based tech stack. Our entire workflow centers around Java and the JVM. Considering the scale of traffic and the stability required by a site like ebay.com, using a proven technology was an obvious choice. But we have always been open to new technologies, and Node.js has been topping the list of candidates for quite some time. This post highlights a few aspects of how we developed eBay’s first Node.js application.

Scalability

It all started when a bunch of eBay engineers (Steven, Venkat, and Senthil) wanted to bring an eBay Hackathon-winning project called “Talk” to production. When we found that Java did not seem to fit the project requirements (no offense), we began exploring the world of Node.js. Today, we have a full Node.js production stack ready to rock. 

We had two primary requirements for the project. First was to make the application as real time as possible–i.e., maintain live connections with the server. Second was to orchestrate a huge number of eBay-specific services that display information on the page–i.e., handle I/O-bound operations. We started with the basic Java infrastructure, but it consumed many more resources than expected, raising questions about scalability for production. These concerns led us to build a new mid-tier orchestrator from scratch, and Node.js seemed to be a perfect fit.

Mindset

Since eBay revolves around Java and since Java is a strongly typed static language, initially it was very difficult to convince folks to use JavaScript on the backend. The numerous questions involved ensuring type safety, handling errors, scaling, etc. In addition, JavaScript itself (being the world’s most misunderstood language) further fueled the debate. To address concerns, we created an internal wiki and invited engineers to express their questions, concerns, doubts, or anything else about Node.js.

Within a couple of days, we had an exhaustive list to work on. As expected, the most common questions centered around the reliability of the stack and the efficiency of Node.js in handling eBay-specific functionality previously implemented in Java. We answered each one of the questions, providing details with real-world examples. At times this exercise was eye-opening even for us, as we had never considered the angle that some of the questions presented. By the end of the exercise, people understood the core value of Node.js; indeed, some of the con arguments proved to be part of the beauty of the language.

Once we had passed the test of our peers’ scrutiny, we were all clear to roll.

Startup

We started from a clean slate. Our idea was to build a bare minimum boilerplate Node.js server that scales; we did not want to bloat the application by introducing a proprietary framework. The first four node modules we added as dependencies were express, clusterrequest, and async. For data persistence, we decided on MongoDB, to leverage its ease of use as well as its existing infrastructure at eBay. With this basic setup, we were able to get the server up and running on our developer boxes. The server accepted requests, orchestrated a few eBay APIs, and persisted some data.

For end-to-end testing, we configured our frontend servers to point to the Node.js server, and things seemed to work fine. Now it was time to get more serious. We started white-boarding all of our use cases, nailed down the REST end points, designed the data model and schema, identified the best node modules for the job, and started implementing each end point. The next few weeks we were heads down–coding, coding, and coding.   

Deployment

Once the application reached a stable point, it was time to move from a developer instance to a staging environment. This is when we started looking into deployment of the Node.js stack. Our objectives for deployment were simple: Automate the process, build once, and deploy everywhere. This is how Java deployment works, and we wanted Node.js deployment to be as seamless and easy as possible.

We were able to leverage our existing cloud-based deployment system. All we needed to do was write a shell script and run it through our Hudson CI job. Whenever code is checked in to the master branch, the Hudson CI job kicks off. Using the shell script, this job builds and packages the Node.js bundle, then pushes it to the deployment cloud. The cloud portal provides an easy user interface to choose the environment (QA, staging, or pre-production) and activate the application on the associated machines.

Now we had our Node.js web service running in various stable environments. This whole deployment setup was quicker and simpler than we had expected.  

Monitoring

At eBay, we have logging APIs that are well integrated with the Java thread model as well as at the JVM level. An excellent monitoring dashboard built on top of the log data can generate reports, along with real-time alerts if anything goes wrong. We achieved similar monitoring for the Node.js stack by hooking into the centralized logging system. Fortunately for us, we had logging APIs to consume. We developed a logger module and implemented three different logging APIs:

  1. Code-level logging. This level includes logging of errors/exceptions, DB queries, HTTP service calls, transaction metadata, etc.
  2. Machine-level logging. This level includes heartbeat data about CPU/memory and other OS statistics. Machine-level logging occurs at the cluster module level; we extended the npm cluster module and created an eBay-specific version.
  3. Logging at the load balancer level. All Node.js production machines are behind a load balancer, which sends periodic signals to the machines and ensures they are in good health. In the case of a machine going down, the load balancer fails-over to a backup machine and sends alerts to the operations and engineering teams.

We made sure the log data formats exactly matched the Java-based logs, thus generating the same dashboards and reports that everyone is familiar with.

One particular logging challenge we faced was due to the asynchronous nature of the Node.js event loop. The result was that the logging of transactions was completely crossed. To understand the problem, let’s consider the following use case:  The Node process starts a URL transaction and issues a DB query with an async callback. The process will now proceed with the next request, before the DB transaction finishes. This being a normal scenario in any event loop-based model like Node.js, the logs are crossed between multiple URL transactions, and the reporting tool shows scrambled output. We have worked out both short-term and long-term resolutions for this issue.

Conclusion

With all of the above work completed, we are ready to go live with our Hackathon project. This is indeed the first eBay application to have a backend service running on Node.js. We’ve already had an internal employee-only launch, and the feedback was very positive–particularly on the performance side. Exciting times are ahead!

A big shout-out to our in-house Node.js expert Cylus Penkar, for his guidance and contributions throughout the project. With the success of the Node.js backend stack, eBay’s platform team is now developing a full-fledged frontend stack running on Node.js. The stack will leverage most of our implementation, in addition to frontend-specific features like L10N, management of resources (JS/CSS/images), and tracking. For frontend engineers, this is a dream come true; and we can proudly say, “JavaScript is EVERYWHERE.”

Senthil Padmanabhan & Steven Luan
Engineers @ eBay

{ 52 comments }

Copyright © 2011 eBay Inc. All Rights Reserved - User Agreement - Privacy Policy - Comment Policy