eBay Tech Blog

The Case Against Logic-less Templates

by Sathish Pottavathini on 10/01/2012

in Software Engineering

At a conference I recently attended, two separate topics related to templates sparked a debate off-stage about Logic vs No-logic View Templates. Folks were very passionate about the side they represented. Those discussions were the high point of the conference, since I learned much more from them than from the sessions that were presented.

While everyone involved in the debates had different opinions about the ideal templating solution, we all agreed that templates should…

  • not include business logic
  • not include a lot of logic
  • be easy to read
  • be easy to maintain

In this post I’ll describe what I prefer–rules, best practices, performance considerations, etc. aside. My preferences come from my experience with templates for medium to complex applications.

Full-of-logic, logic-less, and less-logic solutions

Before I go into the details, I would like to make one thing very clear. Although I’m trying to make a case against logic-less templates, that does not mean that I am advocating the other extreme–i.e., a templating language that allows a lot of logic. I find such templating languages, especially those that allow the host programming languages to be used inside the template, to be hard to read, hard to maintain, and simply a bad choice. A JSP template with Java code in it and an Underscore template with JavaScript in it both fall into the category of being a full-of-logic template. JSP and Underscore are not necessarily at fault here; rather, developers often abuse the additional freedom such solutions offer.

What I am advocating is “less-logic” templates in place of “logic-less” templates (thanks, Veena Basavaraj (@vybs), for the term “less-logic templates”!).

Good and great templating languages

A good templating language should offer, at a minimum, the following features:

  1. Clean and easy-to-read syntax (including the freedom to use whitespace that will not show up in output)
  2. Structural logic like conditionals, iterations/loops, recursions, etc.
  3. Text/Token replacement
  4. Includes

A great templating language should also offer the following features:

  1. Ability to be rendered on the server and the client
  2. Easy learning curve with as few new concepts as possible
  3. Simple template inheritance
  4. Easy debugging
  5. Great error reporting (line numbers, friendly messages, etc.)
  6. Extensibility and customizability
  7. Localization support
  8. Resource URL management for images, scripts, and CSS

The case against logic-less templates

I consider a logic-less template to be too rigid and hard to work with due to the restrictions it imposes. More specifically, here are the reasons I am not a big fan of logic-less templates:

  • Writing a logic-less template requires a bloated view model with comprehensive getters for the raw data. As a result, a messy and difficult-to-maintain view model usually accompanies logic-less templates.
  • Every new variation to the view requires updating both the view model and the template. This holds true even for simple variations.
  • Getting the data ready for the template requires too much data “massaging.”
  • Offsetting the rules of a logic-less template requires a lot of helper methods.

Regarding the argument of better performance with logic-less templates: While they might have a simpler implementation, they require additional preprocessing/massaging of the data. You therefore have to pay a price before the data gets to the template renderer. Granted, a templating language that allows more logic might have a more complex implementation; however, if the compiler is implemented correctly, then it can still produce very efficient compiled code. For these reasons, I would argue that a solution involving templates with more logic will often outperform a similar solution based on logic-less templates.

Conclusion

No doubt, there are advantages like simplicity and readability associated with logic-less templates such as Mustache. And arguably, there could be some performance gains. However, in practice I do not think these tradeoffs are enough to justify logic-less templates.

Logic in a template isn’t really a bad thing, as long as it doesn’t compromise the template’s readability and maintainability.

And as an aside, I’m definitely in favor of more debates than sessions in future conferences, since we actually learn more by hearing multiple viewpoints.

I welcome your thoughts!

Credits: Many thanks to my colleague Patrick Steele-Idem (@psteeleidem) for helping me write this post. He is working on some cool solutions like a new templating language; be sure to check them out when they are open-sourced.

{ 7 comments }

ql.io APIs for eBay Marketplaces

by Madhura Tipnis on 09/24/2012

in Software Engineering

We are pleased to announce the release of eBay Marketplaces APIs based on ql.io:

https://github.com/ql-io/ql.io-ebay-mp-apis

We have created tables for most of the calls for the eBay Finding, Shopping, and Trading APIs. The fields required in the API calls are mapped using either EJS (Embedded JavaScript) or Mustache templates. You can use the tables as they are; or you can create your own tables and routes for the specific fields and calls that your application will need.

To use the tables that we have provided, clone the above repository. Then run

make clean install

You need to have the API key and authentication credentials for the particular API. You can get the API key at https://www.x.com/developers/ebay. Simply enter the appropriate values in the dev.json file in the config directory.

The test directory includes examples of various API calls. As described at http://ql.io/ we support different SQL-like statements – select, insert, delete, etc. We also support various HTTP verbs, such as GET, PUT, POST, and DELETE, with the SQL-like statements.

Usage scenarios

As examples of how the APIs can be used, here are three different scenarios. You can run each of the statements shown individually, and the call will return a “Success” message. Alternatively, you can use the UI to test if the calls have been successful.

SCENARIO 1:  WatchList add/remove (buyer’s perspective)

Here we use a keyword to search for items. For each item matching the keyword, we obtain the item ID and use it to add the item to the watchlist.

itemid = select searchResult.item[0].itemId from finding.findItemsByKeywords where keywords = 'camera';

insert into trading.AddToWatchList (ItemID) values ("{itemid");

We can also remove an item from the watchlist:

return delete from trading.RemoveFromWatchList where ItemID in ("{itemid");

SCENARIO 2:  AddItem + PlaceOffer (buyer’s + seller’s perspective)

This scenario uses the POST method with the AddItem call of the trading API. Here we pass the entire XML as the input string, which the ql.io engine will directly pass to the API gateway. This method is referred to as “passing opaque parameters” in the documentation at http://www.ql.io/docs/insert.

insert into trading.AddItem values ('<?xml version="1.0" encoding="utf-8"?>
   <AddItemRequest xmlns="urn:ebay:apis:eBLBaseComponents">
      <ErrorLanguage>en_US</ErrorLanguage>
      <WarningLevel>High</WarningLevel>
      <Item><Title>Best book</Title>
         <Description>This is the best book. Super condition!</Description>
         <PrimaryCategory>
            <CategoryID>377</CategoryID>
         </PrimaryCategory>
         <StartPrice>1.00</StartPrice>
         <ConditionID>3000</ConditionID>
         <CategoryMappingAllowed>true</CategoryMappingAllowed>
         <Country>US</Country>
         <Currency>USD</Currency>
         <DispatchTimeMax>3</DispatchTimeMax>
         <ListingDuration>Days_7</ListingDuration>
         <ListingType>Chinese</ListingType>
         <PaymentMethods>PayPal</PaymentMethods>
         <PayPalEmailAddress>magicalbookseller@yahoo.com</PayPalEmailAddress>
         <PictureDetails>
            <PictureURL>http://i.ebayimg.sandbox.ebay.com/00/s/MTAwMFg2NjA=/$(KGrHqJHJEsE-js(zPJ)BP)cWCLLSQ~~60_1.JPG?set_id=8800005007</PictureURL>
         </PictureDetails>
         <PostalCode>95125</PostalCode>
         <Quantity>1</Quantity>
         <ReturnPolicy>
            <ReturnsAcceptedOption>ReturnsAccepted</ReturnsAcceptedOption>
            <RefundOption>MoneyBack</RefundOption>
            <ReturnsWithinOption>Days_30</ReturnsWithinOption>
            <Description>
               If you are not satisfied, return the book for a refund.
            </Description>
            <ShippingCostPaidByOption>Buyer</ShippingCostPaidByOption>
         </ReturnPolicy>
         <ShippingDetails>
            <ShippingType>Flat</ShippingType>
            <ShippingServiceOptions>
               <ShippingServicePriority>1</ShippingServicePriority>
               <ShippingService>USPSMedia</ShippingService>
               <ShippingServiceCost>2.50</ShippingServiceCost>
            </ShippingServiceOptions>
         </ShippingDetails>
         <Site>US</Site>
      </Item>
      <RequesterCredentials>
         <Username>ql.io-test1</Username>
         <Password>ebay</Password>
      </RequesterCredentials>
      <WarningLevel>High</WarningLevel>
   </AddItemRequest>') ; 

return insert into trading.PlaceOffer (ErrorLanguage,EndUserIP,ItemID,Offer.Action, Offer.MaxBid, Offer.Quantity ) values ("en_US", "192.168.255.255", "200002581483", "Bid", "20.00","1"); 

The Item ID is obtained after the item is added. This Item ID can later be used to revise the item listing (such as to add to the item description).

SCENARIO 3:  SellingManagerProduct add/remove (seller’s perspective)

In this final example, we use Selling Manager product calls of the ql.io-based Trading API. Here, we use columns and name-value pairs (the first of the insert methods described in http://www.ql.io/docs/insert) to add a product:

result = insert into trading.AddSellingManagerProduct
         (Version,
          RequesterCredentials.eBayAuthToken,
          SellingManagerProductDetails.ProductName,
          SellingManagerProductDetails.QuantityAvailable,
          SellingManagerProductDetails.FolderID)
         values
         ("737",
          "MyAuthToken",
          "Harry Potter Book4",
          "50",
          "4651612545");

prodID = "{result.SellingManagerProductDetails.ProductID}";

And here, we delete a product:

return delete from trading.DeleteSellingManagerProduct where ProductID = "{prodID}";

Try it yourself!

Go ahead and fork https://github.com/ql-io/ql.io-ebay-mp-apis, and play with the tables and routes. For an overview and FAQs, see the readme. You can also ask questions and join the community discussion in the ql.io group.

 

{ 0 comments }

The actor model of computation has gained in reputation over the past decade as engineers have learned to grapple with concurrent and distributed systems to achieve scalability and availability.

I was in the zone about a year ago developing a job scheduling framework for an application called Nebula, which we use to manage keywords and SEM campaigns. A simple actor library with durable mailboxes implemented over Apache ZooKeeper seemed like a good fit at the time. Looking back, I can say that the solution has worked well. At the time, we had some good reasons (e.g., special throttling requirements) to develop our own framework instead of customizing an existing one. Faced with the same decision today, I would seriously consider building something around Akka.

We developed two primary patterns to make actors highly reliable. The first is to model critical actor state and mailboxes in ZooKeeper. The second is to use the ZooKeeper watch mechanism for actor supervision and recovery.

Example:  job type limiter

To explain how these patterns work, let’s look at the design of a specific actor, the job type limiter. Each job is of a given type, and type limits are set by an administrator to prevent too many jobs of a certain type from running simultaneously in the grid. The type limiter is a singleton in the compute grid and ensures that job type limits are enforced.

For example, let’s assume we have jobs that invoke third-party services to update the maximum bid for keyword ads. The call capacity is limited, so while the grid has enough slots to run hundreds of jobs, let’s say we can run only 10 concurrent jobs against a partner API. The type limiter ensures that the 11th job waits until one of the 10 that are running has completed.

For each job type, the type limiter keeps track of the running count and the limit. When the running count hits the limit, the gate is closed until a job of the corresponding type terminates. So how do we ensure this critical state is maintained correctly even if you cut the power cord to the system on which the type limiter is running? More importantly, how do we eliminate drift to prevent our running counts from being off by one? If a job host crashes, how do we guarantee that the type limiter receives a message indicating the job has ended?

Actors, mailboxes, queues, and watches

An actor receives and acts upon messages sent to its mailbox. Implementing an actor is quite easy because there are no locks to contend with – you get a message, handle it, and optionally send new messages or start new actors as a side-effect. But the mailbox, the limits, and the counter values must be durable and highly available for the type limiter to accurately and efficiently keep track of running counts in a fault-tolerant manner.

To model actor mailboxes, I started with a simple queue in ZooKeeper. A set of persistent sequential znodes represent messages, and their parent represents the queue. In order to wake up an actor when new messages arrive in the queue, I prepare another znode called a touch-pad. A sending actor creates the message znode, and then invokes setData() on the touch-pad to trigger a watch. The receiving actor establishes a data watch on the touch-pad and, when the watch is triggered, invokes getChildren() on the queue to collect messages.

 After retrieval from ZooKeeper, the messages are ordered by sequence number and entered into a local queue for processing by the target actor. The actor keeps track of messages it has received by writing the message sequence number back to the queue znode. It does so by calling the asynchronous setData() before actually handling the message. The reason for writing ahead is that messages are assumed to be idempotent; if processing led to an error that prevented the actor from writing the current message sequence number back to the queue, the actor may end up in an infinite recovery cycle: get-message, handle-message-fail, read-recover, get-message, handle-message-fail,….

Though writing back the sequence number ahead of processing prevents the infinite cycle, it’s possible that the message isn’t properly handled at all. What if the actor crashes right after the sequence number has been written, but before the message has been processed? There are different ways to recover from a situation like this. In our case the scheduling framework holds enough extra state in ZooKeeper to allow actors to recover and eventually self-heal. In some cases, an actor will rely on senders to repeat a message; in others, it will check its current state against expected values and take corrective measures if necessary. If an actor hasn’t received any messages for some time, it might run an admin cycle where it checks its state against expected values in its environment, or publishes its assumptions (sends a message) so other actors can confirm.

More concretely, the type limiter doesn’t just keep a running count for each job type. It also records the job handle for each job that it lets out of the gate. In fact, it records the handle in ZooKeeper before it lets the job out of the gate. The job handle encodes a reference to a job’s state (e.g., command) and a run ID. Each time a job is scheduled, its run ID is incremented, so the handle for each run is unique. The framework also serializes runs for the same job (in case of reruns or retries).

Other actors, called job monitors – one per host – are responsible for monitoring jobs running locally. They know whether a job process is still alive and progressing. Job monitors can confirm whether a job handle recorded by the type limiter refers to a job that is actually running. They also emit messages when they’re idle to indicate no jobs are running so actors like the type limiter can validate their state. If the whole cluster is idle, the type limiter verifies that there are no job handles in its records. In this manner, the type limiter may from time to time suffer a temporary glitch where its count might be off by one, but it quickly recovers and makes adjustments as it verifies its assumptions with help from other actors.

Maintaining state

There are aspects to programming actors that turn out to be non-trivial, especially if the application relies on consistent state to make decisions. Asynchronous message exchanges among many actors across a grid can easily lead to inconsistent views (or views that are eventually consistent, but consistent too late). While ZooKeeper guarantees ordered processing, and messages are delivered to actors in sequence by the library (you could say we are deviating from the pure actor model here), an actor can easily fall out of synch if it doesn’t carefully maintain extra, durable state for recovery and healing, as the type limiter demonstrates.

An actor may expect an acknowledgment after sending a message. But if it doesn’t receive any, it gives up after a timeout, and assumes the receiver or the network is unavailable. In such circumstances it is tempting to retry immediately, but better to back off for a while before retrying. In a larger cluster with a lot of asynchronous communication between actors, congestion or delays in processing are quickly ameliorated if actors back off immediately and try again after a random period of time. Throughput suffers a little, but overall stability and fault tolerance are improved – not to mention the quality of uninterrupted sleep for certain humans.

What happens when the host on which the type limiter is running shuts down or crashes? A supervisor takes note and starts a new actor on another system. The new type limiter instance reads the current message sequence number, loads all job handles, calculates the running counts by job type (i.e., refreshes its caches), reads and deletes messages with sequence numbers less than or equal to the current sequence number, and then starts processing new messages – writing back the sequence number before it handles each message. Got it; but how does it really work? What if the supervisor dies?

Before diving in further, this is a good place to give a shout-out to the folks who designed, developed and maintain ZooKeeper. I think it’s one of the most useful open-source contributions for distributed computing of the past decade. Most of the fault tolerance in Nebula can be reduced to simple patterns that make use of the ZooKeeper model (e.g., ephemeral, session-bound znodes), its serialized atomic updates, and the associated watch mechanism.

Ephemeral znodes, master election, and actor supervision

Now let’s examine actor supervision, the second of the two patterns, and one that relies on ZooKeeper’s ephemeral znodes and watch mechanism. We start with master election, where any number of participants compete to attain mastership for a certain role. You can think of these participants as little actors that come to life under prescribed circumstances – for example, when a system starts up. We require that each election participant run on a different host in the grid.

Let’s consider the role that gives the master the privilege of starting a type limiter. Each role is uniquely identified by a well-known path in ZooKeeper where ephemeral sequential znodes are created by election participants. The winner is the participant who creates the znode with the lowest sequence number. This participant is elected master and acts as a local watchdog responsible for the life cycle of the main actor, in this case the type limiter. If the system shuts down cleanly, the type limiter terminates normally, and the master deletes its znode in ZooKeeper.

The deletion triggers a watch that’s established by another participant – we call it the slave. The slave is the participant who creates the znode with the second-lowest sequence number during the election. Instead of walking away, the slave establishes a watch on the master’s znode and acts like a supervisor for the master and its main actor. When the watch is triggered, the slave is notified, and it immediately takes over the master role by starting a new instance of the type limiter. In other words, the slave mutates into a master and now acts as a watchdog over the type limiter.

 If a master and its main actor die abruptly because their host crashes, their ZooKeeper session eventually times out, and ZooKeeper automatically deletes the corresponding ephemeral znode. The watch is triggered, and the slave takes over as described above. Except for a minor delay due to the session timeout, and the possibility of incomplete message handling by a crashing actor, the fail-over and recovery are the same. Bring on the chaos monkeys and see if you can tell anything unusual is happening in the grid – peeking at the logs is cheating.

We’re not done yet. There’s another actor called a grid monitor, which comes to life at random intervals on each host in the grid. It checks the health of local actors and then participates in an election to run a grid-level scan. One of the critical health checks is to ensure every actor has at least one active master and one active slave. The grid monitor doesn’t communicate with other hosts; it just scans ZooKeeper to see if the expected master and slave znodes exist for each role. If it detects a missing slave (not uncommon after a host goes down and associated slaves take over master roles), it starts up a new participant, which can take on the slave’s supervisor role (unless the master for the same role is running on the local host). With the grid monitor’s help, we can ensure that most fail-overs are handled immediately. Even in the rare case where a slave and then its master are terminated in short order, grid monitor instances will bring them back to life on other hosts within a few seconds.

Actor supervision in Nebula is thus quite similar in principle to the supervisor hierarchies defined in Erlang OTP. The master is the immediate local supervisor acting as a watchdog, the slave operates at the next level and runs on a different host to support rapid fail-over, and the grid monitor acts like a meta-supervisor across the whole grid. One of the most useful properties of the Nebula grid is that all hosts are created equal. There is no single point of failure. An actor can run on any host, and the grid monitor instances ensure the load is balanced across the grid.

Sequential znode suffix limits

Let’s look at one more detail regarding durable mailboxes in ZooKeeper. If you’ve used ZooKeeper, you know that the names of sequential znodes have a numeric 10-digit suffix. A counter value stored in the parent is incremented each time a new znode is created. If you rely on ZooKeeper sequence numbers to order messages in a queue, it’s easy to see that you may eventually exhaust the 10-digit integer limit (2^31-1) in a busy queue. You can deal with this limitation in different ways.

For example, if the grid runs idle from time to time, an actor can delete and recreate its mailbox after the sequence number reaches a certain threshold. This will reset the sequence number to zero. Another solution would be to use different queues and a signal that indicates to senders which one is active. One might also consider implementing a mailbox as a fixed-size ring buffer with no dependency on sequence numbers; however, before sending a new message, senders would have to determine which znodes are free and which have yet to be processed.

In any case, setting an upper bound on the number of messages in a mailbox is a good idea. Otherwise, a group of senders could easily overwhelm a receiver. One way to do this is to have the message dispatcher invoke getData() on the queue’s znode before sending a message. The returned value could specify the queue’s message limit, and the dispatcher could retrieve the number of messages via getNumChildren() on the znode’s stat object and compare it with the limit. If the number of children had already reached the limit, the dispatcher would back off, and the sender would try again after some delay. Again, we trade off a little throughput for higher resilience with a simple backpressure mechanism.

Optimizing for short messages

We haven’t discussed how to store the message payload, and though it may be obvious that each znode in the queue can have data representing payload, I should highlight an optimization that works for very small messages. If your message can be encoded in a short string, say less than 50 characters, you can store it as a prefix in the znode name. On the receiving end, the messages can then be retrieved with a single call to getChildren() without extra calls to getData() on each child.

Design considerations

By now you’ve probably concluded that ZooKeeper doesn’t really provide an optimal foundation for messaging in general. I readily admit that your assessment is correct. If you’re dealing with a large grid where, say, tens of millions of messages are exchanged per day, or the message payloads are large, look for another solution. Also keep in mind that ZooKeeper doesn’t scale horizontally. It was designed as a highly available coordination service with very specific guarantees that limit its scalability. So if you’re building a massive grid, consider partitioning into autonomous cells and coordinate locally as much as possible. Global coordination, especially across data centers, requires special treatment.

The Nebula grid is relatively small (about 100 virtual hosts), and we usually have no more than 10 concurrent jobs running on any given host, so we are coordinating the progress of at most 1000 jobs at a time. For a grid of this size and kind of work load, our actor-based scheduling framework performs very well. The overhead in scheduling usually amounts to less than 5% of overall processing. We run ZooKeeper out of the box on an ensemble of five basic servers, and we use a database as an archive, where the state of completed jobs can be looked up if necessary. A simple REST service is provided for job submission and monitoring.

{ 1 comment }

I would like to share my experiences with troubleshooting a recent issue on Oracle databases in our datacenters. While the details in this post involve batch jobs, the lessons learned can apply to any business having Oracle SQL traffic between two or more datacenters.

The problem

We noticed that the running time on a set of batch jobs had suddenly increased substantially. Some increase was understandable, since we had moved a significant percentage of the database hosts accessed by these batch jobs to another eBay data center. However, the latency had jumped beyond what had been expected or could be explained.

The investigation

The first step was to check our logs for the most expensive SQL statements.

For batch jobs, it is very easy to distinguish cross-colo and inter-colo transactions, since all batches run on boxes located in the same datacenter. I immediately found a discrepancy between the SQL times for inter-colo calls compared to cross-colo calls. I expected to see a consistent difference of 20 ms for both minimum and average times; 20 ms is the network latency between the two datacenters involved. In reality, the differences far exceeded 20 ms. The following table shows the statistics collected for approximately 8500 calls in an hour:

Statistic

Inter-Colo

Cross-Colo

Exec Count

8600

8490

Minimum/Maximum/Average

1.00/119.00/5.70

21.00/487.00/36.25

Total Time

49028.00 ms

307758.00 ms

On further investigation, I discovered an additional complication. For some queries, the time difference was consistently 20 ms, and for other queries it was consistently 40 ms.

Next, I looked into database performance. I found that the query execution time was the same on all of our database hosts, and that the issue was reproducible independent of the batch jobs, using only tnsping and the Oracle SQL client.

At this point, I almost didn’t suspect a database issue at all; the statistics showed the same results for all DB hosts in the datacenter.  Instead, I suspected an issue related to either the network or network security.

A possible firewall capacity problem was ruled out because firewall latency contribution is constant, and much lower than the latency introduced by the network. When we looked at the network, we saw that ping consistently showed 20 ms, regardless of the time of day or the traffic activity. Of course, in real life we do not merely exchange pings, but rather use actual SQL queries to communicate with DB hosts, so we looked at a tcpdump analysis of the DB hosts next. 

As the root cause of our problem was still unknown, there was no guarantee that the issue could be reproduced by using tcpdump while running queries manually. After trying twice and not getting enough information, we concluded the only reliable way to reproduce the issue was to run tcpdump while the batch job was running.

The last thing to do was to wait until the next high-volume batch run. Finally, we got enough data, and we almost immediately saw an answer (saving us from having to dive even deeper and look at MTU or window size as the cause of additional round trips).  Here is the analysis of the packets for a single query:

No.  Time     Source  Destination  Protocol  Length  Info
1    0.000000 [IP1]   [IP2]        TNS       549     Request, Data (6), Data
2    0.000714 [IP2]   [IP1]        TNS       852     Response, Data (6), Data
3    0.021257 [IP1]   [IP2]        TNS       71      Request, Data (6), Data
4    0.021577 [IP2]   [IP1]        TNS       153     Response, Data (6), Data

Processing the query required four packets spanning about 21.6 ms. Since round-trip-time (RTT) is about 20ms, prior to the first packet there would be 10 ms—the transit time from when the packet left the first datacenter and arrived in the second datacenter, where it was captured as packet 1. After packet 4, there was another 10 ms—the transit time back to the first datacenter. The total RTT for these four packets was therefore about 41.6ms.

We saw that the SELECT statement was in packet 1 and that the response appeared to be in packet 2. We didn’t know what packets 3 and 4 were, but their small size suggested they were some kind of acknowledgement.

The solution

At this point, we had confirmation that for each query, there were four network packets instead of two. The fact that every execution took two round trips indicated that more soft-parsing was occurring than was necessary. The question was, why were commonly running queries getting parsed multiple times? Once parsed, a query should be added to the app server statement cache. We increased the statement cache size parameter, which dictates how many parsed SQL statements can remain in cache at a given time, from 80 to 500.

When the tuned settings went into effect, we were able to eliminate an extra cross-colo round trip for a majority of SQL requests. We immediately saw a positive impact, observing a 30% reduction in execution time for all instances of a batch job.

This experience demonstrates how even small changes in DB settings can positively affect execution response times. And for me, it was both challenging and interesting to drive problem resolution in an area where I did not have much prior experience.

{ 8 comments }

The basic purpose of HTTP caching is to provide a mechanism for applications to scale better and perform faster. But HTTP caching is applicable only to idempotent requests, which makes a lot of sense; only idempotent and nullipotent requests yield the same result when run multiple times. In the HTTP world, this fact means that GET requests can be cached but POST requests cannot.

However, there can be cases where an idempotent request cannot be sent using GET simply because that request exceeds the limits imposed by popular Internet software. For example, search APIs typically take a lot of parameters, especially for a product with numerous characteristics, all of which have to be passed as parameters. This situation leads to the question, what then is the recommended way of communicating over the wire when the request contains more parameters than the “permitted” length of a GET request?  Here are some of the answers:

  • You may want to re-evaluate the interface design if it takes a large number of parameters. Idempotent requests typically require a number of parameters that falls well within the GET limits.
  • There is no such hard and fast limit imposed by the specification, so the HTTP specification is not to be blamed. Internet clients and servers do impose a limit. Some of them support up to 8 KB, but the safe bet is to keep the length under 2 KB.  
  • Send the body in a GET request.

At this point, we come to the realization that all of the above answers are unsatisfactory. They do not address the underlying problem or change the situation whatsoever.

HTTP caching basics

To appreciate the rest of this topic, let’s first go through the caching mechanism quickly.

HTTP caching involves the client, the proxy, and the server. In this post, we will discuss mainly the proxy, which sits between the client and server. Typically, reverse proxies are deployed close to the server, and forward proxies close to the client. Figure 1 shows the basic topology. From the figure, it should be clear that a cache-hit in the forward proxy saves bandwidth and reduces round-trip time (RTT) and latency; and a cache-hit at the reverse proxy reduces the load on the server.

forward cache proxy between client and internet, and reverse cache proxy between internet and server

Figure 1. Basic topology of proxy cache deployment in a network

The HTTP specification allows a response from cache if one of the following is satisfied:

  1. The cached response is consistent with the origin server’s response, had the origin server handled the request – in short, the proxy can guarantee a semantic equivalence between the cached response and the origin server’s response.
  2. The freshness is acceptable to the client.
  3. The freshness is not acceptable to the client but an appropriate warning is attached.

The specification has a number of flavors and associated headers and controls. Further details of the specification are available at http://tools.ietf.org/html/rfc2616, and of cache controls at http://tools.ietf.org/html/rfc2616#section-14.9.

A typical proxy caches idempotent requests. The proxy gets the request, examines it for cache headers, and sends it to the server. Then the proxy examines the response and, if it is cacheable, caches it with the URL as the key (along with some headers in certain cases) and the response as the value.  This scheme works well with GET requests, because for the same URL repeated invocation does not change the response. Intermediaries can make use of this idempotency to safely cache GET requests. But this is not the case with an idempotent POST request. The URL (and headers) cannot be used as the key because the response could be different – the same URL, but with a different body.

POST body digest

The solution is to digest the POST body (along with a few headers), append the URL with the digest, and use this digest instead of just the URL as the cache key (see Figure 2). In other words, the cache key is modified to include the payload in addition to the URL. Subsequent requests with the same payload will hit the cache rather than the origin server. In practice, we add a few headers and their values to the cache key to establish uniqueness, as appropriate for the use case. Although we don’t have a specific algorithm recommendation, if MD5 is used to digest the body, then Content-MD5 could be used as a header.

Figure 2. Digest-based cache

Now the problem is to distinguish idempotent POST requests from non-idempotent ones. There are a few ways to handle this problem:

  • Configure URLs and patterns in the proxy so that it does not cache if there is a match.
  • Add context-aware headers to distinguish between different requests.
  • Base the cache logic on some naming conventions. For example, APIs with names that start with words like “set”, “add”, and “delete” are not cached and will always hit the origin server.

Handling Non-Idempotent Requests

Here’s how we solve the problem of non-idempotent requests:

  • Hit the origin server under any of the following circumstances:
    • if the URL is in the configured “DO NOT CACHE” URL list
    • if the digests do not match
    • after the cache expiry time
    • whenever a request to revalidate is received
  • Attach a warning saying the content could be stale, thereby accommodating the specification.
  • Allow users to hit the origin server by using our client-side tools to turn off the proxy.

We implemented this solution with Apache Traffic Server, customized to cache POST requests and the cache key.

Advantages

This solution provides the following benefits:

  • We speed up repeated requests by not performing the round trip from the proxy to the origin server.
  • As a hosted solution, one user’s request speeds not only that user’s subsequent requests but also the requests from other users, provided that the cache is set to be shared across requests and that the header permits it.
  • We save the bandwidth between the proxy and the origin server.

Here is a performance comparison of an API invocation deployed as a forward proxy and having a data transfer sum of 20 KB:

No Caching

With Caching

188 ms

90 ms

 Variants of this solution can be used to cache the request or response or both at the forward proxy, reverse proxy, or both.

Cache handshake

To get the full benefit, in this solution we deploy a forward proxy at the client end and a reverse proxy at the server end. The client sends the request to the forward proxy, and the proxy does a cache lookup. In the case of a cache miss, the forward proxy digests the body and sends only the digest to the reverse proxy. The reverse proxy looks for a match in the request cache and, if found, sends that request to the server. The difference is we don’t send the full request from the forward proxy to the reverse proxy.   

The server sends the response to the reverse proxy, which digests the response and sends only the digest – not the full response (see Figure 3). Essentially, we are saving the POST data from being sent, at the cost of an additional round trip (of the digest key) if the result is a cache miss. In most networks, the RTT between the client and the forward proxy and between the server and the reverse proxy is negligible when compared to the RTT between the client and the server. This fact is because typically the forward proxy and the client are close to each other and in the same LAN; likewise, the reverse proxy and the server are close to each other and in the same LAN. The network latency is between the proxies, where data travels through the Internet.

 Sending only digests across the Internet

Figure 3. Cache handshake

This solution can also be applied to just one proxy on either side, at the cost of client or server modification. In such cases, the client or server will have to send the digest instead of the whole body. With the two-proxy architecture, the client and server remain unchanged and, as a result, any HTTP client or server can be optimized.

POST requests typically are large in size. By not having the proxy send the whole request and the whole response, we not only save bandwidth but also save the response time involved in large requests and responses.

Although the savings might seem trivial at first glance, they are not so when it comes to real traffic loads. As we saw, even if the POST response is not cached, we still save bandwidth by not sending the payload. This solution gets even more interesting with distributed caches deployed within the network.

Advantages

Here is a summary of the benefits of a cache-handshaking topology:

  • The request payload travels to the reverse proxy only if there is a cache miss. As a result, both the RTT and bandwidth are improved. The same applies to the response.
  • As a hosted solution, one request will help save the other user requests travelling between the proxies.
  • There is no technical debt involved. If you remove the proxies, you have a fully HTTP-compliant solution.

Conclusion

HTTP caching is not just for GET requests. By digesting the POST body, handling non-idempotent requests, and distinguishing between idempotent and non-idempotent requests, you can realize a substantial savings in round trips and bandwidth. For further savings, you can employ cache handshaking to send only the digest across the Internet—ideally, by implementing a forward proxy on the client side and a reverse proxy on the server side, but one proxy is sufficient to make a difference.

{ 11 comments }

In the first part, we covered a few fundamental practices and walked through a detailed example to help you get started with Cassandra data model design. You can follow Part 2 without reading Part 1, but I recommend glancing over the terms and conventions I’m using. If you’re new to Cassandra, I urge you to read Part 1.

Some of the practices listed below might evolve in the future. I’ve provided related JIRA ticket numbers so you can watch any evolution.

With that, let’s get started with some basic practices!

Storing values in column names is perfectly OK

Leaving column values empty (“valueless” columns) is also OK.

It’s a common practice with Cassandra to store a value (actual data) in the column name (a.k.a. column key), and even to leave the column value field empty if there is nothing else to store. One motivation for this practice is that column names are stored physically sorted, but column values are not.

Notes:

  • The maximum column key (and row key) size is 64KB.  However, don’t store something like ‘item description’ as the column key!
  • Don’t use timestamp alone as a column key. You might get colliding timestamps from two or more app servers writing to Cassandra. Prefer timeuuid (type-1 uuid) instead.
  • The maximum column value size is 2 GB. But becuase there is no streaming and the whole value is fetched in heap memory when requested, limit the size to only a few MBs.  (Large objects are not likely to be supported in the near future – Cassandra-265. However, the Astyanax client library supports large objects by chunking them.)

Leverage wide rows for ordering, grouping, and filtering

But don’t go too wide.

This goes along with the above practice. When actual data is stored in column names, we end up with wide rows.

Benefits of wide rows:

  • Since column names are stored physically sorted, wide rows enable ordering of data and hence efficient filtering (range scans). You’ll still be able to efficiently look up an individual column within a wide row, if needed.
  • If data is queried together, you can group that data up in a single wide row that can be read back efficiently, as part of a single query. As an example, for tracking or monitoring some time series data, we can group data by hour/date/machines/event types (depending on the requirements)  in a single wide row, with each column containing granular data or roll-ups. We can also further group data within a row using super or composite columns as discussed later.
  • Wide row column families are heavily used (with composite columns) to build custom indexes in Cassandra.
  • As a side benefit, you can de-normalize a one-to-many relationship as a wide row without data duplication. However, I would do this only when data is queried together and you need to optimize read performance.

Example:

Let’s say we want to store some event log data and retrieve that data hourly. As shown in the model below, the row key is the hour of the day, the column name holds the time when the event occurred, and the column value contains payload. Note that the row is wide and the events are ordered by time because column names are stored sorted. Granularity of the wide row (for this example, per hour rather than every few minutes) depends on the use case, traffic, and data size, as discussed next.

But not too wide, as a row is never split across nodes:

It’s hard to say exactly how wide a wide row should be, partly because it’s dependent upon the use case. But here’s some advice:

Traffic: All of the traffic related to one row is handled by only one node/shard (by a single set of replicas, to be more precise). Rows that are too “fat” could cause hot spots in the cluster – usually when the number of rows is smaller than the size of the cluster (hope not!), or when wide rows are mixed with skinny ones, or some rows become hotter than others. However, cluster load balancing ultimately depends on the row key selection; conversely, the row key also defines how wide a row will be. So load balancing is something to keep in mind during design.

Size: As a row is not split across nodes, data for a single row must fit on disk within a single node in the cluster. However, rows can be large enough that they don’t have to fit in memory entirely. Cassandra allows 2 billion columns per row. At eBay, we’ve not done any “wide row” benchmarking, but we model data such that we never hit more than a few million columns or a few megabytes in one row (we change the row key granularity, or we split into multiple rows). If you’re interested, Cassandra Query Plans by Aaron Morton shows some performance concerns with wide rows (but note that the results can change in new releases).

However, these caveats don’t mean you should not use wide rows; just don’t go extra wide.

Note: Cassandra-4176 might add composite types for row key in CQL as a way to split a wide row into multiple rows. However, a single (physical) row is never split across nodes (and won’t be split across nodes in the future), and is always handled by a single set of replicas. You might also want to track Cassandra-3929, which would add row size limits for keeping the most recent n columns in a wide row.

Choose the proper row key – it’s your “shard key”

Otherwise, you’ll end up with hot spots, even with RandomPartitioner.

Let’s consider again the above example of storing time series event logs and retrieving them hourly. We picked the hour of the day as the row key to keep one hour of data together in a row. But there is an issue: All of the writes will go only to the node holding the row for the current hour, causing a hot spot in the cluster. Reducing granularity from hour to minutes won’t help much, because only one node will be responsible for handling writes for whatever duration you pick. As time moves, the hot spot might also move but it won’t go away!

Bad row key:  “ddmmyyhh”

One way to alleviate this problem is to add something else to the row key – an event type, machine id, or similar value that’s appropriate to your use case.

Better row key: “ddmmyyhh|eventtype”

Note that now we don’t have global time ordering of events, across all event types, in the column family. However, this may be OK if the data is viewed (grouped) by event type later. If the use case also demands retrieving all of the events (irrespective of type) in time sequence, we need to do a multi-get for all event types for a given time period, and honor the time order when merging the data in the application.

If you can’t add anything to the row key or if you absolutely need ‘time period’ as a row key, another option is to shard a row into multiple (physical) rows by manually splitting row keys: “ddmmyyhh | 1″, “ddmmyyhh | 2″,… “ddmmyyhh | n”, where n is the number of nodes in the cluster. For an hour window, each shard will now evenly handle the writes; you need to round-robin among them. But reading data for an hour will require multi-gets from all of the splits (from the multiple physical nodes) and merging them in the application. (An assumption here is that RandomPartitioner is used, and therefore that range scans on row keys can’t be done.)

Keep read-heavy data separate from write-heavy data

This way, you can benefit from Cassandra’s off-heap row cache.

Irrespective of caching and even outside the NoSQL world, it’s always a good practice to keep read-heavy and write-heavy data separate because they scale differently.

Notes:

  • A row cache is useful for skinny rows, but harmful for wide rows today because it pulls the entire row into memory. Cassandra-1956 and Cassandra-2864 might change this in future releases. However, the practice of keeping read-heavy data separate from write-heavy data will still stand.
  • Even if you have lots of data (more than available memory) in a column family but you also have particularly “hot” rows, enabling a row cache might be useful.

Make sure column key and row key are unique

Otherwise, data could get accidentally overwritten.

  • In Cassandra (a distributed database!), there is no unique constraint enforcement for row key or column key.
  • Also, there is no separate update operation (no in-place updates!). It’s always an upsert (mutate) in Cassandra. If you accidentally insert data with an existing row key and column key, the previous column value will be silently overwritten without any error (the change won’t be versioned; the data will be gone).

Use the proper comparator and validator

Don’t just use the default BytesType comparator and validator unless you really need to.

In Cassandra, the data type for a column value (or row key)  is called a Validator. The data type for a column name is called a Comparator.  Although Cassandra does not require you to define both, you must at least specify the comparator unless your column family is static (that is, you’re not storing actual data as part of the column name), or unless you really don’t care about the sort order.

  • An improper comparator will sort column names inappropriately on the disk. It will be difficult (or impossible) to do range scans on column names later.
  • Once defined, you can’t change a comparator without rewriting all data. However, the validator can be changed later.

See comparators and validators in the Cassandra documentation for the supported data types.

Keep the column name short

Because it’s stored repeatedly.

This practice doesn’t apply if you use the column name to store actual data. Otherwise, keep the column name short, since it’s repeatedly stored with each column value. Memory and storage overhead can be significant when the size of the column value is not much larger than the size of the column name – or worse, when it’s smaller. 

For example, favor ‘fname’ over ‘firstname’, and ‘lname’ over ‘lastname’.

Note: Cassandra-4175 might make this practice obsolete in the future.

Design the data model such that operations are idempotent

Or, make sure that your use case can live with inaccuracies or that inaccuracies can be corrected eventually.

In an eventually consistent and fully distributed system like Cassandra, idempotent operations can help – a lot. Idempotent operations allow partial failures in the system, as the operations can be retried safely without changing the final state of the system. In addition, idempotency can sometimes alleviate the need for strong consistency and allow you to work with eventual consistency without causing data duplication or other anomalies. Let’s see how these principles apply in Cassandra. I’ll discuss partial failures only, and leave out alleviating the need for strong consistency until an upcoming post, as it is very much dependent on the use case.

Because of  Cassandra’s fully distributed (and multi-master) nature, write failure does not guarantee that data is not written, unlike the behavior of relational databases. In other words, even if the client receives a failure for a write operation, data might be written to one of the replicas, which will eventually get propagated to all replicas. No rollback or cleanup is performed on partially written data. Thus, a perceived write failure can result in a successful write eventually. So, retries on write failure can yield unexpected results if your model isn’t update idempotent.

Notes:

  • “Update idempotent” here means a model where operations are idempotent. An operation is called idempotent if it can be applied one time or multiple times with the same result.
  • In most cases, idempotency won’t be a concern, as writes into regular column families are always update idempotent. The exception is with the Counter column family, as shown in the example below. However, sometimes your use case can model data such that write operations are not update idempotent from the use case perspective. For instance, in part 1, User_by_Item and Item_by_User in the final model are not update idempotent if the use case operation ‘user likes item’ gets executed multiple times, as the timestamp might differ for each like. However, note that a specific instance of  the use case operation ‘user likes item’ is still idempotent, and so can be retried multiple times in case of failures. As this is more use-case specific, I might elaborate more in future posts.
  • Even with a consistency level ONE, write failure does not guarantee data is not written; the data still could get propagated to all replicas eventually.

Example

Suppose that we want to count the number of users who like a particular item. One way is to use the Counter column family supported by Cassandra to keep count of users per item. Since the counter increment (or decrement) is not update idempotent, retry on failure could yield an over-count if the previous increment was successful on at least one node. One way to make the model update idempotent is to maintain a list of user ids instead of incrementing a count, as shown below. Whenever a user likes an item, we write that user’s id against the item; if the write fails, we can safely retry. To determine the count of all users who like an item, we read all user ids for the item and count manually.

 

In the above update idempotent model, getting the counter value requires reading all user ids, which will not perform well (there could be millions). If reads are heavy on the counter and you can live with an approximate count, the counter column will be efficient for this use case. If needed, the counter value can be corrected periodically by counting the user ids from the update idempotent column family.

Note: Cassandra-2495 might add a proper retry mechanism for counters in the case of a failed request. However, in general, this practice will continue to hold true. So make sure to always litmus-test your model for update idempotency.

Model data around transactions, if needed

But this might not always be possible, depending on the use case.

Cassandra has no multi-row, cluster-wide transaction or rollback mechanism; instead, it offers row-level atomicity. In other words, a single mutation operation of columns for a given row key is atomic. So if you need transactional behavior, try to model your data such that you would only ever need to update a single row at once. However, depending on the use case, this is not always doable. Also, if your system needs ACID transactions, you might re-think your database choice.

Note: Cassandra-4285 might add an atomic, eventually consistent batch operation.

Decide on the proper TTL up front, if you can

Because it’s hard to change TTL for existing data.

In Cassandra, TTL (time to live) is not defined or set at the column family level. It’s set per column value, and once set it’s hard to change; or, if not set, it’s hard to set for existing data. The only way to change the TTL for existing data is to read and re-insert all the data with a new TTL value. So think about your purging requirements, and if possible set the proper TTL for your data upfront.

Note: Cassandra-3974 might introduce TTL for the column family, separate from column TTL.

Don’t use the Counter column family to generate surrogate keys

Because it’s not intended for this purpose.

The Counter column family holds distributed counters meant (of course) for distributed counting. Don’t try to use this CF to generate sequence numbers for surrogate keys, like Oracle sequences or MySQL auto-increment columns. You will receive duplicate sequence numbers! Most of the time you really don’t need globally sequential numbers. Prefer timeuuid (type-1 uuid) as surrogate keys. If you truly need a globally sequential number generator, there are a few possible mechanisms; but all will require centralized coordination, and thus can impact the overall system’s scalability and availability.

Favor composite columns over super columns

Otherwise, you might hit performance bottlenecks with super columns.

A super column in Cassandra can be used to group column keys, or to model a two-layer hierarchy. However, super columns have the following implementation issues and are therefore becoming less favorable.

 Issues:

  • Sub-columns of a super column are not indexed. Reading one sub-column de-serializes all sub-columns.
  • Built-in secondary indexing does not work with sub-columns.
  • Super columns cannot encode more than two layers of hierarchy.

Similar (even better) functionality can be achieved by the use of the Composite column. It’s a regular column with sub-columns encoded in it. Hence, all of the benefits of regular columns, such as sorting and range scans, are available; and you can encode more than two layers of hierarchy.

Note: Cassandra-3237 might change the underlying super column implementation to use composite columns. However, composite columns will still remain preferred over super columns.

The order of sub-columns in composite columns matters

Because order defines grouping.

For example, a composite column key like <state|city> will be stored ordered first by state and then by city, rather than first by city and then by state. In other words, all the cities within a state are located (grouped) on disk together.

Favor built-in composite types over manual construction

Because manual construction doesn’t always work.

Avoid manually constructing the composite column keys using string concatenation (with separators like “:” or “|”). Instead, use the built-in composite types (and comparators) supported by Cassandra 0.8.1 and above.

Why?

  • Manual construction won’t work if sub-columns are of different data types. For example, the composite key <state|zip|timeuuid> will not be sorted in a type-aware fashion (state as string, zip code as integer, and timeuuid as time).
  • You can’t reverse the sort order on components in the type – for instance, with the state ascending and the zip code descending in the above key.

Note: Cassandra built-in composite types come in two flavors:

  • Static composite type: Data types for each part of a composite column are predefined per column family.  All the column names/keys within a column family must be of that composite type.
  • Dynamic composite type: This type allows mixing column names with different composite types in a column family or even in one row.

Find more information about composite types at Introduction to composite columns.

Favor static composite types over dynamic, whenever possible

Because dynamic composites are too dynamic.

If all column keys in a column family are of the same composite type, always use static composite types. Dynamic composite types were originally created to keep multiple custom indexes in one column family. If possible, don’t mix different composite types in one row using the dynamic composite type unless absolutely required. Cassandra-3625 has fixed some serious issues with dynamic composites.

Note: CQL 3 supports static composite types for column names via clustered (wide) rows. Find more information about how CQL 3 handles wide rows at DataStax docs.

Enough for now. I would appreciate your inputs to further enhance these modeling best practices, which guide our Cassandra utilization today.

–  Jay Patel, architect@eBay.

 

{ 10 comments }

eBay Analytics Platform & Delivery (APD) has a roughly 100-person division within the eBay China Technology Center of Excellence (COE). In September of 2008, the APD COE team started its first  Scrum pilot project – one of the earliest Agile pilots within eBay. Since then, the team has completed its transition to the Scrum framework of Agile software development.

Scrum, a term borrowed from rugby, embodies the spirit of the Agile Manifesto and has become mainstream within the Agile community. As the rugby metaphor implies, a Scrum team works as a jelled unit —  crossing functional silos, holding each other accountable, supporting each other —  to move the ball forward and achieve the goal together. Play as a team, win as a team. 

The individual performance dilemma

Soon after the pilot, people started to realize that Scrum was not simply “yet another development process.” Agile/Scrum represents a new way of working in a broader sense and demands changes on every level — not only day-to-day development but also the way we do management. A big question soon emerged from senior management: “Scrum is all about the ‘team’. People self-organize and share the team’s performance. But how about the individual’s performance within the team? I’m not supposed to micro-manage each person, but it seems the Scrum team becomes a ‘black hole’ to me, and I lose sight of each team member’s performance behind the ‘event horizon’.”

Not coincidentally, this question has been a common one for companies adopting Agile since the beginning. It’s almost an inevitable topic at every Agile event. After hearing opinions, arguments, and debates ranging from setting “mission impossible” goals for individuals to completely abandoning individual appraisals, the APD China management team drew its own conclusions. In late 2011, managers started implementing a new framework for individual and team performance evaluation. The framework has four main components: product success (which is shared by team members), peer feedback (which distinguishes among team members), self-development, and management tasks.  Among these components, the peer feedback within teams becomes a much more significant measure of individual performance. This blog post focuses on the peer feedback component.

Using peer feedback to evaluate individual performance is not new. However, it becomes much more useful and meaningful in an Agile context. The basic idea behind it is that team members who work closely with you on a daily basis are the ones who know the most about your performance in making the team successful; thus, they can give the most meaningful feedback and evaluation. In addition, performance is not one or two managers’ judgment any more, but rather the aggregated evaluation of the other members of the same team. The”wisdom of the crowd” based on day-to-day facts tends to generate better accuracy.

How did we do it?

We needed a peer feedback system that would support summarizing, quantifying, and analyzing any number of responses. We decided to start with SurveyMonkey, a simple and free solution. Then we developed our own internal system to better suit our requirements.

The final step in implementing peer feedback for Scrum teams was determining what to ask. Since the feedback form is basically a survey, we held a “survey about the survey” to learn what people thought should be asked about their own performance. The results boiled down to the following eight question areas:

  • Q1:  Communication — This is the foundation of human interaction and teamwork.
  • Q2:  Quality — One’s defect has a negative impact on the other team members, and ultimately on the overall quality and productivity of the team.
  • Q3:  Collaboration — We value building consensus and seeking win-win outcomes over just getting one’s own work done (i. e., “self-suboptimizing”:  focusing on one’s own tasks rather than considering the team as a whole).
  • Q4:  Continuous Improvement — By improving oneself and helping others to improve, the capabilities of the overall team increase.
  • Q5:  Role Sharing — The willingness and ability to share responsibilities bi-directionally outside of one’s functional silo makes the team more robust.
  • Q6:  Energizing — An individual can positively influence the team, especially in tough times, instead of finger-pointing and dragging down team morale.
  • Q7:  Overall Satisfaction — “If you had a choice, would you continue working with this team member?”
  • Q8:  Other Comments

Questions 1 through 6 represent the teamwork behaviors that we value the most. You might wonder why there’s no question about how much a team member contributes to the team. There’s a good reason for that. Measuring actual work contribution and delivery, such as the complexity of a completed task, is related to job seniority more than to teamwork behaviors. A newly graduated junior programmer might not be able to independently design an excellent solution to a complex requirement; however, that person can be the glue enabling the team to come up with a brilliant solution by working together. On the other hand, a senior architect might prevent the team’s success by not listening to a second opinion due to ego or status. We want highly functioning teams that can produce more and better results than the individuals combined could do, but that outcome is impossible without the positive teamwork behaviors that we believe in. That’s why the questions are all about teamwork; Agile/Scrum is about teamwork.

Questions 1 through 7 are all multiple choice on a scale of 1 to 10. Question 8 is free-form text. After we replaced SurveyMonkey with our own system, we added a free-text comment area for each multiple choice. The combination provides the advantage of both quantifying and qualifying, enabling us to do data analysis as well as to drill down to detailed information and facts. The way we organize each question’s answer set also lets the respondent give relative feedback by comparing each team member on one dimension. For example:

Question 1, with rows for separate ratings of each team member

The peer feedback survey is sent monthly — a pretty high frequency, which allows us to measure the “pulse” of performance and take necessary actions in time. Another reason for this frequency is to avoid the phenomenon of only the most recent performance counting, while the past gets vague in people’s memory. The performance trend over a longer time, such as a half-year, is now visible; of course, a consistently upward trend indicates better performance compared to a fluctuant one, although the average over such a period may be the same.

The feedback survey is strictly anonymous and confidential. I believe that a perfectly mature team could openly discuss each other’s performance face to face, and that comments on areas for improvement could be treated as gifts without hard feelings. However, it’s more important to create a safe environment for people to offer frank and open evaluation; there are also culture and background factors to consider. Another reason for private feedback is to avoid the “anchoring effect,” where the first comment in a group discussion anchors the following ones.

What do we get from it?

After piloting in SurveyMonkey for several months and then officially switching to the internal system four months ago, we’ve gained much more than we had expected. The peer feedback results not only help the management team get much more insight into each individual’s performance, but also help identify and fix team-level issues that have more profound and meaningful impact on our ability to improve our work.

As a data analytics organization, naturally we utilize techniques for visualizing the survey results. First, let’s see the most detailed information at the individual level – which was our initial motivation for creating this peer feedback system. Each solid blue/red line represents one question, and the dashed line is the overall trend over the four months:

dashed line showing 4-month trend per individual

Here’s a deep dive into the question results for each individual:

4-month trend line per question, per team member

Now let’s move to a bit higher level to see the full picture of each team and its team members: 

heat map per question, per team member

The above “heat map” reveals a lot of information. For example:

  • Different teams have different characteristics. The members of Team 12 evaluate each other similarly and tend to measure high on the scale, indicating good teamwork sentiment. Team 10, on the other hand, might still be in an earlier forming/storming stage. Team 11 has high scores for the majority; but notice there’s one problematic team member whom the rest of the team give lower evaluations.
  • Different teams have different teamwork challenges. Team 10 has lower scores (bigger issues) on Q4 (Continuous Improvement) and Q6 (Energizing), while Team 11 may need to improve on Q5 (Role Sharing).

Next, let’s go to an even higher level to see how teams are doing. The following graph shows the average score per team per question, sorted by the average score per team in descending order:

average score per team per question

The above graph indicates the overall sentiment among the team members toward each other. A lower average score (lighter green) may indicate lower satisfaction among the team peers.

This final graph shows the score standard deviation per team per question, sorted by the grand total of the score standard deviation per team in descending order:

score standard deviation per team per question

This graph is an indication of the variation in how team members evaluated each other. Higher variation (darker red) means the evaluation numbers that team members gave to a question were more variable; this variability may indicate the existence of outliers, or of specific issues between specific team members.

Conclusion

Companies want to benefit from Scrum’s focus on highly effective teams. But companies also need visibility into the performance of each team member. In true Agile fashion, eBay’s APD team in China has incrementally developed a peer feedback system that sheds light on both team and individual strengths and weaknesses. As a result, problems can be pinpointed and addressed more accurately and quickly.

{ 1 comment }

This is the first in a series of posts on Cassandra data modeling, implementation, operations, and related practices that guide our Cassandra utilization at eBay. Some of these best practices we’ve learned from public forums, many are new to us, and a few still are arguable and could benefit from further experience.

In this part, I’ll cover a few basic practices and walk through a detailed example. Even if you don’t know anything about Cassandra, you should be able to follow almost everything.

A few words on Cassandra @ eBay

We’ve been trying out Cassandra for more than a year. Cassandra is now serving a handful of use cases ranging from write-heavy logging and tracking, to mixed workload. One of them serves our “Social Signal” project, which enables like/own/want features on eBay product pages. A few use cases have reached production, while more are in development.

Our Cassandra deployment is not huge, but it’s growing at a healthy pace. In the past couple of months, we’ve deployed dozens of nodes across several small clusters spanning multiple data centers. You may ask, why multiple clusters? We isolate clusters by functional area and criticality. Use cases with similar criticality from the same functional area share the same cluster, but reside in different keyspaces.

RedLaser, Hunch, and other eBay adjacencies are also trying out Cassandra for various purposes. In addition to Cassandra, we also utilize MongoDB and HBase. I won’t discuss these now, but suffice it to say we believe each has its own merit.

I’m sure you have more questions at this point. But I won’t tell you the full story yet. At the upcoming Cassandra summit, I’ll go into detail about each use case, the data model, multi-datacenter deployment, lessons learned, and more.

The focus of this post is Cassandra data modeling best practices that we follow at eBay. So, let’s jump in with a few notes about terminology and representations I’ll be using for each post in this series.

Terms and Conventions

  • The terms “Column Name” and “Column Key” are used interchangeably. Similarly, “Super Column Name” and “Super Column Key” are used interchangeably.
  • The following layout represents a row in a Column Family (CF):
  • The following layout represents a row in a Super Column Family (SCF):
  • The following layout represents a row in a Column Family with composite columns. Parts of a composite column are separated by ‘|’. Note that this is just a representation convention; Cassandra’s built-in composite type encodes differently, not using ‘|’. (BTW, this post doesn’t require you to have detailed knowledge of super columns and composite columns.)

With that, let’s start with the first practice!

Don’t think of a relational table

Instead, think of a nested, sorted map data structure.

The following relational model analogy is often used to introduce Cassandra to newcomers:

This analogy helps make the transition from the relational to non-relational world. But don’t use this analogy while designing Cassandra column families. Instead, think of the Cassandra column family as a map of a map: an outer map keyed by a row key, and an inner map keyed by a column key. Both maps are sorted.

SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>>

Why?

A nested sorted map is a more accurate analogy than a relational table, and will help you make the right decisions about your Cassandra data model.

How?

  • A map gives efficient key lookup, and the sorted nature gives efficient scans. In Cassandra, we can use row keys and column keys to do efficient lookups and range scans.
  • The number of column keys is unbounded. In other words, you can have wide rows.
  • A key can itself hold a value. In other words, you can have a valueless column.

Range scan on row keys is possible only when data is partitioned in a cluster using Order Preserving Partitioner (OOP). OOP is almost never used. So, you can think of the outer map as unsorted:

Map<RowKey, SortedMap<ColumnKey, ColumnValue>>

As mentioned earlier, there is something called a “Super Column” in Cassandra. Think of this as a grouping of columns, which turns our two nested maps into three nested maps as follows:

Map<RowKey, SortedMap<SuperColumnKey, SortedMap<ColumnKey, ColumnValue>>>

Notes:

  • You need to pass the timestamp with each column value, for Cassandra to use internally for conflict resolution. However, the timestamp can be safely ignored during modeling. Also, do not plan to use timestamps as data in your application. They’re not for you, and they do not define new versions of your data (unlike in HBase).
  • The Cassandra community has heavily criticized the implementation of Super Column because of performance concerns and the lack of support for secondary indexes. The same “super column like” functionality (or even better) can be achieved by using composite columns.

Model column families around query patterns

But start your design with entities and relationships, if you can.

  • Unlike in relational databases, it’s not easy to tune or introduce new query patterns in Cassandra by simply creating secondary indexes or building complex SQLs (using joins, order by, group by?) because of its high-scale distributed nature. So think about query patterns up front, and design column families accordingly.
  • Remember the lesson of the nested sorted map, and think how you can organize data into that map to satisfy your query requirements of fast look-up/ordering/grouping/filtering/aggregation/etc.

However, entities and their relationships still matter (unless the use case is special – perhaps storing logs or other time series data?). What if I gave you query patterns to create a Cassandra model for an e-commerce website, but didn’t tell you anything about the entities and relationships? You might try to figure out entities and relationships, knowingly or unknowingly, from the query patterns or from your prior understanding of the domain (because entities and relationships are how we perceive the real world). It’s important to understand and start with entities and relationships, then continue modeling around query patterns by de-normalizing and duplicating. If this sounds confusing, make sure to go through the detailed example later in this post.

Note: It also helps to identify the most frequent query patterns and isolate the less frequent. Some queries might be executed only a few thousand times, while others a billion times. Also consider which queries are sensitive to latency and which are not. Make sure your model first satisfies the most frequent and critical queries.

De-normalize and duplicate for read performance

But don’t de-normalize if you don’t need to. It’s all about finding the right balance.

In the relational world, the pros of normalization are well understood: less data duplication, fewer data modification anomalies, conceptually cleaner, easier to maintain, and so on. The cons are also understood: that queries might perform slowly if many tables are joined, etc. The same holds true in Cassandra, but the cons are magnified since it’s distributed and of course there are no joins (since it’s high-scale distributed!). So with a fully normalized schema, reads may perform much worse.

This and the previous practice (modeling around query patterns) are so important that I would like to further elaborate by devoting the rest of the post to a detailed example.

Note: The example discussed below is just for demonstration purposes, and does not represent the data model used for Cassandra projects within eBay.

Example: ‘Like’ relationship between User & Item

This example concerns the functionality of an e-commerce system where users can like one or more items. One user can like multiple items and one item can be liked by multiple users, leading to a many-to-many relationship as shown in the relational model below:

For this example, let’s say we would like to query data as follows:

  • Get user by user id
  • Get item by item id
  • Get all the items that a particular user likes
  • Get all the users who like a particular item

Below are some options for modeling the data in Cassandra, in order of the lowest to the highest de-normalization. The best option depends on the query patterns, as you’ll soon see.

Option 1: Exact replica of relational model


This model supports querying user data by user id and item data by item id. But there is no easy way to query all the items that a particular user likes or all the users who like a particular item.

This is the worst way of modeling for this use case. Basically, User_Item_Like is not modeled correctly here.

Note that the ‘timestamp’ column (storing when the user liked the item) is dropped from User_Item_Like for simplicity. I’ll introduce that column later.

Option 2: Normalized entities with custom indexes

This model has fairly normalized entities, except that user id and item id mapping is stored twice, first by item id and second by user id.

Here, we can easily query all the items that a particular user likes using Item_By_User, and all the users who like a particular item using User_By_Item. We refer to these column families as custom secondary indexes, but they’re just other column families.

Let’s say we always want to get the item title in addition to the item id when we query items liked by a particular user. In the current model, we first need to query Item_By_User to get all the item ids that a given user likes; and then for each item id, we need to query Item to get the title. Similarly, let’s say we always want to get all the usernames in addition to user ids when we query users who like a particular item. With the current model, we first need to query User_By_Item to get the ids for all users who like a given item; and then for each user id, we need to query User to get the username. It’s possible that one item is liked by a couple hundred users, or an active user has liked many items — which will cause many additional queries when we look up usernames who like a given item and vice versa. So, it’s better to optimize by de-normalizing item title in Item_by_User, and username in User_by_Item, as shown in option 3.

Note: Even if you can batch your reads, they will still be slower because Cassandra (Coordinator node, to be specific) has to query each row separately underneath (usually from different nodes). Batch read will help only by avoiding the round trip — which is good, so you should always try to leverage it.

Option 3: Normalized entities with de-normalization into custom indexes

In this model, title and username are de-normalized in User_By_Item and Item_By_User respectively. This allows us to efficiently query all the item titles liked by a given user, and all the user names who like a given item. This is a fair amount of de-normalization for this use case.

What if we want to get all the information (title, desc, price, etc.) about the items liked by a given user? But we need to ask ourselves whether we really need this query, particularly for this use case. We can show all the item titles that a user likes and pull additional information only when the user asks for it (by clicking on a title). So, it’s better not to do extreme de-normalization for this use case. (However, it’s common to show both title and price up front. It’s easy to do; I’ll leave it for you to pursue if you wish.)

Let’s consider the following two query patterns:

  • For a given item id, get all of the item data (title, desc, etc.) along with the names of the users who liked that item.
  • For a given user id, get all of the user data along with the item titles liked by that user.

These are reasonable queries for item detail and user detail pages in an application. Both will perform well with this model. Both will cause two lookups, one to query item data (or user data) and another to query user names (or item titles). As the user becomes more active (starts liking thousands of items?) or the item becomes hotter (liked by a few million users?), the number of lookups will not grow; it will remain constant at two. That’s not bad, and de-normalization may not yield much benefit like we had when moving from option 2 to option 3. However, let’s see how we can optimize further in option 4.

Option 4: Partially de-normalized entities

Definitely, option 4 looks messy. In terms of savings, it’s not like what we had in option 3.

If User and Item are highly shared entities (similar to what we have at eBay), I would prefer option 3 over this option.

We’ve used the term “partially de-normalized” here because we’re not de-normalizing all item data into the User entity or all user data into the Item entity. I won’t even consider showing extreme de-normalization (keeping all item data in User and all user data in Item), as you probably agree that it doesn’t make sense for this use case.

Note: I’ve used Super Column here just for demonstration purposes. Almost all the time, you should favor composite columns over Super Column.

The best model

The winner is Option 3, particularly for this example. We’ve left out timestamp, but let’s include it in the final model below as timeuuid(type-1 uuid). Note that timeuuid and userid together form a composite column key in User_By_Item and Item_By_User column families.

Recall that column keys are physically stored sorted. Here our column keys are stored sorted by timeuuid in both User_By_Item and Item_By_User, which makes range queries on time slots very efficient. With this model, we can efficiently query (via range scans) the most recent users who like a given item and the most recent items liked by a given user, without reading all the columns of a row.

Summary

We’ve covered a few fundamental practices and walked through a detailed example to help you get started with Cassandra data model design. Here are the key takeaways:

  • Don’t think of a relational table, but think of a nested sorted map data structure while designing Cassandra column families.
  • Model column families around query patterns. But start your design with entities and relationships, if you can.
  • De-normalize and duplicate for read performance. But don’t de-normalize if you don’t need to.
  • Remember that there are many ways to model. The best way depends on your use case and query patterns.

What I’ve not mentioned here are special, but common, use cases such as logging, monitoring, real-time analytics (rollups, counters), or other time series data. However, practices discussed here do apply there. In addition, there are known common techniques or patterns used to model these time series data in Cassandra. At eBay, we also use some of those techniques and would love to share about them in upcoming posts. For more information on modeling time series data, I would recommend reading Advanced time series with Cassandra and Metric collection and storage. Also, if you’re new to Cassandra, make sure to scan through DataStax documentation on Cassandra.

UPDATE:  Part 2 about Cassandra is now published.

Jay Patel, architect@eBay

{ 25 comments }

For an online shopping business of any size, time is quite literally money. For eBay, a delay of milliseconds is enough to make the difference between a purchase and a dissatisfied customer who abandoned a page that didn’t load as quickly as desired. Multiply that experience by eBay’s 100 million-plus active users, and you can see how even short delays can have significant financial impacts.

In this blog post, I present some of the techniques that we use to optimize the presentation stack for the eBay Marketplaces web site.

Measuring Above-the-Fold Time (AFT)

AFT is the time that elapses between the first and the last pixel change in the part of the browser window that you see without scrolling. Currently, there are no browser APIs to measure AFT. For today’s image-intensive web pages, image loading is a major contributor to AFT as well as to total page load time. If you know all of the images that will load above the fold, you can calculate AFT by measuring the last image loaded in the above-the-fold area. While not as accurate a measurement as pixel-by-pixel, this calculation can provide a useful estimate.

Here is JavaScript code you can use to calculate AFT, assuming that the first six images display above the fold. The ‘start’ variable holds the time of the first byte. You call the ‘measure’ function on the onload  event of the first six images, capturing the time for each in the lastImageLoadTime variable. On the window’s onload event, you simply subtract the start value from the lastImageLoadTime value to calculate the AFT. Irrespective of the load order of the first six images, lastImageLoadTime will hold the load time of the last of these images. This technique gives you a rough idea of how much time it takes to load all of the above-the-fold images.

<HTML>
<HEAD>
   <SCRIPT>
      var start = new Date().getTime();
      var lastImageLoadTime = null;
      function measure(){
         lastImageLoadTime = new Date().getTime();
      }
      window.onload = function(){
            alert("Above fold time = "+(lastImageLoadTime – start));
      }
</SCRIPT>

</HEAD>
<BODY>
   <IMG src="image1.gif" onload="measure();">
   <IMG src="image2.gif" onload="measure();">
   <IMG src="image3.gif" onload="measure();">
   <IMG src="image4.gif" onload="measure();">
   <IMG src="image5.gif" onload="measure();">
   <IMG src="image6.gif" onload="measure();">
   <IMG src="image7.gif">
   <IMG src="image8.gif">
   <IMG src="image9.gif">
   <IMG src="image10.gif">
</BODY>
</HTML>

We’re using this technique to measure and optimize the AFT.

Page rendering

So many factors influence page rendering. Today’s web sites contain lots of JavaScript, CSS, and images. The quantities of these resources and their placement within the HTML markup have a direct impact on page rendering.

JavaScript

When the browser parses the HTML markup, it stops rendering the HTML when it encounters an inline JavaScript block or external JavaScript file. At this point, the user experiences rendering delays. Moving the JavaScript to the end of the HTML markup would completely eliminate these pauses in rendering.

Here is a simple way to test how JavaScript is influencing the rendering of your page. Although you can use any browser for this test, the influence is easily detected using Internet Explorer. Launch Internet Explorer and open your web page. After the page is completely loaded, change the URL in the browser’s address bar to something else. Now click the browser’s back button. You should see the previous page instantaneously. If you instead observe any pauses in the rendering, you can definitely say that JavaScript is affecting page rendering.

Last year, we implemented this technique on the US, UK, and Germany search results pages. As a consequence, the usability, performance, and bottom-line numbers have improved greatly. Surprisingly, with this change alone we increased purchases by 1%. We also had very happy customers!

CSS

Treat CSS like a king and JavaScript like a slave. What I mean by this is to keep the CSS on top of the HTML markup and JavaScript at the bottom of the markup. Keeping the CSS on top optimizes HTML rendering and reduces the re-flows.

Images

Browsers have limited concurrent connections per domain. If your page exceeds the number of images that can be concurrently downloaded from a given domain, you will see delays in the rendering of those images. One work-around for this issue is to create more subdomains, thereby increasing the number of connections. Still, this approach requires a connection to download the images.

How about eliminating the connection altogether?  You can do so by embedding the image in the HTML using Base64 encoding, which is supported by most modern browsers. Limit the embedding to those images that will display above the fold, to improve the above-the-fold rendering.

Conclusion

This post merely touches on some of the techniques we’ve found that can make a real difference in the amount of time customers spend waiting on pages to load and render.  We’ll provide updates discussing more optimization techniques that we’re exploring.

{ 0 comments }

The eBay Motors engineering team took up an initiative to revisit some of the legacy JavaScript code shared across various pages, optimizing them to leverage the latest advancements in HTML5. Our main focus areas were user interactions and animations in which the old JavaScript code was lacking in performance and sturdiness. When we completed the exercise and demoed the upgraded experience to our product folks, the feedback we got was “SLEEK“. This post highlights five of those JavaScript techniques that we think made a difference.

1. requestAnimationFrame over setInterval: Using the new requestAnimationFrame API for building JavaScript-based animations is a more optimized and efficient approach when compared to the traditional timers and intervals. To quickly summarize, this API offloads the timer calculations for when to do the next UI/DOM style changes to the browser, rather than the developer deciding when to repaint the screen. We grepped our code base and replaced all applicable occurrences of setTimeout and setInterval with requestAnimationFrame; as a result, the animations were a lot smoother. This approach also saves battery life when content is viewed on mobile devices. As expected, the requestAnimationFrame API is not supported in all browsers, so we used polyfill, provided by Paul Irish, as shown below.

    // shim layer with setTimeout fallback
    window.requestAnimFrame = (function(w){
      return  w.requestAnimationFrame       ||
              w.webkitRequestAnimationFrame ||
              w.mozRequestAnimationFrame    ||
              w.oRequestAnimationFrame      ||
              w.msRequestAnimationFrame     ||
              function( callback ){
                w.setTimeout(callback, 1000 / 60);
              };
    })(window);

2. insertAdjacentHTML over innerHTML: The insertAdjacentHTML API is a fine-grained and optimized version of the super-popular innerHTML. Since we specify the insert position, insertAdjacentHTML does not re-parse the element it is being used on and avoids the extra step of serialization, making it much faster than direct innerHTML manipulation. This approach is very effective in scenarios where we keep appending markup to a page or module, such as for the daily deals feed and for endless scroll. A simple JSPerf test result shows that insertAdjacentHTML is 100% faster than innerHTML. Surprisingly, the browser support for this API has been there for a very long time (Firefox started supporting it in version 8), and the helper function is pretty straightforward:

    // helper function to append content for a given element
    var appendContent = function(){
        // Closure to hold the insertAdjacentHTML API support
        var insertAdjacentSupported = document.createElement('div').insertAdjacentHTML;

        return function(elem, content) {
            if(insertAdjacentSupported) {
                elem.insertAdjacentHTML('beforeend', content);
            } else {
                elem.innerHTML = elem.innerHTML + content;
            }
        };
    }();

3. if-else over try-catch: Try-catch blocks provide an efficient mechanism to handle exceptions and unforeseen runtime errors. However, they impose a performance penalty in JavaScript, especially when used in iterations or recursive functions. More details can be found at dev.opera.com and O’Reilly’s site. To address this issue, we scanned our code base, particularly looking for performance-critical functions. We found a few occurrences where a try-catch was inside a long-running iterator loop, and also where the probability of the thread entering the catch block was high. The try-catch was replaced with simple if-else conditions, which took care of all error handling, and the resulting code was much more efficient.

Using try-catch:

    // jsonResponse comes from some web service
    var i, l = jsonResponse.length, item, offer, binPrice, bidPrice;
    for(i = 0; i < l; i++) {
        item = jsonResponse[i];
        // Some code
        // ...
        try {
            offer = item.offer;
            binPrice = offer.bin;
            bidPrice = offer.bid;
        } catch(e) {
            offer = {err: "Offer not found"};
        }
        // Some more code
        // ...
    }

Using if-else:

    // jsonResponse comes from some web service
    var i, l = jsonResponse.length, item, offer, binPrice, bidPrice;
    for(i = 0; i < l; i++) {
        item = jsonResponse[i];
        // Some code
        // ...
        offer = item.offer;
        if(offer) {
            binPrice = offer.bin;
            bidPrice = offer.bid;
        } else {
            offer = {err: "Offer not found"};
        }
        // Some more code
        // ...
    }

4. XMLHttpRequestUpload over server polling: Using browsers to upload files (photos, PDFs, other documents, etc.) has become a very common use case, and one of our applications had this requirement. To keep users informed during the upload process, we wanted to show a real-time progress meter with accurate percentages. One (non-Flash) way of simulating this AJAX-based upload is to create a hidden iFrame and submit the main form, which holds the input file element targeted to the iFrame. Following the form submit, the server has to be polled periodically to retrieve the upload percentage for the progress meter. Not only is this approach a hack, but it also consumes huge amounts of server and client resources.
As a savior, the XMLHttpRequest object in modern browsers has the capability of uploading files (as byte streams) using the send method, and also has this amazing XMLHttpRequestUpload attribute. The upload attribute has a couple of associated events, the most important being the progress event. The progress event handler receives the total number of bytes to transfer and the number of bytes transferred so far, from the event's total and loaded fields. With this information, end users receive updates of real-time progress in the most efficient and sturdy way. Here is a quick preview of the API usage:

    var uploadAJAX = function(serverURL) {
        var xhr = new XMLHttpRequest(),
            fileElement = document.getElementById("file"),
            fileObj = fileElement[0], // For demo, taking only the first file in the file list
            progressMeter = getProgressMeterComponent(); // retrieve the progress meter UI component 

        xhr.upload.onprogress = function(e){
            if (e.lengthComputable){
                var percentComplete = Math.round(e.loaded / e.total * 100);
                progressMeter.update(percentComplete); // Updates the progress meter UI component with the given percentage
                }
        };    

        xhr.open("POST", serverURL, true);
        xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest");
        xhr.setRequestHeader("X-File-Name", encodeURIComponent(fileObj.name));
        xhr.setRequestHeader("Content-Type", "application/octet-stream");
        xhr.send(fileObj.file);
    };

We have open-sourced the HTML5 image uploader utility in github. For browsers that do not support this feature, the application falls back to the hidden iFrame approach.

5. CSS3 over JavaScript: The final optimization is to AVOID JavaScript as much as possible for animations (JavaScript is great but...) and to leverage the modern CSS3-based transitions. The new-age CSS comes with a ton of great animatable properties that can replace most of the basic animations currently implemented in JavaScript. The main advantages are ease of use, the browser doing most of the work, the leveraging of machine hardware if necessary, and above all an elegant and smooth visual touch to the transition (which is nearly impossible to achieve with JavaScript). We started changing our animations from simple tab switches (check it out) to complex 3D carousels (check out the beta version) - all with CSS, which netted a great user experience. For older browsers, we just stopped doing animations.

This entire re-engineering process also helped us develop a workflow for similar upgrades, thus enabling our code base to iterate at the same speed as do browser innovations. We can always hope that the need to use polyfills will be reduced in the near future.


Engineer @ eBay

{ 4 comments }

Copyright © 2011 eBay Inc. All Rights Reserved - User Agreement - Privacy Policy - Comment Policy