Author Archives: Mansi Narula

An Approach to Achieve Scalability and Availability of Data Stores


Today there has been an explosion of the web, specifically in social networks and users of ecommerce applications, that corresponds to an explosion in the sheer volume of data we must deal with. The web has become so ubiquitous that it is used by everyone, from the scientists in 1990s, who used it for exchanging scientific documents, to five-year-olds today exchanging emoticons about kittens. There comes the need of scalability, which is the potential of a system, network, or process to be enlarged in order to accommodate that data growth. The web has virtually brought the world closer, which means there is no such thing as “down time” anymore. Business hours are 24/7, with buyers shopping in disparate time zones. Thereby, a necessity for high availability of the data stores arises. This blog post provides a course of action required to achieve scalability and availability for data stores.

This article covers the following methods to provide a scalable and highly available data stores for applications.

  • Scalability: a distributed system with self-service scaling capability
    • Data capacity analysis
    • Review of data access patterns
    • Different techniques for sharding
    • Self-service scaling capability
  • Availability: physical deployment, rigorous operational procedures, and application resiliency
    • Multiple data center deployment
    • Self-healing tools
    • Well-defined DR tiering, RTO, RPO, and SOPs
    • Application resiliency for data stores


With the advent of the web, especially Web 2.0 sites where millions of users may both read and write data, scalability of simple database operations has become more important. There are two ways to scale a system: vertically and horizontally. This talk focuses on horizontal scalability, where both the data and the load of simple operations is distributed/sharded over many servers, where the servers do not share RAM, CPU, or disk. Although in some implementations disk and storage can be shared, auto scaling can become a challenge for such cases.

diagram abstractly illustrating scalability measures. Image by –

The following measures should be considered as mandatory methods in building a scalable data store.

  • Data capacity analysis: It is a very important task to understand the extreme requirements of the application in terms of peak and average transactions per second, peak number of queries, payload size, expected throughput, and backup requirements. This enables the data store scalability design in terms of how many physical servers are needed and hardware configuration of the data store with respect to memory footprint, disk size, CPU Cores, I/O throughput, and other resources.

  • Review data access patterns: The simplest course to scale an application is to start by looking for access patterns. Given the nature of distributed systems, all queries to the data store must have the access key in all real-time queries to avoid scatter and gather problem across different servers. Data must be aligned by the access key in each of the shards of the distributed data store. In many applications, there can be more than one access key. For example, in an ecommerce application, data retrieval can be by Product ID or by User ID. In such cases, the options are to either store the data redundantly aligned by both keys or store the data with a reference key, depending upon the application’s requirements.

  • Different techniques for sharding: There are different ways to shard the data in a distributed data store. Two of the common mechanisms are function-based sharding and lookup-based sharding.Function-based sharding refers to the sharding scheme where a deterministic function is applied on the key to get the value of shard. In this case, the shard key should exist in each entity stored in the distributed data store, for efficient retrieval. In addition, if the shard key is not random, it can cause hot spots in the system.Lookup-based sharding refers to a lookup table used to store the start range and end range of the key. Clients can cache the lookup table to avoid single point of failure.Many NoSQL databases implement one of these techniques for achieving scalability.

  • Self-service scaling capability: Self-service scaling, or auto-scaling, can work as a jewel in the scalable system crown. Data stores are designed and architected to provide enough capacity to scale up front, but rapid elasticity and cloud services can enable vertical and horizontal scaling in the true sense. Self-service vertical scaling enables the addition of resources to an existing node to increase its capacity, while self-service horizontal scaling enables the addition or removal of nodes in the distributed data store via “scale-up” or “scale-down” functionality.


Data stores need to be highly available for read and write operations. Availability refers to a system or component that is continuously operational for a desirably long length of time. Below are some of the methods to ensure that the right architectural patterns, physical deployment, and rigorous operational procedures are in place for a highly available data store.

diagram of the four availability methods discussed in this blog post

  • Multiple data center deployment: Distributed data stores must be deployed in different data centers with redundant replicas for disaster recovery. Geographical location of data centers should be chosen cautiously to avoid network latency across the nodes. The ideal way is to deploy primary nodes equally amongst the data centers along with local and remote replicas in each data center. Distributed Data stores inherently reduces the downtime footprint by the sharding factor. In addition, equal distribution of nodes across data centers causes only 1/nth of the data to be unavailable in case of a complete data center shutdown.

  • Self-healing tools: Efficient monitoring and self-healing tools must be in place to monitor the heartbeat of the nodes in the distributed data store. In case of failures, these tools should not only monitor but also provide a way to bring the failed component alive or should provide a mechanism to bring its most recent replica up as the next primary. This self-healing mechanism should be cautiously used per the application’s requirements. Some high-write-intensive applications cannot afford inconsistent data, which can change the role of self-healing tools to monitor and alert the application for on-demand healing, instead.

  • Well-defined DR tiering, RTO, RPO, and SOPs: Rigorous operational procedures can bring the availability numbers (ratio of the expected value of the uptime of a system to the aggregate of the expected values of up and down time) to a higher value. Disaster recovery tiers must be well defined for any large-scale enterprise, with an associated expected downtime for the corresponding tiers. The Recovery Time Objective (RTO) and Recovery Point Objective (RPO) should be well tested in a simulated production environment to provide a predicted loss in availability, if any. Well-written SOPs are proven saviors in a crisis, especially in a large enterprise, where Operations can implement SOPs to recover the system as early as possible.

  • Application resiliency for data stores: Hardware fails, but systems must not die. Application resiliency is the ability of an application to react to problems in one of its components and still provide the best possible service. There are multiple ways that an application can use to achieve high availability for read and write database operations. Application resiliency for reads enables the application to read from a replica in the case of primary failure. Resiliency can also be part of a distributed data store feature, as in many of the NoSQL databases. When there is no data affinity of the newly inserted data with the existing data, a round-robin insertion approach can be taken, where new inserts can write to a node other than the primary when the primary is unavailable. On the contrary, when there is data affinity of the newly inserted data with the existing data, the approach is primarily driven by consistency requirements of the application.

The key takeaway is that in order to build a scalable and highly available data store, one must take a systematic approach to implement the methods described in this paper. This list of methods is a mandatory, comprehensive list, but not exhaustive, and it can have more methods added to it as needed. Plan to grow BIG and aim to be 24/7 UP, and with the proper scalability and availability measures in place, the sky is the limit.


Image by –

A Middle Approach to Schema Design in OLTP Applications

eBay is experiencing phenomenal growth in the transactional demands on our databases, in no small part due to our being at the forefront of mobile application development. To keep up with such trends, we continually assess the design of our schemas.

Schema design is a logical representation of the structures used to store the data that applications produce and consume.  Given that database resources are finite, execution times for transactions can vary wildly as those transactions compete for the resources they require. As a result, schema design is the most essential part of any application development life cycle. This blog post covers schema design for online transaction processing applications and recommends a specific approach.

Unfortunately, there is no predefined set of rules for designing databases in an efficient manner, but there can certainly be a defined design process that achieves that outcome. Such a design process includes, but is not limited to, the following activities:

  1. determining the purpose of the database
  2. gathering the information to be recorded
  3. dividing the  information items into major entities
  4. deciding what information needs to be stored
  5. setting up relationships between entities
  6. refining the design further

Historically, OLTP systems have relied on a systematic way of ensuring that a database structure is suitable for general-purpose querying and is free of certain undesirable characteristics – insertion, update, and deletion anomalies– that could lead to loss of data integrity. A highly normalized database offers benefits such as minimizing redundancy, freeing up the relations from undesired insertion, update and deletion dependency, data consistency within the database, and a much more flexible database design.

But as they say, there are no free lunches. A normalized database exacts the price of inserting into multiple tables and reading by way of joining multiple tables. Normalization involves design decisions that are likely to cause reduced database performance. Schema design requires keeping in mind that when a query or transaction request is sent to the database, multiple factors are involved, such as CPU usage, memory usage, and input/output (I/O). Depending on the use case, a normalized database may require more CPU, memory, and I/O to process transactions and database queries than does a denormalized database.

Recent developments further compound schema design challenges. With increasing competition and technological advances such as mobile web applications, transactional workload on the database has increased exponentially. As the competitor is only a click away, an online application’s valuable users must be ensured consistently good performance via QoS and transaction prioritization. Schema design for such applications cannot be merely focused on normalization; performance and scalability are no less important.

For example, at eBay we tried a denormalized approach to improve our Core Cart Service’s DB access performance specifically for writes. We switched to using a BLOB-based cart database representation, combining 14 table updates into a single BLOB column. Here are the results:

  • The response time for an “add to cart” operation improved on average 30%. And in use cases where this same call is made against a cart that already contains several items (>20), the performance at 95 percentile has improved by 40% .
  • For the “create cart” operation, total DB call time for the worst case was improved by approximately 50% due to significant reduction in SQL counts.
  • Parallel transaction DB call times improved measurably for an average use case.

These realities do not imply that denormalization is in any way a blessing. There are costs to denormalization. Data redundancy is increased in a denormalized database. This redundancy can improve performance, but it also requires more extraneous efforts to keep track of related data. Application coding can create further complications, because the data is spread across various tables and may be more difficult to locate. In addition, referential integrity is more of a chore, because related data is divided among a number of tables.

There is a happy medium between normalization and denormalization, but both require a thorough knowledge of the actual data and the specific business requirements. This happy medium is what I call “the medium approach.” Denormalizing a database is the process of taking the level of normalization within the database down a notch or two. Remember, normalization can provide data integrity, which is the assurance of consistent and accurate data within a database, but at the same time it can slow performance because of its frequently occurring table join operations.

There is an old proverb:  “normalize until it hurts, denormalize until it works.” One has to land in the middle ground to get all of the goodies of these two different worlds.