Category Archives: Data Infrastructure and Services

Platforms, frameworks, services, best practices, etc. for managing Big Data at eBay

Enhancing the User Experience of the Hadoop Ecosystem

  At eBay, we have multiple large, multi-tenant clusters. Each of these clusters stores hundreds of petabytes of data. These clusters offer tens of thousands of cores to run computations on the data. We have thousands of internal users who use Hadoop in their roles, including data analysts, data scientists, engineers, and product managers. These
Continue Reading »

A Creative Visualization of OLAP Cuboids

Background eBay is one of the world’s largest and most vibrant marketplaces with 1.1 billion live listings every day, 169 million active buyers, and trillions of rows of datasets ranging from terabytes to petabytes. Analyzing such volumes of data required eBay’s Analytics Data Infrastructure (ADI) team to create a fast data analytics platform for this
Continue Reading »

Elasticsearch Cluster Lifecycle at eBay

Defining an Elasticsearch cluster lifecycle eBay’s Pronto, our implementation of the “Elasticsearch as service” (ES-AAS) platform, provides fully managed Elasticsearch clusters for various search use cases. Our ES-AAS platform is hosted in a private internal cloud environment based on OpenStack. The platform currently manages around 35+ clusters and supports multiple data center deployments. This blog
Continue Reading »