Category Archives: Hadoop

Enhancing the User Experience of the Hadoop Ecosystem

  At eBay, we have multiple large, multi-tenant clusters. Each of these clusters stores hundreds of petabytes of data. These clusters offer tens of thousands of cores to run computations on the data. We have thousands of internal users who use Hadoop in their roles, including data analysts, data scientists, engineers, and product managers. These
Continue Reading »

A Creative Visualization of OLAP Cuboids

Background eBay is one of the world’s largest and most vibrant marketplaces with 1.1 billion live listings every day, 169 million active buyers, and trillions of rows of datasets ranging from terabytes to petabytes. Analyzing such volumes of data required eBay’s Analytics Data Infrastructure (ADI) team to create a fast data analytics platform for this
Continue Reading »

Secure Communication in Hadoop without Hurting Performance

  Apache Hadoop is used for processing big data at many enterprises. A Hadoop cluster is formed by assembling a large number of commodity machines, and it enables the distributed processing of data. Enterprises store lots of important data on the cluster. Different users and teams process this data to obtain summary information, generate insights,
Continue Reading »