Category Archives: Data Infrastructure and Services

Platforms, frameworks, services, best practices, etc. for managing Big Data at eBay

embedded-druid: Leveraging Druid Capabilities in Stand-alone Applications

Co-authors:  Ramachandran Ramesh, Mahesh Somani, and Sankar Venkatraman The eBay Cloud Platform team is happy to announce the open-source project embedded-druid, which aims to provide Druid capability for a reasonably small amount of data without involving the complexity of multi-node setup. That is, embedded-druid is Druid but with a single JVM process. Background Druid is
Continue Reading »

Announcing Pulsar Reporting: Near-Real-Time Metrics Reporting Framework

We are excited to announce the first open-source release of Pulsar Reporting. Earlier this year, we announced, an open-source project that included Pulsar Pipeline, a real-time analytics platform and stream processing framework. One of the frequently requested features for Pulsar has been integration with a metrics store for visualizing the near-real-time metrics. We’ve provided
Continue Reading »

Apache Eagle: Secure Hadoop in Real Time

Co-Authors: Chaitali Gupta and Edward Zhang Update:  Eagle was accepted as an Apache Incubator project on October 26, 2015. Today’s successful organizations are data driven. At eBay we have thousands of engineers, analysts, and data scientists who crunch petabytes of data everyday to provide a great experience for our users.  We execute at massive scale using data
Continue Reading »