Category Archives: Machine Learning

Algorithmic and other data modeling techniques

Griffin — Model-driven Data Quality Service on the Cloud for Both Real-time and Batch Data

Overview of Griffin At eBay, when people use big data (Hadoop or other streaming systems), measurement of data quality is a significant challenge. Different teams have built customized tools to detect and analyze data quality issues within their own domains. As a platform organization, we think of taking a platform approach to commonly occurring patterns.
Continue Reading »

Apache Eagle: Secure Hadoop in Real Time

Co-Authors: Chaitali Gupta and Edward Zhang Update:  Eagle was accepted as an Apache Incubator project on October 26, 2015. Today’s successful organizations are data driven. At eBay we have thousands of engineers, analysts, and data scientists who crunch petabytes of data everyday to provide a great experience for our users.  We execute at massive scale using data
Continue Reading »

Using Spark to Ignite Data Analytics

At eBay we want our customers to have the best experience possible. We use data analytics to improve user experiences, provide relevant offers, optimize performance, and create many, many other kinds of value. One way eBay supports this value creation is by utilizing data processing frameworks that enable, accelerate, or simplify data analytics. One such
Continue Reading »