The Zombie Apocalypse Retrospective!

sketch1

I’ve written in the past that I believe that retrospectives should be a creative process, and I like to engage the brain using interesting visuals and ideas. I’ve attempted to employ this philosophy at Shutl (an eBay Inc. company) by trying to use a different theme for every retrospective I’ve run. (A recent example of a theme I found through funretrospectives.com is the catapult retro.)

Then a few weeks ago, I made a comment to one of our engineers, Volker, that you could pretty much take any situation you can think of and turn it into a retrospective idea; thus the challenge of a Zombie Apocalypse- themed retro was born!

Limitations of more “traditional” retrospective formats

I was first introduced to retrospectives in 2007. Back then, a typical retro would follow the starfish format (or some variation). However, over the past few years I’ve started to see some limitations with such formats. In an attempt to address the more common anti-patterns, I’ve been moving towards a slightly adapted format. I now try to incorporate action items into the brainstorming section, both to streamline the time taken and to focus the group on constructive conversation. This format achieves a few things:

  • Shortens the overall time taken by having the group identify not only what’s helping/hindering the team, but also what they can carry forward to improve their performance in the future
  • Ensures a more constructive mindset by increasing focus, during the brainstorming itself, on suggestions that address hindrances
  • Helps create more achievable solutions by modifying the typical “action item” phase of the retro to instead be a refinement phase, where previously suggested actions are analyzed and prioritized

Exploring the idea

With the above goals in mind, I started by scribbling and sketching out some ideas in my notepad; after a short while I had come up with a basic draft for the structure of the retro:

sketch2

I bandied the idea around in my head for a day or so. The finished product looked like this:

sketch3

The format

The picture above was drawn on a large whiteboard and divided into three color-coded columns (with a fourth column for action items, complete with a reminder that our final actions require a “what,” a “who,” and a “when”).

Green stickies (column 1)

This is you, huddled in the corner, with your stockpile of weaponry at the ready, bravely fighting off the ravenous horde crashing through your doorway.

What’s your ammo? On green stickies, write down all those things that are fueling your team’s successes and working in your favor.

Pink stickies (column 3)

This is the zombie horde—a relentless army of endless undead marching towards your destruction.

Use pink stickies to identify the problems that you are facing (including potential future problems).

Orange stickies (column 2)

This is your perimeter—the security measures you’ve installed to resist the horde and ensure your survival.

As you’re identifying the issues you face and the current behaviors that are fueling your success, think about what actions you can take today to either address these issues or ensure continued success. The idea is to try to come up with a solution or suggestion for every problem that you can see on a pink sticky.

I tried out the format on the team. I gave them about seven minutes for the brainstorming, with the usual guidelines around collaboration: encouraging people to talk to each other and to look at each other’s suggestions. As a countdown timer, I personally use the 3-2-1 dashboard widget, but there are plenty of others you can use.

We then had a round of grouping and voting (each team member got three votes), with a reminder to vote on things you want to discuss, not just things you agree with (e.g., you could strongly disagree with a point on the board, and vote for it to start a discussion). Due to the nature of the board (if things go well), groups of pink stickies should have corresponding orange ones to direct the discussion towards action items.

I wrote down all action items that came up, and gave the team a caveat that we’d have five minutes at the end to review the actions, prioritize them, and pick the ones that we actually wanted to address; this keeps the discussions flowing. We ended up with some conflicting action itemswhich was fine; the idea was to get all the potential actions down, and then at the end decide which we felt were the most valuable. During this final review of the actions, we also assigned owners and deadlines. Then we were done!

Here’s what the final board looked like after our 45-minute retro was complete:

sketch4

Next challenge: what crazy (yet effective) retrospective formats can you come up with?

Developer Anarchy Day

Developer Anarchy Day

Have you ever imagined what would happen if you let software developers work on what they want? Well, we did it. For one day. And here are the results…

How it all began

“OK, listen: there is no backlog today.”

When we first heard these words from Megan instead of the usual beginning of standup, we didn’t know what to expect. Further explanation wasn’t elaborate either. There was only one rule: you need to demonstrate something at the end of the day.

We had different reactions. We were happy (“Great! A break from the day-to-day tasks!”), shocked (“What did they do with my safe and predictable to-do column! Help!”), and insecure (“Can I really finish something to show in just one day, with no planning, estimating, or design?”).

So that was it. For one full (work) day, all developers in our team at Shutl (an eBay Inc. company) were supposed to forget about ongoing projects, deadlines, and unfinished tasks from the day before. We could work on whatever we wanted. We could pair or work individually. We could work on a DevOps task, on a working feature, or just on a prototype. We could develop a feature on an existing application or create a brand new project.

The first thing we did was a small brainstorm where we described our ideas. It was not obligatory, but it helped in forming pairs and getting some encouragement for our little projects. Then we just started coding.

Developer Anarchy

Now, let me give some background behind this idea. You may have heard about “programmer anarchy” in context of development processes and company culture. In a few words: letting engineers make decisions on what and how they develop in order to meet critical requirements, and getting rid of “managers of programmers” from your development process. Fred George, the inventor of the idea, implemented it in a couple of companies. There was also a big buzz about how Github works with no managers (or rather with everyone being a manager).

These are great examples to read and think about. There are different opinions about this philosophy. Certainly, developing a culture and process that leaves all decisions to developers requires courage, time, money, and a certain kind of people in your team. You have to think very carefully before applying developer anarchy as a day-to-day rule.
We asked ourselves if there was anything we could do without changing our processes and getting rid of our managers, but still gain inspiration from the concepts of developer anarchy? We reckoned we could, and Developer Anarchy Days were born!

How to start

Introducing Developer Anarchy Days required very little preparation or changes in our organization. No planning or product management before it began was required. We did have some discussions prior to the event on whether it should be a spontaneously picked day or a planned and scheduled action. We decided for mix of both. Team members would get a ‘warning’ email some days in advance so that they could start thinking about it, but the actual day was a surprise.

Is it suitable for my team?

The concept is very lightweight and open to interpretation. The premise is simple. Give your developers a day without a backlog or predefined tasks and let their creativity take over. This method has benefits to whatever team composition you may have. Less experienced developers get a chance to expand their skills and their self-confidence as they gain experience in owning and delivering something in a short time frame. More experienced developers get a chance to try out some new technologies they’ve been itching to experiment with. Pairing is always an option (and encouraged), so that there is someone to help and learn from.

What if the team is not an agile team at all? Well, that’s actually a great opportunity to taste a bit of agility. What can be more agile than delivering in just one day?

Isn’t it a waste of time?

It depends on how you define wasted time. If you see it as any time not spent directly on delivering pre-defined business requirements/stories, then yes, it is wasted time. You could say the same about avoiding technical debt, working on chores, organizing meetings, or playing ping-pong after lunch. As with any other culture-related thing, it is hard to say. You may waste one day on building software no one will ever look at again. On the other hand, you may learn something, make developers more motivated, invent internal tools that improve efficiency, and even develop some great new innovations to help achieve business goals.

Is one day enough time to develop something?

Yes and no. It’s probably not enough time to develop something production-ready, but that’s not the intention. It’s more about trying something new, developing a prototype, creating a small internal tool, or just presenting new ideas to the team. For that, we’ve found that one day is enough.

You can make it longer and spend a couple of days on building small projects in small teams. This may be more effective in complex and usable projects, but also requires more preparation, such as some planning considering ongoing project roadmaps and probably announcing the event earlier so everyone can prepare potential ideas for the projects.

Isn’t Developer Anarchy Day just another name for hackathon?

Developer Anarchy Days have a lot in common with hackathons, hackfests, codefests, or hack-days. They’re all about intensively developing and presenting creative ideas. The main difference is that hackathons are usually bigger events in the form of competition, very often involving participants from outside of the company. They require proper event organization, including marketing, proper venue, food, and infrastructure. Usually, the promotional aspect of it is very important. You don’t need all this to organize a Developer Anarchy Day.

What engineering teams can get from this?

  • Developers show that they are able to make decisions and explore creative ideas
  • Engineers get a chance to come up with ideas from a technological perspective – something that businesses may sometimes miss
  • Developers feel more motivated, because they are doing something of their own
  • Developers experience how it is when they have to not only deliver something on time but also limit the project to something they can show and sell to others
  • Developers can feel like a product manager and understand their job better
  • The event breaks the routine of everyday (sometimes monotonous) deliveries
  • The event gives everyone an opportunity to finally do stuff that we thought would be nice, but doesn’t bring any direct or indirect business value (e.g. internal tools)
  • Finally, the event allows time to try some new technology or crazy idea!

How did it end up?

OK, let’s go back to Shutl and our very first Developer Anarchy Day. It was a busy day, but definitely a fun one. Everyone felt responsible for finishing what they began on time. After all, we all had to present something. Some of us were pairing; some decided to give it a go by themselves. Although we love pairing, it is good to get away from it from time to time.

First thing the next morning, we presented our work. The variety and creativity of our little projects was beyond all expectations! Here are couple of examples.

“Easy login”

As Shutl has a service-oriented architecture, our everyday work (as everyone’s DevOps) involves logging into multiple boxes. One of our engineers spent Developer Anarchy Day building a super useful command line tool that automates the process of logging in to specific environments without having to ssh into multiple boxes and remember server names. We’ve used it every day since, making our lives easier.

“Shutl mood”

Every day we gather lots of feedback from our customers. The stars they give in their reviews though are a bit impersonal. You can learn much more by analyzing the language of the feedback comments. A pair of Shutl developers spent a day building a language sentiment analyzer that allowed us to get a sense of the general mood of our customers, based on the words they used.
mood

“Immutable deployments”

Another Shutl engineer decided to be more DevOps for that day. He experimented with some new tools and demonstrated immutable deployments with CloudFormation and Chef.

“Predefined order”

Looking for common or possible use cases of our services, we realized that it would be really convenient to use Shutl to pick up and deliver items sent by private sellers on Gumtree or eBay. We have Shutl.it, which allows customers to deliver items from point A to B. The idea was to create a shareable link that pre-fills Shutl.it with pick-up information so any retailer or private seller can offer Shutl as an easy delivery option.
predefined

We definitely had fun and learned something. Actually, we now use “Easy login” every day and “Predefined orders” inspired some things on our roadmap.

What was the feedback from developers?

No surprise here. It was genuinely positive. What can be better for us nine-to-five workers than a little bit of anarchy, especially when it lasts only one day, after which we quickly revert back to comfort and security of prioritized backlog and product management. We all agreed that we want to repeat anarchy on a regular basis. And we do. It has become an important part of our work culture.

Announcing Pulsar: Real-time Analytics at Scale

We are happy to announce Pulsar – an open-source, real-time analytics platform and stream processing framework. Pulsar can be used to collect and process user and business events in real time, providing key insights and enabling systems to react to user activities within seconds. In addition to real-time sessionization and multi-dimensional metrics aggregation over time windows, Pulsar uses a SQL-like event processing language to offer custom stream creation through data enrichment, mutation, and filtering. Pulsar scales to a million events per second with high availability. It can be easily integrated with metrics stores like Cassandra and Druid.

pulsar_logo

Why Pulsar

eBay provides a platform that enables millions of buyers and sellers to conduct commerce transactions. To help optimize eBay end users’ experience, we perform analysis of user interactions and behaviors. Over the past years, batch-oriented data platforms like Hadoop have been used successfully for user behavior analytics. More recently, we have newer use cases that demand collection and processing of vast numbers of events in near real time (within seconds), in order to derive actionable insights and generate signals for immediate action. Here are examples of such use cases:

  • Real-time reporting and dashboards
  • Business activity monitoring
  • Personalization
  • Marketing and advertising
  • Fraud and bot detection

We identified a set of systemic qualities that are important to support these large-scale, real-time analytics use cases:

  • Scalability - Scaling to millions of events per second
  • Latency - Sub-second event processing and delivery
  • Availability - No cluster downtime during software upgrade, stream processing rule updates , and topology changes
  • Flexibility - Ease in defining and changing processing logic, event routing, and pipeline topology
  • Productivity - Support for complex event processing (CEP) and a 4GL language for data filtering, mutation, aggregation, and stateful processing
  • Data accuracy - 99.9% data delivery
  • Cloud deployability – Node distribution across data centers using standard cloud infrastructure

Given our unique set of requirements, we decided to develop our own distributed CEP framework. Pulsar CEP provides a Java-based framework as well as tooling to build, deploy, and manage CEP applications in a cloud environment. Pulsar CEP includes the following capabilities:

  • Declarative definition of processing logic in SQL
  • Hot deployment of SQL without restarting applications
  • Annotation plugin framework to extend SQL functionality
  • Pipeline flow routing using SQL
  • Dynamic creation of stream affinity using SQL
  • Declarative pipeline stitching using Spring IOC, thereby enabling dynamic topology changes at runtime
  • Clustering with elastic scaling
  • Cloud deployment
  • Publish-subscribe messaging with both push and pull models
  • Additional CEP capabilities through Esper integration

On top of this CEP framework, we implemented a real-time analytics data pipeline.

Pulsar real-time analytics pipeline

Pulsar's real-time analytics data pipeline consists of loosely coupled stages. Each stage is functionally separate from its neighboring stage. Events are transported asynchronously across a pipeline of these loosely coupled stages. This model provides higher reliability and scalability. Each stage can be built and operated independently from its neighboring stages, and can adopt its own deployment and release cycles. The topology can be changed without restarting the cluster.

pulsar_pipeline

Here is some of the processing we perform in our real-time analytics pipeline:

  • Enrichment - Decorate events with additional attributes. For example, we can add geo location information to user interaction events based on the IP address range.
  • Filtering and mutation - Filter out irrelevant attributes and events, or transform the content of an event.
  • Aggregation - Count the number of events, or add up metrics along a set of dimensions over a time window.
  • Stateful processing - Group multiple events into one, or generate a new event based on a sequence of events and processing rules. An example is our sessionization stage, which tracks user session-based metrics by grouping a sequence of user interaction events into web sessions.

The Pulsar pipeline can be integrated with different systems. For example, summarized events can be sent to a persistent metrics store to support ad-hoc queries. Events can also be sent to some form of visualization dashboard for real-time reporting, or to backend systems that can react to event signals.

A taste of complex event processing

In Pulsar, our approach is to treat the event stream like a database table. We apply SQL queries and annotations on live streams to extract summary data as events are moving.

The following are a few examples of how common processing can be expressed in Pulsar.

Event filtering and routing

insert into SUBSTREAM select D1, D2, D3, D4
from RAWSTREAM where D1 = 2045573 or D2 = 2047936 or D3 = 2051457 or D4 = 2053742; // filtering
@PublishOn(topics=“TOPIC1”)   // publish sub stream at TOPIC1
@OutputTo(“OutboundMessageChannel”)
@ClusterAffinityTag(column = D1);    // partition key based on column D1
select * FROM SUBSTREAM;

Aggregate computation

// create 10-second time window context
create context MCContext start @now end pattern [timer:interval(10)];
// aggregate event count along dimension D1 and D2 within specified time window
context MCContext insert into AGGREGATE select count(*) as METRIC1, D1, D2 FROM RAWSTREAM group by D1,D2 output snapshot when terminated;
select * from AGGREGATE;

TopN computation

// create 60-second time window context
create context MCContext start @now end pattern [timer:interval(60)];
// sort to find top 10 event counts along dimensions D1, D2, and D3
// within specified time window
context MCContext insert into TOPITEMS select count(*) as totalCount, D1, D2, D3 from RawEventStream group by D1, D2, D3 order by count(*) limit 10;
select * from TOPITEMS;

Pulsar deployment architecture

pulsar_deployment

Pulsar CEP processing logic is deployed on many nodes (CEP cells) across data centers. Each CEP cell is configured with an inbound channel, outbound channel, and processing logic. Events are typically partitioned based on a key such as user id. All events with the same partitioned key are routed to the same CEP cell. In each stage, events can be partitioned based on a different key, enabling aggregation across multiple dimensions. To scale to more events, we just need to add more CEP cells into the pipeline. Using Apache ZooKeeper, Pulsar CEP automatically detects the new cell and rebalances the event traffic. Similarly, if a CEP cell goes down, Pulsar CEP will reroute traffic to other nodes.

Pulsar CEP supports multiple messaging models to move events between stages. For low delivery latency, we recommend the push model when events are sent from a producer to a consumer with at-most-once delivery semantics. If a consumer goes down or cannot keep up with the event traffic, it can signal the producer to temporarily push the event into a persistent queue like Kafka; subsequently, the events can be replayed. Pulsar CEP can also be configured to support the pull model with at-least-once delivery semantics. In this case, all events will be written into Kafka, and a consumer will pull from Kafka.

What’s next

Pulsar has been deployed in production at eBay and is processing all user behavior events. We have open-sourced the Pulsar code, we plan to continue to develop the code in the open, and we welcome everyone’s contributions. Below are some features we are working on. We would love to get your help and suggestions.

  • Real-time reporting API and dashboard
  • Integration with Druid or other metrics stores
  • Persistent session store integration
  • Support for long rolling-window aggregation

Please visit http://gopulsar.io for source code, documentation, and more information.