Continue Reading »]]>

The tool for unexpected failures is statistics, hence the name for this post. A statistical anomaly detector hunts for things that seem off and then sends an alert. The power of statistical anomalies is that they can detect novel problems. The downside is that they must be followed up to track down the root cause. All this might seem abstract, but I will give a concrete example below.

To avoid confusion, I find it helps to carefully define some terminology:

**Monitored Signals**: Results of continuous scans that look for trouble.**Disruption**: Behavior indicating that the site is not working as designed. A human will need to investigate to find the root cause.**Anomaly**: Unusual behavior in the monitored signals, suggesting a disruption.**Alert**: Automatic signal that an anomaly has occurred. It is common to send an alert after detecting an anomaly.

The obstacle to building useful statistical detectors is *false positives*: an alert that doesn’t correspond to a disruption. If the anomaly detector has too many false positives, users will turn off alerts or direct them to a spam folder or just generally ignore them. An ignored detector is not very useful.

In this post I will describe a statistical anomaly detector in use at eBay; my paper at the recent USENIX HotCloud 2015 workshop has more details. The monitored signals in our search system are based on the search results from a set of reference queries that are repeatedly issued. A statistical anomaly detector needs a way to encode the results of the monitoring as a set of numbers. We do this by computing *metrics* on the results. For each reference query, we compute about 50 metrics summarizing the items returned by the query. Two examples of metrics are the number of items returned and the median price of the returned items. There are 3000 reference queries and 50 metrics, therefore 150,000 numbers total. Currently, reference queries are issued every 4 hours, or 6 times a day, so there are 900,000 numbers per day. In these days of terabyte databases, this is very small. But the problem of sifting through those numbers to find anomalies, and do it with a low false-positive rate, is challenging.

I’ll outline our approach using diagrams, and refer you to my HotCloud paper for details. First, here is a picture of the monitored signals—that is, the metrics collected:

Each (query, metric) pair is a number that can be tracked over time, resulting in a time series. That’s 150,000 times series, and it’s reasonable to expect that during every 4-hour collection window, at least one of them will appear anomalous just by chance. So alerting on each time series is no good, because it will result in many false positives.

Our approach is to aggregate, and it starts with something very simple: examining each time series and computing the deviation between the most recent value and what you might have expected by extrapolating previous values. I call this the *surprise*, where the bigger the deviation the more the surprise. Here’s a figure illustrating that there is a surprise for each (query, metric, time) triple.

The idea for our anomaly detector is this: at each collection time *T* we expect a few (query,metric, *T*) triples to have a large surprise. We signal an alert if an unusually high number of triples have a large surprise. To make this idea quantitative, fix a metric, look at the surprise for all 3000 queries, and compute the 90th percentile of surprise at time *T*: *S*_{90}(*T*).

This gives a new time series in *T*, one for each metric. Hypothetical time series for the first two metrics are illustrated below.

We’ve gone from 150,000 time series down to 50. Aggregation like this is a very useful technique in anomaly detection.

Our final anomaly detector uses a simple test on this aggregated time series. We define an anomaly to occur when the current value of any of the 50 series is more than 3σ from the median of that series. Here’s an example using eBay data with the metric median sale price. There is clearly an anomaly at time *T*=88.

For more details, see my previously cited HotCloud paper, The Importance of Features for Statistical Anomaly Detection.

]]>Continue Reading »]]>

No.

Well, perhaps, but it can be the lesser of two evils.

It depends.

**Disclaimer** – these thoughts reflect my personal opinions and not a consensus of how the entire Shutl engineering team at eBay feels.

As a software engineer, I’ve always had it drilled into my head that duplication is a bad thing, and teams I’ve been a part of have always worked diligently to reduce all forms of duplication they see. This as a general rule makes a lot of sense; I can still remember reading through Martin Fowler’s Refactoring while I was at university and the excitement of discovering design patterns.

Good developers are precise and careful and understand the risks and pitfalls of duplication in a code base—see the DRY (don’t repeat yourself) principle, from The Pragmatic Programmer:

“The alternative is to have the same thing expressed in two or more places. If you change one, you have to remember to change the others, or your program will be brought to its knees by a contradiction. It isn’t a question of whether you’ll remember: it’s a question of when you’ll forget.”

Which makes perfect sense. It’s time well spent when you try to make your code streamlined and readable. You’ll end up with a cleaner, easier-to-maintain, and more extensible code base as a result. I briefly mention this point here just to give some context, but it’s been covered many times and I’m assuming that we’re all on board: Duplication is a code smell, and removing it is usually a good thing.

However, like most things, I believe this advice starts getting you into trouble if you take it to an extreme. Advice, just like code, does not live in a vacuum; context is king. For example, when it comes to unit tests, sometimes the cost introduced by removing duplication can be greater than the benefits gained. What I plan to show you in this post are a few examples of where I would leave duplication in place and how this duplication has, in my mind, led to cleaner and more readable code.

^{(image source: https://www.flickr.com/photos/pokerbrit/9455644074)}

I was trying to think up an appropriate term for a thought I had in my head and settled on Peacock Programmers.

*[Edit – I’ve discovered that an alternative name might already exist for this (and here I thought I was being original) – The Magpie Developer ]*

A peacock programmer is someone who loves to use every tool in the arsenal and for various reasons attempts to decorate code with every possible feature that is available to him or her.

You can recognize the work of a peacock programmer through inelegant attempts at elegance. It’s wordy, flashy, complicated looking, and almost impossible to understand without deep concentrated effort and untangling. Sure it’s fun to write and a blast to experiment and learn all the different possible ways of solving a problem, but such use is as an academic tool. When you’re seeing it on a production code base, it can be frustrating.

While talking through the concept of peacock programmers over lunch in the office, a colleague, Kai, raised an interesting point. The term sounds rather negative, and in truth my description in the previous section was slanted that way. However, Kai points out that It is only through the efforts of these peacock programmers and excessive use of all these features that the rest of us learn where the limits of reasonable use are. It’s a common pattern with many technologies, such as with meta-programming. The first stage is to learn the technology. The next stage is to get excited by this new knowledge and attempt to use it everywhere. It’s then, by seeing it used excessively and understanding the consequences of that, that you learn to regulate its use and ensure it’s appropriate.

As Kai sees it, peacock programmers are the early adopters that the rest of us ultimately learn moderation from, and their nature is an essential part of our learning.

The focus of this blog post is a real unit test from a live code base that I’ve worked on. I’ve copied the original test here below** as an example that the rest of this blog post is going to focus on. There were some more interesting examples to pick from, but they started to get a little unwieldy for blogging purposes, so I’ve picked a simpler example. I’ll add some notes at the end on other issues I had found in the larger tests. I’ll also not be concerning myself with actual details of what is being tested; whether the testing is appropriate or well-formed is not the focus here, but rather just the overall structure of the test class.

*** Disclaimer – the code and the style have been preserved exactly as the were. I’ve merely renamed classes/methods for anonymity.*

require File.expand_path('spec/spec_helper') describe NewFarm do let(:valid_params) { {size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor", user_id: 1321} } let(:expected_attributes) do { size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor", user_id: 1321, fake_sheep: nil, session_id: "a uuid", value: 0 } end let(:service) do double 'service' end let(:sheep_collection) { double("sheep collection", id: "an id", valid?: true) } before do service.stub(:generate_sheep).and_return(sheep_collection) SecureRandom.stub(:uuid).and_return("a uuid") end let(:new_farm) { NewFarm.new service, valid_params } subject { new_farm } describe "#formatted_base_errors" do let(:sheep_collection) { double("sheep collection", id: "an id", valid?: false, errors: {base: errors}) } before do new_farm.save end context "when no errors are present on the sheep_collection" do let(:errors) { [] } its(:formatted_base_errors) { should == "" } end context "when errors are present on the sheep_collection" do let(:errors) { ["no wool", "pretty sure it's a disguised wolf"] } its(:formatted_base_errors) { should == "no wool, pretty sure it's a disguised wolf"} end end describe "#save" do context "success" do it "generates sheep" do service.should_receive(:generate_sheep). with(expected_attributes). and_return(sheep_collection) new_farm.save end it "persists the farm" do new_farm.save Farm.last.uuid.should == "a uuid" end it "associates the sheep collection with the farm" do SecureRandom.stub(:uuid).and_return("a uuid") service.should_receive(:generate_sheep). with(expected_attributes). and_return(sheep_collection) new_farm.save Farm.last.sheep_collection_id.should == "an id" end its(:save) { should be_true } end context "when colour is wrong for breed" do let(:params) { {size: "newborn", colour: "#ff69b4", breed: "Badger Face Welsh Mountain"}.with_indifferent_access } let(:new_farm) { NewFarm.new service, params } its(:save) { should == false} it "puts an error on the colour" do new_farm.save new_farm.errors[:color].should == [ "Badger Face Welsh Mountain sheep are not #ff69b4" ] end end context "when input is missing" do let(:new_farm) { NewFarm.new service } its(:save) { should == false } it "doesn't call the service" do service.should_not_receive(:generate_sheep) new_farm.save end it "doesn't create a farm" do expect { new_farm.save }.not_to change{ Farm.count } end it "adds errors for each field" do new_farm.save [:size, :colour, :breed].each do |attr| new_farm.should have(1).error_on attr end end end context "when service returns errors" do let(:sheep_collection) { double 'sheep_collection', {"valid?" => false, "errors" => { "size" => ["oh noes"], "base" => ["A base error", "Is something else"] } } } before do service.should_receive(:generate_sheep).and_return(sheep_collection) end it "puts the errors on itself" do new_farm.save new_farm.errors[:size].should == ["oh noes"] new_farm.errors[:base].should =~ ["A base error", "Is something else"] end it "returns false" do new_farm.save.should be_false end end end context "#farm_id" do it "returns the farm uuid" do new_farm = NewFarm.new service, valid_params new_farm.save new_farm.farm_id.should == Farm.last.uuid end end end

This test was typical for the code base, and was seen by many as a well,written one. In some ways it is; the tests are small, they’re contained, and they test one thing each.

Yet I’ve always been frustrated when I see code like this. I’m a fan of simplicity and try to avoid, as much as I can, the use of superfluous language features. Maybe I’m just stupid (this is quite possible) but I look at a test like this and I really struggle to understand what’s going on. Also important to note is that I’ve picked one of the simpler samples for this post.

We ran an analysis on a small set of our tests (the tests for one of our services) to figure out the worst culprits for nested describes/contexts. Our record was **seven**. That’s right, seven layers of nested description before you finally get to the test body. Good luck trying to figure out what’s actually being tested and finding the relevant bits of setup!

Nested describes/contexts can potentially be a way of helping structure your tests, but give it a moment’s thought before you wrap that spec in a context and ask yourself “why am I doing this?” **Especially **when you’re creating a context with just one test inside it, why not just make the test name more descriptive? In isolation, the idea of being able to use contexts to help structure and add meaning to your tests is a good one, and I’ve made good use of contexts. But use them sparingly, lift your head regularly from the depths of your spec, and take a look at the whole test file and ask yourself “how’s it looking?” Once you start going more than a couple of levels deep, perhaps it’s time to rethink and ask yourself why your test is so complicated.

e.g., instead of this:

require "spec_helper" context "when something interesting happens" it "behaves like this" do ... end end

…how about this:

require "spec_helper" it "behaves like this when something interesting happens" do ... end

** Bonus:**

Here’s an example I found from a code base (again, with the names obfuscated; my only replacements are farm and sheep) that I’ve worked on with 6 layers (I couldn’t find the record-breaking seven-layer spec )

describe Farm::UpdateSheep do ... describe '#from_farm' do ... context 'saving from farm' do ... describe 'formatting for the api' do ... context 'with a nulled date' do ... it 'submits an empty date' do ...

Now imagine the message you would see when this test fails…

Farm::UpdateSheep #from_farm saving from farm formatting for the api with a nulled date submits an empty data

Another feature that I think is worryingly overused is ‘let’. Now I understand that let is lazily evaluated and can improve performance, but we’re talking about unit tests here; running a suite should be taking in the order of seconds. What I’m more concerned about is the time wasted trying to decipher a test class that has ‘let’ blocks scattered all over.

Again, my biggest issue with them comes from misuse and misdirection. I have seen many simple unit test classes that have been made so much more verbose and complex through the addition of unnecessary ‘let’ blocks. Their introduction led to an explosion in the scope of a test. No longer can I look at a self-contained ‘it’ block and understand the test—now I need to scroll all around, trying to piece together the nested contexts and follow the chain of ‘let’ blocks to see which data is being set up, and then see which of these many lets my test is overriding… I’m getting anxious just thinking about it (more thoughts on this, and alternative structure can be found in the **When, Then, Given **section below).

This is probably my least-favorite feature. I can’t think of a single time in my own experience where I’ve thought that using ‘subject’ was a good idea and I am really confused as to why people are propagating its use. It just further reduces readability and adds pointless misdirection to your tests. When I first experienced the joys of RSpec after a stint working in Java, I loved the expressiveness and the flow of the test structure. But ‘subject’ feels like a step backwards.

It gets even more confusing when your spec is testing something that is happening in the initialization of the class under test. Note: bonus point for combining subject with described_class.

subject {described_class.new} ... # way way down in the file it "does something" do Sheep.should_receive(:is_disguised_wolf?) ... subject end

We came across a test like this by chance while working on a class, and my pairing partner saw the trailing subject and assumed it was some mistake as it looked like a pretty meaningless line accidentally added at the end of a long test. My partner deleted this line of code without giving it much thought, and then a few minutes later when we ran the test, it naturally started failing.

My advice: ditch the pointless subject—new is not always better. Make your tests explicit. I’d much rather see an appropriately named variable that is instantiated within the test body where it’s used as opposed to a cryptic ‘subject’ that was defined way up near the top of the test (or somewhere in the nest of contexts I had to traverse to reach the test).

I’m sure you’re starting to see a pattern here. Most of my annoyances are born from testing styles that increase the spread of focus required to understand what is happening. The idea of a unit test is that it tests a unit. I expect them to be concise and self-contained.

The way I try to achieve this structure is by following the “**When, Then, Given**” pattern. I’m assuming that you’re going to be familiar with the BDD given/when/then (or the synonymous arrange/act/assert) philosophy of test structure. When I write a unit test, I start by mentally picturing these as three distinct sections of my test. At times, when trying to make this pattern explicit or to share it with a new team. I explicitly express the sections through comments:

it "does something" do # Given ... # When ... #Then ... end

So far so good. However, the mistake people make from here is to start working through the sections, filling them in. I wouldn’t do that. Start with the **When**. This should be one line, and the most obvious/easiest to write. When I look at a test written in this format I can immediately recognize the trigger for your scenario.

Next, move on to the **Then**. This is the next easiest section to fill in. What is the expected outcome?

And only then should you go back and complete the **Given**. In order to achieve this result, what data is required? And **keep everything in the test! **Again, this advice is flexible; I can understand there are times when you’d want to pull things out. But I try to resist this urge as much as possible (in the rewritten version of the test at the end of this post, you’ll notice I did use one ‘let’ block). Just keep the exceptions to this practice under check, or else a few months later your test will have grown into a mess. Keep the discipline.

This point brings us back to another reason I hate ‘let’ blocks. You can no longer read a spec and understand why the class under test is behaving the way it does. The data required to achieve its result is now spread all around. In addition, while you’re following all these ‘let’ blocks to see what your test is doing, you’re chasing a load of red herrings, as much of this data is entirely irrelevant to the test’s specific scenario.

I couldn’t find a small enough sample to include as a case study of this point, but shared examples are just a nightmare! Imagine all the trouble of going back and forth across a long test file trying to make heads or tails of it, and then multiply that by 10. Just avoid.

Back to our case study: I’ve attempted to rewrite a number of tests using the style I’m recommending, and the first time I tried, I truly did expect there to be more lines of code. I was OK with that; I was willing to accept that as a trade-off for the increased readability and comprehensibility of the tests.

The original test was 162 lines of code.

The rewrite…101.

Technically speaking, there is more code, the lines are longer, **but** there are fewer of them, and they conveyed more meaning. I’ve seen the same result in almost every test I’ve rewritten, and I was surprised.

So here’s the finished result. This is the above test as I would have written it. It’s not perfect and perhaps it’s just me, but I find this style so much easier to comprehend and get my head around.

require "spec_helper" describe NewFarm do let(:valid_params) { {size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor", user_id: 1321} } it 'only returns base errors when there are no other errors on the sheep' do sheep_collection = double(Service::SheepCollection, valid?: false, errors: {base: ["no wool", "pretty sure it's a disguised wolf"]}) service = double("service", generate_sheep: sheep_collection) new_farm = NewFarm.new(service, valid_params) new_farm.save.should be_false new_farm.formatted_base_errors.should == "no wool, pretty sure it's a disguised wolf" end it 'has no formatted base errors when there are both errors on base as well as other errors on the farm' do sheep_collection = double(Service::SheepCollection, valid?: false, errors: {base: ["farm errors should take priority"]}) service = double("service", generate_sheep: sheep_collection) new_farm = NewFarm.new(service, {breed: nil}) new_farm.save.should be_false new_farm.formatted_base_errors.should be_nil end it 'returns nil when trying to format base errors on a valid farm' do sheep_collection = double(Service::SheepCollection, valid?: true, errors: {}) service = double("service", generate_sheep: sheep_collection) new_farm = NewFarm.new(service, valid_params) new_farm.save new_farm.formatted_base_errors.should be_nil end context "#save" do it 'generates sheep on successful save' do expected_attributes = {size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor", user_id: 1321, fake_sheep: nil, session_id: "a uuid", value: 0} RandomGenerator.stub(:uuid).and_return("a uuid") service = double("service") service.should_receive(:generate_sheep).with(expected_attributes).and_return(double("sheep collection", id: "id", valid?: true)) NewFarm.new(service, valid_params).save end it "persists the sheep" do service = double("service", generate_sheep: double("sheep collection", id: "id", valid?: true)) RandomGenerator.stub(:uuid).and_return("a uuid") NewFarm.new(service, valid_params).save Farm.last.uuid.should == "a uuid" end it "returns the sheep uuid" do service = double("service", generate_sheep: double("sheep collection", id: "id", valid?: true)) new_farm = NewFarm.new service, valid_params new_farm.save new_farm.farm_id.should == Farm.last.uuid end it "associates the sheep collection with the farm" do service = double("service", generate_sheep: double("sheep collection", id: "an id", valid?: true)) NewFarm.new(service, valid_params).save Farm.last.sheep_collection_id.should == "an id" end it "puts an error on the colour when it doesn't make sense for the breed" do service = double('service', generate_sheep: double("sheep colletion", id: "an id", valid?: true)) farm = NewFarm.new(service, {size: "newborn", colour: "#ff69b4", breed: "Badger Face Welsh Mountain"}) farm.save farm.errors[:colour].should == ["Badger Face Welsh Mountain sheep are not #ff69b4"] end it "adds errors and doesn't create a farm when input is missing" do service = double("service", generate_sheep: double("sheep collection", id: "id", valid?: true)) new_farm = NewFarm.new service expect{ new_farm.save }.not_to change{ Farm.count } [:size, :colour, :breed].each{ |attr| new_farm.should have(1).error_on attr } end it 'puts sheep collection errors onto self when there are errors' do sheep_collection = double("sheep collection", valid?: false, errors: {base: ["Error"], size: ["oh noes"]}) service = double("service", generate_sheep: sheep_collection) new_farm = NewFarm.new(service, valid_params) new_farm.save new_farm.errors[:size].should == ["oh noes"] new_farm.errors[:base].should =~ ["Error"] end end end]]>

Continue Reading »]]>

When making a fast but approximate function, the design parameter is the form of the approximating function. Is it a polynomial of degree 3? Or the ratio of two linear functions? The table below has a systematic list of possibilities: the top half of the table uses only multiplication, the bottom half uses division.

Table 1: Approximating Functions

The *rewritten* column rewrites the expression in a more efficient form. The expressions are used in the approximation procedure as follows: to get the approximation to , first is reduced to the interval , then is substituted into the expression. As I explained in Part I and Part II, this procedure gives good accuracy and avoids floating-point cancellation problems.

For each form, the minimax values of are determined—that is, the values that minimize the maximum relative error. The *bits* column gives the bits of accuracy, computed as where is the maximum relative error. This value is computed using the evaluation program from my Part II post, invoked as .

The *“cost”* column is the execution time. In olden days, floating-point operations were a lot more expensive than integer instructions, and they were executed sequentially. So cost could be estimated by counting the number of floating-point additions and multiplications. This is no longer true, so I estimate cost using the evaluation program. I put cost in quotes since the numbers apply only to my MacBook. The numbers are normalized so that the cost of log2f in the standard C library is 1.

The table shows that using division increases accuracy about the same amount as does adding an additional free parameter. For example, line 4 has 11.3 bits of accuracy with four free parameters *a*, *b*, *c*, and *d*. Using division, you get similar accuracy (11.6) with line 8, which has only three free parameters. Similarly, line 6 has about the same accuracy as line 10, and again line 10 has one less free parameter than line 6.

It’s hard to grasp the cost and time numbers in a table. The scatter plot below is a visualization with lines 2–10 each represented by a dot.

You can see one anomaly—the two blue dots that are (roughly) vertically stacked one above the other. They have similar cost but very different accuracy. One is the blue dot from line 9 with cost .43 and 7.5 bits of accuracy. The other is from line 3 which has about the same cost (0.42) but 8.5 bits of accuracy. So clearly, line 9 is a lousy choice. I’m not sure I would have guessed that without doing these calculations.

Having seen the problem with line 9, it is easy to explain it using the identity . If is approximated by on , then in , and this works out to be

which means that the denominators must be roughly equal. Multiplying each by , this becomes . The only way this can be true is if and are very large, so that it’s as if the term doesn’t exist and the approximation reduces to the quotient of two linear functions. And that is what happens. The optimal coefficients are on the order of , which makes (now writing in terms of ) . Plugging in the optimal values , , gives , , which are very similar to the coefficients in line 7 shown in Table 2 later in this post. In other words, the optimal rational function in line 9 is almost identical to the one in line 7. Which explains why the bit accuracy is the same.

In the next plot I remove line 9, and add lines showing the trend.

The lines show that formulas using division outperform the multiplication-only formulas, and that the gain gets greater as the formulas become more complex (more costly to execute).

You might wonder: if one division is good, are two divisions even better? Division adds new power because a formula using division can’t be rewritten using just multiplication and addition. But a formula with two divisions can be rewritten to have only a single division, for example

Two divisions add no new functionality, but could be more efficient. In the example above, a division is traded for two multiplications. In fact, using two divisions gives an alternative way to write line 10 of Table 1. On my Mac, that’s a bad tradeoff: the execution time of the form with 2 divisions increases by 40%.

Up till now I’ve been completely silent about how I computed the minimax coefficients of the functions. Or to put it another way, how I computed the values of , , , etc. in Table 1. This computation used to be done using the Remez algorithm, but now there is a simpler way that reduces to solving a convex optimization problem. That in turn can then be solved using (for example) CVX, a Matlab-based modeling system.

Here’s how it works for line 8. I want to find the minimax approximation to . As discussed in the first post of this series, it’s the relative error that should be minimized. This means solving

This is equivalent to finding the smallest for which there are , , and satisfying

For the last equation, pick . Of course this is not exactly equivalent to being true for all but it is an excellent approximation. The notation may be a little confusing, because are constants, and , and are the variables. Now all I need is a package that will report if

has a solution in , , . Because then binary search can be used to find the minimal . Start with that you know has no solution and that is large enough to guarantee a solution. Then ask if the above has a solution for . If it does, replace ; otherwise, . Continue until has the desired precision.

The set of satisfying the equation above is convex (this is easy to check), and so you can use a package like CVX for Matlab to quickly tell you if it has a solution. Below is code for computing the coefficients of line 8 in Table 1. This matlab/CVX code is modified from http://see.stanford.edu/materials/lsocoee364a/hw6sol.pdf.

It is a peculiarity of CVX that it can report for a value of , but then report for a smaller value of . So when seeing I presume there is a solution, then decrease the upper bound and also record the corresponding values of , , and . I do this for each step of the binary search. I decide which to use by making an independent calculation of the minimax error for each set .

You might think that using a finer grid (that is, increasing so that there are more ) would give a better answer, but it is another peculiarity of CVX that this is not always the case. So in the independent calculation, I evaluate the minimax error on a very fine grid that is independent of the grid size given to CVX. This gives a better estimate of the error, and also lets me compare the answers I get using different values of . Here is the CVX code:

format long format compact verbose=true bisection_tol = 1e-6; m=500; lo=0.70; % check values a little bit below 0.75 hi=1.5; xi = linspace(lo, hi, m)'; yi = log2(xi); Xi = linspace(lo, hi, 10000); % pick large number so you can compare different m Xi = Xi(Xi ~= 1); Yi = log2(Xi); xip = xi(xi >= 1); % those xi for which y = x-1 is positive xin = xi(xi = 0.75); xinn = xi(xi = 1); yin = yi(xi = 0.75); yinn = yi(xi = 0.75); Xin = Xi(Xi = bisection_tol gamma = (l+u)/2; cvx_begin % solve the feasibility problem cvx_quiet(true); variable A; variable B; variable C; subject to abs(A*(xip - 1).^2 + B*(xip - 1) - yip .* (xip - 1 + C)) <= ... gamma * yip .* (xip-1 + C) abs(A*(xin - 1).^2 + B*(xin - 1)- yin .* (xin - 1 + C)) <= ... -gamma * yin .* (xin-1 + C) abs(A*(2*xinn - 1).^2 + B*(2*xinn - 1) - (1 + yinn) .* (2*xinn - 1 + C)) <= ... -gamma * yinn .* (2*xinn - 1 + C) cvx_end if verbose fprintf('l=%7.5f u=%7.5f cvx_status=%s\n', l, u, cvx_status) end if strcmp(cvx_status,'Solved') | strcmp(cvx_status, 'Inaccurate/Solved') u = gamma; A_opt(k) = A; B_opt(k) = B; C_opt(k) = C; lo = (A*(2*Xin - 1).^2 + B*(2*Xin - 1)) ./ (2*Xin - 1 + C) - 1; hi = (A*(Xip - 1).^2 + B*(Xip -1)) ./ (Xip - 1 + C); fx = [lo, hi]; [maxRelErr(k), maxInd(k)] = max(abs( (fx - Yi)./Yi )); k = k + 1; else l = gamma; end end [lambda_opt, k] = min(maxRelErr); A = A_opt(k) B = B_opt(k) C = C_opt(k) lambda_opt -log2(lambda_opt)

Here are the results of running the above code for the expressions in the first table. I don’t bother giving all the digits for line 9, since it is outperformed by line 7.

Table 2: Coefficients for Table 1

So, what’s the bottom line? If you don’t have specific speed or accuracy requirements, I recommend choosing either line 3 or line 7. Run both through the evaluation program to get the cost for your machine and choose the one with the lowest cost. On the other hand, if you have specific accuracy/speed tradeoffs, recompute the cost column of Table 1 for your machine, and pick the appropriate line. The bits column is machine independent as long as the machine uses IEEE arithmetic.

If you want a rational function with more accuracy than line 10, the next choice is cubic/quadratic which gives 20.7 bits of accuracy. That would be with coefficients *A* = 0.1501692, *B* = 3.4226132, *C* = 5.0225057, *D* = 4.1130283, *E* = 3.4813372.

Finally, I’ll close by giving the C code for line 8 (almost a repeat of code from the first posting). This is bare code with no sanity checking on the input parameter . I’ve marked the lines that need to be modified if you want to use it for a different approximating expression.

float fastlog2(float x) // compute log2(x) by reducing x to [0.75, 1.5) { /** MODIFY THIS SECTION **/ // (x-1)*(a*(x-1) + b)/((x-1) + c) (line 8 of table 2) const float a = 0.338953; const float b = 2.198599; const float c = 1.523692; #define FN fexp + signif*(a*signif + b)/(signif + c) /** END SECTION **/ float signif, fexp; int exp; float lg2; union { float f; unsigned int i; } ux1, ux2; int greater; // really a boolean /* * Assume IEEE representation, which is sgn(1):exp(8):frac(23) * representing (1+frac)*2^(exp-127). Call 1+frac the significand */ // get exponent ux1.f = x; exp = (ux1.i & 0x7F800000) >> 23; // actual exponent is exp-127, will subtract 127 later greater = ux1.i & 0x00400000; // true if signif > 1.5 if (greater) { // signif >= 1.5 so need to divide by 2. Accomplish this by // stuffing exp = 126 which corresponds to an exponent of -1 ux2.i = (ux1.i & 0x007FFFFF) | 0x3f000000; signif = ux2.f; fexp = exp - 126; // 126 instead of 127 compensates for division by 2 signif = signif - 1.0; lg2 = FN; } else { // get signif by stuffing exp = 127 which corresponds to an exponent of 0 ux2.i = (ux1.i & 0x007FFFFF) | 0x3f800000; signif = ux2.f; fexp = exp - 127; signif = signif - 1.0; lg2 = FN; } // last two lines of each branch are common code, but optimize better // when duplicated, at least when using gcc return(lg2); }

In response to comments, here is some sample code to compute logarithms via table lookup. A single precision fraction has only 23 bits, so if you are willing to have a table of 2^{23} floats (2^{25} bytes) you can write a logarithm that is very accurate and very fast. The one thing to watch out for is floating-point cancellation, so you need to split the table into two parts (see the `log_biglookup()`

code block below).

The `log_lookup()`

sample uses a smaller table. It uses linear interpolation, because ordinary table lookup results in a step function. When *x* is near 0, log(1+*x*) ≈ *x* is linear, and any step-function approximation will have a very large relative error. But linear interpolation has a small relative error. In yet another variation, `log_lookup_2()`

uses a second lookup table to speed up the linear interpolation.

float log_lookup(float x) // lookup using NDIGIT_LOOKUP bits of the fraction part { int exp; float lg2, interp; union { float f; unsigned int i; } ux1, ux2; unsigned int frac, frac_rnd; /* * Assume IEEE representation, which is sgn(1):exp(8):frac(23) * representing (1+frac)*2ˆ(exp-127) Call 1+frac the significand */ // get exponent ux1.f = x; exp = ((ux1.i & 0x7F800000) >> 23); // -127 done later // top NDIGITS frac = (ux1.i & 0x007FFFFF); frac_rnd = frac >> (23 - NDIGITS_LOOKUP); // for interpolating between two table values ux2.i = (frac & REMAIN_MASK) << NDIGITS_LOOKUP; ux2.i = ux2.i | 0x3f800000; interp = ux2.f - 1.0f; if (frac_rnd < LOOKUP_TBL_LN/2) { lg2 = tbl_lookup_lo[frac_rnd] + interp*(tbl_lookup_lo[frac_rnd+1] - tbl_lookup_lo[frac_rnd]); return(lg2 + (exp - 127)); } else { lg2 = tbl_lookup_hi[frac_rnd] + interp*(tbl_lookup_hi[frac_rnd+1] - tbl_lookup_hi[frac_rnd]); return(-lg2 + (exp - 126)); } } static float log_lookup_2(float x) // use a second table, tbl_interp[] { int exp; float lg2; union { float f; unsigned int i; } ux1; unsigned int frac, frac_rnd, ind; /* * Assume IEEE representation, which is sgn(1):exp(8):frac(23) * representing (1+frac)*2ˆ(exp-127) Call 1+frac the significand */ // get exponent ux1.f = x; exp = ((ux1.i & 0x7F800000) >> 23); // -127 done later // top NDIGITS frac = (ux1.i & 0x007FFFFF); frac_rnd = frac >> (23 - NDIGITS_LOOKUP_2); // for interpolating between two table values ind = frac & REMAIN_MASK_2; // interp = tbl_inter[ind] if (frac_rnd < LOOKUP_TBL_LN_2/2) { lg2 = tbl_lookup_lo[frac_rnd] + tbl_interp[ind]*(tbl_lookup_lo[frac_rnd+1] - tbl_lookup_lo[frac_rnd]); return(lg2 + (exp - 127)); } else { lg2 = tbl_lookup_hi[frac_rnd] + tbl_interp[ind]*(tbl_lookup_hi[frac_rnd+1] - tbl_lookup_hi[frac_rnd]); return(-lg2 + (exp - 126)); } } static float log_biglookup(float x) // full lookup table with 2^23 entries { int exp; float lg2; union { float f; unsigned int i; } ux1; unsigned int frac; /* * Assume IEEE representation, which is sgn(1):exp(8):frac(23) * representing (1+frac)*2ˆ(exp-127) Call 1+frac the significand */ // get exponent ux1.f = x; exp = ((ux1.i & 0x7F800000) >> 23); // -127 done later frac = (ux1.i & 0x007FFFFF); if (frac < TWO_23/2) { lg2 = tbl_lookup_big_lo[frac]; return(lg2 + (exp - 127)); } else { lg2 = tbl_lookup_big_hi[frac - TWO_23/2]; return(-lg2 + (exp - 126)); } } #define NDIGITS_LOOKUP 14 #define LOOKUP_TBL_LN 16384 // 2^NDIGITS #define REMAIN_MASK 0x1FF // mask with REMAIN bits where REMAIN = 23 - NDIGITS static float *tbl_lookup_lo; static float *tbl_lookup_hi; static void init_lookup() { int i; tbl_lookup_lo = (float *)malloc(( LOOKUP_TBL_LN/2 + 1)*sizeof(float)); tbl_lookup_hi = (float *)malloc(( LOOKUP_TBL_LN + 1)*sizeof(float)); tbl_lookup_hi = tbl_lookup_hi - LOOKUP_TBL_LN/2; for (i = 0; i <= LOOKUP_TBL_LN/2; i++) // <= not < tbl_lookup_lo[i] = log2f(1 + i/(float)LOOKUP_TBL_LN); for (i = LOOKUP_TBL_LN/2; i < LOOKUP_TBL_LN; i++) // log2, not log2f tbl_lookup_hi[i] = 1.0 - log2(1 + i/(float)LOOKUP_TBL_LN); tbl_lookup_hi[LOOKUP_TBL_LN] = 0.0f; } /* two tables */ #define NDIGITS_LOOKUP_2 12 #define LOOKUP_TBL_LN_2 4096 // 2^NDIGITS #define TWO_REMAIN_2 2048 // 2^REMAIN, where REMAIN = 23 - NDIGITS #define REMAIN_MASK_2 0x7FF // mask with REMAIN bits static float *tbl_interp; static void init_lookup_2() { int i; tbl_lookup_lo = (float *)malloc(( LOOKUP_TBL_LN_2/2 + 1)*sizeof(float)); tbl_lookup_hi = (float *)malloc(( LOOKUP_TBL_LN_2/2 + 1)*sizeof(float)); tbl_lookup_hi = tbl_lookup_hi - LOOKUP_TBL_LN_2/2; tbl_interp = (float *)malloc(TWO_REMAIN_2*sizeof(float)); // lookup for (i = 0; i <= LOOKUP_TBL_LN_2; i++) tbl_lookup_lo[i] = log2f(1 + i/(float)LOOKUP_TBL_LN_2); for (i = LOOKUP_TBL_LN_2/2; i < LOOKUP_TBL_LN_2; i++) // log2, not log2f tbl_lookup_hi[i] = 1.0 - log2(1 + i/(float)LOOKUP_TBL_LN_2); tbl_lookup_hi[LOOKUP_TBL_LN_2] = 0.0f; for (i = 0; i < TWO_REMAIN_2; i++) tbl_interp[i] = i/(float)TWO_REMAIN_2; } #define TWO_23 8388608 // 2^23 static float *tbl_lookup_big_lo; static float *tbl_lookup_big_hi; static void init_biglookup() { int i; tbl_lookup_big_lo = (float *)malloc((TWO_23/2) * sizeof(float)); tbl_lookup_big_hi = (float *)malloc((TWO_23/2) * sizeof(float)); for (i = 0; i < TWO_23/2; i++) { tbl_lookup_big_lo[i] = log2f(1 + i/(float)TWO_23); // log2, not log2f tbl_lookup_big_hi[i] = 1.0 - log2(1 + (i+ TWO_23/2)/(float)TWO_23); } }

Powered by QuickLaTeX

]]>Continue Reading »]]>

Here’s the explanation. Rounding error in *x+y* can happen in two ways. First if *x > y* and *x* has a different exponent from *y*, then *x* will have its fractional part shifted right to make the exponents match, and so *x* might drop some bits. Second, even if the exponents are the same, there may be rounding error if the addition of the fractional parts has a carry-out from the high order bit. In the case of *x − 1*, the exponents are the same if 1 ≤ *x* < 2. And if 1/2 ≤ *x* < 1, the larger number is 1 and it does not drop bits when shifted. So there is no rounding error in *x − 1* if 1/2 < *x* < 2.

The rule of thumb is that when approximating a function with , severe rounding error can be reduced if the approximation is in terms of . For us, so , and the rule suggests polynomials written in terms of rather than . By the key fact above, there is no rounding error in when .

Let me apply that to two different forms of the quadratic polynomials used in Part I: the polynomial can be written in terms of or .

If they are to be used on the interval and I want to minimize relative error, it is crucial that the polynomial be 0 when , so they become

The second equation has no constant term, so they both cost the same amount to evaluate, in that they involve the same number of additions and multiplications.

But one is much more accurate. You can see that empirically using an evaluation program (code below) that I will be using throughout to compare different approximations. I invoked the program as and and got the following:

SPACING IS 1/1024 using x bits 5.5 at x=0.750000 2.06 nsecs nsec/bit=0.372 bits/nsec=2.69 using x-1 bits 5.5 at x=0.750000 2.10 nsecs nsec/bit=0.380 bits/nsec=2.63 SPACING IS 1/4194304 = 2^-22 using x bits 1.7 at x=1-2.4e-07 2.08 nsecs nsec/bit=1.222 bits/nsec=0.82 using x-1 bits 5.5 at x=0.750000 2.12 nsecs nsec/bit=0.384 bits/nsec=2.61

When the approximating polynomials are evaluated at points spaced 1/1024 apart, they have similar performance. The accuracy of both is 5.5 bits, and the one using is slightly slower. But when they are evaluated at points spaced apart, the polynomial using has poor accuracy when is slightly below 1. Specifically, the accuracy is only 1.7 bits when .

To see why, note that when , is summing two numbers that have rounding error, but are of different sizes, since . But is summing two numbers of similar size, since and the sum of the first two terms is about . This is the bad case of subtracting two nearby numbers (cancellation), because they both have rounding error.

I suppose it is an arguable point whether full accuracy for all is worth a time performance hit of about 2%. I will offer this argument: you can reason about your program if you know it has (in this case) 5.5 bits of accuracy on *every* input. You don’t want to spend a lot of time tracking down unexpectedly low accuracy in your code that came about because you used a log library function with poor precision on a small set of inputs.

Here’s some more information on the output of the evalution program displayed above. The first number is accuracy in bits measured in the usual way as where is the maximum relative error. Following is the value of where the max error occured. The execution time (e.g. 2.06 nsecs for the first line) is an estimate of the time it takes to do a single approximation to , including reducing the argument to the interval . The last two numbers are self explanatory.

Estimating execution time is tricky. For example on my MacBook, if the argument must be brought into the cache, it will significantly affect the timings. That’s why the evaluation program brings and into the cache before beginning the timing runs.

For polynomials, using has almost the same cost and better accuracy, so there is a good argument that it is superior to using . Things are not so clear when the approximation is a rational function rather than a polynomial. For example, . Because , the numerator is actually . And because you can multiply numerator and denominator by anything (as long as it’s the same anything), it further simplifies to . This will have no floating-point cancellation, and will have good accuracy even when . But there’s a rewrite of this expression that is faster:

Unfortunately, this brings back cancellation, because when there will be cancellation between and the fraction. Because there’s cancellation anyway, you might as well make a further performance improvement eliminating the need to compute , namely

Both sides have a division. In addition, the left hand side has a multiplication and 2 additions. The right hand side has no multiplications and 2 additions ( is a constant and doesn’t involve a run-time multiplication, similarly for ). So there is one less multiplication, which should be faster. But at the cost of a rounding error problem when .

SPACING IS 1/1024 using x-1 bits 7.5 at x=0.750000 2.17 nsecs nsec/bit=0.289 bits/nsec=3.46 using x bits 7.5 at x=0.750000 2.07 nsecs nsec/bit=0.275 bits/nsec=3.64 SPACING IS 1/4194304 = 2^-22 using x-1 bits 7.5 at x=0.750000 2.24 nsecs nsec/bit=0.298 bits/nsec=3.36 using x bits 1.4 at x=1+2.4e-07 2.09 nsecs nsec/bit=1.522 bits/nsec=0.66

As expected, the rational function that has one less multiplication (the line marked *using x)* is faster, but has poor accuracy when is near 1. There’s a simple idea for a fix. When is small, use the Taylor series, . Using is a subtraction and a multiplication, which is most likely cheaper than a division and two additions, What is the size cutoff? The error in the Taylor series is easy to compute: it is the next term in the series, , so the relative error is about . And I want to maintain an accuracy of 7.5 bits, or . So the cutoff is , where or .

On my MacBook, the most efficient way to implement appears to be . In the evaluation program, most of the are greater than 1, so only the first of the inequalities is executed. Despite this, adding the check still has a high cost, but no more accuracy than using .

SPACING IS 1/1024 using x-1 bits 7.5 at x=0.750000 2.17 nsecs nsec/bit=0.289 bits/nsec=3.46 using x bits 7.5 at x=0.750000 2.07 nsecs nsec/bit=0.275 bits/nsec=3.64 cutoff bits 7.5 at x=0.750000 2.58 nsecs nsec/bit=0.343 bits/nsec=2.91 SPACING IS 1/4194304 = 2^-22 using x-1 bits 7.5 at x=0.750000 2.24 nsecs nsec/bit=0.298 bits/nsec=3.36 using x bits 1.4 at x=1+2.4e-07 2.09 nsecs nsec/bit=1.522 bits/nsec=0.66 cutoff bits 7.5 at x=0.989000 2.60 nsecs nsec/bit=0.347 bits/nsec=2.88

In Part I of this series, I noted that testing whether was in the range can be done with a bit operation rather than a floating-point one. The same idea could be used here. Instead of using the Taylor series when or , use it in a slightly smaller interval

The latter can be converted to bit operations on , the fraction part of x, as follows:

As bit operations, this is

(exp == 0 && (f & 111111100...) == 0) OR (exp = -1 && (f & 11111100...) == 111111000...)

When I tested this improvement ( in the table below). it was faster, but still slower than using , at least on my MacBook.

SPACING IS 1/1024 using x-1 bits 7.5 at x=0.750000 2.17 nsecs nsec/bit=0.289 bits/nsec=3.46 using x bits 7.5 at x=0.750000 2.07 nsecs nsec/bit=0.275 bits/nsec=3.64 cutoff bits 7.5 at x=0.750000 2.58 nsecs nsec/bit=0.343 bits/nsec=2.91 fcutoff bits 7.5 at x=0.750000 2.46 nsecs nsec/bit=0.327 bits/nsec=3.06 SPACING IS 1/4194304 = 2^-22 using x-1 bits 7.5 at x=0.750000 2.24 nsecs nsec/bit=0.298 bits/nsec=3.36 using x bits 1.4 at x=1+2.4e-07 2.09 nsecs nsec/bit=1.522 bits/nsec=0.66 cutoff bits 7.5 at x=0.989000 2.60 nsecs nsec/bit=0.347 bits/nsec=2.88 fcutoff bits 7.5 at x=0.750001 2.44 nsecs nsec/bit=0.325 bits/nsec=3.08

Bottom line: having special case code when appears to significantly underperform computing in terms of .

In the first post, I recommended reducing to instead of because you get one extra degree of freedom, which in turns gives greater accuracy. Rounding error gives another reason for preferring . When , reduction to will have cancellation problems. Recall the function that was optimal for the interval , . When , must be multiplied by two to move into , and then to compensate, the result is . When , , and so you get cancellation. Below are the results of running the evaluation program on . If there was no rounding error, would be accurate to 3.7 bits. As you get closer to 1 () the accuracy drops.

SPACING IS 1/2^19 g bits 3.5 at x=1-3.8e-06 2.05 nsecs nsec/bit=0.592 bits/nsec=1.69 SPACING IS 1/2^20 g bits 2.9 at x=1-9.5e-07 2.05 nsecs nsec/bit=0.706 bits/nsec=1.42 SPACING IS 1/2^21 g bits 2.9 at x=1-9.5e-07 2.05 nsecs nsec/bit=0.706 bits/nsec=1.42 SPACING IS 1/2^22 g bits 1.7 at x=1-2.4e-07 2.05 nsecs nsec/bit=1.203 bits/nsec=0.83

The goal of this series of posts is to show that you can create logarithm routines that are much faster than the library versions and have a minimum guaranteed accuracy for all . To do this requires paying attention to rounding error. Summarizing what I’ve said so far, my method for minimizing rounding error problems is to reduce to the interval and write the approximating expression using , for example ). More generally, the approximating expression would be a polynomial

or rational function

I close by giving the code for the evaluation program that was used to compare the time and accuracy of the different approximations:

#include <stdio.h> #include <stdlib.h> #include <sys/time.h> #include <math.h> /* * Usage: eval [hi reps spacing] * Evaluates an approximation to log2 in the interval [0.125, hi] * For timing purposes, repeats the evaluation reps times. * The evaluation is done on points spaced 1/spacing apart. */ int main(argc, argv) char **argv; { float x; struct timeval start, stop; float lo, hi, delta; int i, j, n, repetitions, one_over_delta; double xd; float *xarr, *lg2arr, *yarr; // parameters lo = 0.125; hi = 10.0; one_over_delta = 4194304.0; // 2^22 repetitions = 1; if (argc > 1) { hi = atof(argv[1]); repetitions = atoi(argv[2]); one_over_delta = atoi(argv[3]); } delta = 1.0/one_over_delta; // setup n = ceil((hi - lo)/delta) + 1; xarr = (float *)malloc(n*sizeof(float)); yarr = (float *)malloc(n*sizeof(float)); lg2arr = (float *)malloc(n*sizeof(float)); i = 0; for (xd = lo; xd <= hi; xd += delta) { x = xd; if (x == 1.0) // relative error would be infinity continue; xarr[i] = x; lg2arr[i++] = log2(x); } if (i >= n) // assert (i < n) fprintf(stderr, "Help!!!\n"); n = i; /* cache-in xarr[i], yarr[i] */ yarr[0] = 0.0; for (i = 1; i < n; i++) { yarr[i] = xarr[i] + yarr[i-1]; } fprintf(stderr, "cache-in: %f\n\n", yarr[n-1]); // to foil optimizer gettimeofday(&start, 0); for (j = 0; j < repetitions; j++) { for (i = 0; i < n; i++) { yarr[i] = approx_fn(xarr[i]); } } gettimeofday(&stop, 0); finish(&start, &stop, "name ", n, repetitions, xarr, yarr, lg2arr); exit(0); } // convert x to string, with special attention when x is near 1 char *format(float x) { static char buf[64]; float y; if (fabs(x - 1) > 0.0001) sprintf(buf, "%f", x); else { y = x-1; if (y < 0) sprintf(buf, "1%.1e", y); else sprintf(buf, "1+%.1e", y); } return(buf); } void finish(struct timeval *start, struct timeval *stop, char *str, int n, int repetitions, float *xarr, float *yarr, float *lg2arr) { double elapsed; // nanosecs float max, rel; int maxi, i; double bits; elapsed = 1e9*(stop->tv_sec - start->tv_sec) + 1000.0*(stop->tv_usec - start->tv_usec); max = 0.0; for (i = 0; i < n; i++ ) { rel = fabs( (yarr[i] - lg2arr[i])/lg2arr[i]); if (rel > max) { max = rel; maxi = i; } } bits = -log2(max); elapsed = elapsed/(n*repetitions); printf("%s bits %4.1f at x=%s %.2f nsecs nsec/bit=%.3f bits/nsec=%.2f\n", str, bits, format(xarr[maxi]), elapsed, elapsed/bits, bits/elapsed); }

Powered by QuickLaTeX

]]>Continue Reading »]]>

Building a fully scalable website requires a strong focus on code quality. Concepts such as modularity, encapsulation, and testability become extremely important as you move across domains. Whether we are scaling up to desktop or down to mobile, we need the code to stay consistent and maintainable. Every hacked, poorly planned, or rushed piece of code we might add reduces our ability to write elegant, scalable, responsive code.

Perhaps creating a responsive app is not high on your team’s priority list right now. But one day it will be — and the conversion time frame might be very tight when that day comes.

Ideally, all you need to do is add media query CSS and everything just works. But the only way that can happen is if the code readily adapts to responsive changes.

Below are some suggestions and fixes that will make conversion to responsive easier. Some are specific to responsive design while others are general good practices.

Yes, we all know about media queries. How hard can they be? Sprinkle some on any page and you have a responsive website, right?

Using media queries on your pages is essential; they allow you to overwrite CSS values based on screen size. This technique might sound simple, but in a larger project it can quickly get out of hand. A few major problems can get in the way of using media queries properly:

**Colliding media queries:**It is easy to make the mistake of writing media queries that overwrite each other if you do not stick to a common pattern. We recommend using the same boilerplate throughout all projects, and have created one here.**Setting element styles from JS:**This is a tempting, but inferior, approach to building responsive websites. When an element relies on JS logic to set its width, it is unable to properly use media queries. If the JS logic is setting width as an inline property, the width cannot be overwritten in CSS without using`!important`

. In addition, you have to now maintain an ever-growing set of JS logic.**Media queries not at the bottom:**If your queries are not loaded last, they will not override their intended targets. Every module might have its own CSS file, and the overall ordering might not place it at the bottom, which leads us to our next point.**CSS namespacing for encapsulation:**If you are writing a module, its CSS selectors should be properly encapsulated via namespace. We recommend prefixing class names with the module name, such as*navbar-parent*. Following this pattern will prevent conflicts with other modules, and will ensure that media queries at the bottom of your module’s CSS file override their intended targets.**Too many CSS selectors:**CSS specificity rules require media queries to use the same specificity in order to override. It is easy to get carried away in LESS, which allows you to nest CSS multiple levels deep. While it can be useful to go one or two levels deep for encapsulation, usually this is unnecessarily complicating your code. We recommend favoring namespacing over nested specifiers as it is cleaner and easier to maintain.**Using**Adding`!important`

to override styles:`!important`

to your styles reduces maintainability. It is better to avoid relying on`!important`

overrides and instead use CSS namespacing to prevent sharing between modules.

Both responsive and adaptive web design techniques contain powerful tools, but it is important to understand the differences between the two. Responsive techniques usually include media queries, fluid grids, and CSS percentage values. Adaptive techniques, on the other hand, are focused more on JavaScript logic, and the adding or removing of features based on device detection or screen size.

So, which should you use? Responsive or adaptive? The answer depends on the feature you are trying to implement. It can be tempting to jump straight into applying adaptive techniques to your feature, but in many cases it may not be required. Worse, applying adaptive techniques can quickly over-complicate your design. An example of this that we saw in many places is the use of JavaScript logic to set CSS style attributes.

When styling your UI, JavaScript should be avoided whenever possible. Dynamic sizing, for example, is better done through media queries. For most UI designs, you will be deciding on layouts based on *screen size*, not on device type. Confusing the need for device detection with screen size can lead us to apply adaptive where responsive would be superior.

Rethink any design that requires CSS attributes to change based on device detection; in almost all cases it will be better to rely on screen size alone, via media queries. So, when should we use adaptive Javascript techniques?

Adaptive web design techniques are powerful, as they allow for selective loading of resources based on user agent or screen size. Logic that checks for desktop browsers, for example, can load high-resolution images instead of their mobile-optimized counterparts. Loading additional resources and features for larger screens can also be useful. Desktop browsers, for example, could show more functionality due to the increased screen size, browser capability, or bandwidth.

Ideally, additional resources will be lazy-loaded for their intended platforms. Lazily loading modules helps with site speed for mobile web, while still allowing for a full set of functionality for desktop and tablet web. This technique can be applied by checking the user agent on the client or server. If done on the server, only resources supported by the user’s platform should be returned. Alternatively, client-based lazy loading can use Ajax requests to load additional resources if they are supported. This effect can be achieved using client-side JavaScript, based on browser support or user agent. Client-side detection is generally preferred, as it allows feature detection based on actual browser functionality instead of potentially complicated user agent checks.

A responsive flex grid doesn’t have to be complicated. In our live demo page, we show a simple implementation that creates a horizontally scrolling section of image containers. The images are centered, allowed to expand up to 100% of their container, and will maintain their original aspect ratio. In addition, the container height values are set to 100%, allowing us to adjust the height in the parent wrapper only, and keeping our media query overrides simple and easy to read.

The html and css source code use the concepts mentioned above. We plan to add more boilerplate patterns; please don’t hesitate to add your own as well. Pull requests are welcomed!

We hope that the information above will come in handy when you are working on your next mobile-first web project. Below is a summary of what we mentioned above and other helpful tips.

- Most responsive layout can and should be done with media queries. JS manipulation of CSS (maybe with the exception of adding/removing classes) should be avoided. Setting width in JS is not as maintainable or dynamic compared to CSS.
- Use media query boilerplate to ensure you do not have contradicting media queries or have media queries that are always skipped.
- Put media queries at the bottom. Media queries override CSS and should be the final overrides, whether page level or module level.
- If your regular CSS rules have many selectors, your media query CSS rules will have to as well, due to CSS specificity rules. Use as few selectors as possible when defining CSS rules.

- Use CSS classes, not CSS IDs, to avoid CSS specificity issues.
- Use the fewest number of selectors possible to define your selector.
- Reuse classes. If an element has the same look on different parts of the page, do not create two different classes. Make a generic class and reuse it.
- Encapsulate your CSS selectors by using proper namespacing to prevent conflicts.

e.g., `class=”module-name-parent”`

- It is very rare that you need to use
`!important`

. Before you use it, ask yourself whether you can instead add another class (parent or same level). And then ask yourself whether the rule you are trying to override has unnecessary selectors.

- Use LESS nesting only where needed. Nesting is good for organization, but it is also a recipe for CSS specificity issues.
- Check that you do not have a CSS rule that looks like this:

#wrapper #body-content #content #left-side #text { border: 1px solid #000; }

- Work with the design team and define LESS variables using good names. Then, use these LESS variables everywhere possible.
- If you are using a set of CSS rules repeatedly, make it a LESS mixin.

- Most dom structures are more complex than necessary.
- Add a wrapper only when needed. Do not add a wrapper when proper CSS can do the same thing.
- If you remove the wrapper and the layout does not change, you do not need it. Now, do a global search for this wrapper’s references (JS, CSS, rhtml, jsp, tag) and remove them.

- Add a placeholder to your component for lazy loading.
- Lazy-loaded sections will start off empty, so make sure you reserve the correct amount of space for this behavior. Otherwise, you will see the page shift as modules load in.
- Use media queries for the empty section so that it closely matches the filled size.

- If you are playing around with CSS to attempt a layout and it starts working, remember to remove the unnecessary CSS rules. Many of them are probably not needed anymore. Remove the unnecessary wrappers as well.

Image source: http://upload.wikimedia.org/wikipedia/commons/e/e2/Responsive_Web_Design.png

]]>Continue Reading »]]>

You can find code for approximate logs on the web, but they rarely come with an evaluation of how they compare to the alternatives, or in what sense they might be optimal. That is the gap I’m trying to fill here. The first post in this series covers the basics, but even if you are familiar with this subject I think you will find some interesting nuggets. The second post considers rounding error, and the final post gives the code for a family of fast log functions.

A very common way to to compute log (meaning ) is by using the formula to reduce the problem to computing . The reason is that for arbitrary is easily reduced to the computation of for in the interval ; details below. So for the rest of this series I will exclusively focus on computing . The red curve in the plot below shows on . For comparison, I also plot the straight line .

If you’ve taken a calculus course, you know that has a Taylor series about which is . Combining with gives ( for **T**aylor)

How well does approximate ?

The plot shows that the approximation is very good when , but is lousy for near 2—so is a flop over the whole interval from 1 to 2. But there is a function that does very well over the whole interval: . I call it for better. It is shown below in red ( in blue). The plot makes it look like a very good approximation.

A better way to see quality of the approximation is to plot the error . The largest errors are around and .

Now that I’ve shown you an example, let me get to the first main topic: how do you evaluate different approximations? The conventional answer is *minimax*. Minimax is very conservative—it only cares about the worst (max) error. It judges the approximation over the entire range by its error on the worst point. As previously mentioned, in the example above the worst error occurs at , or perhaps at , since the two have very similar errors. The term minimax means you want to minimize the max error, in other words find the function with the minimum max error. The max error here is very close to 0.0050, and it is the smallest you can get with a quadratic polynomial. In other words, solves the minimax problem.

Now, onto the first of the nuggets mentioned at the opening. One of the most basic facts about is that , whether it’s or or . This means there’s a big difference between ordinary and relative error when .

As an example, take . The error in is quite small: . But most likely you care much more about relative error: , which is huge, about . It’s relative error that tells you how many bits are correct. If and agree to bits, then is about . Or putting it another way, if the relative error is , then the approxmation is good to about bits.

The function that solved the minimax problem solved it for ordinary error. But it is a lousy choice for relative error. The reason is that its ordinary error is about near . As it follows that , and so the relative error will be roughly . But no problem. I can compute a minimax polynomial with respect to relative error; I’ll call it for relative. The following table compares the coefficients of the Taylor method , minimax for ordinary error and minimax for relative error :

The coefficients of and are similar, at least compared to , but is a function that is always good to at least 5 bits, as the plot of relative error (below) shows.

Here’s a justification for my claim that is good to 5 bits. The max relative error for occurs at , , and . For example, at

If you’re a nitpicker, you might question whether this is good to 5 bits as claimed. But if you round each expression to 5 bits, each is .

Unfortunately, there’s a big problem we’ve overlooked. What happens outside the interval [1,2)? Floating-point numbers are represented as with . This leads to the fact mentioned above: . So you only need to compute on the interval . When you use for and reduce to this range for other , you get

The results are awful for just below 1. After seeing this plot, you can easily figure out the problem. The relative error of for is about 0.02, and is almost the same as ordinary error (since the denominator is close to ). Now take an just below 1. Such an is multiplied by 2 to move it into [1,2), and the approximation to is , where the compensates for changing to . The ordinary error is still about 0.02. But is very small for , so the ordinary error of 0.02 is transformed to , which is enormous. At the very least, a candidate for small relative error must satisfy . But . This can be fixed by finding the polynomial that solves the minimax problem for all . The result is a polynomial for global.

One surprise about *g* is that its coefficients appear to be simple rational numbers, suggesting there might be a simple proof that this polynomial is optimal. And there is an easy argument that it is *locally* optimal. Since *g*(*x*) = *Cx*^{2} + *Bx *+ *A* must satisfy *g*(1) = 0 and *g(*2) = 1 it is of the form g_{C}(*x*) = *C*(*x*-1)^{2} + (1−*C*)(*x−*1). When *x* > 1 the relative error is ε(*x*) = (g_{C}(x)− log_{2}(*x*)) ⁄ log_{2}(*x*) and lim_{x→ 1+ }*ε*(x) = (1−*C*)log2 − 1. When x < 1 then *ε*(*x*) = (g_{C}(2*x*) − 1− log_{2} (*x*)) ⁄ log_{2}(*x*) and lim _{x→1− }*ε*(x) = 2(1+C)log2 − 1. The optimal g_{C} has these two limits equal, that is (1−*C*)log2 − 1 = 2(1+*C*)log2 − 1, which has the solution *C* = −1/3.

Globally (over all ), the blue curve does dramatically better, but of course it comes at a cost. Its relative error is not as good as over the interval [1, 2). That’s because it’s required to satisfy in order to have a small relative error at . The extra requirement reduces the degrees of freedom, and so does less well on [1, 2].

Finally, I come to the second nugget. The discussion so far suggests rethinking the basic strategy. Why reduce to the interval [1,2)? Any interval will do. What about using [0.75, 1.5)? It is easy to reduce to this interval (as I show below), and it imposes only a single requirement: that . This gives an extra degree of freedom that can be used to do a better job of approximating . I call the function based on reduction [0.75, 1.5) for shift, since the interval has been shifted.

The result is a thing of beauty! The error of is significantly less than the error in . But you might wonder about the cost: isn’t it more expensive to reduce to [0.75, 1.5) instead of [1.0, 2.0)? The answer is that the cost is small. A floating-point number is represented as , with stored in the right-most 23 bits. To reduce to [0.75, 1.5) requires knowing when , and that is true exactly when the left-most of the 23 bits is one. In other words, it can be done with a simple bit check, not a floating-point operation.

Here is more detail. To reduce to , I first need code to reduce to the interval . There are library routines for this of course. But since I’m doing this whole project for speed, I want to be sure I have an efficient reduction, so I write my own. That code combined with the further reduction to is below. Naturally, everything is written in single-precision floating-point. You can see that the extra cost of reducing to [0.75, 1.5) is a bit-wise operation to compute the value , and a test to see if is nonzero. Both are integer operations.

The code does not check that , much less check for infinities or NaNs. This may be appropriate for a fast version of log.

float fastlog2(float x) // compute log2(x) by reducing x to [0.75, 1.5) { // a*(x-1)^2 + b*(x-1) approximates log2(x) when 0.75 <= x < 1.5 const float a = -.6296735; const float b = 1.466967; float signif, fexp; int exp; float lg2; union { float f; unsigned int i; } ux1, ux2; int greater; // really a boolean /* * Assume IEEE representation, which is sgn(1):exp(8):frac(23) * representing (1+frac)*2^(exp-127) Call 1+frac the significand */ // get exponent ux1.f = x; exp = (ux1.i & 0x7F800000) >> 23; // actual exponent is exp-127, will subtract 127 later greater = ux1.i & 0x00400000; // true if signif > 1.5 if (greater) { // signif >= 1.5 so need to divide by 2. Accomplish this by // stuffing exp = 126 which corresponds to an exponent of -1 ux2.i = (ux1.i & 0x007FFFFF) | 0x3f000000; signif = ux2.f; fexp = exp - 126; // 126 instead of 127 compensates for division by 2 signif = signif - 1.0; // < lg2 = fexp + a*signif*signif + b*signif; // < } else { // get signif by stuffing exp = 127 which corresponds to an exponent of 0 ux2.i = (ux1.i & 0x007FFFFF) | 0x3f800000; signif = ux2.f; fexp = exp - 127; signif = signif - 1.0; // <<-- lg2 = fexp + a*signif*signif + b*signif; // <<-- } // lines marked <<-- are common code, but optimize better // when duplicated, at least when using gcc return(lg2); }

You might worry that the conditional test *if greater * will slow things down. The test can be replaced with an array lookup. Instead of doing a bitwise with 0x3f000000 in one branch and 0x3f800000 in the other, you can have a single branch that uses . Similarly for the other difference in the branches, *exp−126* versus *exp−127*. This was not faster on my MacBook.

In summary:

- To study an approximation , don’t plot and directly, instead plot their difference.
- The measure of goodness for is its maximum error.
- The best is the one with the smallest max error (minimax).
- For a function like , ordinary and relative error are quite different. The proper yardstick for the quality of an approximation to is the number of correct bits, which is relative error.
- Computing requires reducing to an interval but you don’t need . There are advantages to picking instead.

In the next post, I’ll examine rounding error, and how that affects good approximations.

Powered by QuickLaTeX

]]>Continue Reading »]]>

One of the projects we are currently working on at Shutl involves building an iOS application. The application is essentially quite simple; it acts as a client for our API, adding animations, visuals, and notifications.

Testing is a key part of our development process, so when we started developing the application, one of the first steps was to find a testing framework that suited our needs. XCode provides XCTest as a testing framework that works good for unit testing. Unfortunately, if you want to test the behavior of your app from a user perspective, XCTest’s abilities are very limited.

Because we are mostly a Ruby shop, we’re familiar with using cucumber.

That’s how we came across Frank, a handy framework which enables you to write functional tests for your iOS applications using cucumber.

The way Frank works is that you “frankify” your iOS app, which then lets you use the accessibility features of iOS to emulate a user using an iOS device. You can launch the app, rotate the device, and interact with the screen in most of the ways a real user can.

If you’re familiar with CSS selectors, interacting with elements on the screen should look very familiar, albeit with a slightly different syntax. Frank also provides custom selectors and predefined steps for some of the most common interactions.

For instance if you want to select a label with the content “I am a label” you could use this:

check_element_exists('label marked:"I am a label"')

There are also predefined steps provided for more complex instructions like clicking a button with the content “Click me”:

When I touch the button marked "Click me"

At first we considered testing against a live QA server but soon experienced problems with this setup. We needed predictable data for our tests, and this is difficult to achieve as the data stored in a live QA environment changes all the time. Combine this with availability issues and you’ve got yourself an unworkable solution.

After some thought, the route we decided to take was to mock these services and return fixtures.

The idea is to keep all the logic that directly interacts with the server inside one unique class or struct. It will provide necessary functions such as `fetchUser`

and `updateResource`

that can be invoked from wherever they’re needed. This allows us to easily implement alternate versions of these functions without affecting the rest of the code.

In the example code below, we have two different implementations. The first one, shown here, uses our remote API to retrieve data from the server.

```
static func requestSuperHeroName(name: String, gender: String, completionHandler: (String) -> Void) {
let url = NSURL(string: "http://localhost:4567?name=\(name)&gender=\(gender)")
let request = NSURLRequest(URL: url!)
NSURLConnection.sendAsynchronousRequest(request, queue: NSOperationQueue.mainQueue()) {(response, data, error) in
if let name = NSString(data: data, encoding: NSUTF8StringEncoding) {
completionHandler(name)
}
}
}
```

The second implementation – our test mock – is simply returning a hard-coded value with the same structure as the ones returned by the server.

```
static func requestSuperHeroName(name: String, gender: String, completionHandler: (String) -> Void) {
completionHandler("Super \(name)")
}
```

Next we’ll define two different targets, one using the real client and another one using the test mock client, and we’ll use the second – mocked – target to create our frankified app.

Here is a walk-through of how such an app could be implemented by using our sample app “What superhero are you?”. You provide the app with your name and gender, and it uses a highly advanced algorithm to determine which superhero you are.

- Set your app up with two targets. One will be using the real backend, and the other one will be using the mocked backend.

- Frankify your app.

- Write your first test. Our first feature looks like this:
Feature: As a user I want to use the app So I can determine which superhero I am Scenario: Put in my name and gender and have my superhero have it return which superhero I am Given I launch the app When I enter my name And I choose my gender And I touch the button marked "Which superhero am I?" Then I want to see which superhero I am

And the related steps:

When(/^I enter my name$/) do fill_in('Name', with: 'Jon') end When(/^I choose my sex$/) do touch("view:'UISegmentLabel' marked:'Male'") end Then(/^I want to see which superhero I am$/) do sleep 1 check_element_exists("view:'UILabel' marked:'Super Jon'") end

- Make the tests pass!

- Done!

## Conclusions

This solution works, but it’s not without its limitations. The most significant one being that you need to return some sensible data in your mocks. In our test app we work with very simple logic, and it did the trick. We return fixed responses, which means that there is no way of testing more complex interactions. These can and should be covered by unit and integration tests, which come with their own problems.

It can also be hard to test certain user actions, like swiping something on the screen. The more customized your app’s interface is, the harder it will be to test it with Frank. Almost anything can be done, but the solution will most likely feel hacked. Also we have yet to find a way of testing web UIs.

Frank is not a magic bullet for functional testing in Swift, but so far we’ve found it a useful addition to our codebase, and we’re liking it!

### Links

What Superhero are you? on Github

Testing with Frank

*(CC image by Chris Harrison)*

(CC image by Esther Vargas)

Continue Reading »]]>

The podcast, How eBay’s Search Technology Helps Users Find Your Listings, touches on how eBay’s evolving machine learning search technology helps users find the listings they are looking for. Dan describes the parts of each listing that eBay’s algorithm searches to contextually find the best listings for users’ searches, as well as recommend additional listings users might be interested in.

The discussion covers the following topics:

- Whether eBay search utilizes users’ prior search behavior to influence search results
- The top areas in a product listing that are crawled first
- Why search is so important to eBay
- What eBay is looking forward to in the future of our search technology
- How eBay is treating mobile search

Continue Reading »]]>

I’ve written in the past that I believe that retrospectives should be a creative process, and I like to engage the brain using interesting visuals and ideas. I’ve attempted to employ this philosophy at Shutl (an eBay Inc. company) by trying to use a different theme for every retrospective I’ve run. (A recent example of a theme I found through funretrospectives.com is the catapult retro.)

Then a few weeks ago, I made a comment to one of our engineers, Volker, that you could pretty much take any situation you can think of and turn it into a retrospective idea; thus the challenge of a Zombie Apocalypse- themed retro was born!

I was first introduced to retrospectives in 2007. Back then, a typical retro would follow the starfish format (or some variation). However, over the past few years I’ve started to see some limitations with such formats. In an attempt to address the more common anti-patterns, I’ve been moving towards a slightly adapted format. I now try to incorporate action items into the brainstorming section, both to streamline the time taken and to focus the group on constructive conversation. This format achieves a few things:

- Shortens the overall time taken by having the group identify not only what’s helping/hindering the team, but also what they can carry forward to improve their performance in the future
- Ensures a more constructive mindset by increasing focus, during the brainstorming itself, on suggestions that address hindrances
- Helps create more achievable solutions by modifying the typical “action item” phase of the retro to instead be a refinement phase, where previously suggested actions are analyzed and prioritized

With the above goals in mind, I started by scribbling and sketching out some ideas in my notepad; after a short while I had come up with a basic draft for the structure of the retro:

I bandied the idea around in my head for a day or so. The finished product looked like this:

The picture above was drawn on a large whiteboard and divided into three color-coded columns (with a fourth column for action items, complete with a reminder that our final actions require a “what,” a “who,” and a “when”).

*This is you, huddled in the corner, with your stockpile of weaponry at the ready, bravely fighting off the ravenous horde crashing through your doorway.*

What’s your ammo? On green stickies, write down all those things that are fueling your team’s successes and working in your favor.

*This is the zombie horde—a relentless army of endless undead marching towards your destruction.*

Use pink stickies to identify the problems that you are facing (including potential future problems).

*This is your perimeter—the security measures you’ve installed to resist the horde and ensure your survival.*

As you’re identifying the issues you face and the current behaviors that are fueling your success, think about what actions you can take today to either address these issues or ensure continued success. The idea is to try to come up with a solution or suggestion for every problem that you can see on a pink sticky.

I tried out the format on the team. I gave them about seven minutes for the brainstorming, with the usual guidelines around collaboration: encouraging people to talk to each other and to look at each other’s suggestions. As a countdown timer, I personally use the 3-2-1 dashboard widget, but there are plenty of others you can use.

We then had a round of grouping and voting (each team member got three votes), with a reminder to vote on things you want to **discuss**, not just things you agree with (e.g., you could strongly disagree with a point on the board, and vote for it to start a discussion). Due to the nature of the board (if things go well), groups of pink stickies should have corresponding orange ones to direct the discussion towards action items.

I wrote down all action items that came up, and gave the team a caveat that we’d have five minutes at the end to review the actions, prioritize them, and pick the ones that we actually wanted to address; this keeps the discussions flowing. We ended up with some conflicting action items*—*which was fine; the idea was to get all the potential actions down, and then at the end decide which we felt were the most valuable. During this final review of the actions, we also assigned owners and deadlines. Then we were done!

Here’s what the final board looked like after our 45-minute retro was complete:

Next challenge: what crazy (yet **effective**) retrospective formats can you come up with?

Continue Reading »]]>

**Have you ever imagined what would happen if you let software developers work on what they want? Well, we did it. For one day. And here are the results…**

“OK, listen: there is no backlog today.”

When we first heard these words from Megan instead of the usual beginning of standup, we didn’t know what to expect. Further explanation wasn’t elaborate either. There was only one rule: you need to demonstrate something at the end of the day.

We had different reactions. We were happy (“Great! A break from the day-to-day tasks!”), shocked (“What did they do with my safe and predictable to-do column! Help!”), and insecure (“Can I really finish something to show in just one day, with no planning, estimating, or design?”).

So that was it. For one full (work) day, all developers in our team at Shutl (an eBay Inc. company) were supposed to forget about ongoing projects, deadlines, and unfinished tasks from the day before. We could work on whatever we wanted. We could pair or work individually. We could work on a DevOps task, on a working feature, or just on a prototype. We could develop a feature on an existing application or create a brand new project.

The first thing we did was a small brainstorm where we described our ideas. It was not obligatory, but it helped in forming pairs and getting some encouragement for our little projects. Then we just started coding.

Now, let me give some background behind this idea. You may have heard about “programmer anarchy” in context of development processes and company culture. In a few words: letting engineers make decisions on what and how they develop in order to meet critical requirements, and getting rid of “managers of programmers” from your development process. Fred George, the inventor of the idea, implemented it in a couple of companies. There was also a big buzz about how Github works with no managers (or rather with everyone being a manager).

These are great examples to read and think about. There are different opinions about this philosophy. Certainly, developing a culture and process that leaves all decisions to developers requires courage, time, money, and a certain kind of people in your team. You have to think very carefully before applying developer anarchy as a day-to-day rule.

We asked ourselves if there was anything we could do without changing our processes and getting rid of our managers, but still gain inspiration from the concepts of developer anarchy? We reckoned we could, and Developer Anarchy Days were born!

Introducing Developer Anarchy Days required very little preparation or changes in our organization. No planning or product management before it began was required. We did have some discussions prior to the event on whether it should be a spontaneously picked day or a planned and scheduled action. We decided for mix of both. Team members would get a ‘warning’ email some days in advance so that they could start thinking about it, but the actual day was a surprise.

The concept is very lightweight and open to interpretation. The premise is simple. Give your developers a day without a backlog or predefined tasks and let their creativity take over. This method has benefits to whatever team composition you may have. Less experienced developers get a chance to expand their skills and their self-confidence as they gain experience in owning and delivering something in a short time frame. More experienced developers get a chance to try out some new technologies they’ve been itching to experiment with. Pairing is always an option (and encouraged), so that there is someone to help and learn from.

What if the team is not an agile team at all? Well, that’s actually a great opportunity to taste a bit of agility. What can be more agile than delivering in just one day?

It depends on how you define wasted time. If you see it as any time not spent directly on delivering pre-defined business requirements/stories, then yes, it is wasted time. You could say the same about avoiding technical debt, working on chores, organizing meetings, or playing ping-pong after lunch. As with any other culture-related thing, it is hard to say. You may waste one day on building software no one will ever look at again. On the other hand, you may learn something, make developers more motivated, invent internal tools that improve efficiency, and even develop some great new innovations to help achieve business goals.

Yes and no. It’s probably not enough time to develop something production-ready, but that’s not the intention. It’s more about trying something new, developing a prototype, creating a small internal tool, or just presenting new ideas to the team. For that, we’ve found that one day is enough.

You can make it longer and spend a couple of days on building small projects in small teams. This may be more effective in complex and usable projects, but also requires more preparation, such as some planning considering ongoing project roadmaps and probably announcing the event earlier so everyone can prepare potential ideas for the projects.

Developer Anarchy Days have a lot in common with hackathons, hackfests, codefests, or hack-days. They’re all about intensively developing and presenting creative ideas. The main difference is that hackathons are usually bigger events in the form of competition, very often involving participants from outside of the company. They require proper event organization, including marketing, proper venue, food, and infrastructure. Usually, the promotional aspect of it is very important. You don’t need all this to organize a Developer Anarchy Day.

- Developers show that they are able to make decisions and explore creative ideas
- Engineers get a chance to come up with ideas from a technological perspective – something that businesses may sometimes miss
- Developers feel more motivated, because they are doing something of their own
- Developers experience how it is when they have to not only deliver something on time but also limit the project to something they can show and sell to others
- Developers can feel like a product manager and understand their job better
- The event breaks the routine of everyday (sometimes monotonous) deliveries
- The event gives everyone an opportunity to finally do stuff that we thought would be nice, but doesn’t bring any direct or indirect business value (e.g. internal tools)
- Finally, the event allows time to try some new technology or crazy idea!

OK, let’s go back to Shutl and our very first Developer Anarchy Day. It was a busy day, but definitely a fun one. Everyone felt responsible for finishing what they began on time. After all, we all had to present something. Some of us were pairing; some decided to give it a go by themselves. Although we love pairing, it is good to get away from it from time to time.

First thing the next morning, we presented our work. The variety and creativity of our little projects was beyond all expectations! Here are couple of examples.

As Shutl has a service-oriented architecture, our everyday work (as everyone’s DevOps) involves logging into multiple boxes. One of our engineers spent Developer Anarchy Day building a super useful command line tool that automates the process of logging in to specific environments without having to ssh into multiple boxes and remember server names. We’ve used it every day since, making our lives easier.

Every day we gather lots of feedback from our customers. The stars they give in their reviews though are a bit impersonal. You can learn much more by analyzing the language of the feedback comments. A pair of Shutl developers spent a day building a language sentiment analyzer that allowed us to get a sense of the general mood of our customers, based on the words they used.

Another Shutl engineer decided to be more DevOps for that day. He experimented with some new tools and demonstrated immutable deployments with CloudFormation and Chef.

Looking for common or possible use cases of our services, we realized that it would be really convenient to use Shutl to pick up and deliver items sent by private sellers on Gumtree or eBay. We have Shutl.it, which allows customers to deliver items from point A to B. The idea was to create a shareable link that pre-fills Shutl.it with pick-up information so any retailer or private seller can offer Shutl as an easy delivery option.

We definitely had fun and learned something. Actually, we now use “Easy login” every day and “Predefined orders” inspired some things on our roadmap.

No surprise here. It was genuinely positive. What can be better for us nine-to-five workers than a little bit of anarchy, especially when it lasts only one day, after which we quickly revert back to comfort and security of prioritized backlog and product management. We all agreed that we want to repeat anarchy on a regular basis. And we do. It has become an important part of our work culture.

]]>