Fast Approximate Logarithms, Part I: The Basics

Performance profiling of some of the eBay code base showed the logarithm (log) function to be consuming more CPU than expected. The implementation of log in modern math libraries is an ingenious wonder, efficiently computing a value that is correct down to the last bit. But for many applications, you’d be perfectly happy with an approximate logarithm, accurate to (say) 10 bits, especially if it were two or three times faster than the math library version. This post is the first of a series that examines approximate logarithms and the tradeoff between accuracy and speed. By the end of the series, you will see how to design fast log functions, and will discover that on a MacBook Pro, you can get a roughly 3x speedup if you are willing to settle for 7 bits. Alternatively, if you want more accuracy you can get 11 bits with a 2x speedup. The numbers might vary for your target machine, but I will present code so that you can determine the tradeoffs for your case.

You can find code for approximate logs on the web, but they rarely come with an evaluation of how they compare to the alternatives, or in what sense they might be optimal. That is the gap I’m trying to fill here. The first post in this series covers the basics, but even if you are familiar with this subject I think you will find some interesting nuggets. The second post considers rounding error, and the final post gives the code for a family of fast log functions.

A very common way to to compute log (meaning \ln = \log_e) is by using the formula \log x = (\log 2) \log_2 x \approx 0.6931472 \log_2x to reduce the problem to computing \log_2. The reason is that \log_2 for arbitrary x is easily reduced to the computation of \log_2 x for x in the interval [1, 2); details below. So for the rest of this series I will exclusively focus on computing \log_2. The red curve in the plot below shows \log_2 on [1, 2]. For comparison, I also plot the straight line y=x-1.

fig1_part1

If you’ve taken a calculus course, you know that \log x has a Taylor series about x=1 which is \log x = (x-1) - \frac{1}{2} (x-1)^2 + \cdots. Combining with \log_2 x = (\log_2 e)\log x gives (t for Taylor)

    \[ t(x) \approx -0.7213 x^2 + 2.8854 x - 2.1640 \]

How well does t(x) approximate \log_2 x?

fig2_part_1

The plot shows that the approximation is very good when x \approx 1, but is lousy for x near 2—so is a flop over the whole interval from 1 to 2. But there is a function that does very well over the whole interval: b(x) = -0.344845x^2 + 2.024658x -1.674873. I call it b for better. It is shown below in red (\log_2 in blue). The plot makes it look like a very good approximation.

fig3_part_1

A better way to see quality of the approximation is to plot the error \log_2 x - b(x). The largest errors are around x \approx 1.2 and x \approx 1.7.

fig4_part1

Now that I’ve shown you an example, let me get to the first main topic: how do you evaluate different approximations? The conventional answer is minimax. Minimax is very conservative—it only cares about the worst (max) error. It judges the approximation over the entire range 1 \leq x < 2 by its error on the worst point. As previously mentioned, in the example above the worst error occurs at x \approx 1.2, or perhaps at x \approx 1.7, since the two have very similar errors. The term minimax means you want to minimize the max error, in other words find the function with the minimum max error. The max error here is very close to 0.0050, and it is the smallest you can get with a quadratic polynomial. In other words, b solves the minimax problem.

Now, onto the first of the nuggets mentioned at the opening. One of the most basic facts about \log is that \log 1 = 0, whether it’s \log_2 or \ln or \log_{10}. This means there’s a big difference between ordinary and relative error when x \approx 1.

As an example, take x = 1.001. The error in b is quite small: \log_2 x - b(x) \approx -0.005. But most likely you care much more about relative error: (\log_2 x - b(x))/\log_2(x), which is huge, about -3.35. It’s relative error that tells you how many bits are correct. If \log_2x and b(x) agree to k bits, then (\log_2 x - b(x))/\log_2(x) is about 2^{-k}. Or putting it another way, if the relative error is \epsilon, then the approxmation is good to about -\log_2 \epsilon bits.

The function b that solved the minimax problem solved it for ordinary error. But it is a lousy choice for relative error. The reason is that its ordinary error is about -0.005 near x=1. As x \rightarrow 1 it follows that \log x \rightarrow 0, and so the relative error will be roughly -0.005/\log x \rightarrow \infty. But no problem. I can compute a minimax polynomial with respect to relative error; I’ll call it r(x) for relative. The following table compares the coefficients of the Taylor method t, minimax for ordinary error b and minimax for relative error r:

Rendered by QuickLaTeX.com

The coefficients of b and r are similar, at least compared to t, but r is a function that is always good to at least 5 bits, as the plot of relative error (below) shows.

fig5_part1

Here’s a justification for my claim that r is good to 5 bits. The max relative error for r occurs at x=1, x=1.5, and x=2. For example, at x=1.5

    \begin{eqnarray*} r(1.5) & = & 0.1001\textbf{1000}\ldots \\ \log_2(1.5) & = & 0.1001\textbf{0101}\ldots \\ \end{eqnarray*}

If you’re a nitpicker, you might question whether this is good to 5 bits as claimed. But if you round each expression to 5 bits, each is 0.10011_2.

 

Unfortunately, there’s a big problem we’ve overlooked. What happens outside the interval [1,2)? Floating-point numbers x are represented as f2^k with 1 \leq f < 2. This leads to the fact mentioned above: \log_2 x = \log_2 f2^k = \log_2 f + k. So you only need to compute \log_2 on the interval [1,2). When you use r(x) for 1 \leq x < 2 and reduce to this range for other x, you get

fig6_part1

The results are awful for x just below 1. After seeing this plot, you can easily figure out the problem. The relative error of r for x \approx 2 is about 0.02, and is almost the same as ordinary error (since the denominator is close to \log_2 2 = 1). Now take an x just below 1. Such an x is multiplied by 2 to move it into [1,2), and the approximation to \log_2 is r(2x) - 1, where the -1 compensates for changing x to 2x. The ordinary error (r(2x) - 1) - \log_2(x) \approx r(2) - 1 is still about 0.02. But \log_2 x is very small for x \approx 1, so the ordinary error of 0.02 is transformed to 0.02/\log_2 x, which is enormous. At the very least, a candidate for small relative error must satisfy r(2) = \log_2(2) = 1. But r(2) \approx 0.98. This can be fixed by finding the polynomial that solves the minimax problem for all x. The result is a polynomial g for global.

Rendered by QuickLaTeX.com

One surprise about g is that its coefficients appear to be simple rational numbers, suggesting there might be a simple proof that this polynomial is optimal. And there is an easy argument that it is locally optimal. Since g(x) = Cx2 + Bx + A must satisfy g(1) = 0 and g(2) = 1 it is of the form gC(x) = C(x-1)2 + (1−C)(x−1). When x > 1 the relative error is ε(x) = (gC(x)− log2(x)) ⁄ log2(x) and limx→ 1+ ε(x) = (1−C)log2 − 1. When x < 1 then ε(x) = (gC(2x) − 1− log2 (x)) ⁄ log2(x) and lim x→1ε(x) = 2(1+C)log2 − 1. The optimal gC has these two limits equal, that is (1−C)log2 − 1 = 2(1+C)log2 − 1, which has the solution C = −1/3.

 

fig7_part1

Globally (over all x), the blue curve g does dramatically better, but of course it comes at a cost. Its relative error is not as good as over the interval [1, 2). That’s because it’s required to satisfy g(2) = 1 in order to have a small relative error at x-2. The extra requirement reduces the degrees of freedom, and so g does less well on [1, 2].

fig8_part1

Finally, I come to the second nugget. The discussion so far suggests rethinking the basic strategy. Why reduce to the interval [1,2)? Any interval [t, 2t) will do. What about using [0.75, 1.5)? It is easy to reduce to this interval (as I show below), and it imposes only a single requirement: that g(1) = 0. This gives g an extra degree of freedom that can be used to do a better job of approximating \log_2. I call the function based on reduction [0.75, 1.5) s for shift, since the interval has been shifted.

Rendered by QuickLaTeX.com

fig9_part1

The result is a thing of beauty! The error of s is significantly less than the error in g. But you might wonder about the cost: isn’t it more expensive to reduce to [0.75, 1.5) instead of [1.0, 2.0)? The answer is that the cost is small. A floating-point number is represented as (1+f)2^e, with f stored in the right-most 23 bits. To reduce to [0.75, 1.5) requires knowing when f \geq \frac{1}{2}, and that is true exactly when the left-most of the 23 bits is one. In other words, it can be done with a simple bit check, not a floating-point operation.

Here is more detail. To reduce to [0.75, 1.5), I first need code to reduce to the interval [1,2). There are library routines for this of course. But since I’m doing this whole project for speed, I want to be sure I have an efficient reduction, so I write my own. That code combined with the further reduction to [0.75, 1.5) is below. Naturally, everything is written in single-precision floating-point. You can see that the extra cost of reducing to [0.75, 1.5) is a bit-wise operation to compute the value \mathsf{greater}, and a test to see if \mathsf{greater} is nonzero. Both are integer operations.

The code does not check that x > 0, much less check for infinities or NaNs. This may be appropriate for a fast version of log.

float fastlog2(float x)  // compute log2(x) by reducing x to [0.75, 1.5)
{
    // a*(x-1)^2 + b*(x-1) approximates log2(x) when 0.75 <= x < 1.5
    const float a =  -.6296735;
    const float b =   1.466967;
    float signif, fexp;
    int exp;
    float lg2;
    union { float f; unsigned int i; } ux1, ux2;
    int greater; // really a boolean 
    /* 
     * Assume IEEE representation, which is sgn(1):exp(8):frac(23)
     * representing (1+frac)*2^(exp-127)  Call 1+frac the significand
     */

    // get exponent
    ux1.f = x;
    exp = (ux1.i & 0x7F800000) >> 23; 
    // actual exponent is exp-127, will subtract 127 later

    greater = ux1.i & 0x00400000;  // true if signif > 1.5
    if (greater) {
        // signif >= 1.5 so need to divide by 2.  Accomplish this by 
        // stuffing exp = 126 which corresponds to an exponent of -1 
        ux2.i = (ux1.i & 0x007FFFFF) | 0x3f000000;
        signif = ux2.f;
        fexp = exp - 126;    // 126 instead of 127 compensates for division by 2
        signif = signif - 1.0;                    // <
        lg2 = fexp + a*signif*signif + b*signif;  // <
    } else {
        // get signif by stuffing exp = 127 which corresponds to an exponent of 0
        ux2.i = (ux1.i & 0x007FFFFF) | 0x3f800000;
        signif = ux2.f;
        fexp = exp - 127;
        signif = signif - 1.0;                    // <<--
        lg2 = fexp + a*signif*signif + b*signif;  // <<--
    }
    // lines marked <<-- are common code, but optimize better 
    //  when duplicated, at least when using gcc
    return(lg2);
}
You might worry that the conditional test if greater will slow things down. The test can be replaced with an array lookup. Instead of doing a bitwise | with 0x3f000000 in one branch and 0x3f800000 in the other, you can have a single branch that uses \mathsf{arr[greater]}. Similarly for the other difference in the branches, exp−126 versus exp−127. This was not faster on my MacBook.

 

In summary:

  • To study an approximation p(x), don’t plot \log_2 and p(x) directly, instead plot their difference.
  • The measure of goodness for p(x) is its maximum error.
  • The best p(x) is the one with the smallest max error (minimax).
  • For a function like \log_2 x, ordinary and relative error are quite different. The proper yardstick for the quality of an approximation to \log_2 x is the number of correct bits, which is relative error.
  • Computing \log requires reducing to an interval [t, 2t) but you don’t need t=1. There are advantages to picking t = 3/4 instead.

In the next post, I’ll examine rounding error, and how that affects good approximations.

Powered by QuickLaTeX

Functional iOS Testing in Swift

test
One of the projects we are currently working on at Shutl involves building an iOS application. The application is essentially quite simple; it acts as a client for our API, adding animations, visuals, and notifications.

Testing is a key part of our development process, so when we started developing the application, one of the first steps was to find a testing framework that suited our needs. XCode provides XCTest as a testing framework that works good for unit testing. Unfortunately, if you want to test the behavior of your app from a user perspective, XCTest’s abilities are very limited.

Because we are mostly a Ruby shop, we’re familiar with using cucumber.

That’s how we came across Frank, a handy framework which enables you to write functional tests for your iOS applications using cucumber.

Frank

The way Frank works is that you “frankify” your iOS app, which then lets you use the accessibility features of iOS to emulate a user using an iOS device. You can launch the app, rotate the device, and interact with the screen in most of the ways a real user can.

If you’re familiar with CSS selectors, interacting with elements on the screen should look very familiar, albeit with a slightly different syntax. Frank also provides custom selectors and predefined steps for some of the most common interactions.

For instance if you want to select a label with the content “I am a label” you could use this:

check_element_exists('label marked:"I am a label"')

There are also predefined steps provided for more complex instructions like clicking a button with the content “Click me”:

When I touch the button marked "Click me"

At first we considered testing against a live QA server but soon experienced problems with this setup. We needed predictable data for our tests, and this is difficult to achieve as the data stored in a live QA environment changes all the time. Combine this with availability issues and you’ve got yourself an unworkable solution.

After some thought, the route we decided to take was to mock these services and return fixtures.

How we are doing it

The idea is to keep all the logic that directly interacts with the server inside one unique class or struct. It will provide necessary functions such as fetchUser and updateResource that can be invoked from wherever they’re needed. This allows us to easily implement alternate versions of these functions without affecting the rest of the code.

In the example code below, we have two different implementations. The first one, shown here, uses our remote API to retrieve data from the server.


static func requestSuperHeroName(name: String, gender: String, completionHandler: (String) -> Void) {
    let url = NSURL(string: "http://localhost:4567?name=\(name)&gender=\(gender)")
    let request = NSURLRequest(URL: url!)
    NSURLConnection.sendAsynchronousRequest(request, queue: NSOperationQueue.mainQueue()) {(response, data, error) in
        if let name = NSString(data: data, encoding: NSUTF8StringEncoding) {
            completionHandler(name)
        }
    }
}

The second implementation – our test mock – is simply returning a hard-coded value with the same structure as the ones returned by the server.


static func requestSuperHeroName(name: String, gender: String, completionHandler: (String) -> Void) {
    completionHandler("Super \(name)")
}

Next we’ll define two different targets, one using the real client and another one using the test mock client, and we’ll use the second – mocked – target to create our frankified app.

Step by step

Here is a walk-through of how such an app could be implemented by using our sample app “What superhero are you?”. You provide the app with your name and gender, and it uses a highly advanced algorithm to determine which superhero you are.

  1. Set your app up with two targets. One will be using the real backend, and the other one will be using the mocked backend.
    targets
  2. Frankify your app.
    frank
  3. Write your first test. Our first feature looks like this:
    Feature:
      As a user
      I want to use the app
      So I can determine which superhero I am
     
    Scenario:
      Put in my name and gender and have my superhero have
      it return which superhero I am
     
    Given I launch the app
    When  I enter my name
    And   I choose my gender
    And   I touch the button marked "Which superhero am I?"
    Then  I want to see which superhero I am
    

    And the related steps:

    When(/^I enter my name$/) do
      fill_in('Name', with: 'Jon')
    end
     
    When(/^I choose my sex$/) do
      touch("view:'UISegmentLabel' marked:'Male'")
    end
     
    Then(/^I want to see which superhero I am$/) do
      sleep 1
      check_element_exists("view:'UILabel' marked:'Super Jon'")
    end
    
  4. Make the tests pass!
    passing-tests
  5. Done!
    done

    Conclusions

    This solution works, but it’s not without its limitations. The most significant one being that you need to return some sensible data in your mocks. In our test app we work with very simple logic, and it did the trick. We return fixed responses, which means that there is no way of testing more complex interactions. These can and should be covered by unit and integration tests, which come with their own problems.

    It can also be hard to test certain user actions, like swiping something on the screen. The more customized your app’s interface is, the harder it will be to test it with Frank. Almost anything can be done, but the solution will most likely feel hacked. Also we have yet to find a way of testing web UIs.

    Frank is not a magic bullet for functional testing in Swift, but so far we’ve found it a useful addition to our codebase, and we’re liking it!

    Links

    What Superhero are you? on Github
    Testing with Frank
    (CC image by Chris Harrison)
    (CC image by Esther Vargas)

Search Engine Journal: How eBay’s Search Technology Helps Users Find Your Listings

Dan Fain, Vice President of Search Technology at eBay Inc., joined Kelsey Jones of the Search Engine Journal for a Marketing Nerds podcast to discuss eBay’s proprietary search technology, which helps both sellers and users in search results.

Search Engine JournalThe podcast, How eBay’s Search Technology Helps Users Find Your Listings, touches on how eBay’s evolving machine learning search technology helps users find the listings they are looking for. Dan describes the parts of each listing that eBay’s algorithm searches to contextually find the best listings for users’ searches, as well as recommend additional listings users might be interested in.

The discussion covers the following topics:

  • Whether eBay search utilizes users’ prior search behavior to influence search results
  • The top areas in a product listing that are crawled first
  • Why search is so important to eBay
  • What eBay is looking forward to in the future of our search technology
  • How eBay is treating mobile search