Wednesday, April 30, 2014

Land Registry March 2014 data uploaded

I’ve uploaded the Land Registry house price data for March 2014 to my website. Now that probably all the sales data for 2013 has come in, it’s plain to see sales volumes were up in 2013 and prices continue to drift upwards

Friday, April 18, 2014

The perils of micro-optimisations

A debate has been raging on my website over the use of StringBuilder.AppendFormat in my exception logger code. OK, raging is something of an exaggeration, there have been two comments in two years. But the point made by two people is that rather than

error.AppendLine("Application: " + Application.ProductName);

I should be using

error.AppendFormat("Application: {0}\n", Application.ProductName);

Since this means I wouldn’t be using string concatenation, which is considered bad for performance reasons. My main reason for not doing anything about this is because I’m lazy, but also because the whole point of this code is that it only runs when an exception is thrown, which hopefully is a pretty rare event, so performance is not a major concern.

But then I wondered what the difference in performance is between these two approaches? So I wrote a little test application that looks like this.

    static void Main(string[] args)
    {
      for (int j = 0; j < 10; j++)
      {
        // try using AppendLine
        Console.WriteLine("AppendLine");
        StringBuilder error = new StringBuilder();
        Stopwatch sw = new Stopwatch();
        sw.Start();
        for (int i = 0; i < 1000000; i++)
        {
          error.AppendLine("Application: " + Application.ProductName);
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);

        // try using AppendFormat
        Console.WriteLine("AppendFormat");
        error.Clear();

        sw.Restart();
        for (int i = 0; i < 1000000; i++)
        {
          error.AppendFormat("Application: {0}\n", Application.ProductName);
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
      }

      Console.ReadKey();
    }

The results from this app in milliseconds are as follows (reformatted for clarity)

AppendLine 307 315 321 372 394 370 289 298 300 296
AppendFormat 366 360 362 471 353 359 354 365 365 350

So which is quicker? Well it looks like AppendLine might be marginally quicker. But, much more importantly, who the feck cares? We are repeating each operation 1 million times and the time to execute is still less than half a second. Maybe you can pick holes in my test application, but again I would ask who the feck cares? Either approach is really fast.

And this is the main problem with trying to optimise this kind of stuff. We can spend huge amounts of time figuring out if one approach is quicker than another, but a lot of the time is doesn’t matter. Either the code runs quick enough using any sensible approach, or it’s hit so infrequently that even a really poor implementation will work.

Of course we should consider performance whilst writing code, but we should only use particular approaches when we know they are going to produce more performant code. A good example is the StringBuilder class. We can be pretty sure this is going to be better than using string concatenation, otherwise it wouldn’t exist in the first place. That said, if you’re concatenating two strings I really wouldn’t worry about it.

But the key to writing efficient code is to understand what is slow on a computer. Network operations are slow. Disk access is slow. Because of that, anything that requires large amounts of memory (meaning virtual memory i.e. disk access) is slow. Twiddling bits in memory is quick. Fast code is achieved by avoiding the slow stuff and not worrying about the quick stuff.

And once you’ve written your code and found it doesn’t run ask quick as you’d hoped, don’t jump in and replace calls to AppendLine with calls to AppendFormat, profile your application! Every time I profile an application, I’m always amazed at the causes of the performance bottleneck, it’s rarely where I thought it would be.

If you don’t have a profiler, use poor man’s profiling. There are also free profilers available, I quite liked the Eqatec Profiler which seems to be available from various download sites, although it’s no longer available from Eqatec. But whatever you do, don’t get into Cargo Cult Programming