I’ve uploaded the latest house price data from the Land Registry for England and Wales to my website. I’ve also made some changes to the way annual changes are calculated. This means the annual change figure is much less volatile, but it does mean changes in the directions of travel are slower to appear. The current annual change is 3.5% but reducing each month. If the current trend continues, the annual change will go negative in about 9 months time.
Friday, December 30, 2016
I’ve uploaded the UK station usage data for 2016 to my site. I’ve also made a few minor tweaks to the layout which hopefully make the data more useful
Wednesday, November 30, 2016
Monday, November 28, 2016
I’ve uploaded the October 2016 Land Registry house price data to my website. Prices are just about managing to maintain a positive annual increase. Although houses are continuing their upward march, there seems to have been a mini collapse in the price of flats, I’m guessing due to the Buy To Let tax changes.
Friday, October 28, 2016
Friday, September 30, 2016
As I type this, my server is churning through the latest Land Registry data for August 2016. The top level data has been imported and it suggests prices continue to glide ever upwards. The only difference seems to be the annual change is getting smaller, whereas it’s been around 5% for the last few years, it’s now around 2%.
If you look back at the data for earlier this year, you can now clearly see the spike in sales in March just before the buy to let Stamp Duty hike. Sales now seem to have returned to the levels seen after the financial crisis.
Wednesday, August 31, 2016
I’ve uploaded the latest house price stats from the Land Registry to my website. As usual, there is very little to report, prices continue to rise gently. Maybe this apparent calm is what leads up to the Minsky Moment?
Tuesday, August 23, 2016
Wednesday, August 17, 2016
For many years I’ve been using the Google Maps APIs on my website. It’s been fun to use and until recently the licensing has been very unrestrictive. If an API returned a response saying you’d gone over the query limit, just wait for a second or so, try again and generally it would work. So with some use of setTimeout, it was possible to build reasonably scalable apps that cost nothing.
It’s looking like those days are coming to an end. The various APIs are starting to introduce hard limits on their usage. Once you’re over the limit, that’s it until the counter resets at the start of the next day. I first hit this with my use of the Directions API in my Driving Distances page. I can’t say I’m too happy with the way it was introduced, the Google API Console had given no previous indication of my usage of the API but on the same day as they started displaying the usage report, the hard limit was also introduced. Since my site was way over the limit, that page fell over almost immediately.
So I had a Baldrick cunning plan. I’d swap out the Directions API for the Distance Matrix API. This is exactly the kind of application this API was designed for. Unfortunately I failed to read the usage limits correctly and after I uploaded the new code, the page fell over in a heap again after a few hours. It turns out the usage limits apply to the elements passed to the Distance Matrix API, not the number of requests. So a 10 by 10 matrix counts as 100 towards the free 2,500 limit, not 1 as I had assumed. Given that this API provides less information and has fewer options than the Directions API but has the exact same usage limits, this is rather disappointing!
I am trying to figure out the best way forward now. I could start to pay for extra requests, but since a 100 by 100 matrix would cost $5, the costs could mount up quickly. I can put a maximum daily cost on the account so I don’t have to pay enormous amounts if someone overuses the page, but this could lead to the page becoming unavailable again.
I suspect the outcome will be me removing the page, or at least no longer linking to it from the rest of the site. I make a tidy sum from Google AdSense advertising on the site, but I fear this may just be the start, as more and more APIs start to introduce a hard limit and I don’t particularly want to pay that money back to Google every month to pay for their APIs. It was fun whilst it lasted I guess!
Saturday, July 09, 2016
So there’s a page on my site, https://www.doogal.co.uk/BatchGeocoding.php, that someone complained about. Specifically, they complained that if they tried to geocode 3,000 postcodes, it was terribly slow. I tried it myself and experienced the same problem. When geocoding postcodes, the page uses my own internal database, so it should suffer none of the throttling issues of Google Maps. No worries I thought, I can reproduce the problem, which is generally the biggest hurdle, fixing it should be straightforward.
So I fired up Chrome’s profiler and found… absolutely nothing… None of the delays were in my code. So I tried Microsoft Edge and it was super quick. I pretty much gave up at that point and suggested the user tried MS Edge.
Three weeks later I had a look with fresh eyes. And something popped up from the recesses of my mind, spellcheck=”false”. I vaguely remembered setting that attribute on a textarea in the past had improved performance and once again, this fixed the issue. A single geocode was previously taking a second, now all 3,000 took a couple of minutes. This may be a bug in Chrome or maybe spellchecking is a very CPU intensive process. Either way, turning it off makes everything better.
As always this is just a reminder for me and maybe it will be useful to someone passing through.
Friday, July 01, 2016
Tuesday, May 31, 2016
Wednesday, May 25, 2016
I’ve uploaded the latest UK postcode data to my website. Well, nearly all of it. Northern Irish postcodes are still from 2008 since the nice folk at NISRA still release their data under a restrictive license. I assumed at some point they would come into line with the rest of the UK and provide the postcode data with a liberal license, but it doesn’t seem to be happening. Since the data I have is now very old (from a time when the data was released with a liberal license), I am considering removing it from the site. Let me know if you find it useful and I’ll keep it online. And maybe send a polite request to NISRA to open up their data…
Friday, April 29, 2016
Sunday, April 24, 2016
For a while now, Google has been trying to get everyone to move their sites over to https. There’s lots of valid reasons to do this, although the majority of sites don’t really need it.
The carrot of improved rankings hadn’t prompted me to make the change but Chrome 50 removed support for geolocation services which I use in a number of places. So the site was broken in Chrome 50. And it was that stick that motivated me to make the switch.
One reason I’d held off from using https was the cost of a certificate. But things have moved on and it’s now possible to grab a certificate for nothing from Let’s Encrypt. It’s pretty simple to acquire and install a certificate using these instructions.
So the switch has been made. For most users of the site, nothing should have changed, unless I’ve broken something (let me know!). Comments are currently being migrated, so you may find some comments aren’t where they should be. If you are grabbing data from the site directly you may need to change the URL you use from http:// to https://.
Sunday, April 10, 2016
First I needed to know the number of signed up users of Strava. This is pretty straightforward, head off to https://www.strava.com/athletes/6161562 and keep increasing the number at the end until Strava says it can’t find a user. Last March there were 8.2 million users, now there are about 14.4 million, not a bad increase for just over a year.
Next I wanted to capture the active users and the premium users. Since I’m a techy, I can automate this process using the Strava API and a .NET wrapper around it. So I decided to sample 1 in every 10,000 users, giving me about 1,440 sample users which should give the results a reasonable accuracy.
After pulling down that data, the first thing I noticed was that 47 of my requests had returned ‘Not Found’ errors. In fact, most of these were grouped together, suggesting Strava decided to restart their numbering with larger IDs at some point. So the total number of users is probably just shy of 14 million.
Premium UsersOf the 1.393 users I had left, 28 were Premium users, so approximately 2% of all users. This figure is pretty close to last year’s figure so I’m happy to believe it. That equates to 280,000 premium users or $16.6 million in revenue for Strava.
As an aside, I’m a Premium member, but not because Strava offers particularly compelling features for Premium users, but mainly to show my support for a website that is exceedingly useful and fun. I suspect Strava could differentiate between free and Premium a lot more to increase the percentage of Premium users. Take a look at all the functionality available at veloviewer.com.
Active UsersThe Strava API doesn’t let me get activity data for other users, so I’m not able to find out how active users are directly. But it does provide an Updated field, which I’m hoping gets updated when a user uploads an activity (the Strava API docs are a little vague on this point). Using last year’s definition of an active user being someone who has done something in the last 24 days, how many active users are there? I found 181 users where that Updated field was in the last 24 days. That’s about 13%, or 1.8 million active users. The percentage is again fairly close to last year’s figure so I’m happy to go with it.
GenderI found 806 men and 270 women and 317 blanks. Ignoring the blanks, that almost exactly a 75% / 25% split between men and women.
By CountryI think my sample is too small to draw accurate conclusions from the home countries of Strava users, but lets play with the numbers anyway. 463 had blank entries for the country which leaves 930 users with a country specified. I’ve removed countries with less than 5 users in my sample and then adjusted for population. Below I’ve highlighted the countries where more than 2% of the population have signed up for Strava. It seems like there is massive potential to increase usage in many countries, although that may depend on whether there is a culture of recreational running and riding in these countries (China and India being the obvious biggest potential markets). And little old blighty, the UK, is top of the pile. Go UK!
|Country||Population||Sampled users||Approximate Strava users||% of population|
Wednesday, April 06, 2016
So since I don’t have a huge dataset to analyse, lets see how many segments had been created at the end of every year.
So what’s this tell us? It shows the total number of segments created at the end of every year and it looks like since 2011, the number of segments created every year has remained fairly constant. I guess the interesting question would be whether creation of segments can be used as some kind of proxy for usage of Strava since Strava keep this information confidential? I think the answer to that is probably no. A new user in an area already choc full of segments probably isn’t going to feel the need to create more, although they may create a few personal ones (home to work etc). Long term users probably already have all the segments they need. A better approach would be to repeat this study from last year.
But I guess it does show Strava is still being actively used by its users, beyond that it’s hard to say anything definitive.
*Not entirely true, the lowest ID I’ve found is 96, but I imagine the ID of the first segment ever created was 1.
Thursday, March 31, 2016
Monday, February 29, 2016
Saturday, February 27, 2016
I’ve uploaded the latest Land Registry data to my site. The source file is a lot bigger than usual and seems to contain quite a lot of old sales.
Prices are rising even faster and the number of sales is on the increase.
Friday, February 26, 2016
Wednesday, February 03, 2016
But I was still surprised to read someone suggesting they go faster in the winter. The only time this happens to me is when I’m out during a windy winter day and catch a nice tailwind.
But I thought I’d check the Strava data from my site. Which months have the most KOMs? Checking that removes at least one variable, different roads. First I had to update my site to store the KOM date for each segment then I had to grab some data. I chose segments in the UK, removing those that were less than 0.5km (too easily messed up with dubious GPS data) and removing segments with less than 100 riders (not competitive enough) and this is what I got.
This is a fairly small sample but it certainly suggests the summer months are the best time to grab a KOM, which also suggests the summer is the fastest time of the year for cycling. But then a thought crossed my mind, people tend to cycle more during the summer months, so perhaps it’s unsurprising that most KOMs are achieved then. So my conclusion is that I still have no idea and more research is required. And there are too many variables…
For my own recollection, this is the SQL I used to grab the data
SELECT MONTH(KOMDate), COUNT(*) FROM segments
WHERE Country='United Kingdom' AND
KOMDate IS NOT NULL
GROUP BY MONTH(KOMDate)
ORDER BY MONTH(KOMDate)