Sunday, May 24, 2009

To ID or not to ID

Somebody commented on my post about my implementation of a US states table, saying there was a problem with it because there was no ID field. I can understand the comment but don’t necessarily agree with it. The table has a primary key, the state code. In this instance, this seems a reasonable approach, the state code isn’t likely to ever change and non-numeric primary keys can still be used as foreign keys in other tables. In fact using the state code has its advantages since in some queries, joining to the state table won’t be necessary if only the state code is required in the result set, so SQL is simplified and is probably quicker.

But ID fields are almost expected by many people, but is this purely down to habit, or are there good reasons for having an ID field in tables? In many cases, there is no other obvious column (or number of columns) to use as a primary key in the table. Then it clearly makes sense to add an ID column. But are there other reasons to use an ID column? Does it offer better performance? I don’t actually know the answer to this question, but I’m interested to know.

I do have an example where a seemingly sensible unique column was used as a primary key, but this caused problems down the track and an ID column would have been a better solution. It was a user table that used the user name as the primary key. This seems like a reasonable idea, since it has to be unique, but the problem came when users wanted to change their user name in the system. This can be due to any number of reasons, but the main one is getting married. Since the user name was used as a foreign key on lots of other tables, things got somewhat complicated when trying to update them all. Of course with hindsight a user ID column would have been a better solution in this case.

So perhaps that’s the reason people add ID columns by default, since it’s perhaps impossible to guess beforehand whether the natural primary key will need to be updated. That said, it seems pretty unlikely that state codes will change, so I’m happy to keep the state code as my primary key.

No comments: