I debugged a problem for a friend of mine over the last day. He had installed bbPress on a server which also had WordPress on it, and his WordPress site just suddenly stopped serving comments and pages, and started killing the database server instead.
In the end (after a day of considering this and that and a lot of head-scratching), it turns out that the bbPress installation had been done using the same table prefix as the WordPress one.
See, when a web app is installed, it usually installs a load of database tables. The way to avoid these tables overwriting each other is to “namespace” them. The most common method is to simply place a unique prefix in front of the table names.
For example, WordPress and bbPress both have tables for posts. The default installation is to name them wp_posts and bb_posts, respectively, using wp_ and bb_ as prefixes. KFM does that as well, using kfm_ by default.
In the current case, what happened is that the bbPress installation was done using wp_ as the prefix, and some of the WordPress tables were then changed to match what bbPress expected.
In the case of wp_posts, the field post_status was changed from a varchar holding values such as “publish” or “draft”, to a smallint holding just ’1′ or ’0′. This had the immediate effect of unpublishing over 6000 articles on the blog. (aside: oi, WordPress – why are you not using enums?)
A few hours, spread over a day or so, were spent figuring that out then fixing that.
So, what actually happened? Why was wp_ used as the prefix?
Let’s look at the WordPress installation.
When you install a copy of WordPress, the first thing you do is fill in the database details. This is the field for the prefix:
<input id="prefix" type="text" size="25" value="wp_" name="prefix"/>
Notice the name, “prefix”. A lot of browsers these days include auto-fill, which is sometimes useful. When encountering a form, the browser would fill in the details that it already knows, such as email address, etc.
The problem, though, is that sometimes, autofill might fill in details that really should be left blank or at their default states. An example of this is in phpBB, the popular forum software – whenever an admin goes to edit a user’s details, autofill in Firefox tends to fill in the “password” field in the user form, but it really shouldn’t do that, and can potentially break a user’s account.
This is actually quite easy to fix, though, in the installer – the installation program just needs to change to use a different name for its input box:
<input id="prefix" type="text" size="25" value="wp_" name="wp_prefix"/>
Now the value should only be autofilled by values recorded from other WordPress installations, and should not affect future installations of other software.
The thing is, this is not what broke the system I was fixing.
Let’s look at the bbPress installation.
When writing bbPress, more thought appears to have gone into the installation. The only fields visible here are database name, user, and password. These are the most commonly changed details, so are shown prominently.
There is an “advanced” checkbox. When clicked, this shows fields that should only be changed if you are sure of what they do:
The fields shown here rarely need to be changed. They include database host, collation, character set and table prefix.
The default character set is UTF8, and I can’t imagine why anyone would need anything other than that (that also covers collation). My own opinion is that these ones shouldn’t even be in advanced – they should be hard-coded.
There are not many reasons why the table prefix would ever need to be changed, but in the case I was troubleshooting, the database host was held on a separate server to the site itself.
So, because the database host needed to be changed, that explains why the admin opened the Advanced section in the first place.
If the prefix input box was named “prefix”, this would be the end of the story – I might have assumed that the problem was autofill (as I did for a while), and suggest the above solution to them.
But, the authors of bbPress have obviously come across the autofill problem themselves, because they named their prefix button “bb_table_prefix”, exactly as I suggest every installer should do (now do it with WordPress, guys!):
<input [...] name="bb_table_prefix" class="text" value="bb_" tabindex="8" />
So, the problem is not the software.
In this case, I believe it’s human error. What follows is my own guess at what happened, in hind-sight and without the benefit of watching the proceedings in person.
I think what happened is that our admin’s techie had left some notes about the settings used for the WordPress installation, including database username, password, etc. Then the admin had gone to install bbPress, in the correct assumption that installation should be straightforward and compatible with WordPress.
However, I believe the problem is that the techie had left a note about table prefix, but had not explained to the admin that all the other settings must be as written but the prefix must not.
The admin had simply filled in what was written, and bbPress did what it does, and changed WordPress’s database tables.
So, who’s at fault?
I believe the problem here was a misunderstanding – to all of us who have done a million installations of a thousand different software applications, it’s very obvious what the table prefix is, and what it does. However, I don’t believe I’ve ever seen it written down, or the dangers of using an existing value.
Let’s look at the bbPress installation again, this time with the ‘?’ expanded:
It says “If you are running multiple bbPress sites in a single database, you will probably want to change this.” While literally correct, this actually doesn’t explain what the prefix is or why it should be changed. And most importantly, it does not say “This value must be unique in your database, or it may overwrite your existing database tables.” The help text in bbPress’s case does not actually explain the danger of using existing values for the prefix.
The techie is also at fault. If leaving notes for someone, always make sure to explain anything which needs explaining. In this case, it should be something like “If installing new software, use these database access details, but make sure that you do not use these database table prefixes.” In fact, the table prefix should not have been in the notes at all – if the value is a default, then don’t write it down.
The admin is also at fault, for following the notes to the letter and filling in the prefix where the default value was actually perfect. I won’t fault him for installing software when the techie wasn’t at hand to point out mistakes, as there really shouldn’t have been any – the prefix error was actually quite easy to make; it just happened to have devastating effects.
However, these are all minor faults. They’re all easy mistakes to make, and it was the combination of all three that caused the error.