Category Archives: mysql

multi-tenanted CMS architecture

Last week, I did a talk at the Dublin Google buildings titled “Multi-tenanted CMS Architecture using PHP”.

Here are the slides that I used:

While talking with Google’s Brian Brazil, he explained that it is actually more efficient to use one database and many separate tables, than to separate each installation into a separate database, so one point I made (that KV-WebME uses separate databases per site) will change in the future.

I think the talk went down well, by the number of questions afterwards.

Last year, I gave a similar talk, and made the mistake of including way too much PHP in it – I had assumed that the audience would be composed of PHP developers. This year, there is just one slide of PHP, and that’s just to illustrate one possible way to build a proxy config.

Lesson’s learned for this time:

  • Talk slower. When I’m explaining something, I tend to try to get as much in as possible, so speak very fast. This makes it hard to hear what I’m saying.
  • More pictures, less words!
  • Stats. Some of the questions were around how efficient certain parts of the method were – particularly on the overhead of piping a file through a script as opposed to simply delivering it via Apache. I need to come up with numbers for that.

Overall, I was happy with this presentation.

CMS Design using PHP and jQuery

I’m happy this week. Last week, I spent some time and organised myself a bit more. In work, things are going smoothly – managed to get over a tricky piece of work and the rest is simply a list of small tasks.

For the last few weeks, I’ve been emailing and messaging Darshana at Packt Publishing, about writing a second book (jQuery 1.3 with PHP is going very well – list of reviews).

I initially wanted to write about file management, to explain how KFM works, and to help force me to improve on it. But there’s just not enough of an interested market in that – it’s too specialised.

So instead, I’ll be writing about CMS design using PHP and jQuery.

We (webworks.ie) have a CMS engine which we’ve written and improved for the last 6 or so years. We’ve open-sourced it a number of times, but never managed to generate much interest in it. We never had the time to spend on publicising it.

The book will not be specifically about that engine, but rather about the concepts that went into creating it – how a CMS works, how to manage plugins, administration, user management, and all the other little bits and pieces that every PHP developer needs to eventually address.

By way of explanation, I will be demonstrating various parts of our CMS, and explaining how and why it was built that way. I will be closely examining the other major CMSes as well, and giving alternative methods where good ones exist.

The proposed chapter list is:

  1. Introduction
  2. CMS core design
  3. User management and access control
  4. Page creation and Navigation
  5. Template Management
  6. Plugins
  7. Form creation
  8. Image Gallery
  9. Panels
  10. Search and Polls
  11. RSS and News
  12. Online Store
  13. Products

I’m really excited about this project!

jQuery 1.3 With PHP: Datatables chapter is online

Packt have put another chapter of my book online. This time, it’s the datatables chapter, which discusses how to use the Datatables plugin to organise very large data tables.

The chapter discusses how to make a large (over two million rows) database usable through an ajaxified table.

The Datatables plugin supports search, sorting and pagination.

a tale of prefixes

I debugged a problem for a friend of mine over the last day. He had installed bbPress on a server which also had WordPress on it, and his WordPress site just suddenly stopped serving comments and pages, and started killing the database server instead.

In the end (after a day of considering this and that and a lot of head-scratching), it turns out that the bbPress installation had been done using the same table prefix as the WordPress one.

See, when a web app is installed, it usually installs a load of database tables. The way to avoid these tables overwriting each other is to “namespace” them. The most common method is to simply place a unique prefix in front of the table names.

For example, WordPress and bbPress both have tables for posts. The default installation is to name them wp_posts and bb_posts, respectively, using wp_ and bb_ as prefixes. KFM does that as well, using kfm_ by default.

In the current case, what happened is that the bbPress installation was done using wp_ as the prefix, and some of the WordPress tables were then changed to match what bbPress expected.

In the case of wp_posts, the field post_status was changed from a varchar holding values such as “publish” or “draft”, to a smallint holding just ’1′ or ’0′. This had the immediate effect of unpublishing over 6000 articles on the blog. (aside: oi, WordPress – why are you not using enums?)

A few hours, spread over a day or so, were spent figuring that out then fixing that.

So, what actually happened? Why was wp_ used as the prefix?

Let’s look at the WordPress installation.

wordpress

When you install a copy of WordPress, the first thing you do is fill in the database details. This is the field for the prefix:

<input id="prefix" type="text" size="25" value="wp_" name="prefix"/>

Notice the name, “prefix”. A lot of browsers these days include auto-fill, which is sometimes useful. When encountering a form, the browser would fill in the details that it already knows, such as email address, etc.

The problem, though, is that sometimes, autofill might fill in details that really should be left blank or at their default states. An example of this is in phpBB, the popular forum software – whenever an admin goes to edit a user’s details, autofill in Firefox tends to fill in the “password” field in the user form, but it really shouldn’t do that, and can potentially break a user’s account.

This is actually quite easy to fix, though, in the installer – the installation program just needs to change to use a different name for its input box:

<input id="prefix" type="text" size="25" value="wp_" name="wp_prefix"/>

Now the value should only be autofilled by values recorded from other WordPress installations, and should not affect future installations of other software.

The thing is, this is not what broke the system I was fixing.

Let’s look at the bbPress installation.

bbpress

When writing bbPress, more thought appears to have gone into the installation. The only fields visible here are database name, user, and password. These are the most commonly changed details, so are shown prominently.

There is an “advanced” checkbox. When clicked, this shows fields that should only be changed if you are sure of what they do:

bbpress2

The fields shown here rarely need to be changed. They include database host, collation, character set and table prefix.

The default character set is UTF8, and I can’t imagine why anyone would need anything other than that (that also covers collation). My own opinion is that these ones shouldn’t even be in advanced – they should be hard-coded.

There are not many reasons why the table prefix would ever need to be changed, but in the case I was troubleshooting, the database host was held on a separate server to the site itself.

So, because the database host needed to be changed, that explains why the admin opened the Advanced section in the first place.

If the prefix input box was named “prefix”, this would be the end of the story – I might have assumed that the problem was autofill (as I did for a while), and suggest the above solution to them.

But, the authors of bbPress have obviously come across the autofill problem themselves, because they named their prefix button “bb_table_prefix”, exactly as I suggest every installer should do (now do it with WordPress, guys!):

<input [...] name="bb_table_prefix" class="text" value="bb_" tabindex="8" />

So, the problem is not the software.

In this case, I believe it’s human error. What follows is my own guess at what happened, in hind-sight and without the benefit of watching the proceedings in person.

I think what happened is that our admin’s techie had left some notes about the settings used for the WordPress installation, including database username, password, etc. Then the admin had gone to install bbPress, in the correct assumption that installation should be straightforward and compatible with WordPress.

However, I believe the problem is that the techie had left a note about table prefix, but had not explained to the admin that all the other settings must be as written but the prefix must not.

The admin had simply filled in what was written, and bbPress did what it does, and changed WordPress’s database tables.

So, who’s at fault?

I believe the problem here was a misunderstanding – to all of us who have done a million installations of a thousand different software applications, it’s very obvious what the table prefix is, and what it does. However, I don’t believe I’ve ever seen it written down, or the dangers of using an existing value.

Let’s look at the bbPress installation again, this time with the ‘?’ expanded:

bbpress3

It says “If you are running multiple bbPress sites in a single database, you will probably want to change this.” While literally correct, this actually doesn’t explain what the prefix is or why it should be changed. And most importantly, it does not say “This value must be unique in your database, or it may overwrite your existing database tables.” The help text in bbPress’s case does not actually explain the danger of using existing values for the prefix.

The techie is also at fault. If leaving notes for someone, always make sure to explain anything which needs explaining. In this case, it should be something like “If installing new software, use these database access details, but make sure that you do not use these database table prefixes.” In fact, the table prefix should not have been in the notes at all – if the value is a default, then don’t write it down.

The admin is also at fault, for following the notes to the letter and filling in the prefix where the default value was actually perfect. I won’t fault him for installing software when the techie wasn’t at hand to point out mistakes, as there really shouldn’t have been any – the prefix error was actually quite easy to make; it just happened to have devastating effects.

However, these are all minor faults. They’re all easy mistakes to make, and it was the combination of all three that caused the error.

Having said all that, I realise that I have some installers to check now to make sure I follow my own advice!

Mastering phpMyAdmin 3.1

Last year, I reviewed Mastering phpMyAdmin 2.11 for Packt, and said it was overall a comprehensive book, but with a few points that I would change.

I was sent a copy of Mastering phpMyAdmin 3.1 a few weeks ago and after reading it, I can see that most of the points I mentioned have been addressed, and I know from other sources that the points that were not changed have good reasons for them (screenshots and we/our speech are very good for people that are not familiar with the subject).

There are many reviews of this book already, so I will mostly describe how this edition is different from the previous one. In short, though, I would say that this version is much more readable, and is still the best book to buy if you want a book on phpMyAdmin. The reviews I’ve read here all agree with my own assessment that phpMyAdmin is a surprise – right when you think you know what it does, a book points out a load of stuff you didn’t know it did.

Things that are improved

I suggested last year that the 5 page date-by-date history of phpMyAdmin’s delivery be moved from the front of the book to an appendix. At the time, the book didn’t have an appendix. The new edition has two appendices – the first one is for the history, as suggested, and the second contains the old book’s chapter 20 on Troubleshooting. The short history of MySQL at the beginning of the MySQL 5.0 chapter has also been removed – it had no purpose.

A few of the unnecessary screenshots have been removed. I understand the need for screenshots, as they sometimes describe better than mere text what the author means when he says something, but my complaint from last year was that screenshots that were simply not necessary had been put in apparently just to take up space. This has been improved this year, with many of the redundant images removed.

SQL code screenshots have been replaced with just the text of the SQL. This makes the examples much easier to read.

Language issues, such as switching collations, had originally been a full chapter, but is now spread out in the book, where appropriate. This is good, as when exporting a backup (for example), you don’t want to have an important part of the explanation, collation, hidden away at the back of the book.

A lot of the “since 2.6.1″, “since 2.6.0″ wording has been removed. This is good, as the reader of the book is obviously using version 3.1 or higher. The author does sometimes mention the version number when adding new text, and I think that’s unnecessary – the reader really should not need to know what version of phpMyAdmin a new feature was introduced in. If you really want, add another appendix listing dates of improvements, but it’s really not needed; that’s what changelogs are for.

phpMyAdmin itself has a number of new minor features that are detailed in the new book. For example, exports (SQL dumps) can be done to Texy! text format, the search wizard has been improved to allow searches for empty/non-empty fields. Support for the PBXT storage engine is added, public query bookmarks, the default browse view’s query can be customised

There are a number of major features which also make their debut here. New features include database partitioning, scheduled events, streamable blobs (for videos, etc). I’m looking forward to making use of streaming blobs!

Things that are not improved

The table of contents is even longer. At 13 pages (versus 12 last year), I still feel it’s too large. My comment last year was that maybe just the higher level headers should be shown, with more detailed contents at the beginning of each chapter. As an example, under the user management section, there is a header for adding a user. That should be enough. Instead, we then have headers describing the username, the password, the hostname, etc. That sort of thing makes the front of the book very heavy. I feel that most readers when opening the book for the first time wonder to themselves when will the contents end and the content begin.

Most of the screenshots have been redone, but in some cases, shots which were perfectly sized in the old book have been exploded up to larger images in the new one, making them huge relative to the content they portray. For example, on page 78, the top screenshot has a total of about four lines of text in it, but takes up almost two inches of page space (and is pixelated), compared to the original on page 75 of the old book at only one inch. I can see this in a lot of cases in the new book (another is the 3 inch bottom image on 132 vs the perfectly legible 2 inch one in the old book’s page 128) and wonder if that helped to add to the number of pages. The new book has 326 pages versus the old book’s 318. On the good side, in some cases, the new screenshots reduced the size of the old ones.

I didn’t mention it last year, but remember puzzling over it at the time. The last part of the Troubleshooting appendix (chapter 20 back then) discusses future enhancements to phpMyAdmin, but the goals are very vague. “improved support of mysql”, “internal code improvements”; these should not be considered goals – they should be considered givens! Besides which, it’s not productive to discuss future events that may or may not happen. They’re not available yet, so shouldn’t be written about in a manual which is about /existing/ work.

Conclusion

I noted a number of improvements, and a number of failures. The improvements outnumber the failures. Each edition of this book gets better and better.

If you are new to phpMyAdmin and are looking for a book, I really do recommend buying this.

bootnote: I have my own Packt book coming out in a few months. I hope that reviewers of that book (PHP And jQuery) will be as critical as I try to be in my own reviews – I feel that criticism is good, and can only help to improve future editions. This year’s edition of Mastering phpMyAdmin is better than last year’s, and I feel that it is because the author was receptive to criticism and improved his book based on it. I hope I can be just as receptive when my own time comes!

optimise all tables in all mysql dbs on your server

A quick script to do so:

<?php
mysql_connect('localhost','root','YOURPASSWORD');
$alldbs=mysql_query("SHOW DATABASES");
while ($dbs = mysql_fetch_assoc($alldbs)) {
	foreach($dbs as $db_name){
		if($db_name=='information_schema' || $db_name=='mysql')continue;
		mysql_select_db($db_name);
		echo '<strong>database: '.$db_name.'</strong><br />';
		$alltables = mysql_query("SHOW TABLES");
		while ($table = mysql_fetch_assoc($alltables)) {
			foreach ($table as $db => $tablename) {
				echo 'optimising '.$tablename.'<br />';
				mysql_query("OPTIMIZE TABLE `".$tablename."`") or die(mysql_error());
			}
		}
	}
}

Based on this original script for optimising all tables in a specific database.

webme, step 2: upgrading and user accounts

Last time, we wrote a basic installer and nothing else. This time, we’ll create user/admin accounts and discuss how upgrading can be done easily.

With user accounts, it is tempting to create two separate tables of users – one of admins and another of ordinary users, but that’s redundant. The same information is saved in both cases, so it makes more sense to have one table dedicated to users as a whole and some way of distinguishing the ordinary users from the administrators.

The way we’ll handle this is through use of Roles. The simplest being to have a table of users and a table of groups, and if a user is a member of the “admins” groups, then they have admin privileges. This is an example of ACL, which can get very complex if you dig into it. I’d recommend sticking to just users and groups unless there is a really good reason to get more in-depth than that.

So, the next thing we need in the CMS is the ability to set up one admin from the installer, and the tables needed to support it.

Logging in will be done via email and password.

Up until recently, the most common method was logging in via username and password, but that has a disadvantage – when you forget your password, it’s also possible that you forgot which email address you registered with, or you no longer have access to the email address.

Using your email address as the login, you are reminded every time you login what email address the new password will be sent to if you forget the old one. And if you know you’ve lost access to the old email address, you can take measures to change the login email before it becomes impossible to do so.

Adapting the installer to ask for these details was not difficult so I won’t detail it.

Upgrades are an interesting problem.

In a non-database environment, upgrades consist of merely unzipping the new version over the old one (or running “svn up” in the root of the site).

Upgrades become much trickier when databases are involved. Upgrading the files might cause an inconsistency, where the scripts may reference database tables that don’t exist or have been changed.

One project I worked on got around this by having a directory which held database patch scripts which an upgrader should run before the site should be used again. That was messy.

Other projects get around this by requiring you to upgrade your files, and then go to a URL (/upgrade.php for example) which will do the database upgrade for you. Better, but still messy.

I think the solution is to have a database version number kept in a file which is upgraded (the front index.php for example), and another version number kept in the site config file, which is not upgraded. When someone visits the site index, it’s a non-expensive process to simply compare those numbers, and if they are different, then an upgrade is automatically performed, and the config file is adjusted with the new version number.

In that way, upgrading is simply a matter of unzipping the new files, and the next time anyone visits the site, it is upgraded automatically.

Anyway – that’s enough for this step. You still can’t create any pages, but we’re almost ready for it.

As usual, you can get this code from the google Webworks-Webme repository. No zipped package available yet, although the next article might be the first which warrants it (we’ll create an actual page and discuss templates).

case-insensitive authentication in postfix-mysql

Just a quick note. If you find that your customers can log into their postfix account ok, but can’t read their email, check the directory their email is in:

[root@postfix /]# cd /home/vmail/domainname.ie/
[root@postfix domainname.ie]# ls -l
drwx------  5 postfix postfix 4096 Sep  4 11:52 joe.smith
drwx------  9 postfix postfix 4096 Sep  9 09:30 jane.smyth
drwx------  9 postfix postfix 4096 Sep  9 21:09 Jane.Smyth

The problem appears to be that postfix authenticates using MySQL, which is case-insensitive by default, then creates a default email skeleton directory named after the login username if it doesn’t find one already.

You can see in the above example that our user has logged in using uppercase letters in the username, MySQL verified the user had a right to be there, then postfix created the user’s directory using the login credentials, even though a lower-case version of the directory name already existed…

Obviously, this incorrect account will not receive email – email will be sent to the correct one.

There are a few solutions:

  • you can remove the Jane.Smyth directory and tell the user to change their authentication to use a lowercase username.
  • you can remove the Jane.Smyth directory then create a symbolic link from jane.smyth to Jane.Smyth.
  • change your MySQL installation to use a case-sensitive collation.

Personally, i go for the first solution.

book review: Learning PHP Data Objects

Overview: Learning PHP Data Objects, by Dennis Popel, is an introduction to PDO, which walks through the building of a believable test example – a library manager for your home library. Each chapter introduces a new facet of PDO and shows how to rewrite the appropriate parts of the application to slot the new ideas in. Very clear and easy to read. Non-PDO subjects are appropriately kept to the appendices.

I really couldn’t find very much about this book that I didn’t like. Ignoring the appendices, the book is 154 pages purely devoted to teaching PDO through examples, including error handling, working with BLOBs, even the creation of the M in MVC (Models).

I mentioned MVC there. One of my gripes with most tutorials of MVC is that they introduce the concept simply, then provide pages and pages of code with the end product which is “hello world”. Why I should go to all that trouble instead of simply writing <php echo 'hello world'; ?> to the screen usually escapes me. Dennis, however, concentrates solely on the Model and shows exactly why it’s a great idea. I think some more separation of concerns would have been better (don’t mix Author and Book SQL in the same object, for example), but the ideas were all good.

I think that if Dennis was going to show how the Model works, he should also have gone a little further and showed an example of an Active Record pattern as well. But I guess the point of showing MVC was more to show /an/ example of abstraction of the DB code, and that was sufficient.

The book covers a Library manager application all the way through from conception to implementation, demonstrating at all points that the code works with SQLite and MySQL (and by implication, all other DBMS’s) with a change of only the connection string.

Possible problems are explained clearly and solutions are provided. For example, Dennis explains why, after you compile the query select * from books, PDO (and indeed the database itself) does not know how many rows it will return. A solution, in the form of a very smart getRowCount() function shows a query-agnostic method for counting results of an arbitrary line of SQL.

Other areas that are covered in the book include error-handling, prepared statements and transaction-handling.

PDO can handle Prepared Statements even if the underlying DBMS cannot handle it, so it is possible to write your code in a cross-platform way. Examples of why you should use this are provided. One of the examples shows an efficient way to handle insertion or updating of a table using the same parameters for both cases, with the row-handling function deciding whether to use update or insert based on whether an ID was provided.

I feel the Transactions section could have been expanded a bit further. It is not explained how PDO handles this for DBMS’s that don’t internally support transactions, and I wouldn’t like to assume that they work all the time, only to find after deleting critical data that it’s not supported.

Overall, I enjoyed reading this book. Dennis is a good writer and I think he explained his thoughts very clearly.

On an aside, my four-year-old son Jareth loves Packt Publishing‘s books. Sometimes when I go to read another chapter, I need to covertly steal the book I’m reading back from him. For a while, he made it a bed-time ritual to grab all the Packt books he could find around and bring them up with him to read in bed. I think he loved the screen-shots and the frequent code samples. He’s high-functioning autistic and likes literary constructs, and programming books are perfect for him in that regard. Thanks Packt, you’ve made my son (and therefore me) happy.

review: Mastering phpMyAdmin 2.11 for Effective MySQL Management

Packt publishing emailed me a week ago and asked me if I was interested in writing a review of a book on phpMyAdmin.

In short, my review is this: overall, the book detailed everything I can think of (and more) about phpMyAdmin – I learned about some features in phpMyAdmin that I was not already aware of. I would ask for less screenshots, more how-tos and less “we/our” speech for the next revision, but if there are any questions I have about phpMyAdmin, the answers are in this book.

I use phpMyAdmin only occasionally, when I can’t get direct access to the database in question (I’m a console jockey – there’s nothing like the thrill of tappity-tapping away, knowing that you’re doing Important Stuff and …well, yeah – geek).

I’ve used phpMyAdmin since forever, so was familiar with its common uses – importing/exporting databases, running simple queries, etc. I was interested to see if there was anything that I might have missed. And who better to show me than one of the developers, Marc Delisle.

The table of contents was large, spread over 12 pages. I feel it would have been easier to read if there was one page detailing just h1 listings, followed by 2-3 pages of h2 listings. The h3 and h4 listings could have been shown at the beginning of the concerned chapters.

Disregarding that, scanning the contents showed some features that I was not aware of. Did you know that phpMyAdmin has a built-in relations manager? Even if the engine used is MyISAM, which does not manage referential integrity, phpMyAdmin will protect you and manage it for you if queries are run through it.

Screenshots are provided for everything. This makes some things much easier to understand. Sometimes it made interesting features stand out even more. Did you know that phpMyAdmin can generate PDFs of the relational schema of a database? Some screenshots in chapter 15 show examples of this feature.

Marc obviously knows his stuff. As a major contributor to phpMyAdmin since the early days, he is an ideal person to talk us through configuration details and the various tools that phpMyAdmin offers.

There are, however, a number of things which detract from the readability of the book.

This is the first edition for this title, so the following are probably “teething” problems for the author. However, I think they should be taken care of for the second edition, to make the book easier to read. Although, the second edition would most likely have a different title (why is the version number in the title?)

The book is written from a 1st-person-plural point of view – now we do this, now we do that, our table, our server. This distracted and annoyed me while reading the book.

It is too detailed in places, at the expense of readability. For example, chapter one goes through a blow-by-blow account of how phpMyAdmin came to be, with version numbers and even dates! I feel this should have been left to an appendix. The first chapter of any book should introduce a reader gently to the subject; not pummel the reader with historical facts.

The author sometimes forgets that the reader does not know the subject and its community’s habits. As an example, on page 28 the configuration value “PmaAbsoluteUri” is described, but afterwards, I was not sure a) what “Pma” is, or b) what the variable was supposed to do (the book described how it should be configured, but not what it was supposed to do). It isn’t until page 33 that the book pointed out that “pma” is an acronym for “phpMyAdmin” used by the community. I don’t feel stupid for not getting this immediately, as there is an awkward mixture of letter-cases going on there. By rights, “Pma” should stand for “Phpmyadmin”, not “phpMyAdmin”.

The sheer number of screenshots was a bit confusing. I feel that screenshots should only be used where the subject is not clear from the text. Two images on every page (almost) has the effect of breaking your concentration when you’re trying to read.

The book exclusively deals with how to use phpMyAdmin, but I think there should have been more about how to deal with MySQL itself, using phpMyAdmin. For example, chapter 17 details phpMyAdmin’s coverage of character sets and collations. As a person who knows the pain of upgrading a database from MySQL 4 to MySQL 5 while switching character sets, I was looking forward to seeing how this case is handled (which I am sure is pretty common in one form or another). However, the chapter describes the problems with character sets and does not offer any solutions to those problems.

Overall, the book was detailed, and I learned a lot of new stuff from it. I’d recommend it for administrators that are not comfortable with CLI MySQL and want a reference for the best web-based MySQL administration tool out there.