July 7, 2004

Hacking WordPress with Microsoft’s IIS Web Server

Making WordPress work on Microsoft’s IIS server had a couple tricky issues that I think are worth writing down.

Now that I’ve finished the entry, I’ve noticed that the only IIS-specific items are #6. Permalinks Formatting, and #8. RSS Comments Feed, the rest of the entry is about the other issues I encountered getting the site up and running. Hopefully some of this information will be useful to others.

  1. Installation and Configuration

    I won’t bore you with the details of getting WordPress up and running because there are several places that describe installation better than I can. I will say though that I use phpMyAdmin to manage my MySql databases. Much easier than managing it through the command line.

  2. Customizing index.php

    index.php is the primary user-visible page in your WordPress site. WordPress generates pages dynamically, and most pages bottleneck through this single page/template, using query parameters to define whether it’s the front door, a single story, a monthly archive, or another page in your site.

    I moved a lot of things around in index.php because I wanted it to look like my old site. I also tweaked the css files quite a bit as well (see next item). In particular I wanted to move the “posted by” info to the bottom of each post, and I didn’t like the way that the side bar was generating sections and links. So, I went in and munged around with the template, added a couple of custom php functions, and a couple of hours later I had a site that looked almost identical to my old site. But better.

    Here’s the index page I finally settled on.

  3. CSS Files

    I searched for “wordpress templates” on Google, and found this site that had lots of example WordPress templates. I liked the look of several of them, so I downloaded them, and took a look at their css files. None of them were exactly what I wanted, but with a couple of hours of prodding and poking, I was able to shoehorn my previous css styles into the WordPress css structure.

    You’ll find my two css files here and here.

  4. Problems with Writing Options

    I started out selecting a couple of writing options (Admin -> Options -> Writing), such as “Formatting: Convert emoticons like :-) and :-P to graphics on display” and “Formatting: WordPress should correct invalidly nested XHTML automatically”. I also liked the looks of the “Search-hilite” plug-in as well as the “Textile 2″ plug-in (both available under Admin -> Plugins).

    They all seemed to work fine with short test posts, but one or the other would barf when the post got long, or when there were lots of html tags in the code. At one point I could see that my posts got entered in the database, but the page would never return. I tracked it down to getting stuck in calls to do_action('publish_post', $post_ID); and do_action('edit_post', $post_ID);.

    Instead of tracking down the actual problems, I simply turned off both “Formatting” options and both plug-ins. Since then the problems seem to have gone away.

  5. Fixing Comment Moderation and the Spam Filter

    Comment spam is a big problem on this little site, I can’t imagine how much of a problem it must be for sites with more traffic. I was excited about trying out the “Comment Moderation” feature (Admin -> Options -> Discussion). I grabbed the spam words from the suggested location and entered them into the Comments Moderation field.

    Then I did a test by adding a couple of comments with spam words. But the comments didn’t get moderated as I thought they should; instead they appeared in the comments section, spam words and all.

    What was wrong I wondered? I re-read the Comment Moderation page and noticed that it said “Separate multiple words with new lines". But the spam words that I had gotten from the link weren’t separated by lines, and I kind of preferred being able to see all the spam words without scrolling, so I went digging in the code.

    I found the offending function, check_comment(), in /wp-includes/functions.php. To have your spam words separated by spaces instead of lines, look for the line that says $words = explode("n", get_settings('moderation_keys')); (that’s a backslash-n or return character) and change it to $words = explode(" ", get_settings('moderation_keys')); (that’s a space character). With this change each of the words in the Comments Moderation field gets checked one at a time, and comments with any of the spam words will be put in the moderation queue.

  6. Permalinks Formatting

    WordPress allows you to specify whether you want Permalinks (more accurately all the links on your site), to be specified using parameters, like this: /index.php?year=2004&monthnum=07&day=07, or using paths, like this: /feed/2004/07/07/. The latter format is preferred because search engines can be touchy about following links with lots of parameters, as it’s harder for them to determine if they’ve gotten into an infinite loop.

    The Permalinks configuration page (Admin -> Options -> Permalinks), is where you specify the permalinks format. The two most common formats are /archives/%year%/%monthnum%/%day%/%hour%/%minute%/%second%/ or /archives/%year%/%monthnum%/%day%/%postname%/.

    The former produces links that look like /archives/2004/07/07/12/01/01/ and the latter produces links that look like /archives/2004/07/07/weblog_entry_title_goes_here/. I chose the former format because it allows me to change the title of a piece without changing the permalink.

    But choosing the format was easy, the real problem is that IIS doesn’t have a built-in rewrite rule handler like Apache does. Luckily I’d looked at rewrite rule handlers for IIS in a former life, and with a quick glance at Google found the one I’d used before – ISAPI_Rewrite. I installed the Lite version (free), and after a little futzing with the rewrite rules, got it working.

    Here’s how I did it:

    1. I downloaded and installed the Lite version from here.

    2. You can configure IIS to use ISAPI_Rewrite on all sites, or you can install it on only individual sites. I chose the latter.

      To install ISAPI_Rewrite on a single site, open the IIS Internet Service Manager. Right-click on the site you want to install it on, and choose “Properties” from the popup menu. Select the “ISAPI Filters” tab, then click the “Add” button. Name it “ISAPI Rewrite", and click “Browse” to find the ISAPI_Rewrite.dll in the install location. Choose the ISAPI_Rewrite.dll, then click “Open” to choose the file. Finally, click “OK” to close your site’s properties panel.

    3. Now you have to add some Rewrite Rules to the ISAPI_Rewrite.dll httpd.ini file. You’ll find this file in the same folder as the ISAPI_Rewrite.dll file.

      A nice WordPress feature is that at the bottom of the Permalinks page they provide the Apache Rewrite Rules for the Permalinks format you’ve chosen. What they provide there is very close to what we need for ISAPI_Rewrite, but we will have to tweak it a bit.

      The first three lines of the WordPress rewrite rules for my Permalinks options looked like this (notice that the last line is wrapped here):

      RewriteEngine On
      RewriteBase /
      RewriteRule ^archives/category/(.*)/(feed|rdf|rss|rss2|atom)/?$ /wp-feed.php?category_name=$1&feed=$2 [QSA]

      I changed them to read as follows (again the last line is wrapped):


      [ISAPI_Rewrite]

      RewriteRule /archives/category/(.*)/(feed|rdf|rss|rss2|atom)/?$ /wp-feed.php?category_name=$1&feed=$2 [I,U,O]

      What did I do? I threw the first two lines away (RewriteEngine and RewriteBase), changed the ^ at the start of the first RewriteRule from a ^ to a /, and changed [QSA] to [I,U,O] at the end of the line. That’s it. Now you can do the same to the rest of the rewrite rules – change the initial ^ to a /, and the final [QSA] to [I,U,O]. Save the file, and you’re done.

      Now, because we’re using the Lite version, we need to restart IIS (in fact you need to do this whenever you make changes to the ISAPI_Rewrite rewrite rules). So, fire up a command window, type “iisreset /restart", hit return, wait for IIS to restart, and you’re in business – Permalinks as they’re supposed to be.

    4. You can see my rewrite rules here.

  7. More Comments Problems

    With the rewrite rules in place, I added a test entry. But something was missing, there were no comments, and no place to add a comment. Hmmm, looks like more digging was going to be required.

    To make a long story short, I had to change code in one place:

    In wp-blog-header.php change

    if (1 == count($posts)) {
       if ($p || $name) {
         $more = 1;
         $single = 1;
       }

    to
    if (1 == count($posts)) {
      $single = 1; // <== add this line here
       if ($p || $name) {
         $more = 1;
         $single = 1;
      }

    This hack forces $single to true when the number of posts = 1. This has the side-affect that comments show on the home page when only one post is showing. But I decided I could live with that until the WordPress guys fixed it properly.

  8. RSS Comments Feed

    The last thing I had to debug was the RSS comments feed. When you’re looking at an entry with comments, there’s a link that says “RSS feed for comments on this post". Clicking on it did nothing on my machine, so I went digging in the code again.

    In the end I needed to make two changes, one a code change, the other a rewrite rule change.

    First the code change. In wp-feed.php change
    if ( (($p != '') && ($p != 'all')) || ($name != '') || ($withcomments == 1) ) {
      require(’wp-commentsrss2.php’);
    }

    to

    if ( (($p != '') && ($p != 'all')) || ($name != '') || ($withcomments == 1) || ($feed == 'rss2_comments') ) {
      require(’wp-commentsrss2.php’);
    }

    Note the addition of || ($feed == 'rss2_comments') to the if statement.

    I think the bug is that $withcomments never gets set anywhere. Instead of trying to figure out where that should be getting set, I simply check the value of $feed.

    Now to the rewrite rule change. You’ll notice that the RSS comments URLs look something like this /archives/2004/07/06/23/15/22/rss2_comments/. Notice the /rss2_comments/ at the end of the URL. We need rewrite rules that handles that case.

    I re-opened the ISAPI_Rewrite httpd.ini file and changed all of the rewrite rules that had this in them (feed|rdf|rss|rss2|atom) to this instead (feed|rdf|rss|rss2|atom|rss2_comments).

    And of course, remember to restart IIS after you make these changes.

    Again, you can see my rewrite rules here.

That’s it. For now. I’m up and running, but I know I haven’t found all possible bugs. And I looked at the latest CVS build, and wow there are a lot of changes. I can see that keeping up with WordPress is going to more work than what I’ve been used to with MovableType. But I can also see that I’m going to get a lot of value for that effort, so I think it will be worth it.

Enjoy!

Posted by: Frank @ 3:16 pm — Filed under: Comments (17)

Moving from MovableType to WordPress

By Saturday last I was limping along on the 4U box, and ready to recreate this website, as well as my Photo Album Pro weblog, by re-entering the stories I’d recovered from the Google cache.

But I was hesitating. I was hesitating because I wasn’t sure I wanted to keep using MovableType as my welog software.

I’d used MovableType originally, about 24 months or so ago now, because it seemed to be the best software out there at the time. There weren’t as many options back then, and MovableType seemed to have the nicest interface, the biggest user base, and it was free. And though I wasn’t a Perl-person, I took the plunge anyway.

I didn’t have much trouble getting it working, and in fact I’ve used it quite successfully for the past year. But I’ve never really been “happy” with it. It worked ok. But I never felt like I was going to master it. I was never going to hack on it. I was never going to make it my own.

Recently there had been a 3.0 release of MovableType, but most of that had been under the hood bug fixes, with few new features. And while we’d all been waiting for that release, any number of alternatives had not only appeared, but appeared to have outstripped MovableType’s capabilities. And of course there were the various licensing issues that have dogged the 3.0 release.

So I sat down and wrote down the pros and cons of continuing with MovableType:

Pros:

  1. I’m currently using it.

Cons:

  1. Perl
    I’m not a Perl person. I’ll never be a Perl person. Shell scripts make my head hurt. Perl makes my head hurt.

  2. Comment spam
    I’ve been getting an increasing amount of comment spam. I wrote my own php-based routines to make it easier to delete spam en mass, but there has to be a better way.

  3. Static pages
    MovableType produces static pages. This might have been necessary when machines were slower, with less memory, but I have a machine with plenty of speed and memory. I want dynamic pages.

  4. Templating system
    I find the MovableType template system unintuitive to use, and time-consuming to debug. Edit, change, save, regenerate static pages, view in browser, start all over again. Boring, boring, boring.

  5. YATL
    Yet Another Templating Language. I’m sure there must be a reason to use a templating language, as opposed to a real language such as php or asp or javascript, but I can’t think of one. I wrote my own minorly successful templating language in 1994, and by 1997 had been convinced that the future was not template languages. So why was I still using one in 2004?

Bottomline: + 1 -5 = -4

Add it all up and it looked like it was time to switch. Time to make lemonade out of the site crash “lemons” I’d been handed.

But to what? That was the $64,000 question.

I’ve been tracking some of the various “which blogging software is best conversations", and the one piece of software that stood out was WordPress. It stood out for a number of reasons. First, it was written in php. Second, it had a big and vocal user community. Third, it was free. Fourth, it was at version 1.2 (better than if it had been at version 0.9 or 1.0). And fifth, even Mark Pilgrim had used it to convert his weblog from MovableType to WordPress.

I clicked around Google a bit more, reading up on as many blogging packages as I could find. But when it was time to make the decision I realized I really wanted one based on php and MySql. And so I took the WordPress plunge.

I downloaded the software. Configured the database. Configured wp-config.php to point at the database. And a couple of config screens later I had new weblog software running. A dynamically generated weblog, that is written in php. Ahhhh, I felt like I could breathe again.

In my next post, I’ll describe some of the changes I made to WordPress, and some of the configuration issues I found in using it with Microsoft’s IIS web server.

Posted by: Frank @ 12:20 pm — Filed under: Comments (0)

July 6, 2004

Server Crash and Site Rebuild

It has been a very, very, very, long week.

A week ago today, the server which hosts this web site crashed. The disk drive failed, refusing to ever boot again. And as I sat staring at the screen, wondering if I needed to fly back to San Francisco (from London), my worry turned to horror as I realized that I hadn’t backed up in a very very long time. Since…ouch…the end of October. Yes, eight long months ago.

Now before you start chastising me – I’ve done quite enough of that myself this past week, thank you very much – stop what you’re doing right now, and back up whatever it is you haven’t been backing up. Buy a big fat hard disk, connect it to your computer, and make a backup. I went and bought a 250GB disk from Western Digital. I partitioned it into 4 x 60GB partitions, and now I backup every night before I go to bed (well, most every night…). If you’re using a Mac get the free SilverKeeper backup program from LaCie – it’s pretty good. And if you’re using Windows, the built-in “Backup” program is free too (Start -> Programs -> Accessories -> System Tools -> Backup).

I own the hardware that runs this site (and yes, there are times when I wonder why I own my own hardware – times like this actually). The server is parked in a colo in San Jose, in the cage of a friend who runs a medium-sized web site. I have two machines, a 1U IBM eServer 330, and a 4U no-name box that’s mostly full of air (and is a heck of a lot slower than the IBM).

I called the colo to have them reboot the 1U with a monitor attached. It wouldn’t get past the BIOS, failing with a 19990301, an error code that shows up on exactly once on the IBM web site with this informative piece of information: “19990301 - Hard disk boot failed. Run Setup Utility. Hard disk drive.” Gee thanks, that’s helpful.

After a couple of hours back and forth with the nice folks at the colo, after my fifth cup of worry coffee, and after moaning some more to Rachel, I wondered if I could at least get myself up and limping again using the 4U box. Hmmm. I’d really only used it to run secondary DNS, but it did have an old copy of my mail server program, and it had a web server…maybe…just maybe, I could go from 100% disaster, to only 98% disaster. Heck, it was worth a try.

I fired up Terminal Client and connected to the 4U box. DNS was out (looks like I’d misconfigured something, it shouldn’t have been out, but I’ll worry about that later), so no one on the net could find my web sites; of course that wasn’t really a problem since there were no longer any web sites to find. My first piece of luck was that the secondary DNS had stored dns files for all of my domains, so I fired up the DNS server user interface, switched all the domains from secondary to primary, and with that the breath of internet life had resurrected my domains.

Next I re-started the mail server. It was a version from about 18 months earlier, but it still worked. I added a couple of domains and accounts that were missing. And then I went back to the DNS server and updated all the mail server entries to point to the ip address of the 4U box.

6 hours later and I had DNS and mail working again. For the first time I thought that I might not have to fly back to San Francisco after all…

The thing I was most worried about was that I didn’t have a backup of the writing I’d done on my A Year In Cornwall weblog. I had all the pictures, but very little of the writing. I remembered reading about someone else re-creating their web site by using the Google cache, so I fired up my web browser and typed in “A Year In Cornwall". Sure enough, there was the home page. Then I typed in “A Year In Cornwall archive” and there was one of the monthly archive pages. And so one month at a time, I downloade the text of all of the entries in my weblog. Whew – got it all – thank you Google!

Now on to the web sites. That was going to be a bit harder, mostly because so much had changed since October. I started by opening the October 31 backup. The backup had most of the web sites, and though they’d all been updated since the backup, the folder structure was pretty much the same. I copied them all to the 4U box, and started creating web sites, one at a time. I’d forgotten I had so many sites – 18 in all if you include things like the database admin site, the stats site, and the blogging admin site. As I was doing that I also upgraded php and MySql to the latest versions. Then I copied the database files from the October backup and got the database up and running. I even installed phpBB for my support bulletin board. And then I tried to create the ftp sites I needed. And that’s when I got stuck.

Creating ftp sites. Can there be a more complicated, convoluted and poorly-documented feature in all of Microsoft’s IIS web/ftp server? If there is, I’d like to know what it is, because it took me 5 hours – 5 HOURS – to get half a dozen ftp accounts set up. First the user accounts, then the permissions, then the ftp sites. And while it all seems to work now, there’s a big red “ERROR” icon next to each ftp site in the ftp console window. Why? Damned if I know – I just hope it doesn’t mean something like “your server is now open for attack because you’ve misconfigured everything".

So there it was Thursday, two days later, and things were looking a lot better than they had on Tuesday.

In the meantime there’s the issue of what to do about the hardware. I scrambled around trying to find a way to get the dead server up and running again, without flying all the way back to San Francisco. I called some of my old workmates at Wired – they’d all been laid off – and the two I was able to contact were too busy to make the trek to San Jose. A bunch of phone calls and emails later and I realize how few friends I have that could actually do the work. I thought about calling my brother Chris, trying to lead him through hooking up the monitor and rebooting the machine, but then I envisioned him with his tool belt on – he’s a general contractor – and after a particularly frustrating moment taking one of his hammers to the equipment. Noooooo, maybe I better find another way. Luckily, it turns out that the colo I use, AboveNet, has a Level 2 tech-support program, which means that if I can get them new disks, they can do the reinstall.

So I jumped on eBay, and start looking for disks. Unfortunately the disks I needed were quite specialized – they’re hot swap disks – and it was important that I get the right drive, right tray and right type of connector. More unfortunately, few people put all the pertinent serial numbers on their items, so searching for disks was very hit and miss. That and the fact that I needed to “Buy It Now", not wait for an auction to complete. After many hours of surfing eBay I finally found two disks in Minnesota that could be overnighted to San Jose (and crossed my fingers that they were the right disks). Then I ordered Norton SystemWorks from PCConnection on the off-chance that it could recover the data from the disk. And by Thursday night all the hardware and software I needed was on its way to San Jose.

Back to the software setup. Web stats was next. I use Analog and ReportMagic for the reporting, along with QuickDNS for doing ip-to-host lookups, and Stats Automator for IIS to make it a one click process. The hardest thing here is getting all the various config files talking to each other. Analog has to output data in the right format for ReportMagic. QuickDNS has to run before Analog and cache the dns lookups in the right place. And the Automator has to be customized to put the stats reports where I want them. Another 4 hours.

Come Friday, the disks arrived in San Jose via morning FedEx. But the software didn’t show up until mid-afternoon. The tech, Rudy, who was going to do the work had to leave early and wouldn’t be back until Tuesday morning (oh yeah, I forgot, it’s Fourth of July weekend in the States). I’d talked with him on the phone and got the feeling that he really knew what he was doing, so after thinking about it for a bit I said, ok, I want you to do the work, so let’s wait until Tuesday to try to recover the disk. I figured at that point it didn’t really matter whether things came up Friday or Tuesday. If the disk was recoverable it was worth recovering. And if it wasn’t recoverable, then I might as well keep doing what I was doing with the 4U box since it was going to be next week before all the web content was restored anyway.

Finally Tuesday arrived, and Rudy tried to recover the disk. Norton SystemWorks booted the server, but it couldn’t see the disk. “Ok” I said, “let’s just reinstall the OS on the new disks.” An hour later Rudy called back and said everything was installed and connected. I logged on with Terminal Server, updated to the latest patches, rebooted several times, and set up disk mirroring using the two new disks. Hopefully disk mirroring will make it much less likely that what happened will happen again. With disk mirroring, if one disk fails the other will keep running. And because they’re hot swap drives, it’s possible to buy another drive, stick it in, and have the mirroring continue on the new disk automatically. Personally I’m hoping that I have never have to test that scenario – that like rain in London, where the fact that you’re carrying an umbrella means that it’s much less likely to rain – that having mirrored disks means it’s much less likely to ever have a failure.

So there it is – seven very long days to get my servers up and running again. But what I think is most amazing about it is that it could be done at all. 10,000 miles away. All it took was eBay, FedEx, PCConnection, Amazon.com, a broadband connection, a web browser, email, a telephone and a credit card.

“But what about the web sites?” I can hear you saying. “Why are they still not up? And why do they look different?” Ahhh, that’s for the next entry.

Posted by: Frank @ 11:15 pm — Filed under: Comments (1)