Abandoning WordPress, Part 1: Faded Blosxoms and Shining Perl

I’m making progress with the new site. Nothing is online yet but I thought I’d post an occasional progress report.

As I described previously, my plan is to not use any dynamically generated content but instead create the whole site with static HTML on local machines and upload it when something changes. I’ve settled on Blosxom as the main component of this system. The trouble with Blosxom is that it’s something of an open-source orphan and isn’t really being maintained well. The official site is quite neglected (lots of broken links, etc.); fortunately the unofficial site is more helpful. (You can find them with Google if you need them. I’m not going to scatter links through these posts for every single bit of software I mention.) I thought for awhile that static rendering in Blosxom was broken altogether but it turned out that the real problem was that I was attempting to use a plugin written for version 1.x with version 2.x of Blosxom. It also seems that the download packaged up in a Mac OS X installer is not the same version that’s provided for “everything else”, i.e., Windows. With some persistence, though, I’ve convinced myself that it can do what I need to do, which is take a bunch of text files and generate HTML files with some cross-linking and categorization and stuff.

I took a very brief look at pyBlosxom but recoiled in horror. Not only is it just as much of a mess as Blosxom itself, but it’s written in Python. I don’t know Python, and I don’t feel like learning YA scripting language just to keep this site chugging along. I’m far from fluent in perl but at least I can get stuff done with it.

So, having settled on Blosxom, I turned my attention to extracting the posts on this site from the mySQL database in which they reside. This turned out to be pretty straightforward. I used the export utility in phpMyAdmin to dump the table containing the posts into an XML file. Then I wrote a perl script to parse the XML file and spit out separate text files, one for each post. The text files have a meta- tag with the posting date which Blosxom can read (thanks to the entriescache plugin) and use to organize the files by this date rather than their creation time. (Blosxom uses the creation time by default, which obviously wouldn’t work in this application.) Writing that script turned out to be far simpler than I expected, thanks in part to the XML::Simple package for perl.

One annoyance is that WordPress uses separate database tables for both the category information and the comments associated with posts. As a result I’m going to have to organize the categories by hand, and I’m not sure what I’m going to do about the comments. I suppose there aren’t many of those so I can just copy and paste them into the appropriate entries for posterity. (The new site won’t have any sort of commenting mechanism, since that would require dynamic content.)

By adam

Go ahead, try to summarize yourself in a sentence or two.

2 comments

  1. Now I don’t know very much about how to keep web sites secure from being hacked, but this really seems like a huge headache and a case of reinventing the wheel. Couldn’t you just use one of the other blog software options that don’t have the same WordPress vulnerability? Have you considered slashcode? http://www.slashcode.com/
    It’s likely overkill for a blog, but I don’t see Slashdot getting hacked too often and I’m sure due to the nature of the site that there have been plenty of attempts. Anyway, good luck with getting this thing secured. I’m glad you decided to fight it out instead of just taking your site down.

  2. Thanks for your comments. Yes, it’s a huge headache. There have been numerous occasions on which I’ve considered just scrapping the entire site. Yes, to some extent I am reinventing wheels. To answer your first question, yes, I could use some other option but WordPress is actually my second such option. I used to use PostNuke but abandoned it for essentially the same reason. To answer your second question, no, I haven’t considered slashcode. You’re probably right that it’s probably quite secure, considering its application. Taking a brief glance at the site, it looks like it might require somewhat more control over the server than I am able to exercise with my current hosting service, but I could be wrong.
    Here’s the fundamental problem, though: ANY dynamically generated site is vulnerable to hacking. It’s the nature of the beast. On top of that, the number of people who have a vested interest in hacking sites is drastically greater than the number of people who are working on any given blog software system. Hence there’s a constant battle going on as people find new ways to exploit blog systems and the authors of the systems issue patches to plug the security holes. Frankly, I really get tired of keeping up with the updates. I have many other things I’d rather do with my time than install the latest WordPress update just because some spammer in Singapore has figured out a new way to hack the commenting mechanism in WP–and yet obviously I have to stay on top of such things if I want to run a (relatively) secure site.
    So while I could find some other off-the-shelf solution, I have little confidence that I’d really end up with a better system in the long run. If I used slashcode I’d have to stay on top of slashcode’s updates. On the other hand, if there’s no code at all running on my site, there’s nothing that can be hacked, nothing that needs to be updated. I’ve decided that the advantages of building the site that way outweigh the disadvantages of reinventing wheels.
    At least that’s my current opinion. 🙂

Leave a comment

Your email address will not be published. Required fields are marked *