Blogging from the web
Sun Aug 13 12:18:42 BST 2006
Now that I have my blogging software working to process a single text file into a set of related blog webpages it is time to think of extending it to make it more useful.
- Create a webpage so that new entries can be added to the blog, rather than the existing method of writing to a text file and updating the blog by running my program on the text file. Use a .htaccess file to restrict access to the "Create blog entry" webpage.
- Make it possible to update the blog by sending email messages to a certain address on my domain (or even just my email account with a special token to trigger a procmail rule). Restrict access by a password token in the email, encrypted email, one-time passwords, etc.
Both of the above ideas boil down to writing a program to add an individual entry to the blog. One method could be to just add the entry to the existing log file and then run the couple of scripts that are used to produce the blog - the disadvantage of this approach is that the entire blog is regenerated each time a new entry is added, which doesn't scale well. So it is beginning to look like a database is the way to go, with each blog page performing a database query to generate the content dynamically (I need to check whether this is how most wikis work). I am a bit averse to this approach, since it seems excessive to dynamically generate the content each time the blog entry is accessed rather than only when new content is added.
The first thing to do is identify what needs to be done by the Ocaml program itself and what can be done by shell scripts. At the moment the Ocaml program reads in the textfile of blog entries and writes a number of text files of processed text. This processed text consistes of the text of each blog entry together with links to the category pages for each entry and formatted headings, dates etc. The program also prunes the list of entries to appear on the main blog page to the last ten entries, and for each blog page displays entries with the most recent entry first.
Processing an individual blog entry is hence simple for every page except the main blog page, the program is run over the individual text entry file and the old blog page for that category is then concatenated with the new output text. The htmltree program is then run over the new file to produce the appropriate menu elements.
Each blog entry is defined by the date and time of the entry, and the blog categories that the entry falls under. The date and time can be converted into a unique integer identifier by changing the format from human readable to number of seconds from a defined start date - Unix epoch is a good default choice. Check whether Ocaml modules for dealing with changing dates and times exist in the stdlib (doesn't appear to, but there are the C strftime and strptime for formatting and parsing dates, could link in a small C program).
Write a program that takes a list of text files as input and strips out the dates and categories of the blog entries in the text files, writing this list to a new index file (together with the filenames and offets of each entry). Then do the required rearrangement operations on the entries in the index file, determine which blog pages need to be updated as each new blog entry is added and only at the last step actually access the text files containing the actual entries.
Rewrite the htmltree program in a similar manner, so that it just deals with the pagedesc file to create a few html files conatining the main navbar and the subnavbars for a given page, together with another file mapping each subpage to the appropriate subnavbar. This means the actual page content can be altered without having to run the htmltree program, with all of the constituent bits being stuck together with shell commands at the final step of the website generation.
Concurrency issues crop up if I allow more than one person to update the blog at the same time, to avoid these issues file locking is required. Or alternatively move to using Erlang, Yaws and Mnesia (maybe keeping the Ocaml and Perl programs as text processing filters that are accessed through ports).