Michael J.A. Clark
Michael Clark is a Computer Science student from England providing freelance programming and design when not studying at Cambridge. Skills: C#, Sitecore, PHP, XHTML, CSS, AS3, Java, ML, F#.

Sections

Contact details

Email
mjac@mjac.co.uk
Skype
mjacdotuk
Twitter
mjacuk

Articles tagged coding/xml

Decentralised XML storage

Over the past couple of years I have been experimenting with an XML storage system for displaying articles. I first wrote about during my 2008 homepage refresh. The last couple of days have seen a major overhaul, converting from a centralised XML database to dispersed XML files that accompany each entry. The article directory is parsed and these XML files are collated and serialized by PHP.

Having a decentralised storage mechanism allows greater integration with the filesystem. It removes the need to update and overwrite the XML database after every change. Now a new article can be uploaded to a directory using FTP and automatically deployed on a cache refresh.

Example XML info file

The XML format allows content to be mapped to different output types. This webpage is delivered as XHTML so the convertor identifies PHP Markdown as a class library to achieve the mapping.

<?xml version="1.0" encoding="UTF-8"?>
<article>
    <title>Decentralised XML storage</title>
    <created>Sun, 15 Aug 2010 00:07:05 +0100</created>
    <modified>Sun, 15 Aug 2010 00:07:05 +0100</modified>
    <content format="markdown" src="decentralisedxmlstorage.mdml"/>
    <category>Writing-Development-Flux, Flux</category>
</article>

How to detect changes

In many computer systems a dirty bit is set to signify that modifications have taken place and the cache line should be refreshed. Deleting the cache is a simple solution, forcing the directories to be indexed. This is an appropriate solution for up to 1000 articles. Beyond these scales a more efficient solution would detect filesystem modification times. Is it sensible to trust system time? I would rather not go there.

Major modifications

The decentralisation is a major step towards a class library that can be used by other developers to deliver high quality content in less time. A single file management interface is the final stage before release.

Thanks for reading, please add your comments.

Journal feed generation, RSS and Atom

For the past few hours I have been trying to integrate a feed generation library into my new blog system. After a couple of minutes searching on Google I came across Anis uddin Ahmad’s PHP Universal Feed Generator. This feed generator can create Atom, RSS 1.0 and RSS 2.0 feeds.

There were a few mistakes in his program: to begin with, escaping data in the XML output using the PHP function htmlentities is incorrect because it will generate entities that are not part of the XML standard. Instead you should allow native UTF-8 characters to remain in the XML document and only escape the small subset of characters that XML uses. I fixed other niggles, including making a method of FeedWriter public static.

I want to write my own feed generator and release it to the world. I wrote the feed generator for Mind & Soul when I was working for Premier Media Group. With some of the knowledge gained from my first year computer science course I could do a far better job. I will now take any excuse to try out my new skills!

Converting to absolute URLs

Many of the files on my server reference other files using an absolute URL syntax such as /journal/view/test. This will break down if the files are transported to another server if the path changes. The path should either be relative (would require lots of processing) or be absolute and include the domain part.

Webmail clients can interpret feed links and images, allowing users to travel to external pages or view images directly in the client. This means that I will definitely have to change the URL system in Base (my CMS) and the implementation of FluxLinkage. The latter is an interface to connect the journal system with other classes, in this case my CMS, and to adapt to their style of page linking.

Speech recognition

Dragon NaturallySpeaking seems to have problems interpreting my speech when writing CompSci blog entries… However, it is an awesome piece of software and it gives me an excuse for making more grammar errors.

Thanks for reading, please add your comments.