Matt's Blog

Lexing and Parsing again

Wed Jul 26 18:11:18 BST 2006

[Ocaml Parser Tutorial Setup]

This tutorial describes how to construct a parser in Ocaml. The main thing that I found useful was the idea of compiling a separate .ml file that defines types that the parser deals with - rather than a parser token being a simple int or string it can be a more complex data type. So for my blog software define a "key" of type (string * string * string * string) to contain (year, month, day, full date string), a "tagstr" type of string list to contain the list of category tags, etc. This lets me use the lexer to do more of the string processing, rather than doing further string processing in the main blogmain.ml program. This is turn makes the program as a whole more modular, since the blogmain.ml program can be written to treat an incoming list of verified tokens from the parser and combine them with some glue functions into the final output.

The key point here is that the process of "take input list of objects and glue them together into an output" consists of two parts: rearranging the input list and gluing elements together. The rearrangement process is identical for a range of output formats, eg Markdown, XML, LaTeX, RSS feed, etc. The only thing that varies between these different output formats is the transformation that must be applied to each element, for example enclosing the subject in title tags for HTML output versus putting a hash symbol at the start of the line to signify a level 1 heading in Markdown.

[code]

[permlink]

code (24)

erlang (5)
ideas (19)
lisp (1)
me (11)
notes (4)
ocaml (1)
physics (45)
qo (7)
unix (6)
vim (3)