Matt's Blog

Planar trees

Sun Sep 17 17:42:01 EST 2006

A planar tree consists of a list of nodes, where each node is a pair containing the element present at that node and a list of child nodes - or in other words the subtree that is the child of that node. This data structure is very important, being able to model Lisp S-expressions, XML, and other hierarchical data structures.

As a concrete example consider a bullet point list with nested levels:

  • Point 1
    • Point 1a
    • Point 1b
    • Point 1b1
  • Point 2
  • Point 3
    • Point 3a

We want to represent this is a tree of the form

[Node(1, [Node(1a,[]); Node(1b,[Node(1b1,[])])]);
 Node(2,[]); Node(3,[Node(3a,[])])]

where "[a;b;c]" is the Ocaml notation for constructing a (linked) list. This can be translated trivially into other functional programming languages.

The key insight that I had today was how to represent the tree as a list of pairs where each pair contains (level in hierarchy, element). For the above example this would be:

[(0, 1); (1, 1a); (1, 1b); (2, 1b1); (0, 2); (0, 3); (1,3a)]

This had been a sticking point up until now. I had previously represented the tree as the list of pairs of (parent, child), i.e.

[(root,1); (1,1a); (1,1b); (1b,1b1); (root,2); (root,3); (3,3a)]

This is the representation that is used in the HtmlTree program that I use to generate the tree of HTML for my website.

The second approach has the disadvantage that the nodes have to have unique names. The first approach has the disadvantage that elements have to be inserted in order as they are encountered in the outline view, in the second approach it is possible to enter all of the top level elements first, then the next level, etc. There is probably a way to do this by treating the list form of the tree as a breadth-first traversal but presently the tradeoff is acceptable.

Below is the code for the planar tree data structure basic functions, ie conversion to or from a list.

  type 'a node = Node of 'a * 'a tree
  and 'a tree = 'a node list

  type 'a t = 'a tree

  (* Map function f: tree -> tree over the tree *)
  let rec map_list_tree f tr =
    f ( (fun (Node(el,lst)) -> (Node(el, map_list_tree f lst))) tr)

  (* let rec map_list_tree f tr = match tr with
    | [] -> []
    | (Node(el,lst))::xs -> f (Node(el, map_list_tree f lst) :: (map_list_tree f xs))

  (* Map function f: 'a -> 'b over the elements of the tree *)
  let rec map_tree f tr = match tr with
    | [] -> []
    | (Node(el,lst)) :: xs -> (Node(f el, map_tree f lst)) :: (map_tree f xs)

  (* Reverse all of the lists in the tree to make the mirror image *)
  let rec rev_tree tr = map_list_tree List.rev tr

  (* Construct a tree from a list of [(level, element); ...] *)
  let of_list lst =
    let rec instree trlst (lev,el) =
      if lev=0 then (Node (el,[])) :: trlst
      else match trlst with
        | [] -> [Node(el, [])]
        | (Node(a,lst)):: xs -> (Node(a, instree lst (lev-1,el)))::xs
    in rev_tree (List.fold_left instree [] lst);;

  (* Convert the tree back to linear list form by a depth first search *)
  let to_list tr =
    let rec aux lev t = match t with
      | [] -> []
      | (Node(el,lst))::xs ->
    (lev,el) :: ((aux (lev+1) lst) @ (aux lev xs))
    in aux 0 tr

Now that I have this basis to work with I can start processing the data in the tree into other forms. At present this is pretty basic, I can't do anything more than I could with a list of elements. Indeed many of the tree data structure processing functions can be performed by conversion to the list form, applying list data processing functions and converting back to the tree form if necessary.

The main accomplishment here is the ability to take a bullet-point formatted text file and convert it into an abstract tree. This makes it possible to talk about certain problems in a more natural language, for example finding all of the paths through a tree.



code (24)

erlang (5)
ideas (19)
lisp (1)
me (11)
notes (4)
ocaml (1)
physics (45)
qo (7)
unix (6)
vim (3)