Matt's Blog

Organisation of content

Sat Nov 4 10:21:59 EST 2006

My default state is to take in information in written form from books and websites. This is all well and good but unless I make use of that information it is no better than sitting in front of the television watching a soap opera or reality TV show.

Humans process information. Novelty is the acquisition of new facts about our surroundings and relationships that may reveal potential threats or opportunities. Hence there is a biological drive for humans to acquire and process information. As with any biological drive there is good money to be made by subverting the drive for commercial interests. For example the fast food industry exists by hacking into the biological drive for (high energy density) fatty sugary foods, traditionally a good resource for energy storage in times of famine. In the same way the human desire for understanding and novelty means that new interesting information has a drug-like effect on the human brain. Define addiction to a substance as the pursuit of that substance to the extent that it becomes harmful to the survival of the organism. It is possible to become addicted to information. MMORPGs such as Everquest (or "Evercrack" as it is also known) are a current example but there has always been addiction to news, gambling, television shows, music, crossword puzzles, pornography.

  • Sidenote: The last one is interesting - the closer the substance is to the sex drive the stronger the potential for addiction (stated not proved - sounds plausible though). Measure "closeness to sex drive" by comparison between the neurochemicals produced during sexual arousal to the neurochemicals produced by the addictive substance. We can probably do this with current technology - use a laser-based instrument to measure neurochemical concentrations by analysis of the optic nerve (the way to a man's brain is through his eye). That seems the nice direct route assuming that the neurochemicals percolate from the relevant bit of the brain all the way to the surface of the optic nerve, I'll need to check my neuroscience books to see whether this is the case. If it is not, then there is always the fallback to physiological response such as pulse rate, pupil dilation, blood chemical analysis etc - a somewhat fuzzier measurement requiring more post-processing but still plausible.

Anyway... I seem to have become sidetracked into thoughts about the fundamental nature of humanity, when I started this entry with the intention of describing a filing system. The basic point of the above is that it is easy to become an information addicted consumer rather than using the acquired information to perform useful (positive survival value) tasks. Avoid this by having a system in place to process incoming information into useful forms, and make an effort to produce new content from what you have learned.

The system for processing incoming information that I find most attractive, at least in principle, is David Allen's "Getting Things Done" plan. I say "in principle" because I have tried to use it several times but always tail off after a month or so. A common cause is something appearing in my in box that I have not encountered before and hence do not have a ready tool to deal with it yet. Partly also is the fact that my day job has fairly little written work associated with it on a day to day basis, despite the best efforts of university administration - I'm still having trouble trying to fit "twiddle the knobs on the laser until it works" into the GTD scheme.

None the less, let me see how far I can get by doing a systems analysis on myself.

  • Information inputs
    • general sensory input
    • aural response range approx 20kHz
    • visual resolution roughly (1e3-1e6)^2 pixels at a colour depth of 3 bytes at 24 frames per second (taking 3x24 = approx 100) so 100MB/s - 100TB/s. Take the geometric average, call it 100GB/s
    • tactile, taste, smell: these involve physical transfer of biomolecules so the bandwidth will be much greater than expected. For example pheromone response, or transfer of genetic information during sexual reproduction. The bandwidth of the male orgasm is surprisingly high if total number of sperm is multiplied by discharge velocity and information content of a single sperm (although in terms of received information the total transfer is just a single sperm). (More realistically this could be a useful model for pollination although a reaction-diffusion equation is more appropriate).
    • email
    • books
    • websites
    • journals, arxiv
    • slashdot, Fark, boingboing
    • conversation
  • Information outputs
    • general physical behaviour, by symmetry this must convey at most information equal to the total sensory input capacity of the receiver
    • conversation
    • email
    • websites
    • my website
    • Facebook
    • Flickr
    • Arxiv, journal submissions
    • electronic banking or stock market dealing
    • hipster PDA (ie stack of index cards in my pocket)
    • personal journal
    • lab notebook
    • rough working books eg exercises when learning a new mathematical or programming technique
    • drawings
    • filing cabinet
    • GTD category folders eg Projects, Someday/Maybe, Next Actions
    • storage of reference material
    • the Big Text File (see 43Folders - basic idea is to store all writing in a single text file and then use editor functions for searching etc. Works well with Vim).
    • This log file.
    • separate text files (although usually use the BTF by preference)
    • code in separate directories and files.
    • blog entries and other website content (usually created from log file or BTF but sometimes separate files)

The information inputs and outputs can be classified into those that require physical actions and those that can be accomplished with computers. At present interaction with computers is limited to visual and aural output and input via a keyboard or mouse. Within these constraints there are a wide variety of user interface metaphors such as command line interaction and graphical user interfaces. Different applications commonly require different user interfaces, which makes it more difficult to transfer information from the user to different applications. A solution to this is a common data format that can be parsed into a variety of output formats depending on the information destination, and XML has been designed with this purpose in mind.

Here is the basic idea: for each of the information channels that I use, I want to be able to transparently transfer information from one channel to another through a standard interface. I want to be able to read a webpage and be able to highlight and save interesing content when I find it, with the content being automatically tagged with the location and date where I found it, and given the option for commenting on the content or tagging it in relation to other articles that I have found. I want to be able to take this log file, select a region of text and click a "publish this on my website" button (or more likely command chord or macro command) and have everything happen automatically. I want to be able to read my email in the same editing environment, saving everything to a distributed backup archive when it comes in. I want all of my content to be checked in to a version control repository that I can access anywhere. I want everything to be backed up. I want to capture chains of thought, not just content - my big text file is a useful way to store everything but I find it hard to follow which paths my thoughts were going over time without having separate diary entries acting as metadata. I want to start a new log entry each time I start working with a new programming idea, to capture my thoughts about what I am doing rather than relying on comments in the code.

The annoying thing is that I know all of the features that I want are available, they are just not integrated with each other yet. Using Emacs as my working environment solves most of the problems since the editor is extensible by adding new Lisp code and already has functions for network connectivity and so on. If I use Firefox as my web browser I can write scripts to do the auto collation of content I wished for above, ie add a new right mouse button menu option of "take selected region and add to log (with or without comment)". It is all available there for me to use, I just need to learn how to use it.

Finish now with a plan for the future: learn how to use an editor so that I can write programming idea notes and the code itself in the same text file, setting up a macro so that I can select text in the file and say "make source code out of this, compile it and show me the result" (this is similar to the literate programming idea - check out ocamlweb for an example using ocaml. I want to be able to do this in any language). Initial thoughts: mark code sections in the text file so the editor can process the appropritate content appropriately. Lisp style s-expression markup perhaps?

[ideas] [me]


code (24)

erlang (5)
ideas (19)
lisp (1)
me (11)
notes (4)
ocaml (1)
physics (45)
qo (7)
unix (6)
vim (3)