The newsworthiness of current events
Sun Dec 10 18:56:56 EST 2006
Possibly interesting data mining project: Lexis search of column inches vs number of deaths vs distance from newspaper source geographically. Three dimensional data problem, but I hypothesise that there will be a power law relationship between column inches and some geometric combination (eg distance*body count) of the other parameters. How does this vary with year of publication? An event is newsworthy if it affects you. An event affects you directly if someone you know is affected. Telecommunications and long-distance travel have increased exponentially over the last few decades, has this changed the scaling law? What about further classification by type of event eg accident versus intentional action (eg warfare, terrorism)? Another hypothesis: outside a certain geographical radius accidental and intentional events will be fairly similar in terms of total coverage unless a fair proportion of the readership have ties to the location.
Everything is Lisp if you use Emacs
Sun Dec 10 08:42:07 EST 2006
A simple editor or word processor lets you type in characters on the keyboard and save them to a file so that you can access the file later. A more advanced editor allows formatting markup of the text, eg headings, lists, paragraphs, justification of text. An extremely advanced editor has an embedded scripting language to allow the user to automate common tasks. The type of tasks that can be automated depends on how powerful the scripting language is, and what interfaces there are between the editor and the outside world.
Emacs is an extremely advanced editor. The scripting language is a dialect of common Lisp, so lack of power is not a concern. The editor can be tied into external programs such as news readers, mail agents, web browsers, compilers, contract checker, ... with the scripting language. Programmers love this because it allows automation of common compilation tasks. (...)
Sat Dec 9 14:31:58 EST 2006
Writing a computer program at the most basic level simply involves taking a thought, translating that thought into a programming language and then running the final program on some machine. However, the downside to programming is that it requires a rigid structure in order to allow the computer to interpret the programming language correctly. This is not completely a downside as it encourages logical thought about a problem, but there are many elements of a program that consist of "boilerplate" code that is reused from problem to problem with no thought involved - a boring problem rather than an intersting one.
Programmers like interesting problems. Turning a boring problem into an interesting problem means finding a general solution to the boring problem so the programmer does not have to think about the boring problem ever again. This means building tools.
The Ocaml distribution has a good selection of tools, which are similar to those available to other developed languages. (...)
Programming as an expressive medium
Sat Dec 9 14:10:58 EST 2006
Programming is a way of translating thought into form. It is as expressive a medium as a novel or poem. In the poetical form of communication the writer is relying on the depth and breadth of the reader's experience to convey the poets message. In the programming form of communication the programmer relies on the capabilities of a computational engine, including its interface with the user. This user interface can be optimized for a given purpose, minimising the impedance mismatch between the programmer and his audience.
Here is the idea: any program is a message from the programmer saying one or more of:
- This is a problem that I find interesting, and here is a method to investigate or solve that problem.
- This problem is boring, here is a method to solve this problem automatically so I can think about more interesting things
- I don't care much one way or the other about this problem, but writing this program pays the bills and lets me do more interesting things. (...)
Tries, dictionaries and directories
Sat Nov 25 09:02:43 EST 2006
A trie, or prefix tree, is a tree structure that maps keys to values stored in the tree. The tree itself consists of a set of labelled edges and nodes containing the values. The key corresponding to the value at a particular node is the concatenation of the path labels leading from the root of the tree to the node.
A standard application is to use a trie to represent a dictionary (see the exercises at the end of chapter 2 of the ORA Ocaml book). Here the edges are labelled by the characters making up a word, and the associated value is a boolean stating whether the node is the end of a word.
A trie can also be used to represent the layout of a website consisting of a root index page followed by a hierarchy of subpages (ie the layout of my website). Here the node values are wrapper functions that wrap content depending on its position in the tree, for example producing a navigation menu that depends on the position of the element in the tree. (...)
Wrapping content in containers
Sun Nov 19 11:24:44 EST 2006
Everyone is pretty familiar with the concept of wrapping content in
tags describing how to display that content by now. A basic example
is wrapping section heading in a
<h1>Section name</h1> environment
What I want to do is wrap an entire output format around my content. Consider my website as the example. As I write this my website consists of a hierarchical collection of linked pages. The pages at the top level of the hierarchy ("Home", "Me", "Links", etc) are used to construct a top level navigation bar that is placed on the side of each page. The lower level pages have a smaller navigation bar at the top of the page that links to children of that page and the parent node (as long as the parent is not one of the top level pages).
At the moment my website generation program is quite low level in that it treats the pages as strings rather than abstract datatypes for example (this is due to my lack of understanding of the module system when I wrote the program). (...)
The Game of Life, Metacircular interpreters and Hashlife
Sun Nov 12 09:33:31 EST 2006
J.H. Conway describe the "Life" cellular automata in 1970. This cellular automata consists of cells that may be alive or dead, and a transition rule that says a cell stays the in the same state if it has two neighbours, becomes alive if it has three neighbours and dies otherwise. The cells are arranged in a regular array of square cells, with the neighbourhood of a given cell being the surrounding eight cells.
- This simple setup is Turing complete
- see "The Recursive Universe" by William Poundstone or "Winning Ways (For Your Mathematical Plays)" by J.H. Conway.
- This means that it is possible to construct a Life pattern that acts as an interpreted ie it takes other programs translated (compiled) to a Life pattern and then evaluates the program described by this input pattern. In particular, the pattern of the evaluator itself can be run by the evaluator - this is the key of a language being Turing complete. (...)
Nouns and Verbs
Sun Nov 5 20:04:26 EST 2006
Recently I have been reading "The Structure and Interpretation of Computer Programs" by Abelson and Sussman, (or "the wizard book" as it is also known). This describes the fundamentals of computer programming, using the Lisp dialect Scheme as the example language due to its ability to treat functions as data and vice versa. The ability to treat functions and data on the same footing leads to a whole new way of programming, since programs can be designed that work by functional composition rather than operations on data structures.
Why is this fundamental? It boils down to how we as humans use language. We use language to describe the world around us to ourselves and other people. While it is a fallacy that tribes having no word for the colour blue are unable to see that colour (for example) there is a grain of truth - the ability to translate perception into different sensory channels reinforces events and brings different processing mechanisms into play. (...)
Organisation of content
Sat Nov 4 10:21:59 EST 2006
My default state is to take in information in written form from books and websites. This is all well and good but unless I make use of that information it is no better than sitting in front of the television watching a soap opera or reality TV show.
Humans process information. Novelty is the acquisition of new facts about our surroundings and relationships that may reveal potential threats or opportunities. Hence there is a biological drive for humans to acquire and process information. As with any biological drive there is good money to be made by subverting the drive for commercial interests. For example the fast food industry exists by hacking into the biological drive for (high energy density) fatty sugary foods, traditionally a good resource for energy storage in times of famine. In the same way the human desire for understanding and novelty means that new interesting information has a drug-like effect on the human brain. (...)
Content management systems
Sun Oct 15 19:39:42 EST 2006
It looks like it may be time to bid a fond farewell to my homebrew blogging software, at least in its current incarnation. It was a good programming exercise to write a program that read a plain text file and constructed a set of interlinked webpages with the dated entries separated into the appropriate categories. This turned out to be fairly simple in the functional programming style, and I'm tempted to go a bit further in introducing a higher degree of data abstraction (now that I have a better idea of how the Ocaml module system works).
But recently I have been playing around with the Drupal content management system (somehow volunteering to update the group website became constructing the group interactive online community. This mission creep tends to happen when I get a dull computery job to do :) ). The nice thing about this system compared to other content management systems is its extensibility. (...)
Encrypted blog entries again
Sat Oct 14 09:06:15 EST 2006
One of my friends read the blog entry from a couple of weeks ago about
There already exist libraries for encryption between the client and
the server eg using public key cryptography or similar.
This is of course true, and proabably what you would use in a secure web application where bi-directional communication is allowed.
His question clarified the idea in my mind further. The crux of the idea is unidirectional broadcast from a server to a set of clients. The messages broadcast contain plain text data, encrypted data, and code for the encryption machines. Keys to make the machines decrypt encrypted content are distributed through other channels, for example buying a decryption key in tamper-proof hardware token form from a vendor.
Is this starting to sound familiar?
Looks like I've just reinvented encyption for digital or satellite television. (...)
Encrypted blog entries
Tue Oct 3 21:48:44 EST 2006
Most social networking sites have a mechanism whereby only the friends of a person can see their blog entries. The standard way to do this is to have a back end database that keeps track of the users logged in to the system and allows them access to the entries only if they are in the list of friends of a given user. This is a server side approach that has a couple of disadvantages:
- increased server load to keep track of the state of the system
- lack of security against eavesdroppers as it can be seen when access to a restricted page is attempted
The second point deserves further discussion. Consider a high traffic website that contains some entries that are restricted to those people meeting a certain criteria eg a news site that restricts access to full articles to subscribers, a blog with personal entries restricted to close friends, www.al-queda.ter restricting details of plans to the true believers, etc. (...)
Thu Sep 28 08:25:57 EST 2006
Just bought my first mobile phone ever - a cheap "Pay as you go" (UK) or "Prepaid" (AUS) Nokia 1112. Basic phone, no camera, 4MB memory (can't seem to access it to upload applications), just phone calls and text messages.
Text messages are 140 byte Short Message Service (SMS) format and can be a link to the internet through SMS gateways.
A few links:
- [O'Reilly Nokia Smartphone Hacks] for fancier phones than mine
- [Kannel]: Open Source WAP and SMS gateway. Available in the FreeBSD ports through www/kannel
- [SMS Server Tools] available in comms/smstools
To use any of the gateways a computer connected to the phone network is needed. I still don't have a land-line at home, but Vodaphone seems to have reasonable deals on [GPRS
A Pattern Language
Thu Sep 14 20:57:39 EST 2006
Anyone who has studied an object-oriented programming language such as Java or C++ will have come across the "Design Patterns" concept at some stage. This was popularised by the "Group of Four" book of the same name, the basic concept being to identify common problems faced in programming tasks and use template solutions to solve them. A consistent approach to laying out the solutions and naming them appropriately enhances the value of this approach. This concept is based on an earlier work in the field of architecture, the book "A Pattern Language" by Christopher Alexander, Sara Ishikawa and Murray Silverstein et al.
"A Pattern Language" identifies 253 common design patterns come across in designing towns, neighbourhoods and individual buildings - and a key point is the patterns are described in this order from the scale of regions to the personal scale of the family home. (...)
Biological basis of homesickness?
Sun Aug 27 22:46:57 EST 2006
Why do people feel homesick when they move to a new location? Is there a evolutionary explanation? A new location requires finding a new solution to the basic survival needs of food and shelter, plus the need to build up a new protective social network. Check for animal studies where the animals are relocated from their usual environment or social group to a new social group.
Humans are a particularly social animal, and a large amount of the brain's processing power is taken up with modelling the probable behaviour of people within our social circle. Is homesickness a mechanism to try to prevent moving to a new location and having to recalculate the behaviours of your new human environment?
The Constructivist Manifesto
Sun Aug 27 22:23:15 EST 2006
There are always going to be things that you consider not to be the way the world should be. You can either try to force the world to be as you want it by destructive acts, or constructively build the new capabilities that you require. Analyse the way the system works and build the devices to extend its capabilities or hide the undesired complexities. Borrowing from the Hacker Manifesto, you shouldn't have to solve the same problem twice. After the first time you should have made something that solves the problem automatically.
Always identify why you are unhappy with life, don't just accept things as they are. Sometimes it is fine to be unhappy for the sake of it, but real conflicts between yourself and the world should be resolved. Change yourself to be in harmony with the world, change the world to be in harmony with yourself or accept the tension between the two.
Labview in Erlang
Sun Aug 13 21:44:30 BST 2006
Labview is the common programming language used for data acquisition in the experimental sciences (although Matlab is also very popular). The main selling point of Labview is the ability to write programs in a graphical programming language where functions are "virtual instruments" with defined inputs and outputs joined together by wires. This allows for "data flow" based programming where the output of a virtual instrument is calculated as soon as all of the inputs are available.
In Erlang the input and output messages can be represented by messages, and the virtual instruments by processes. The tricky thing is to represent the data flow of the input messages so that the process calculates the outputs only when all of the input messages have arraived. Take a simple example of a process that requires three data inputs to calculate its output. (...)
Fri Aug 11 21:52:30 BST 2006
Another concept from "Utopia or Oblivion": energy slaves. A human can do a certain amount of work per day unaided, call this the energy value of a human. Each person in a developed country uses more than this amount of energy, by making use of energy stored in fossil fuels, nucelar power, wind power etc, or by explotation of other humans or animals. Each person therefore has a number of energy slaves (their energy use divided by the average human energy value) working for them. Increasing efficiency of technology leads to more energy slaves per person, as long as the underlying energy supply lasts. A fundamental limit of the total amount of new energy available is the energy output of the sun, together with energy stored in spin-orbital angular and linear motion of objects in the solar system.
Consider extending the energy slaves concept to other aspects of humans existence. As well as moving around matter and energy, humanity also processes information. (...)
Communication defeats competition
Thu Aug 3 18:52:56 BST 2006
Communication defeats competition. One man may be stronger than you, he won't be stronger than you and your thousand friends (unless his friends have given him powerful weapons).
If you can only talk to your local tribe you are working for the good of your genes, only by spreading your social circle outside your physical environment are you able to work for the good of society as a whole. The internet is the enabler of this.
(The locality statement above is based on the assumption that for low levels of immigration people sharing a common geographical area will share more genes in common than two people chosen at random from the entire population. This may give rise to hive-mind behaviour where the (sterile) worker bees sacrifice themselves to protect the queen bee that gives the genome immortality.)
Utopia or Oblivion
Thu Aug 3 18:31:39 BST 2006
I'm rereading R. Buckminster Fuller's book "Utopia or Oblivion" at the moment (one advantage of packing all of my books into boxes is that I discover some that I haven't read in a while). The book is a series of transcripts for public lectures given by Fuller, to a variety of audiences. The common theme of the lectures is technological progress solving all of humankind's problems, either by giving us a utopia or wiping humanity from the face of the planet.
- Emphemeralization. Doing more with less. As technology improves it (by definition?) becomes more efficient. One solution to the global warming problem would be to operate engines at a higher temperature, so that the Carnot efficiency is improved and less energy is lost as heat. This is a materials engineering problem that can be solved by development of cheap and strong high temperature materials for turbine blades.
- Language acts as in intrinsic limiter to thought. (...)
Ultrafast lasers and nuclear fusion
Tue Jul 25 21:44:39 BST 2006
Things that we can do now that we could not five years ago: create attosecond pulses of coherent X-rays. We can do this by hitting a gas with a very short (femtoseconds long), very intense laser (terrawatt peak power) pulses. The underlying physics is multiphoton acceleration of the electrons in the atoms of the gas - the laser is intense enough that the potential seen by the electron changes on the timescale of the electron's vibration in the atom, so resonant forcing occurs.
How does this relate to fusion? One approach to fusion is to hit the fusion pellet fast enough and with enough energy that [inertial confinement] occurs. This means that the energy has no time to be distributed over the entire pellet, but instead is concentrated in the area where the laser pulse hits. This region turns into a plasma and the back reaction compresses and heats the pellet enough that fusion occurs. (...)
Trust and databases
Tue Jul 25 21:08:46 BST 2006
Each person has data associated with them which defines who they are and what they have done. Knowledge of this data allows prediction of future actions or verification of past actions. Consider a national DNA database that can be used for law enforcement. The idea is that once a person is arrested for any crime their details are placed in the database, allowing identification of that person if they commit a crime in the future. The arguments for this system are that it is beneficial for society to punish criminals, and that if you are not doing anything wrong you have nothing to hide.
Security researcher [Bruce Schneier] presents a very good argument against a system such as this: we should not have to justify why we do not want our data to go into the database, rather the entity running the database should have to prove why they can be trusted with our data. (...)
Blogging from email
Thu Jul 13 12:15:35 BST 2006
The data entry process for a blog usually consists of entering the post on a webpage, or writing a set of text files on a computer and running a script over the result to upload to a weblog. The first approach limits you to the editing capabilities of the web browser, while the second requires a higher level of technical expertise.
An alternative approach would be to email the post to an email account connected to a procmail script that passes the email to a processing script. The processing script takes the date and subject from the email headers, tags specified at the start of the email and then runs Markdown over the actual blog post to produce the HTML.
Authentication can be handled by entering a password as a separate password tag in the email. The security of this can be enhanced by using encrypted email or one-time passwords if necessary. (...)
Thu Apr 27 17:29:39 BST 2006
Multi-user online chat
- http://en.wikipedia.org/wiki/ThePalace(computer_program) 2D visual chat room, users are represented by icons that can travel around the "palace". Similar to the idea that I want to implement with my LambdaMOO web version.
- http://en.wikipedia.org/wiki/Talker Long article on chat rooms based on MUDs