OPT design principles
A key advantage for using Erlang to construct large fault-tolerant systems is the Open Telecoms Platform (OTP) framework. This is a set of libraries that allow the construction of applications as SUPERVISION TREES, which break up the processes used in the application into:
- workers that do the actual computation
- supervisors that monitor the behaviour of workers and restart them if they crash
The supervision tree is the hierarchical arrangement of code into supervisors and workers. This is possible because most of the modules in the application behave as one of a small finite set of patterns (behaviours):
- supervisor, supervises other processes eg restart crashed processes or spawn new sets of processes when demand is high. Starts, stops and monitors child processes.
- gen_server: generic server (usually these processes make up the bulk of an application \ref{Armstrong thesis}). Acts as a server in a client-server setup, ie receives requests from clients and responds to them.
- gen_fsm: generic finite state machine. The process exists in one of a finite number of states. When an event occurs the process performs some actions and transfers to another state depending on the event.
- gen_event: generic event handler. The process acts as a named process to which events can be sent eg for logging puropses.
- supervisor_bridge for adding new subsystems not designed according to the OTP principles to the supervision tree, ie defining new behaviours.
- application for adding entire applications to the supervison tree
Behaviours are implemented as callback modules, ie the developer writes a set of functions that the behaviour calls in order to carry out its task. For example a gen_server module must export the functions init/1, handle_cast/2, handle_info/2, terminate/2, code_change/3 and handle_call/3. Since gen_servers usually make up the bulk of an application (Joe Armstrong thesis) consider the gen_server callbacks in detail below:
- init/1: initialise and start the gen_server. The argument is a term passed by the gen_server:start(_link) function. The init callback should return {ok, State} if the server is initialised correctly.
- handle_call(Request, From, State) : Handles a SYNCHRONOUS call, ie one where the calling process waits for the server to respond before continuing its execution. Request is a term, From is the {PID, Tag} of the calling process and State is the state of the server (a term). If the server successfully handles the call it returns {reply, Reply, State} where State is the new state of the server.
- handle_cast(Request, State) : handles an asynchronous request, ie one where the calling process doesn't wait for any response to be returned before continuing execution. Used to change the internal state of the server (or get the server to communicate with other servers detailed in the request). Returns {noreply, NewState} upon successful execution.
- terminate(Reason, State) : shuts down the server.
- handle_info(Info, State) : called whenever the server times out or receives a request that isn't a call or cast.
- code_change(OldVsn, State, Extra) : changes the code of the module, returns {ok, State} if successful.
The first three callbacks are the main ones to implement to get an application up and running, the others can be initially just put in as skeletons that don't do any processing and just return the required result.