WADI Web Application Distribution Infrastructure.
The problem with the current breed of HTTP servlet session distribution mechanisms is that they do not scale well. I have just tried the first release of WADI 'Web Application Distribution Infrastructure', and it shows great promise as a next-generation session distribution mechanism that can be used for simple load balancing, as well as huge multi redundant clusters.
Web applications are put into clusters with one or more of the following aims:
Scalability: to use N nodes to increase throughput almost N times.
Availability: to use N nodes to ensure that after N-1 failures,
request will find at least 1 node running the application.
Fault Tolerance: to use N nodes to ensure that after N-1 failures, the
results of previous requests are correctly available to the application.
Unfortunately, these aims are not always mutually compatible, nor completely understood by those who want to implement a cluster. Also, the components that are used to make a cluster (load balancers, session replicators, EJB replications, distributed databases) all have different focuses with respect to these aims. Thus it is important to remember why you want a cluster and not to make the aims of a single component dominate your own.
The current crop of load balancers (eg. mod_jk, Pound and IP routers) are very good at scalability and quickly distribute the HTTP requests to a node in a cluster. But they are not so good at fault tolerance as once a failure occurs, the mechanisms for selecting the correct node for a request are replaced by a "pick a node, any node" approach.
The currently available HTTP session distribution mechanisms have mostly been written with fault tolerance in mind. When a node receives some session data, it immediately attempts to replicate it somewhere else. Thus the session is broadcast on the network or persisted in a database, and significant extra load is generated. Because load balancers cannot be trusted to select the correct node, the session is often distributed to all nodes. Thus in an N node cluster, every node must store and broadcast it's own sessions while receiving and storing the session from N-1 other nodes. This is great for fault tolerance but a disaster for scalability. Rarely is 1+1>1 or even 1+1+1+1>1 and throughput is inversely proportional to cluster size. Worse still, the complexity of the mechanism often results in more failures and less availability.
It is difficult to achieve scalability and/or availability with a session distribution mechanism that has been designed for fault tolerance. Thus, many clusters do not use session distribution. Unfortunately this has its own problems, as load balancers are not perfect at sticking a session to a node in a cluster. For example, if IP stickiness is used and the client is on AOL, then their source IP can change mid session and the load balancers will route the requests to the wrong node. To handle the imperfections of load balancers, a subset of session distribution is needed - namely session migration.
Session migration is the initial focus of WADI, so that it can provide scalability and availability to match the available load balancers. However, WADIs extensible architecture has also been designed for fault tolerance, so eventually it will be able to handle all the concerns of clustering Servlet sessions.
WADI has been simply integrated with the Jetty and Tomcat Servlet containers. For normal operation, it adds very little overhead to session access. Only the cost of synchronizing requests to the session and of some AOP interceptors. Expensive mechanisms such as migration or persistence are only used when a node has been shutdown or a load balancer has directed the request to a different node.
When a WADI node receives a request for which it has no session data, it can ask the cluster for the location of the session. This can either be on another node or in persistant storage (after a timeout or graceful node shutdown). The request can then be directed to the correct node or the session can be migrated to the current node.
WADI is still alpha software, but is actively being developed by Julian Gosnell of Core Developers Network and should soon be included with Jetty 5 and Geronimo.
Posted at 09:48AM Jun 11, 2004 by gregw in General | Comments[0]