This page last changed on May 31, 2006 by tcarlson.

From earlier discussions on the dev list, we've looked at four different types of clustering:

1) transactional replication
2) transactional load balancing
3) standby/failover

Of these choices, the third, failover, is the immediate concern. What we need to have is a failover policy in which a second Mule instance can pick up the work when the primary one fails. To accomplish this, there are four areas to address:

a) configuration replication
b) resource replication
c) failover mechanism
d) Shared Memory Space

A. Configuration replication

Basically, all this means it that we have a mechanism for replicating the master Mule configuration to one or more child nodes. This replication should be accomplished through a multicasting mechanism like that used by JGroups and should distribute any core configuration changes. The goal here is to make sure a second Mule instance is running in exactly the same configuration as the primary, with the exception that no routes are activated (until failover occurs).

In looking thru the source code, we can see that there are a number of configuration objects (i.e., pretty much everything in the configuration file) that would have to be replicated between multiple Mule nodes:

  • server properties
  • connector definitions
  • transformer definitions
  • agent definitions
  • interceptor definitions
  • security settings
  • global endpoints
  • endpoint identifiers
  • model definitions
  • descriptor definitions (routes)

This is where the Registry should be used. A registry can be located in a central known location that all nodes obtain their configuration from. If configuration changes,notifications can be broadcast to all nodes with details of the changes to be consumed by each of the nodes. The actual transport of the notifications can be managed by JGroups.

Using JGroups, here is one way we could achieve replication:

  • have a o.m.replication.MuleNode that implements/uses a JGroup JChannel to communicate with other nodes
  • have the MuleNode register with the MuleManager (or somewhere) to receive any events indicating a configuration change (add, delete, update)
  • have the MuleNode broadcast messages with the configuration object change
  • other MuleNodes would pick up these messages and feed the changes back through their local MuleManagers

B. Resource Replication

Many of the commercial ESB implementations that I have worke d with have the concept of a virtual file system that replicates resources like jars, keystores, config files, etc, between server instances. Drop a jar in one ESB node, and it will be replicated to all others. This is also critical to any failover scenarios.

Again, we would use JGroups to monitor any resource dependencies for a Mule instance (perhaps even by simply looking at the classpath). A more sophisticated approach would be to create a virtual file system that is instantiated by each Mule node on its local filesystem. This files system would have a well-defined structure with perhaps room for user-defined directories for projects.

C. Failover mechanism

Basically, we would have to code up some rules for elective leadship among Mule instances. Each MuleNode would know the rules and, should the primary Mule go down, the remaining Nodes would be able to decide who is the next primary. I've rolled this myself several times and it is pretty easy.

Providing the same transactional event source i.e. Jms or JDBC is available to the fail over nodes there "shouldn't" be a problem. If Mule dies mid way, the transaction is interrupted and nothing is committed. Thus when the fail over node kicks in it will pick up the failed message. Problems will occur when non-transactional and transactional data processing occurs in a single request. But thats the same with any transactional system (of course we could get tricky here and always combine a non-transactional action with a transacted check point, that could be quired on start up to roll back the non-transactional resource to a consistent state.... ).

D. Shared Memory Space

This is similar to resource replication. A share Memory Space (SMS) is used for storing 'session' state between requests i.e. Correlation of events, Idempotent event tracking, etc. This SMS would need to be high performance, may by using a replicated memory cache.

Any clustering plan involves two other features: implementation of a JBI registry and hot deployment. It has been suggested that the JBI registry is the way to go for the future. However, given that this is a bit to chew off, I'd like to decouple the JBI registry from the actual clustering solution. Hot deployment is something that can also be decoupled.

There are three options when it comes to a registry:

1) Use the current concept of a Mule registry (i.e. the model) and enhance it so that is supports component enable/disable functions, component events (start, stop, enable, disable, reload), and manipulation by a clustering agent.

2) Write a Mule-specific registry that takes over the ownership of components that the model currently has. To that, add the ability to store info/pointers to resources (properties, files, etc).

3) Write a JBI registry and make it replace the model or at least take over the ownership of components from the model.

My preference is for #1 because I don't see the need for extensive rework of something that is already working. For either #1 or #2, I would recommend building a JBI facade to the Mule-specific registry.

I suggest a phased approach:

1) Phase 1

1a) Develop a clustering agent based on JGroups that replicates configuration changes across Mule nodes, so that all Mule nodes have the same UMOModel. Agent will have election rules/policies for fail over/load balancing.

1b) Enhance the UMOModel so that it has the ability to disable descriptors (routes) - this will be required so that a primary model will be the only one with activated routes, while the backup nodes know about the routes but have the disabled. When a backup node picks up the work of another node, it activates the appropriate routes. We should also enhance UMODescriptor so it can store its node source (i.e. was is local or was it created through replication).

1c) Define an interface between the clustering agent and the model, such that the clustering agent can add/remove/disable/enable routes in the model

1d) Alter the procedure for component building so that we can enable hot deployment

2) Phase 2

2a) Either develop a full-blown registry, or a JBI facade to the model.

2b) Enhance the registry (or model) to manage resources

2c) Enhance the registry (or model) to support notification/component events

3) Phase 3

3a) JMX-ize the whole thing, so that registry components can be managed via JMX.

3b) Shared memory space for session state (could be a distributed cache)

I am currently starting on 1a. Anyone interested in 1b? Ross? Anyone ...?

Document generated by Confluence on Oct 03, 2006 09:23