Calsoft Whitepaper - Implicit replication in a Network File Server
Although research in fault tolerance by replication has matured, the results have not been widely used in practice. Existing approaches incur large overheads in resources and performance. In many cases, application programs have to be altered to explicitly manage replication. These problems have limited the acceptance of replication as a means for fault tolerance except where expense can be justified. Our view is that for fault tolerance by replication to gain acceptance in practice, the following conditions must be met:
1. Failure-free performance must not be penalized because of replication.
2. Replication and failure recovery must be transparent to application programs. Moreover, existing programs should be able to benefit from replication without modification.
3. Replication techniques must support standard protocols and systems.
We have used the above guidelines in the design and implementation of a Highly Available Network File Server (HA-NFS) . HA-NFS has been implemented on a network of work stations from the IBM RISC System/6000 family. HA-NFS servers preserve the semantics of the NFS1 protocol, and can be used by existing NFS clients without modification. Therefore, existing application programs can benefit from high availability without alteration.
- Current Status
- Future Work