|
Felis Server - High Availability Server Solutiona White paper by - Architek - September 1999 editor's note, since writing this article Architek has been acquired by another company, but this archived article still contains some useful ideas. |
![]() |
"Our view is that an HA system should not mean there are simply two machines which are inter-controlled in the case of a fault, but a system which is able to recognise and internally manage the various services provided by the system." | |
![]() |
Felis Server is a High Availability / Cluster software solution integrated with the characteristics of the operating system, offering the best ratio of cost/functionality that guarantees the continuity of services provided by a system. Felis Server provides a rapid implementation for whatever the type of application being supported, (Database, Web server, Mail Server, File & Print Server, Gateway). And guarantees the management of the Application server by a simple API.
Our view is that an HA system should not mean there are simply two machines which are inter-controlled in the case of a fault, but a system which is able to recognise and internally manage the various services provided by the system.
Felis Server is a solution for High Availability and Manageability, offering the continuity of services from each server in the system with a solution that is both easy to manage and flexible.
DESCRIPTION Felis Server provides the capability to add to a generic pool of services the possibility to dynamically migrate from one machine to another. A service is characterised by one or more processes using the system
resources (Service Environment) of which mass memory (internal disks, external
disk array, RAID) and network interfaces (Ethernet 10/100 MB, Token Ring, FDDI,
etc.). |
![]() |
The migration of the Service Environment and the relative service can occur by the automated High Availability or by the System Manger of the Manageability system for maintenance, or in order to tune the configuration in order to provide the best performance. |
![]() |
Felis Server augments the flexibility of the system, the overall services which are present in both machines when booted are characterised based on the optimisation criteria required by the services themselves, and of the machine characteristics which hosts them. This is separate from the requirements of the software itself.
For each of the defined services it takes care to provide the resources required (Service Environment) both in terms of mass memory (file system device) and in network interfaces (dynamic configuration of one or more physical or logical network interfaces where the IP address is associated to the service).
The actions that Felis Server will follow in the event of an error condition are set in the configuration phase and can be, in the case of services, a shutdown, halt (failover) or the rebooting of services for a defined number of repetitions, meanwhile in the case of a machine that does receive a response from the other, a takeover. The takeover consists of starting on the second machine all the services that were being performed on the original failing one. For the control of each service there is a defined API which allows Felis Server to activate, stop and control the user application as a function of the operation, this allows a customisation of the control of the given services. |
![]() |
The modules available for the management of the commonly used applications are:
SYSTEM ARCHITECTURE The chosen architecture offers the maximum reliability of the S/W components used in the implementation. The system is composed of these layers:
Redundant Link/Lock Manager This is the heart of the system which provides a secure method of communication to co-ordinate the operations. This layer controls every instant of the system boot, shutdown, manual intervention to move one or more services , service activation following a takeover and service component control. |
![]() |
Monitoring The two components (Takeover Monitor & Service Monitor) other than the functions specified below, provide reciprocal control thus guaranteeing the continued presence of both. The functions that are executed interact with Lock Manager for co-ordination and they use Redundant Link for the communication. |
![]() |
Takeover
monitor Verifies via the Redundant Link the correct operation of the other half of the system, activating the takeover sequence whenever it does not receive a response. The reply system is implemented within the kernel which guarantees that it's function always punctually answers the requests of the operating system that it hosts. The local control policy can be configured to activate the failover according to problems such as excess resource allocation: memory, printers etc. |
Service
monitor Is responsible for the control of the components which make up the Service Environment of a service, the network and media components are singularly controlled in order to ascertain availability of a service. The failure of one of the devices could prime a failure sequence. A service is constituted by the allocation of known functions e.g. NFS, or by application services such as a database or web server etc. For each service Felis Server permits the user to define the control method. The test method is used as an ulterior check regarding the state of each service in order to initiate sequences of restart or takeover. |
Administration Shell and WEB interface
Allows the interaction with the system components in order to execute requests via a guided menu couple / host / service.
The options are:
Service Environment Manager Provides to the Service Environment the management of internal disks, external disk arrays, RAID, network interfaces; Ethernet 10/100MB, Token Ring, FDDI, etc. and application processes. Network: - Service Environment Manager co-ordinates the operations required to virtualize the network services, it is available at the level of IP address and IP address/Mac address. Storage: Service Environment Manager co-ordinates the operations required for the acquisition and release of disks, instructing any additional S/W that is eventually involved (RAID & File system). |
![]() |
Software
Applications:
Service Environment Manager supports the commonly used applications,
however, the system offers an open API interface to any application or system
of High Availability Software. There is exisiting support for the following :
Administration Shell and WEB interface
Service Environment Manager allows the interaction with the system
components in order to execute requests via a guided menu couple / host /
service. The options are:
Configuration | |
Network Requirements: | At least two links between systems, with the configuration of a virtual IP address it is possible to obtain a link from the network of available services. |
Storage Requirements : | When using simple devices (single or multiple SCSI disk) it is better to
use at least two SCSI busses for each machine (separate from the one where the
boot device is mounted). On which the disks are attached and configured as
mirrored volumes with one component per bus, all components must also support
multi initiated SCSI ID. When using complex devices (RAID) which offer access to their own volumes via a separate interface for each host, these can be managed with only one SCSI bus. In order to maximise the use of the functionality offered by the Clariion Double Storage Processor, EasyLife offers a specific driver which directly interacts with the Storage processors. |
Operating System: | Solaris 2.X on all platforms. |
home page & search engine | storage manufacturers | storage products | add url | articles | storage resellers |
STORAGEsearch is published by ACSL |