AppsHosting
Sun PartnerAdvantage MemberSpace Oracle Partner

Home | Contact Us | Login | Register

Tree

 


The World of AppsHosting
 
 

What is High Availability?

You are welcome to use AppsHosting documentation in your businesses and websites, but we would really appreciate it if you maintain the links to AppsHosting!

At AppsHosting, we define High Availability as various levels of data protection which assist in increasing application uptime, summarized as follows:

1. Effective Backup/Recovery – This is the single most important aspect of high availability.

· Scheduled Rsync over SSH/VPN, RMAN backups for database

· O/S dumps (on a periodic basis) – For fast recovery in the event of O/S failure, using Unix native methods (ufsdump, vxdump etc)

· Database backups (online or offline, using Oracle tools, or Veritas Netbackup agent) – Using Netbackup agent will eliminate the need for “disk-to-disk” backups, and allows for full and incremental backups.

· Backup validation needs to occur on a regular basis.

· Offsite tape storage with defined tape rotation and retention schedules.

2. Hardware/Software RAID - For protection against disk failure

· Hardware RAID will be configured on the SAN storage controller. Typical levels of RAID will include RAID 1+0 (for high performance and high availability), RAID 1 (for high availability), and RAID 5 (for cost reduction and data protection)

· Software RAID will be configured using Veritas Volume Manager or Sun Volume Manager. These software allow for efficient mangement of disk volumes.

3. File System Protection

· File systems need to be configured for quick recovery in the event of a system crash. This can be accomplished by either using Sun’s built in file system journaling option, by utilizing ZFS, or by using Veritas File System. Using Veritas File System or ZFS are efficients method, as they allow for online management of file systems, cloning etc., and provide efficient journaling mechanisms. These file systems also allow for file system expansion. Under Linux, the 'ext3' filesystem provides similar advantages.

4. Hardware Server Redundancy

· Clustering - For a clustered environment, Veritas Cluster Server or Sun Cluster software can be used to create a load balanced, highly available environment. Clustering requires additional overhead in terms of hardware, software, and administration.

· Hot Standby Environment - An alternative to clustering is a hot standby server that will act as a replacement for an unavailable server. Properly configured, a hot-standby server could potentially replace an unavailable server within minutes, assuming there is no SAN data or database corruption. This is a simpler solution than clustering that can provide quick recovery, but downtime will nevertheless be incurred.

· Failover Environment – Another alternative to clustering is server failover. Our Oracle 11i servers are configured for failover. Any Oracle 11i server can act as a failover server for another. Properly configured, a failover server could potentially replace an unavailable server within minutes, assuming there is no SAN data or database corruption. This is a simpler solution than clustering that can provide quick recovery, but downtime will nevertheless be incurred.

5. Effective Monitoring and Notification

· Efficient monitoring and notification tools could potentially prevent a system problem. The most effective notifications involve automatic paging of system/database administrators in the event of an alert.

· An “on-call” schedule for system administrators with appropriate escalation procedures is essential for quick diagnosis and recovery.

6. Access to Systems

· During an outage, systems need to be easily accessible, for quick recovery. Whether system administrators are onsite or offsite, accessibility to the Sun console is required. This can easily be accomplished with a Serial switch and VPN access, or a LOM device (Lights out management).

7. The Next Level of High Availability

· The final component of high availability involves GSLB, or Global Server Load Balancing. For implementing GSLB we use a multiple data center model, involving application and database replication using Oracle Dataguard/Rsync over SSH

· DNS is configured such that rapid failover takes place in a Disaster Recovery scenario.

Summary

We believe that these protocols, when properly implemented, can provide a high level of assurance to business, that business sustaining application can continue to be available during periods of unexpected events.

Please contact AppsHosting with any questions, comments, or suggestions; as we constantly strive for improvement in these areas.

 


 
Copyright © 2006 AppsHosting, Inc. | All rights reserved | Privacy statement