|
keywords = zero downtime, uptime, high availability , server clustering, failover, load balancing, 99.99% reliability" Title:
Clustering Solutions and Zero Downtime Hosting Pitfalls Publishing Guidelines: You may publish this article in your newsletter, on your web site, or in your print publication provided you include the resource box at the end. Notification would be appreciated but is not required.
Clustering Solutions and Zero Downtime Hosting PitfallsThere are a number of benchmarks, which we may use to evaluate hosting companies. One of these is, reliability. Like most things in this life, reliability in web hosting is typically a function of how much we are willing to spend for it. In essence, a “cost-effectiveness” equation needs to be determined and solved. Reliability can be measured in terms of percentage availability. Industry personnel will talk of reliability in terms of system availability with three (99.9%), four (99.99%) or five nines (99.999%).
High availability can be achieved by removing, as far as possible, any “single point/s of failure”, or, where this is not altogether possible, minimizing the time spent in a “failure” situation. One of the ways in which small businesses and ISP’s can reasonably avoid single point of failures is by employing server farm clustering and load-balancing solutions. Webopedia defines server farm clustering as follows: “A server farm is a group of networked servers that are housed in one location. A server farm streamlines internal processes by distributing the workload between the individual components of the farm and expedites computing processes by harnessing the power of multiple servers. The farms rely on load-balancing software that accomplishes
such tasks as tracking demand for processing power from different machines,
prioritizing the tasks and scheduling and rescheduling them depending
on priority and demand that users put on It is important to note, that typically, web servers,
which are load-balanced in such a manner, display one external IP address
to the public Internet, while using internal network IP’s to communicate
between the clustered servers and load balancer. However this is only half of the picture. There are very important cautionary notes to keep in mind. Where web hosting is concerned, availability depends on two things: 1. Hardware reliability (RAID drives, server clustering etc) within the Data Center; 2. High Bandwidth Internet Connectivity to the Data Center / Network Operating Center (NOC). Now, with all your well thought out server clustering
solutions, what would be the result, if, (as had recently occurred in
a very high profile web company), a fire in the Network vicinity had caused
the entire Data Center to shut down power for hours. Or, a The ideal solution therefore would be to employ clustering
solutions with servers in entirely different Data Centers with different
bandwidth providers. Redundant Data Centers eliminate the NOC itself being
a single point of failure. This scenario becomes interesting at this point,
because the difficulty of addressing the potential problems We now have to deal with DNS caching, the concept of failover, and how static and dynamic web applications respond to failure events. Failover and Load balancing are frequently used interchangeably, however they are in fact quite different. · Load Balancing refers to physically sharing servers capacity, so that one server is not overloaded and swamped with requests. · Failover however, is the process that manually or automatically switches a failed server or bandwidth provider to a standby server or network if the primary system fails or is temporarily shut down for servicing. As such, failover software is an important function of mission-critical systems that rely on constant accessibility.
As DNS records are passed from the original DNS servers
(i.e., ns1/ns2.your-domain.com), they are cached or stored at several
different ISP’s along the way. Each DNS record has a TTL (time to live) setting assigned. By manipulating this value, it is possible to alter how long that particular IP address/ DNS record combo is stored. If your site is on 2 different servers with 2 different IP addresses, you could set the ‘time to live’ with a value of, say, 2 minutes. The failover software would check server availability by “pinging” the web server every few minutes to determine whether it’s IP address is responding appropriately. (perhaps by looking for a particular text string in a web page). If a failure is detected, then the software would pull
the non-working web server IP address out of the list of IP addresses
assigned to the your web site’s domain With a TTL setting of 2 minutes, theoretically, your web site should be down for just 2 minutes, while switching DNS information to the other web server. The problem with this scenario, is that, while some ISP’s
caching might respond to such low figures, other ISP’s may decide
to ignore,(to save on bandwidth utilization), any TTL’s below a
certain value, say, 60 minutes. So it is entirely possible that some of
your visitors would see your websites and for others, your site would
be down for 1 hour Static non interactive web sites are great candidates
for server clustering, but the wicket becomes a bit sticky for dynamically
generated sites. Most database application software in general, although
having some replication capabilities, are not happy with multiple server
master/slave relationships and real time updating between servers. The
issue can Then there is the problem of how to keep your websites
synchronized. Unix/Linux servers have a built in synchronizing software
tool called rsync. You can also automate the synchronizing process by
setting up a cron job to run periodically. Your customers will also have to contend with their desktop email client software having dual email addresses for each email account on each web server. e.g. info@server1.net, info@server2.net. It is important to realize that DNS operates by default in a round robin manner, so that, if you have the same web site on 2 separate servers, it is very likely that server 1 will get 50% of all the web traffic.
Reasonable cost effective software based solutions may
be obtained as a service model or by purchasing the software yourself.
Zoneedit is an example of a service model, and Simplefailover is an example
of a software based model which maybe purchased on a In conclusion,
at this point in time, there are several limiting factors to successfully
implementing a “true” high availability multiple server web
hosting system. Depending on your clientele and the nature of their web
sites, this may indeed be a very viable alternative. For others, simply
setting up a server with high quality components, redundant RAID hard
drives and a good supply of server spare parts may be the best way
-------------------------------- Godfrey Heron is the Website Manager of the Irieisle Multiple Domain Hosting Services company. Signup for your free trial, and host multiple web sites on one account --------------------------------
|