ad

Your Ad Here

Tuesday, May 26, 2009

Web Hosting

Definition and Overview

Definition
The World Wide Web (WWW), a web of worldwide servers connected to the Internet, provides an easily used and understood method of accessing electronic content. Accessing information requires data communication between a Web-browser client and a Web-server application. Web hosting, then, is a means of hosting the Web-server application on a computer system through which electronic content on the Internet is readily available to any Web-browser client.

Overview
This tutorial will provide a basic overview of the main components that enable the Web, present two basic methods of Web hosting known as dedicated and shared, and discuss the challenges of resource management.

1. Overview of the Web
In late 1990 while working at CERN, the European Laboratory for Particle Physics Research in Geneva, Switzerland, Tim Berners-Lee invented the Web, including the definitions of universal resource locator (URL), hypertext transfer protocol (HTTP), and hypertext markup language (HTML). The Web provides a method for easily linking content contained on computer systems distributed throughout the world and connected to the Internet. Utilizing the Web, content on servers from many locations can be seamlessly linked and presented as a comprehensive resource collection. The Web further strengthens the power of the Internet's foundation of distributed computing.

The Web and the Internet remained the world's best-hidden resource until 1993 when Marc Andreessen, an undergraduate at the University of Illinois in Champaign, and a team at the National Center for Supercomputing Applications (NCSA) created the NCSA Mosaic browser. The NCSA Mosaic browser was the first Web-browser client that provided a friendly, point-and-click method for navigating the Internet using the Web.

The invention of the NCSA Mosaic browser was the start of the unprecedented growth of Internet users, Internet service providers (ISPs), and Internet business opportunities. By means of a user-friendly approach to searching and viewing the vast amount of information on the Internet, the Web-browser client enabled nontechnical individuals to benefit from the power and resources of the Internet.

Accessing content through the Web consists of communication between a Web-browser client and a Web server utilizing HTTP (see Figure 1).


Figure 1. Web Overview

The following is a step-by-step description of the communication path, as shown in Figure 1. It assumes that the Web server, the primary domain naming system (DNS) server, and the client computer are connected to the Internet and that all communication is conducted through the Internet.

  • steps 1 and 2—The end-user types a URL into the Web browser. The client computer finds the Internet protocol (IP) number associated with the domain name in the URL from the primary DNS server.
  • steps 3 and 4—The client computer uses the IP number obtained from the primary DNS server to request, through HTTP, the default HTML file from the Web server associated with the URL. The Web server sends the default HTML file to the client computer. The default HTML file provides information to the client computer for requesting all associated files—such as graphics—for the Web site's complete home page.

When the client computer and Web browser request and receive files from the same URL, the client computer is not required to perform a DNS lookup as described in steps 1 and 2. When the client computer attempts to retrieve a Web site from a different URL, the client computer must then perform steps 1 and 2 again.

2. Overview of Web Hosting
The complex web of servers consists of computer systems installed with Web-server software and connected to the Internet. These servers can be found in any facility with Internet connectivity. The process of maintaining and operating one of these servers is called Web hosting. Web hosting can be conducted in-house by the owner of the Web site, or it can be outsourced to a Web presence provider (WPP).

WPPs are typically companies with one or more data-center facilities that are connected to the Internet. Web hosting provided by WPPs can vary widely with respect to service quality and cost. Some providers consist simply of a room in the basement of a house and a tier-1 (T1) line connected to the local ISP. Others, however, are corporations with state-of-the-art hosting centers consisting of redundant fiber paths for high-speed Internet connections, redundant electrical power sources, a dry pipe–fire suppression system, and an experienced operations group, available 24 hours a day, seven days a week.

Web hosting can be provided on a shared computer environment or on a dedicated computer system. When a Web site consists only of standard HTML code and receives a small number of visitors, shared hosting service is the best solution. When a Web site consists of complex common gateway interface (CGI) scripts and proprietary programs and receives a large number of visitors, dedicated hosting service is the best solution.

3. Web-Hosting Implementation on a Dedicated Platform
The basic concept of Web hosting on a dedicated computer system consists of hosting one Web site on one computer system. The dedicated environment offers complete flexibility and security to both the WPP and the customer.

Web hosting on a dedicated computer system is the simplest and most straightforward method of operating a Web site. Because the computer system contains only one Web site, the configuration of software is standardized, as outlined in the software-installation documentation. Furthermore, system resources are dedicated to only one Web site and, therefore, are not constrained by any other process not associated with the operations of that site.

The essential components of Web hosting on a dedicated computer system are as follows (see Figure 2):

  • computer system hardware
  • operating system (including transfer control protocol [TCP]/Internet protocol [IP] stack)
  • Internet connection (IP number and domain name)
  • Web server software (HTTP)


Figure 2. Dedicated Hosting Basic Elements

Additional software applications can be added to the computer system to enhance the Web site and to simplify the process of uploading content. One of these applications is a file transfer protocol (FTP) server for remote access to the computer system for transferring HTML content files.

4. Web-Hosting Implementation on a Shared Platform
The basic concept of Web hosting on a shared computer environment consists of hosting many different Web sites on one computer system. The shared environment offers economic benefits to both the WPP and the customer. Because the Web-hosting environment is the same for all customers, the provider gains economic benefits from allocating portions of the total cost of the hardware, software, maintenance and operation, and customer support amongst all customers. Therefore, the total fixed cost is less on a per-customer basis than with dedicated hosting. The customer gains economic benefit by the reduced price of the Web-hosting service.

The essential components of Web hosting on a shared computer environment are the same as with dedicated hosting, except for the configuration of the software and the management of system resources. There are two basic ways to configure Web-server software for multiple Web sites. The first method is to configure the Web server with each Web site's specific configuration information. The second method is to operate multiple Web-server software on a single computer environment. The first method—a single configuration file with all of the Web site's information—has greater scalability but does not provide a means of limiting the resources consumed by each Web site. Therefore, a combination of both methods is ideal for creating a scalable shared-hosting service. A combination is achieved by using the single configuration file method for Web sites requiring small amounts of resources and using the multiple Web-server method to limit the resources consumed by Web sites that demand large amounts of resources.

When a Web site demands large amounts of system resources, the logical next step is to move the Web site to a dedicated computer system (i.e., dedicated hosting).

5. Web Hosting–Resource Management Challenges
Managing computer-system resources in the shared platform and the dedicated platform is challenging. As a Web site becomes more popular and is sought after by millions of Internet users, the Web site demands more and more system resources. Being able to measure, monitor, and manage the amount of system resources is essential for Web-site availability and server performance.

Critical system resources to manage include the following:

  • central processing unit (CPU) utilization
  • memory utilization
  • disk-swap space
  • disk space
  • disk input and output
  • network input and output
  • Internet bandwidth (not a computer-system resource but still requires monitoring and managing)

These critical system resources have a direct relationship with the performance of a specific Web site. A Web site can be created or modified to minimize the demand on these system resources. Some Web sites are developed without the consideration of system-resource utilization. When a Web site contains and executes a common gateway interface CGI script, CPU resources are demanded. If the Web site contains a large number of CGI scripts and requires these scripts to be executed by every Web-site visitor, then CPU resources become a major bottleneck and cause the Web site to appear slow. It is important for the Web-site designer and developer to balance system-resource demands with Web-site functionality and creativity.

To measure, monitor, and manage the computer-system resources, additional software must be installed on the computer system. Each type of computer system hardware requires specific software for resource management. The computer-system manufacturer and operating system–software developer should be able to identify the necessary software applications for measuring, monitoring, and managing the system resources for their specific computer systems.

6. Advanced Web-Hosting Methods
During the last several years, Web hosting has evolved from simple one-computer system architectures to redundant, load-balanced server farms. A server farm is a network of computer systems. As a Web site demands more and more system resources, the traditional hosting environment is constrained by the limited amount of available resources. There are two basic means of providing more resources: a larger computer system or a distributed computer environment. To provide redundancy and scalability, the distributed computer environment is the preferred method of expanding system resources.

The simplest distributed computer environment consists of two identical Web servers on the same local-area network (LAN) with a load-balancing device (see Figure 3). The load-balancing device is the gateway for all traffic entering and leaving the Web servers. The load balancer directs the incoming traffic to the best performing Web server, to alleviate all resource bottlenecks. With the load balancer as the gateway, the two Web servers appear as one large computing environment to all end-users on the Internet. This simple distributed computer environment can be expanded to accommodate more Web servers, providing greater scalability and consistently high performance levels.


Figure 3. Load Balancing Two Web Servers

The simple distributed-computer environment provides a method for increasing the available computer-system resources, but it will not prevent performance problems associated with specific network issues within the LAN or with the Internet connection at that specific location. To overcome local network problems, Web hosting has continued to evolve into a geographically distributed computing–environment architecture.

By distributing the traffic of a Web site across multiple servers located in dispersed geographic locations, system resources can be added without interruptions in the Web-hosting service, and the Web site can always be available despite LAN or Internet-connection problems. Moreover, with intelligent wide-area network (WAN) load balancing, Web-site performance will increase for all visitors, regardless of their geographic location.

Figure 4 illustrates Web hosting in a geographically distributed computing environment.


Figure 4. Two Site Architectures


Glossary
CGI
common gateway interface

DNS
domain naming system

FTP
file transfer protocol

HTML
hypertext markup language

HTTP
hypertext transfer protocol

SSL
secure socket layer

URL
uniform resource locator


No comments: