A Tutorial on the World-Wide Web and Mosaic

This is a bit dated now...

This document assumes that you know a bit about the Internet but are pretty new to the World-Wide Web. If you prefer your material on dead trees, may I suggest Spinning The Web, available from PUBORDER. It focuses on information providers but has all the information that I would consider important.

Although the terms World-Wide Web and Mosaic are sometimes used interchangeably, it is important to understand the distinction between them. The World-Wide Web is actually the design of the interconnected documents, their formats, the protocols for transfering them, and by extension, the actual documents, data, and facilities that use the design. Mosaic, on the other hand, is merely one implementation of a tool to present the results to a user. From this, it should be immediately obvious that there is no such thing as a Mosaic Server, it is properly called a Web server.

The Goals of the World-Wide Web

The Web was created by Tim Berners-Lee to facilitate sharing information amongst high-enery physicists.

Basic goals of the design are:

Seamless interface to disparate and distributed data
Information of different formats, different qualities, served by different methods, and created in distant places, are all accessible at the click of a button.
Support of pre-existing data
The massive quantities of data on FTP, Gopher, and WAIS servers made it essential to integrate that data into the model.

The Design

By making use of simple and open standards, the Web has been allowed to mature at an incredible rate. Interoperability has been very successful and there is a wide selection of software to choose from.

Client-server
The client-server design point is important to understanding The Mechanics of the WWW.
This organization (along with the use of openly documented protocols) is key to the goal of distributed data as well as allowing the independence of data providers and consumers.
The data provider/consumer can run any software supporting the protocol that he sees fit. This allows for individual wants and needs.
URLs
The URL Primer will introduce you to this way of specifying a document. The URL typically specifies the scheme used to access the document, the server that will provide the document, and a path that the server will use to uniquely identify the document.
HTML
The HyperText Markup Language provides enough clues to the client software to allow it to display the document in a sensible manner on a variety of different hardware configurations. The appearance of HTML is much like BookMaster or other SGML-based markup languages. This is the native format of WWW documents.
MIME
The Multipurpose Internet Mail Extensions are an openly documented standard to facilitate the identification and transmission of many different types and encoding styles of files.
It was originally intended to assist in delivering binary files of various kinds across links that were not 8-bit safe. The WWW uses it mainly as an established set of file-types, and a way to communicate the type by way of headers.
HTTP
The Hyper-Text Transfer Protocol is a light-weight protocol intended to allow a file to be requested and returned quickly. Both the request and the reply have a variable amount of information that can be transmitted in addition to the filename and the actual file.
Strictly speaking, this protocol should not be necessary for the success of the WWW, however it eliminates the limitations of using any of the existing protocols.

The Data

The data subjects available on the WWW vary widely and are truly mind-boggling. You should be prepared to find mind-numbingly boring materials as well as blatantly offensive material. You are encouraged to choose what you read rather than choose what others may publish.

Tutorials
There are significant tutorials on the subject of the World-Wide Web available as well as a few on popular science topics. Many of these involve hyper-text, many of them are multi-media, and some are even interactive.
Documentation
The first thing to appear on the WWW of course, was massive quantities of information on how to use the WWW. This has started to subside and information on almost anything is now available.
Advertisements
An increasing portion of the WWW seems to be devoted to making a profit. Yellow-pages style lists of companies are being formed with each company offering their wares or services.
Databases
Through the use of advanced features in the servers (not specifically spelled out by any of the network protocols), it has become popular to perform database functions and return their results rather than a static document.
Databases available include entire product lines, theatrical movie information, dictionaries, telephone books, etc.
Indexes
A specific kind of database which has proven to be very useful is one built from an index of a large group of documents (perhaps the entire World-Wide Web). The request then initiates a lookup in the index and returns a list of documents which might be appropriate.

The Clients

The client or browser is the software that is run by the end user to view the data retrieved from the WWW. There are already many implementations.

Popular Features
There are a wide variety of possible features, and each client might incorporate a slightly different set.
1. Tables
  A recent addition to HTML is a method of specifying a table. The language is very flexible. Different implementations seem to allow differing levels of nesting.
2. Forms
  Fill-out forms allow the user to give feedback to the document creator or provide more information to a gateway.
3. Graphics
  Many HTML documents now specify that an image be imbedded in the document. In some cases, it's convenient for an independent image to be viewable directly from the client as well.
4. External Viewers
  For unrecognized document types, it is important to be able to define the handling of the data. Sound-clips, movie-clips, etc. are examples of documents that are best handled by an external viewer.
5. Cacheing
  Frequently during browsing, a document is requested that was recently retrieved. A cache can speed access to those documents.
6. Authentication/Encryption
  Of rising importance, is secure communication on the web. Various methods of authentication and/or encryption are being investigated.
Popular Software
1. NCSA Mosaic
  NCSA Mosaic is responsible for making the WWW popular with it's imbedded graphics and GUI interface. It has fallen behind in implementing advanced features though. It is still freeware but suffers in comparison to other clients.
2. Netcape Navigator
  Netscape was written by the original authors of NCSA Mosaic. It has become the most popular browser software and has lots and lots of features.
3. Robots, etc.
  It's important to understand that some requests do not come from people. There are robots wandering the web that collect information for indexing or for off-line delivery to people.

The Servers

Popular Features
1. Cacheing
2. Proxy
3. CGI
4. Authentication/Encryption
Popular Software
1. NCSA HTTPD
2. Cern HTTPD
3. Apache
4. FTP, etc.

Paul Chamberlain