Intelligent Bandwidth
			    (white paper)
		    Joe Touch and Paul Mockapetris
		  USC/Information Sciences Institute
				   
    In today's Internet, information services such as WWW, Gopher,
WAIS, etc. are oriented towards a reactive model: the Internet and its
servers transport and respond to individual requests. Currently, those
requests are serviced by conventional client/server systems. With the
evolution of these systems, the expectations of the user interface are
changing from an Internet model of "point, click, and wait" to an
interactive "point, click, and action" behavior. Intelligent bandwidth
examines the tradeoffs between bandwidth, storage, processing, and
network topology itself in proactively reducing server response time.

    IB is a middleware component. We consider middleware a system
component that is not encapsulated in either the network or the
conventional operating system. Middleware is also a component of
applications that either is replicated or can only effectively manage
resources when implemented as a monolithic layer. 

    IB provides caching and replication management, which is not a
networking component. It also manages multicast, bandwidth, latency,
and topology, which are not conventional OS components. IB optimizes
this management among all applications on a host, so cannot be
effectively replicated per application.

IB Services
-----------

    Even with existing client/server implementations, the Web can use
tens of megabits of bandwidth to provide an interactive response time.
Reaction speeds near 100 milliseconds (0.1 sec.) are required for
transparent real-time interactive use. Given simple files in the 10
Kbyte range, this requires 0.8 Mbps, exclusive of transaction
processing, file retrieval, and intermediate buffering. Current
measurements of Web logs indicate that simple hypertext files are near
this size; graphic files are typically 60-100 Kbytes or larger, and it
is not uncommon for a single page to incorporate tens of such images.
Bandwidths near 5-10 Mbps would be required.

    We expect available end-user bandwidth to be in the 14 - 28 Kbps
range for conventional modem access, and 128 Kbps for ISDN in the near
future: all insufficient to support interactive access. We are
developing techniques to adapt client/server hypermedia systems to the
available bandwidth, and to use idle bandwidth between user requests
to reduce the bandwidth required for real-time interactive access.
Such techniques also adapt to other overall system parameters to
optimize performance proactively, rather than waiting to react to user
requests. Such proactive mechanisms are also suited to supporting
emerging information services such as satellite, radio, and cable, as
well as supporting mechanisms for replication in support of
reliability (as above).

    Recent workshops have concluded that emerging Internet
applications will need to be adaptive to their environment [Par94].
Parameters that describe the environment include:

	Network parameters, e.g., 
		bandwidth, packet rate
		latency, jitter
		access method (shared vs. dedicated, polled vs. signalled)
		multicast (serial vs. parallel vs. broadcast-based)
		topology
	Local parameters, e.g.,
		RAM space
		disk space
		CPU load
        System parameters, e.g.,
		number and type of processors
		RPC overhead (marshalling, etc.)

    These parameters can be tuned to reduce latency, bandwidth,
storage, processing, or any cost function (even one imposed by
telcos). For example, we can use bandwidth and user-based RAM and disk
space to preload a client-side cache, to reduce user-perceived latency
[TF94][To94]. This is useful in high-bandwidth networks as well as
over modem lines, where unused bandwidth is available.  

    We can alternately delay individual requests to amortize the cost
of responses. Consider a cable distribution plant where back-channels
are provided by modem lines (e.g., Hybrid's product). Requests can be
aggregated, or logged over time windows, and frequently-accessed data
can be pushed "down" the data hierarchy to client-end caches.
Distribution can utilize multicast mechanisms as well. Such
information can be adaptive to out-of-band control signals; e.g.,
after an emergency "official information" can be proactively broadcast
to client caches. This reduces subsequent load on centralized servers,
as well as providing fault-tolerance.

Response Time
-------------

    Response time is critical in making Web browsers acceptable
interfaces for general daily use. These browsers were originally
conceived as graphical user interfaces (GUIs) to Internet file
transfer mechanisms, based on a slow transaction model. The Web is now
being considered a true interactive distributed application, requiring
response times nearer to 100 ms. The bandwidth required to support
interactive response time is enormous, even excluding other latencies,
including transmission, disk access, and computation costs.
Transmission of even a moderate hypertext page of 40 Kbytes in 100 ms
requires a bandwidth of 3 Mbps.Note that the Internet backbone is only
45 Mbps currently, and that most of the Internet is limited to much
lower speeds.  Bandwidth to the end user is likely to be in the 20-100
Kbps range for the foreseeable future. Even at these speeds, with the
conventional client / server architecture, only 15 people could use
the system simultaneously. Fortunately, memory costs, both in RAM and
disk storage, have decreased to a point where we can consider methods
that trade space for time or bandwidth, or consider amortizing replies
in a single multicast response.

    One solution to real-time response latency is source preloading of
a receiver cache .When viewing a Web page, a user stays on that page
for a several seconds before requesting another. The workstation and
network connection are otherwise idle during that time, and even
inexpensive mass-market workstation disks are can store many 10-20 100
Kbyte files in temporary space. The server is in the best position to
anticipate the needs of the user, and also is in the best position to
determine its own load, preloading receivers when it is idle, and
avoiding this additional load when otherwise occupied. At 10 links per
page estimated, and 40 Kbytes per link as before, we can send all 10
links from the server to the receiver with the same 4 Mbps bandwidth
required before, if the user spends only 1 second browsing each page.
As the time the user spends per page increases, the file sizes
decrease, or the number of links per document decrease, the bandwidth
requirements decrease as well.

    In this way, memory (disk storage) can be used to reduce response
time. This method also works when the "memory" is the bandwidth-delay
product, i.e., the bits in transit between the source and receiver. It
is thus useful when there is idle bandwidth-delay product, either as a
result of a high-speed long-distance transmission line, or an
otherwise idle low-speed or short-distance link. If the user spends 10
seconds reading each page, on 400 Kbps of bandwidth is required for
the previous example. If we further constrain the hypertext file size
to 10 Kbytes, we can accommodate 100 ms access times with current ISDN
telephone links.

    Memory can also be used to reduce bandwidth in a multiaccess
medium, such as cable, radio, or Ethernet. Consider a video-on-demand
system (or the Web home-page server), in which two users request the
same video. If the requests are simultaneous, we can transmit the
video once, halving the bandwidth requirements. If the requests are
not simultaneous, we have to transmit the same information twice, with
a time offset between the copies. Instead, we can insert some
buffering into the network to reduce the probability of having to
retransmit. The buffer acts as a cache, amortizing the load on the
server but transmitting the video at different times, with only the
cost of the storage of the offset. This is a very effective use of
buffering near the "local loop". Such techniques are useful in
maintaining replicates, for reliability purposes as well.


References
----------

[Pa94] 	"Report of the ARPA/NSF Workshop on Research in Gigabit
	Networking," C. Partridge, Ed., Washington DC, July 20-21,
	1994, <ftp.std.com:pub/craigp/report.ps>.

[TF94] 	"An Experiment in Latency Reduction," J.D. Touch and D.J.
	Farber, IEEE Infocom, Toronto, June 1994, pp. 175-181.

[To94] 	"Defining 'High Speed' Protocols: Five Challenges & an Example
	That Survives the Challenges," J.D. Touch, IEEE Gigabit
	Networking Workshop, Toronto, 1994. Also in IEEE JSAC special
	issue on Applications Enabling Gigabit Networks