Intelligent Bandwidth (white paper) Joe Touch and Paul Mockapetris USC/Information Sciences Institute In today's Internet, information services such as WWW, Gopher, WAIS, etc. are oriented towards a reactive model: the Internet and its servers transport and respond to individual requests. Currently, those requests are serviced by conventional client/server systems. With the evolution of these systems, the expectations of the user interface are changing from an Internet model of "point, click, and wait" to an interactive "point, click, and action" behavior. Intelligent bandwidth examines the tradeoffs between bandwidth, storage, processing, and network topology itself in proactively reducing server response time. IB is a middleware component. We consider middleware a system component that is not encapsulated in either the network or the conventional operating system. Middleware is also a component of applications that either is replicated or can only effectively manage resources when implemented as a monolithic layer. IB provides caching and replication management, which is not a networking component. It also manages multicast, bandwidth, latency, and topology, which are not conventional OS components. IB optimizes this management among all applications on a host, so cannot be effectively replicated per application. IB Services ----------- Even with existing client/server implementations, the Web can use tens of megabits of bandwidth to provide an interactive response time. Reaction speeds near 100 milliseconds (0.1 sec.) are required for transparent real-time interactive use. Given simple files in the 10 Kbyte range, this requires 0.8 Mbps, exclusive of transaction processing, file retrieval, and intermediate buffering. Current measurements of Web logs indicate that simple hypertext files are near this size; graphic files are typically 60-100 Kbytes or larger, and it is not uncommon for a single page to incorporate tens of such images. Bandwidths near 5-10 Mbps would be required. We expect available end-user bandwidth to be in the 14 - 28 Kbps range for conventional modem access, and 128 Kbps for ISDN in the near future: all insufficient to support interactive access. We are developing techniques to adapt client/server hypermedia systems to the available bandwidth, and to use idle bandwidth between user requests to reduce the bandwidth required for real-time interactive access. Such techniques also adapt to other overall system parameters to optimize performance proactively, rather than waiting to react to user requests. Such proactive mechanisms are also suited to supporting emerging information services such as satellite, radio, and cable, as well as supporting mechanisms for replication in support of reliability (as above). Recent workshops have concluded that emerging Internet applications will need to be adaptive to their environment [Par94]. Parameters that describe the environment include: Network parameters, e.g., bandwidth, packet rate latency, jitter access method (shared vs. dedicated, polled vs. signalled) multicast (serial vs. parallel vs. broadcast-based) topology Local parameters, e.g., RAM space disk space CPU load System parameters, e.g., number and type of processors RPC overhead (marshalling, etc.) These parameters can be tuned to reduce latency, bandwidth, storage, processing, or any cost function (even one imposed by telcos). For example, we can use bandwidth and user-based RAM and disk space to preload a client-side cache, to reduce user-perceived latency [TF94][To94]. This is useful in high-bandwidth networks as well as over modem lines, where unused bandwidth is available. We can alternately delay individual requests to amortize the cost of responses. Consider a cable distribution plant where back-channels are provided by modem lines (e.g., Hybrid's product). Requests can be aggregated, or logged over time windows, and frequently-accessed data can be pushed "down" the data hierarchy to client-end caches. Distribution can utilize multicast mechanisms as well. Such information can be adaptive to out-of-band control signals; e.g., after an emergency "official information" can be proactively broadcast to client caches. This reduces subsequent load on centralized servers, as well as providing fault-tolerance. Response Time ------------- Response time is critical in making Web browsers acceptable interfaces for general daily use. These browsers were originally conceived as graphical user interfaces (GUIs) to Internet file transfer mechanisms, based on a slow transaction model. The Web is now being considered a true interactive distributed application, requiring response times nearer to 100 ms. The bandwidth required to support interactive response time is enormous, even excluding other latencies, including transmission, disk access, and computation costs. Transmission of even a moderate hypertext page of 40 Kbytes in 100 ms requires a bandwidth of 3 Mbps.Note that the Internet backbone is only 45 Mbps currently, and that most of the Internet is limited to much lower speeds. Bandwidth to the end user is likely to be in the 20-100 Kbps range for the foreseeable future. Even at these speeds, with the conventional client / server architecture, only 15 people could use the system simultaneously. Fortunately, memory costs, both in RAM and disk storage, have decreased to a point where we can consider methods that trade space for time or bandwidth, or consider amortizing replies in a single multicast response. One solution to real-time response latency is source preloading of a receiver cache .When viewing a Web page, a user stays on that page for a several seconds before requesting another. The workstation and network connection are otherwise idle during that time, and even inexpensive mass-market workstation disks are can store many 10-20 100 Kbyte files in temporary space. The server is in the best position to anticipate the needs of the user, and also is in the best position to determine its own load, preloading receivers when it is idle, and avoiding this additional load when otherwise occupied. At 10 links per page estimated, and 40 Kbytes per link as before, we can send all 10 links from the server to the receiver with the same 4 Mbps bandwidth required before, if the user spends only 1 second browsing each page. As the time the user spends per page increases, the file sizes decrease, or the number of links per document decrease, the bandwidth requirements decrease as well. In this way, memory (disk storage) can be used to reduce response time. This method also works when the "memory" is the bandwidth-delay product, i.e., the bits in transit between the source and receiver. It is thus useful when there is idle bandwidth-delay product, either as a result of a high-speed long-distance transmission line, or an otherwise idle low-speed or short-distance link. If the user spends 10 seconds reading each page, on 400 Kbps of bandwidth is required for the previous example. If we further constrain the hypertext file size to 10 Kbytes, we can accommodate 100 ms access times with current ISDN telephone links. Memory can also be used to reduce bandwidth in a multiaccess medium, such as cable, radio, or Ethernet. Consider a video-on-demand system (or the Web home-page server), in which two users request the same video. If the requests are simultaneous, we can transmit the video once, halving the bandwidth requirements. If the requests are not simultaneous, we have to transmit the same information twice, with a time offset between the copies. Instead, we can insert some buffering into the network to reduce the probability of having to retransmit. The buffer acts as a cache, amortizing the load on the server but transmitting the video at different times, with only the cost of the storage of the offset. This is a very effective use of buffering near the "local loop". Such techniques are useful in maintaining replicates, for reliability purposes as well. References ---------- [Pa94] "Report of the ARPA/NSF Workshop on Research in Gigabit Networking," C. Partridge, Ed., Washington DC, July 20-21, 1994, . [TF94] "An Experiment in Latency Reduction," J.D. Touch and D.J. Farber, IEEE Infocom, Toronto, June 1994, pp. 175-181. [To94] "Defining 'High Speed' Protocols: Five Challenges & an Example That Survives the Challenges," J.D. Touch, IEEE Gigabit Networking Workshop, Toronto, 1994. Also in IEEE JSAC special issue on Applications Enabling Gigabit Networks