One of the uses mentioned for the application in Chapter 4 was to predict network growth. Unfortunately, the approach taken by the previous chapter was not really sufficient to provide the data required to make accurate predictions of network usage and growth. Part of the reason for this is that it does not record historical data to allow for comparisons over time.
In order to get an accurate picture of network growth and utilisation, a new approach was required. One of the key features of such an approach would be that it should keep historical information on the utilisation of various parts of the network.
At a high level, the approach taken by this application is straightforward. Every half an hour, it collects information on which hosts are available on the network and records that information to a database. Logistically, however, things are more complicated than that.
Rhodes has a class B network block assigned to it by ARIN, the American Registry for Internet Numbers. In class B networks, the first two octets of an IP address are fixed and the last two are variable. In other words, the first 16 bits of the 32 bit IP address space define the owner of the network block, and the last 16 bits define hosts on the network. This gives 65536 available IP addresses [BCP 12]. In Rhodes' class B, the first two octets correspond to the dotted-decimal representation 146.231, meaning that anything matching 146.231.*.* is considered to be on Rhodes' network.
It was estimated from entries in the DNS tables from Rhodes that about 4500 IP addresses were in use at Rhodes, or roughly seven percent of the available IP space. (It will be shown later in this chapter than this estimate was a little generous.) Rhodes' network is subnetted according to Tsuchiya's scheme for assigning subnet numbers [RFC 1219], which means those IP addresses that are in use are scattered throughout the class B network rather than being a contiguous block within the available IP addresses. This makes finding an IP address that is actively in use akin to finding the proverbial needle in a haystack.
The simple solution is to ping every host on the network block and see which ones respond [ping(8)].
If, in order to do a scan of the network, a single 64 byte ICMP ping packet is sent to each of the 65536 hosts, the result is four megabytes  of outgoing network traffic. In addition, from the estimate above, the application can expect 4500 replies to these ICMP echo requests, adding another 280 KB to the amount of data on the network. Together, this represents a significant amount of data generated by this method, and the sheer volume of traffic led to some implementation problems, as shall be seen in Section 5.3.
For such a probe to gather useful information, it has to be repeated at regular intervals. Since this is an active probe (as explained in Section 2.2.3), there is a trade off between usefulness and invasiveness. If the experiment is run too often, it risks saturating the low-bandwidth shared-media segments of the network. This is something that needs to be avoided, and something that was considered during the implementation of the probe.
(64 * 65536) B / 1024 KB / 1024 MB = 4.0 MB