|Novel Approaches to the Monitoring of Computer Networks|
|Prev||Chapter 5. Tracking Network Growth||Next|
The first attempt at an implementation was a single Perl programme that used Perl's Net::Ping module. The program attempted to sequentially ping every host in Rhodes' class B network once every half hour. The IP addresses of those hosts that were visible on the network were written to a text file, which served as a database of hosts that had been seen. The data obtained from this probe was also fed into a round-robin database of the type mentioned in Section 3.3.
This approach suffered from a serious limitation; it took longer than half an hour for each run to complete, leading a situation where the monitoring machine was constantly sending out ICMP packets, and where there was overlap between two consecutive runs. The presence of overlap caused the most problems, leading to, on occasion, corruption of the data file when two processes attempted to write to it simultaneously. Clearly a better solution was required.
Realising that the excessively long time this program was taking to run was caused by the process blocking while waiting for an ICMP echo reply, rather than network saturation, led to the first improvement. By splitting up Rhodes' class B into smaller blocks, these blocks can each be passed to a separate worker process, allowing multiple IP addresses to be queried simultaneously. This is a weak attempt at parallel processing, and brings with it the associated Inter-Process Communication (IPC) difficulties.
Attempts were made with various file locking and IPC methods (such as shared memory) to ensure that the data from each of the worker processes was correctly stored, and that data was not lost or corrupted. The KISS principle (Keep It Simple, Stupid) prevailed, however, and the problem of IPC and locking was offloaded to a SQL database package, namely MySQL.
MySQL was chosen because it is generally accepted to be a very fast database for cases where data integrity is not of utmost importance [MySQL, 2001]. In the case of this application, speed was certainly of more importance than integrity, since there were likely to be a large number of database INSERTs in a very short period of time. The reason that integrity is less important than speed is that the network itself is a lossy medium, meaning packets get lost in the normal course of operation. So long as the database performs each INSERT atomically (which all database management systems guarantee) and each record is limited to a single INSERT it does not particularly matter if a relatively small number of these INSERTs fail. In practice, however, this never happened. Another reason for preferring MySQL over other database products is its ability to handle large numbers of simultaneous connections well, which is an important consideration for an application that is going to create a large number of child processes.
Several experiments were done to determine what the optimal number of child processes would be; too many of them and the machine's processor would be overloaded, and too few of them would increase the time taken for a run to be processed. Since the machine running this application was used for other development work, it was important not to compromise the usability of the machine, making a longer run more preferable. In the end, Rhodes' class B network was divided into 128 blocks, with each block being processed by a separate worker process. These blocks were created as 23 bit networks on CIDR boundaries allowing them to easily be manipulated by the program [RFC 1519].  This configuration seemed to give the best time/load performance.
The round robin database was maintained even after the switch to a SQL database backend. It was found that real-time generation of graphs for the web interface described in Section 5.3 performed better when linked to an RRD database than it did when it was connected directly to the MySQL database. To facilitate this, another program was developed to extract records from the MySQL database and insert them into the RRD. This was done every half an hour, and is offset from the data gathering program by fifteen minutes.
Once the initial configuration and performance tuning had been done, this application was left to run on Rhodes' network for just over a year. The results that were gathered from this year-long run are discussed later in this chapter.
A CIDR boundary is a place where, when represented in binary, all the bits to the left of the boundary do not vary, whilst those to the right of the boundary do. This gives a range of addresses in a network. It is efficient to implement since a simple AND can be used to determine whether a host is on the network.