Network latency

Network latency considerations

Network latency can potentially impact the performance of NFM-P. The following are known impacts of latency between the various NFM-P components:

NFM-P server to NFM-P clients (GUI/XML-API): event notification rates of network changes
NFM-P auxiliary statistics collector to the network elements: ftp connection for statistics collection and SNMP stats collection
NFM-P server to the network elements: resync times, provisioning, ftp connections for statistics and network element backups, trap handling, and SNMP stats collection (See Scaling guidelines for statistics collection for more information about latency impact on SNMP stats collection)
NFM-P server and NFM-P auxiliary collectors to NFM-P database: NFM-P performance is sensitive to latency in this area. The round trip latency between the active NFM-P components (server, database, auxiliary) must be no longer than 1 ms., otherwise overall NFM-P performance will be significantly impacted. The NSP auxiliary database can tolerate up to 200 ms of latency between it and the rest of the NFM-P management complex.

Since SNMP communication to a single network element is synchronous, the impact of latency is directly related to the number of SNMP gets and responses. Operations to a network element with a round trip latency of 50 ms will have the network transmission time increase by ten times compared to a network element with a round trip latency of only 5 ms. For example, is a specific operation required NFM-P to send 1000 SNMP gets to a single network element, NFM-P will spend a total of 5 seconds sending and receiving packets when the round trip latency to the network element if 5 ms. The time that NFM-P spends sending and receiving the same packets would increase to 50 seconds if the round trip latency were increased to 50 ms.

Network element re-sync can be especially sensitive to latency as the number of packets exchanged can number in the hundreds of thousands. For example, if a re-sync consists of the exchange of 100 000 packets (50 000 gets and 50 000 replies), 50 ms of round trip latency would add almost 42 minutes to the overall re-sync time and 100 ms of round trip latency would add almost 84 minutes to the overall re-sync time.

NFM-P can use a proprietary mechanism to discover and resync specific node types and versions, that can dramatically reduce resync and discovery times to network elements with high network latency. TCP Streaming is supported on the following network element types, on the releases that support streaming:

7950 XRS
7750 SR
7450 ESS
7250 IXR

Geographical redundancy of NFM-P components

It is ideal to ensure that all NFM-P stations and the NFM-P XML-API clients are collocated within a geographical site on a high availability network to avoid the impact of network latency.

In cases where geographic redundancy is configured, all active NFM-P stations (NFM-P server, NFM-P auxiliaries, and NFM-P database) should be located within a geographical site on a high availability network to avoid the impact of network latency between components, which must remain at less than 1 ms. When an NFM-P component (server, auxiliary, or database) switchover or failover occurs, manual intervention may be required to align the stations on the same geographical site to minimize the performance impact of network latency. This task can be automated by enabling the database alignment feature within NFM-P.

NFM-P has been tested with up to 250 ms of geographic latency. Specifically for the NFM-P database, Oracle doesn't provide any guidance on latency, other than adjusting TCP socket buffer sizes. If the NFM-P deployment includes the NSP auxiliary database, the latency between the active NFM-P auxiliary statistics collectors and the NSP auxiliary database must be less than 200 ms, effectively reducing the tested geographic redundancy limit from 250 ms to 200 ms.

Optimizing throughput between NFM-P components

In high-speed, high-latency networks the TCP socket buffer size controls the maximum network throughput that can be achieved. If the TCP socket buffer is too small it will limit the network throughput, despite the fact that the available bandwidth might support much higher transfer rates.

Adjusting the TCP socket buffer size to achieve optimal network throughput may be necessary if the network bandwidth is more than 10Mbps and round-trip latency is higher than 25 ms.

The optimal TCP socket buffer size is the bandwidth delay product (BDP). The bandwidth delay product is a combination of the network bandwidth and the latency, or round-trip time (RTT); basically, it is the maximum amount of data that can be in transit on the network at any given time.

For example, given a 20Mbps network with a RTT of 40 ms the optimal TCP socket buffer size would be computed as follows:

BDP = 20 Mbps * 40ms = 20,000,000 bps * .04s = 800,000 bits / 8 = 100,000 bytes socket buffer size = BDP  = 100,000 bytes

See the RHEL documentation for information about how to modify the TCP socket buffer size and ensure that the change is persistent.

It is important to note that increasing the TCP socket buffer size directly affects the amount of system memory consumed by each socket. When tuning the TCP socket buffer size at the operating system level, it is imperative to ensure the current amount of system memory can support the expected number of network connections with the new buffer size.

Additional NFM-P database throughput optimizations

In addition to the optimizations above, the NFM-P database station requires changes to the sqlnet.ora and listener.ora files that are contained in the oracle/network/admin directory. The lines with the SEND_BUF_SIZE and RECV_BUF_SIZE should be uncommented (delete the “#” character), and set to 3 times the BDP value calculated above. The database should be shutdown when this change is made.