Sonoran Systems
Article KB0005 - Dated: 28-May-2004
![]()
Problem:
Occasionally the ArTrac Fault Management system misses an alarm or has a hanging alarm (an alarm that did not clear when the network element is no longer in alarm).
Solution:
There are three common causes of this problem. They are:
Network Failures - ArTrac relies on the network to which it is connected to faithfully deliver event messages from the monitored network elements. Frame slips, data collisions, data switch and router failures, data corruption, and a host of other transient network problems can compromise the deliver of messages to ArTrac. Depending on the design and maintenance of the network, these types of problems can be very infrequent or occur quite often.
To identify possible network problems, you should perform the following:
Transmit Buffer Congestion in the Network Element - Depending on the design of the network element, the transmit buffer may be inadequate for the volume of messages that the network element must pass to the network. In such cases, the network element may discard messages in overflow conditions. Of course, if an important message is discarded, it will never reach the ArTrac system and may result in an alarm being missed or in an alarm hanging (the clearing message wasn't received). This is not the fault of the ArTrac system or the network but rather the network element itself.
To identify possible buffer congestion problems in the network element, check the following:
Many switches, including Lucent, Nortel, Motorola, and Ericsson can develop buffer congestion problems. Open the Connection Manager administration screen and set the "Routing Type" to Analysis and File for any connectors which establish connectivity for the network element that you are having a problem with. Allow the ArTrac system to record all network communications for the network element to file for one or more days. Following some time, open the recording file(s) in an editor and search the file for any of the following:
- Most network elements place a sequence number somewhere in the transmitted message. Thus, each successive message received from the network element should have a sequence number that is one number higher than the previous message. Check the contents of the recording file for any missing messages - locations in the file where the sequence number skips one or more numbers. If any such occurances are found, the network element is likely experiencing transmit buffer congestion.
- Perform a search in the recording file for any messages like "X messages discarded." or any other messages that the network element may transmit which indicate transmit buffer congestion. If any such messages are found, the network element is experiencing transmit buffer congestion
If any of the above conditions are found in the recording file, contact the manufacturer of the network element for information on how to resolve the problem.
Conflict in Rule Sets - ArTrac has a very solid alarm analysis engine that uses a system of checks to guarantee 100% reliability in the parsing of alarm messages. The ArTrac database is, in its most simple description, a relational database; meaning that complex relationships can be established between rule sets and look-up tables. There are occasions where the system administrator might create a rule set that is either too ambiguous such that it interferes with other rule sets or directly conflicts with another rule set. This can result in analysis failure of certain event messages received by the system. By nature, this type of failure is difficult to locate. Fortunately, ArTrac has a built-in Database Analysis feature which checks the integrity of its databases and performs a comprehensive evaluation of how rule sets interact structurally and relationally. The Database Analysis feature will send a detailed report the the Chronological Display screen (and to the historical database) which will assist you in identifying any problems in your database.
To solve Rule Set conflicts perform the following:
If you continue to experience this problem, contact your product support representative for further assistance.
![]()