Performing a detailed health check

Perform a detailed system health check on a regular basis and any time you make configuration changes. These checks should be done for each configured report server. This will ensure that your NAM installation performs well and that it is configured to reflect the current environment it monitors and your monitoring objectives.

Memory

The memory limit is the most important and most commonly exceeded limit on the NAM Server.

  1. In the NAM Console, open Deployment ► Manage devices.
  2. On the Devices tab, select the NAM Server system to be checked.
  3. On the Device Status tab, in the Hardware health section, click the Capacity status link.
  4. View the Memory utilization section.
memory check
memory check

(Do not use the Windows Task Manager for this purpose. The values it reports do not include the internal Java memory structures, which can be easily released by garbage collection. This release is run automatically by the NAM Server when needed.)

The memory threshold value should be approximately 50 percent (with local SQL Server) or 80 percent (with a remote SQL Server) of available RAM. Memory utilization should be comfortably under the threshold over the last 24 hours. For more information on memory use, also see Tools ► Diagnostics ► Memory status.

Check whether Java memory limits have been set incorrectly due to a 32/64-bit mismatch or a lack of free memory during installation. Adding memory after installing the NAM Server software will not cause Java to use this new memory.

Do not use an incorrect 32-bit version of Java on a 64-bit operating system; the additional memory that would be available to 64-bit Java will not be available to the 32-bit Java installation. In addition, if you set more memory for a 32-bit Java version than it can allocate, the server won't start.

Data processing times

  1. In the NAM Console, open Deployment ► Manage devices.
  2. On the Devices tab, select the NAM Server system to be checked.
  3. On the Device Status tab, in the Hardware health section, click the Capacity status link.
  4. View the Average data files processing time section.
processing time
processing time

The average processing time should not be much more than this indicated threshold. However, during certain time intervals, it is acceptable to see longer processing times, especially during nightly task generation. For more diagnostic information, go to Tools ► Diagnostics ► Processing Status.

Nightly tasks

  1. In the NAM Console, open Deployment ► Manage devices.
  2. On the Devices tab, select the NAM Server system to be checked.
  3. On the Device Status tab, in the Hardware health section, click the Capacity status link.
  4. View the Nightly tasks execution time section.
Nightly tasks
Nightly tasks

Perform this check over an extended period of time (for example, seven days). There should be no alerts for this extended execution time.

Number of sessions

  1. In the NAM Console, open Deployment ► Manage devices.
  2. On the Devices tab, select the NAM Server system to be checked.
  3. On the Device Status tab, in the Hardware health section, click the Capacity status link.
  4. View the Number of sessions section.
Number of sessions
Number of sessions

The session capacity limit on standard recommended hardware in typical environments is about 3 to 4 million.

Basic NAM Probe statistics

  1. In the NAM Console, open Deployment ► Manage devices.
  2. On the Devices tab, select the NAM Probe system to be checked.
  3. On the Device Status tab, in the Hardware health section, click the Capacity status link.
    The NAM Probe capacity report opens to the Capacity tab.
  4. Tabs to review include:
    • Capacity
      The table shows average and maximum bandwidth for each NAM Probe.

      • A red icon indicates that the selected NAM Probe has capacity issues.
      • Click the Alerts tile to show related alerts.
      • Select one NAM Probe (by row) to chart capacity for that NAM Probe.
    • Packet stats (wire-level and IP packet distribution)
      Displays drops and errors at the driver level.
      Your packet loss rate should be less than 1 percent.

      • The total of all types of errors and non-sampling drops should not exceed 1 percent over longer periods of time.
      • For checksum, IP header, and fragmentation errors, verify the span connection, cable connections, and packet fragmentation.
      • The total of all types of errors and non-sampling drops should not exceed 1 percent. The second chart shows the traffic type used most.
    • Interface utilization

      • Select a NAM Probe in the Connected NAM Probes list to see statistics.
    • Resource utilization
      This information is helpful in resolving NAM Probe performance issues.

      • Received traffic
      • CPU usage
      • Memory usage
      • Disk space usage
        For more disk usage information, run the df shell command.
    • CPU stats This tab charts CPU usage.

      • Individual CPU cores are not utilized equally.
      • All CPUs must be comfortably below 100 percent.

For NAM Probe restarts, see var/spool/adlex/log/rtm.log on the probe.