Big Brother help
Help

Big Brother - Help


Big Brother FAQ
Purple problems
"Forbidden" problems
BB Installation and Configuration Manual
Changing the look of BB
Explanation of BB files
About the bb-hosts file



Color codes by order of severity

Trouble
Bad things are happening.
No report
No report from this client in the last 30 minutes. The client may have died.
Attention
The reporting system has crossed a threshold you should know about
OK
Everything is fine. Have a nice day.
Unavailable
The associated test has been turned off, or does not apply. A common example is connectivity on disconnected dialup lines
Disabled
Notification for this test has been disabled. Used when performing maintenance
Acked
A current event has been acknowledged by one or many recipients. The acknowledgement is valid until the longest delay has expired

All connections are checked every 5 minutes


Trouble

Most severe conditions result in the administrator being notified. These include loss of network connectivity, loss of HTTP access, and disk conditions over 95% full, since these can result in a system hang. Furthermore, any "NOTICE" messages in the message file causes a notification since this may signal a disk fault.

Under these circumstances, the screen should turn red. Click on the corresponding red dot for additional information about the condition.

If a severe situation is occurring that is not being noticed by Big Brother, use the PAGE/ACK button on the main screen to notify the administrator manually.


Attention

These include HTTP server errors, disks 90-94% full, the death of important processes, and "WARNING" messages in the system logs.

The screen should turn yellow if this is the most severe situation at the time. Click on the corresponding yellow dot for additional information, and notify the administrator manually if necessary.


No Report

Each report is checked for freshness. If any report is more than 30 minutes old, it is marked with a purple dot, and the screen turns purple, assuming that it is the most serious situation at the time.

These may be the result of heavily loaded systems, but may also indicate a more serious loss of communication within the Big Brother system itself.


Disabled

Notification for this test has been disabled. Used when performing maintenance tasks on a host


Unavailable

Under some conditions, the administrator may elect to disable some tests, in which case the reported condition becomes clear.


OK

Green dots indicates that all is well. Green screens represent a seldom achieved state of administrative bliss.


Pager Codes

The administrator will be notified when conditions merit. The numeric message is formatted as follows: [3 DIGIT CODE] [IP-ADDRESS]

  • 100 - Disk Error. Disk is over 95% full...
  • 200 - CPU Error. CPU load average is unacceptably high.
  • 300 - Process Error. An important processes has died.
  • 400 - Message file contains a serious error.
  • 500 - Network error, can't connect to that IP address.
  • 600 - Web server HTTP error - server is down.
  • 7-- - Generic server error - 7 + server port number i.e. 721 = ftp down
  • 800 - DNS server on that machine is down
  • 911 - User Page. Message is phone number to call back.
  • 999 - The service reported in an error could not be found in the svcerrlist token in etc/bbwarnsetup.cfg file.

System Information

Click on any server name for additional details about the machine. Information about all components are available, including serial numbers, partition sizes, SCSI addresses, and the physical locations of the devices. This information lives in the www/notes directory.


General Information

The current status of any individual component is always available by clicking the appropriate dot in the display matrix. You may have to hit Reload to get the most recent entry.

Occasionally the screen changes color for CPU or HTTP warnings. These can usually be disregarded since Big Brother has been instructed to be very sensitive during this initial test. Similarly, internet connections may turn yellow when the network is heavily loaded. Although it should be checked out, this is usually not a problem unless the whole Internet section goes yellow.


Big Brother Column Information

conn

The conn column denotes the ping check performed periodically. This code is located in bb-network.sh.


nntp

The nntp column denotes the nntp check performed periodically. This code is located in bb-network.sh. It makes sure the news server is alive and well.


cpu

The cpu column denotes the cpu check performed periodically. This figure is based on the 5 minute load average as reported by the 'uptime' command, in the second column. The code for this test is located in bb-local.sh.


disk

The disk column denotes the disk check performed periodically. This test is just the 'df' command with the disk most full being reported. The warning amount is 90% by default, and the system is set to panic at 95%. These values are set in $BBHOME/etc/bbdef.sh and may be changed. The code for the disk test lives in bb-local.sh. You may also set warning/panic level individually in the etc/bb-dftab file. See the etc/bb-dftab.DIST.


dns

The dns column verifies the status of the DNS server on that machine. The test is basically an nslookup with the server name and IP address as arguments.


ftp

The ftp column denotes the ftp check performed periodically. This code is located in bb-network.sh. It is part of the new group of generic server tests performed. To test this service on a given machine, just include 'ftp' on the line in the bb-hosts file.


http

The http column denotes the http check performed periodically. This code is located in bb-network.sh. It will return OK if the server is there and does not return a string containing the word 'Error'. It should be more rigourous. Note that password-protected pages return an error when they shouldn't.


msgs

The msgs column denotes the msgs check performed periodically. This code is located in bb-local.sh. Only NOTICE and WARNING conditions are considered. Note that a NOTICE condition will cause a notification (code red) whereas a WARNING just turns the screen yellow. There is no way to turn these messages off, short of clearing out the messages file manually or modifying the tags from WARNING to wARNING and NOTICE to nOTICE. You may also introduce tags in the etc/bbdef.sh file in the PAGEMSG and MSGS variables. You can also have the check performed on multiple log files by set MSGFILE in bbsys.local to all log files to be checked. The msgs test also checks that the logs files are readable and non-empty.


pop3

The pop3 column denotes the pop3 check performed periodically. This is part of the generic test code in bb-network.sh. It checks that the pop3 server is alive and well. To test a machine for the pop3 server, put the word 'pop3' on that server's line in the bb-hosts file. You may have to put pop-3 instead on certain platforms. Check /etc/services for the correct spelling.


procs

The procs column denotes the procs check performed periodically. This code is located in bb-local.sh. It makes sure that the processes defined in etc/bbdef.sh in the PROCS variable exist on the local machine. If a process does not exist, and it has been defined in the PAGEPROC variable, then the code is red and a notification is sent out. The ps command is used to get a current process listing.


smtp

The smtp column denotes the smtp check performed periodically. This is part of the generic server test code located in bb-network.sh. It makes sure that the SMTP process (usually sendmail) is alive and well.




Copyright © 1997-2001 BB4 Technologies Inc. - All Rights Reserved