Angela Mao ee7e0ba7db Report Tool: Package and add plugins/correlator
This update packages the report tool and plugin files into Debian, and
bundles it with the collect tool so that they are added to the
'collect' tarballs at the time of creation.

The report tool now allows users to point it at any collect bundle and
have it automatically extract the tarball and tar files for each host
before running.

This update also adds heartbeat loss, maintenance errors, daemon
failures, and state changes plugin algorithms to the report tool.
Some of the existing algorithms were enhanced to extract more relevant
log events. The alarm algorithm was updated to only track when alarms
switch between set and clear.

Lastly, there is a correlator function implemented into the tool that
determines failures in collect bundles and their root causes, as well
as finds significant events and state changes from the log files. The
number of times alarms are set and cleared are also counted and printed
by the correlator. They are presented in output files and summaries are
printed out onto the command line.

Users can also specify if they want the correlator to only find events,
alarm transitions and state changes for a specific host.

Test Plan:

PASS: Verify tool is packaged in Debian
PASS: Verify tool is inserted into 'collect' tarballs
PASS: Verify tool extracts tarballs and host tarfiles
PASS: Verify tool can point at any collect bundle and run successfully
PASS: Verify substring plugin algorithm is working
PASS: Verify swact activity plugin algorithm is working
PASS: Verify heartbeat loss plugin algorithm is working
PASS: Verify maintenance errors plugin algorithm is working
PASS: Verify daemon failures plugin algorithm is working
PASS: Verify state changes plugin algorithm is working
PASS: Verify alarm plugin algorithm is working
PASS: Verify failures and correct root causes are found by correlator
PASS: Verify significant events are found by correlator
PASS: Verify alarm transitions are found by correlator
PASS: Verify state changes are found by correlator
PASS: Verify failures/events/alarms/state changes are printed into files
PASS: Verify tool prints correct info onto command line
PASS: Verify correlator only finds events for specified host
PASS: Verify correlator only finds alarm transitions for specified host
PASS: Verify correlator only finds state changes for specified host

Story: 2010166
Task: 46177
Signed-off-by: Angela Mao <Angela.Mao@windriver.com>
Change-Id: I02e28edf16b342abf2224cc98325d77ba0678055
2022-12-10 00:30:37 +00:00
..

Refer to report.py file header for a description of the tool

Example:

Consider the following collect bundle structure

SELECT_NODES_20220527.193605
├── controller-0_20220527.193605
│   ├── etc
│   ├── root
│   └── var
├── controller-1_20220527.193605
│   ├── etc
│   ├── root
│   └── var
├── report
    ├── plugins   (where the plugin files will be placed)
    │   ├── alarm
    │   ├── substring
    │   └── ...
    ├── tool      (where the tool will be placed)
    └── output    (where the output files will be placed)


> cat plugins/alarm

algorithm=alarm
alarm_exclude=400., 800.
entity_exclude=subsystem=vim

> cat plugins/substring

algorithm=substring
files=var/log/mtcAgent.log, var/log/sm.log
hosts=controllers
substring=operation failed
substring=Failed to send message

> report/tool/report.py --start 20220501 --end 20220530

Running the command above will populate the report folder with output files.
The tool also provides default values, more details are in 'report.py -h'.

The substring algorithm creates an output file for every host of the
specified host type. The files will contain log events within the
provided date range containing the substring 'operation failed' and 'Failed
to send message'.

The alarm algorithm creates two output file: 'log' and 'alarm'
'log' contains customer log messages created within the provided date range,
and 'alarm' contains system alarms created within the provided date range, as
long as the alarm ids and entity ids are not included in the alarm plugin file.

For more detailed information about an algorithm use 'report.py <algorithm> -h'.

Here is the report directory after running the above command

report
├── output
│   └── SELECT_NODES_20220527.193605 (collect bundle that the report tool was run on)
│       ├── plugins      (output files for plugins)
│       │   ├── alarm
│       │   └── ...
│       ├── correlator_failures
│       ├── correlator_events
│       ├── correlator_state_changes
│       ├── report.log   (log file for report tool)
│       └── untar.log    (log file for untarring collect bundle and host tar files)
├── plugins   (where the plugins files are)
└── tool      (where the report tool is)

The report tool also allows users to point it at any collect bundle and
have it automatically extract the tarball and tar files for each host
before running.

> report/tool/report.py -d CGTS-19143

Users may specify if they want the correlator to only find events
and state changes for a specific host.

> report/tool/report.py --hostname controller-0