

By Ed Woods
Consulting IT Specialist
IBM
The Tivoli Enterprise Portal provides IBM Tivoli OMEGAMON XE solutions with powerful automation, command and control capabilities. This article expands on Ed Woods’ earlier pair of articles that introduced the concept of situations. Here he discusses Tivoli OMEGAMON XE policy automation and command capabilities.
|
The IBM Tivoli Enterprise Portal provides several command and control options, including a manual ‘take action’ command function, commands issued from situations, and commands issued by policy automation. Policies tighten the integration of performance and availability monitoring with automation, and enable more effective, efficient and intelligent monitoring and systems management.
Situations form the starting point for policies, which can combine situations, commands and other logical steps, and allow for multiple commands, situation checks and automated actions to be executed as a logical flow in response to a performance problem or availability issue detected by Tivoli OMEGAMON software.
The multi-step nature of policies adds flexibility and power to the Tivoli Enterprise Portal, and extends command automation concepts established with situations.
Uses of policies include:
- Basic alert correlation with corrective actions
- Forwarding alerts to other alert managers, such as Simple Network Management Protocol (SNMP)-based alert managers
- As a mechanism to feed other automation technologies, such as console automation
- Dynamically managing the monitoring infrastructure.
Defining a policy
Figure 1 shows an example of how to launch the policy editor. If the Tivoli Enterprise Portal is enabled for policies, the policy icon will appear on the Tivoli Enterprise toolbar. (Customers running Tivoli OMEGAMON XE solutions and the Tivoli Enterprise Portal in a z/OS environment need IBM Tivoli OMEGAMON DE on z/OS, a separately licensed component, for policy enablement.)
 Figure 1: Launching the policy editor
Like situations, policies are named, stopped and started, and distributed for execution. However, policy distribution is more involved than situation distribution. Because policies employ situations as essential components of a logical flow, situations must be distributed to the appropriate managed systems for correct policy execution.
In addition, when a policy is started, you will be prompted to specify an IBM Tivoli Enterprise Management Server for the policy to execute in. Because the restart option is checked in Figure 1, the policy will restart after it executes. In this example, policy logic is launched by a situation check.
Figure 2 shows an example of a simple policy flow. In this case, the policy will wait for the situation Demo_DB2_Alert to be true. When the situation is true, the policy will execute the first command, and then the second command, following the lines as shown in the policy graphic editor. After the second command is executed, the policy will restart, and the logical flow will begin anew with the situation check.
 Figure 2: An example of a basic policy flow
Command execution considerations
Figure 3 shows an example of the command options for the ‘take action’ policy activity. The command will be executed as entered in the system command line. Attribute substitution may be used in the command string, as it is in situations.
 Figure 3: Command execution from policies
When specifying commands, consider these important details:
- If the command will be executed at a Tivoli Enterprise Management Server, you should keep in mind the location of the TEMS chosen at the time the policy was started, and ensure that your command is appropriate for that platform.
- Similarly, if the policy will issue a command by the agent task, you should keep in mind where the agent is executing, and ensure that the command is appropriate for that platform.
Multiple situations, multiple commands, same host
Figure 4 shows a policy with multiple situation checks – a DB2 and an IBM WebSphere MQ situation – and multiple commands. The ‘correlate by’ host name option is chosen, which is acceptable because both the DB2 and WebSphere MQ situation alerts will originate from the same z/OS LPAR (host name).
The DB2 and WebSphere MQ alerts are independent events detected by separate monitoring agents. For the policy flow to fully execute, both the DB2 situation and the WebSphere MQ situation must be true. However, because DB2 and WebSphere MQ are monitored by independent tasks, the DB2 situation being true and the WebSphere MQ situation being true are technically separate events. While the WebSphere MQ issue may be due to the DB2 alert, a direct causation cannot be assumed.
 Figure 4: A policy example with multiple situations and commands
Multiple situations, multiple commands, different host
The example in Figure 5 shows a different scenario where the policy is checking for alerts on DB2 on z/OS and Microsoft Windows. z/OS and Windows will have different host names, so the policy uses the ’logical application group’ correlation option.
Distributing this policy would require a managed system list containing the DB2 on z/OS managed system and the Windows managed system. Like in the previous example, the DB2 alert and the Windows alert are independent events.
 Figure 5: A policy example with multiple situations and commands on different hosts
Common usage of policies
Policies have many excellent uses. They work well in scenarios where a situation alert requires execution of multiple commands, such as issuing a corrective action command, and then a notification command to inform the user that the corrective action was taken.
Policies can also feed information to console automation, in particular where alert information from multiple situations needs to be fed to automation.
One of the most common uses of policies is for passing situation alert information to other alert management technologies, such as passing alerts to SNMP managers or the IBM Tivoli Enterprise Console. Figure 6 shows an example of how policies can be used to send alert information to SNMP.
 Figure 6: Using a policy to forward an SNMP event
You can also use policies to optimize situation and monitoring usage. Policies can start and stop situations, which lets you start higher cost situations only when needed, for example.
In addition, policies can manage situations based on monitoring and management requirements, such as time of day sensitivity, subsystem or application maintenance windows, or to allow for normal variations in a monitored workload. By using policies this way, you may avoid some overhead generated by continuously running alerts and monitoring.
Using policies to oversee the Tivoli Monitoring infrastructure
Some situation alerts are sensitive to certain times of day or day of week considerations, due to operational or off-hours processing issues. In addition, some issues may be critical during prime time and less critical during off hours. You can reduce monitoring overhead and eliminate unnecessary alerts by running situations only when needed.
Figure 7 shows an example of an ‘Overseer’ policy, which manages situation start and stop. Using an overseer policy can simplify the coding and maintenance of the underlying situations, because the policy will be able to handle the time sensitivity logic, so the situation code doesn’t have to. In addition, the situation’s formula will have increased space available for application attributes.
 Figure 7: An overseer policy can manage situations.
What policies can and can’t do
Policies extend concepts established with situations and add additional functionality to the Tivoli Enterprise Portal by expanding its integrated command and control capabilities. While situations remain the essential starting point for alerts and automation, policies add essential function and flexibility to situation capabilities, and can be used for basic alert correlation.
The command capabilities of situations and policies are not a substitute for a full-function automation engine, such as IBM Tivoli System Automation for z/OS or IBM Tivoli AF/Operator on z/OS, though policies and situations can feed information to them.
Although policies can be used for basic alert correlation, they aren’t a correlation engine. For high volume, complex event correlation, consider other Tivoli solutions, such as Tivoli Business Service Manager.
Best practices: policies versus situations
For best results, use situations as the primary alert/command mechanism when possible. Situations will typically be easier to deploy than policies, simply because they usually involve less coding and knowledge of the infrastructure. Policies require more a more complete understanding of the monitoring topology and configuration, because although situations can usually run at the agent task level, policies must be executed within the Tivoli Enterprise Management Server infrastructure.
Therefore, if the task can be accomplished with a situation alone, that approach will often be most efficient. Well-crafted situations using appropriate sampling intervals and Boolean logic can be highly effective.
Use policies when the job cannot be accomplished by situations alone, such as in scenarios that require multiple commands. In addition, use policies to optimize the monitoring and alerting environment, such as to activate situations when needed. And use policies when you need to forward alerts to SNMP or the Tivoli Enterprise Portal.
Best practices employ a ‘keep it simple’ methodology: Consider carefully before deploying large multi-situation policies across large numbers of managed systems. Test carefully to make sure that policies perform to the desired outcome. Have a clear understanding of the monitoring topology and requirements when deploying policies.
Understand the subtleties of how policies operate. Policies allow for more sophisticated performance automation scenarios than can be accomplished with situations alone. When used appropriately, policies can provide powerful command and control, and monitoring optimization capabilities.
|