Real time log monitoring on Windows
Lets say that you have an Windows Server (NT, 2000,XP,etc) application that generates log files, and you want to monitor that logs for particular messages, especially those pertaining to errors. By application we are referring to any kind of program that runs as a service in windows, such as Siebel, PeopleSoft, databases such as Oracle, MSSQL, web servers, or all manner of custom built integration systems that may support an enterprise. Many applications create log files for the purpose of having the log file monitored, but how would you actually implement the monitoring of the log directory in an automated fashion? That’s always up to the administrator, with no built in automated way to do it on Windows.
The most common method is to use some programming language to open up the directory your monitoring, open each file, read through it, and compare the file contents to the error string(s) you’re looking for. A good manual way to do this on Windows 2000 is to search the contents of a directory, and search within the files for the phrase. XP it can be done too, but it requires indexing the folder in question. However, this kind of searching generally falls into the “After the fact” category of log file checking, similar to web log checking. What if you want to monitor a log file, all the time, and be alerted when an event happens with no manual steps?
Using a scripting/programming language, you could write a program and open the file and search periodically. On UNIX you’d probably use grep or the built in utilities. You would also want to do something with the results. You could output the results of the search, or email, or whatever you want your script to do depending on how it was written. Then you have to worry about scheduling the script, how to run it in the background, how often, and how to deal with the results. If you have multiple servers and multiple directories, you may have to run it many times on each machine. What if you want to incorporate multiple phrases or errors in the search string? While different scripting applications can handle this, a key problem always crops up: performance. Opening a directory and searching through every file for errors is expensive. And depending on how you implement it, trying to get real time performance can be very expensive.
VA2 was written to handle this – it uses windows API calls to reduce overhead, and does not open whole directories and scan the entire thing every time a log file error is being searched for. Furthermore, it provides several key features if you’re looking to monitor log files for any application:
Lets compare the steps needed to monitor a log file using a home built script and VA2. We’ll use Perl as the example of scripting languages:
With VA2, there are some necessary steps:
This example shows that you will be looking for the string ‘Failed at invoking service” under application appserver:PREPROD1. If that error string ever appears in the directory you are monitoring, an event will be generated with a level of 0, type listed in Event Type field, and Event Sub Type listed in the Event Sub Type field.
You may ask: Ok, it is searching for ‘Failed at invoking service” string – but in what directory? The answer is to look at the appserver:PREPROD1 software element, and you can determine what directory is being searched. You can also create new software elements, to point to any directory you need to be monitored.
In this case, the e:\sblppr1752\siebsrvr\log\ directory is being monitored. You notice that you can also monitor a NT Service with a click of the button. That is also a feature of VA2. With VA2 you can monitor any Windows NT log file, in real-time, and instantly react to search strings by generating events.
Here is a screenshot of the events once generated:
Although there are some setup steps with VA2, there is also a cost to doing custom script writing. VA2 has the ability to email events when they happen, and to run Reaction scripts, either on a remote machine or a central server. If your using Siebel, there is built in interface to send Siebel Server commands when an event is detected.
Event handling diagram: (more information available at http://recursivetechnology.com/documentation/Tutorial_event_routing.html)
Here is a review of the main benefits of monitoring log files with VA2:
The most difficult part of log file monitoring is knowing what to monitor. Application design can help with that, by defining at what points and why you may want to monitor log files. Siebel is a good example – VA2 was initially built to monitor Siebel. The log files Siebel produces are fairly standardized, but there is still isn’t definite cases where you always want to monitor Siebel log files. The closest that comes to that is Process Exited with error – but even that message you don’t always care about. Often the customization of Siebel, for example for integration processes, will result in applications that kick out customized errors. With good application design, you can design the errors you are looking for under critical situations, and use log file monitoring to check for the errors.
If you do have a situation where you definitely want to monitor a log file for known strings, its likely that you’ll want to have infrastructure like VA2 assisting you with the monitoring, and handling the results of the monitoring. VA2 is not limited to monitoring Siebel log files, and the same infrastructure can be used to monitor multiple custom applications.