Five mistakes of log analysis

October 21, 2004 (Computerworld) As the IT market grows, organizations are deploying more security solutions to guard against the ever-widening threat landscape. All those devices are known to generate copious amounts of audit records and alerts, and many organizations are setting up repeatable log collection and analysis processes.
However, when planning and implementing log collection and analysis infrastructure, the organizations often discover that they aren’t realizing the full promise of such a system. This happens due to some common log-analysis mistakes.
This article covers the typical mistakes organizations make when analyzing audit logs and other security-related records produced by security infrastructure components.
No. 1: Not looking at the logs
Let’s start with an obvious but critical one. While collecting and storing logs is important, it’s only a means to an end — knowing what ‘s going on in your environment and responding to it. Thus, once technology is in place and logs are collected, there needs to be a process of ongoing monitoring and review that hooks into actions and possible escalation.
It’s worthwhile to note that some organizations take a half-step in the right direction: They review logs only after a major incident. This gives them the reactive benefit of log analysis but fails to realize the proactive one — knowing when bad stuff is about to happen.
Looking at logs proactively helps organizations better realize the value of their security infrastructures. For example, many complain that their network intrusion-detection systems (NIDS) don’t give them their money’s worth. A big reason for that is that such systems often produce false alarms, which leads to decreased reliability of their output and an inability to act on it. Comprehensive correlation of NIDS logs with other records such as firewalls logs and server audit trails as well as vulnerability and network service information about the target allow companies to “make NIDS perform” and gain new detection capabilities.

Some organizations also have to look at log files and audit tracks due to regulatory pressure.
No. 2: Storing logs for too short a time
This makes the security team think they have all the logs needed for monitoring and investigation (while saving money on storage hardware) and then leading to the horrible realization after the incident that all logs are gone due to its retention policy. The incident is often discovered a long time after the crime or abuse has been committed.
If cost is critical, the solution is to split the retention into two parts: short-term online storage and long-term off-line storage. For example, archiving old logs on tape allows for cost-effective off-line storage, while still enabling future analysis.
No. 3: Not normalizing logs
What do we mean by “normalization”? It means we can convert the logs into a universal format, containing all the details of the original message but also allowing us to compare and correlate different log data sources such as Unix and Windows logs. Across different application and security solutions, log format confusion reigns: some prefer Simple Network Management Protocol, others favor classic Unix syslog. Proprietary methods are also common.
Lack of a standard logging format leads to companies needing different expertise to analyze the logs. Not all skilled Unix administrators who understand syslog format will be able to make sense out of an obscure Windows event log record, and vice versa.
The situation is even worse with security systems, because people commonly have experience with a limited number of systems and thus will be lost in the log pile spewed out by a different device. As a result, a common format that can encompass all the possible messages from security-related devices is essential for analysis, correlation and, ultimately, for decision-making.
No. 4: Failing to prioritize log records
Assuming that logs are collected, stored for a sufficiently long time and normalized, what else lurks in the muddy sea of log analysis? The logs are there, but where do we start? Should we go for a high-level summary, look at most recent events or something else? The fourth error is not prioritizing log records. Some system analysts may get overwhelmed and give up after trying to chew a king-size chunk of log data without getting any real sense of priority.
Thus, effective prioritization starts from defining a strategy. Answering questions such as “What do we care about most?” “Has this attack succeeded?” and “Has this ever happened before?” helps to formulate it. Consider these questions to help you get started on a prioritization strategy that will ease the burden of gigabytes of log data, collected every day.
No. 5: Looking for only the bad stuff
Even the most advanced and security-conscious organizations can sometimes get tripped up by this pitfall. It’s sneaky and insidious and can severely reduce the value of a log-analysis project. It occurs when an organization is only looking at what it knows is bad.
Indeed, a vast majority of open-source tools and some commercial ones are set up to filter and look for bad log lines, attack signatures and critical events, among other things. For example, Swatch is a classic free log-analysis tool that’s powerful, but only at one thing — looking for defined bad things in log files.
However, to fully realize the value of log data, it needs to be taken to the next level — to log mining. In this step, you can discover things of interest in log files without having any preconceived notion of what you need to find. Some examples include compromised or infected systems, novel attacks, insider abuse and intellectual property theft.
It sounds obvious: How can we be sure we know of all the possible malicious behavior in advance? One option is to list all the known good things and then look for the rest. It sounds like a solution, but such a task is not only onerous, but also thankless. It’s usually even harder to list all the good things than it is to list all the bad things that might happen on a system or network. So many different events occur that weeding out attack traces just by listing all the possibilities is ineffective.
A more intelligent approach is needed. Some of the data mining (also called “knowledge discovery in databases”) and visualization methods actually work on log data with great success. They allow organizations to look for real anomalies in log data, beyond “known bad” and “not known good.”
Avoiding these mistakes will take your log-analysis program to the next level and enhance the value of your company’s security and logging infrastructures.