Ten Reasons Log Data is Not Enough: #5. Who dat installing software?
September 21, 2009
As we continue down our analysis of why log data is not enough, the next issue we discover is installed software. Most malware (at least persistent malware) will do some kind of installation of the malicious code to steal data, which could be sniffing network traffic or key strokes or account numbers. The list goes on and on. Many log management or security information and event management (SIEM) products will look at logs to figure out if some software was installed.
So that should solve the problem, right? The host log says software was installed and then you’ll know malware has been installed, act decisively and be the hero. Well, not so much. Do you know how many times a day a typical enterprise installs software on it’s managed devices. Hundreds? Thousands? More? It’s very likely too much for human analysis to figure out. What about those fancy correlation engines that will look for bad software? Hmm. For that to work, it needs to have a list of “good” software – which is always changing. I hope you are good with coding and regular expressions, because you’ll need to build a number of custom rules to make that work in a SIEM product.
The key is to be able to ENFORCE POLICY. As we discussed in Reason #3 about configuration data, the key to reacting faster to emerging threats is to detect something different, anomalous, and not normal. By establishing a set of policies for what software is allowed and then detecting when a device violates that policy, you can reduce the noise of watching every software install.
All of the anti-virus companies are talking about their shiny new, white-listing widget, and justifiably so. Taking a positive security approach (only allowing authorized software to run on managed devices) will definitely reduce the likelihood of infection. So this idea of monitoring for new software installs is sort of a poor-man’s white-listing.
Which is really the point of this entire series. There are many defensive techniques that are required to keep pace with today’s attackers. Just relying on a log management toaster or a SIEM “in a box” is not going to get the results you are looking for. Aggregating, parsing and even correlating on log data is just not enough.
Ten Reasons Log Data is Not Enough: #4. Network Blind Mice
September 17, 2009
As we discussed in the last post in the Ten Reasons Log Data is Not Enough series, configuration data provides an important additional set of information to help pinpoint potential attacks and make sure that in the absence of log data (if logging is turned off, for example) attacks can still be detected.
Network flow data is another data type that can yield important and interesting corroborating data to go Beyond Security Information and Event Management (SIEM) and Log Management. First what is network flow data? Basically, every network device tracks some simple information about who is talking to whom and what protocols they are using. Cisco’s data is called NetFlow. Juniper has a format called (surprisingly) JFlow and there is a more standard format called cflowd.
Regardless, this network flow data comes in fast and furious, with millions of flow records being generated every second. So scaleability is a key requirement if you are planning to analyze and correlate network flows, along with everything else.
Why is being blind to network flows a huge problem for security professionals? Basically, the network sees everything, at one point or another. In the event of an attack, the attacker needs to move data either within the environment or outside of the environment. Typically you wouldn’t see huge amounts of data moving to a server in Eastern Europe. Or an open FTP server in Brazil. Or in a government processing center in China. So these are good indications that something may be a bit funky.
Now to be clear, network flow data is not going to be a definitive answer to the presence of an attack, which is probably why the network behavior analysis (NBA) market never really took off, especially for the security use case. But the data can tell you what isn’t normal and give you some more information to analyze and correlate. It’s really about having another data source to provide additional corroborating evidence to the potential presence of an attack.
As a bit of a unplanned benefit, your network operations folks could be very interested in building their own alerts based on network flows because not only can flow data detect attacks, it also pinpoint network performance issues pretty effectively. So here is yet another reason that log data is not enough, and security professionals need to go Beyond Log Management to keep pace with today’s attacks.
Ten Reasons Log Data is Not Enough: #3. What’s the Configuration, Kenneth?
September 10, 2009
As we resume our series on why Log Data is Not Enough, the 3rd reason we have underscores the importance of configuration data as part of the security analysis. As we’ve repeatedly mentioned, log management systems are driven by log data. And as we showed in Reason #1, logging can (and usually is) turned off – by savvy attackers anyway.
So how do you detect an attack, if you have no log data to analyze? Basically you need other data sources to figure out what’s happening and that is where configuration data comes in. Every device (whether it’s a firewall, switch, Windows Server, Linux Server, desktops, etc.) has a configuration and you can poll that configuration (with proper authorization) to figure out what’s going on.
Note you have to PULL the config data out of the device. It’s not going to just send it to you (like with log data), so this is actually a big deal to have in a security management platform. It’s a totally different way to gather data and is very hard to do in a scalable fashion with the reliability enterprises demand.
Once you have the configuration baseline, then you can compare new versions of the config to the baseline at a user defined interval. If something changes (like logging is turned off, a new service is turned on, or a registry change happens, for example), it will create an event in the system that can then be used with other data types to determine if it’s really an attack.
Remember systems relying just on log data can’t do this level of analysis. And those vendors that say they do require customers to buy a totally different product with a totally different management interface. Many of these other folks ONLY track network device configuration as well.
So this is another reason that Log Data is Not Enough, and those folks that know they need to go beyond compliance know they need to go beyond log management.
Ten Reasons Log Data is Not Enough: #2. Partial Regulation Coverage
September 8, 2009
For those organizations looking specifically to check the compliance box, log management is one of the things towards to top of their shopping list. I mean, the product is called out specifically in Requirement 10 of PCI, and is a “best practice” in many other regulations and frameworks.
And a lot of organizations just figure if they only deploy a log manager, and a web application firewall, and a regular firewall, and anti-virus – they’ll be in good shape when the PCI assessor shows up to put your organization through its paces. And depending on the assessor, you may be right.
But to be clear, those thinking that log management = compliance are sorely mistaken. Putting on my master of the obvious hat, log management products are driven by logs (duh!). But the logs can’t tell you if AV is installed on a device and if the signatures are up to date. It can’t tell you how the database device is configured. Logs don’t tell you whether the default passwords have been changed on sensitive devices or whether the firewall policies are in place.
To get answers to those questions, you need to go beyond log data and look at the configuration and asset data of these devices. eIQ is the only security information and event management (SIEM)/log management solution to aggregate and analyze configuration and asset data as part of security analysis. And we don’t stop there, we also look at performance, vulnerability, and network flow data, in addition to logs.
As we continue through the 10 reasons, you’ll hear all about these other data types. But in the meantime, just remember that log data is not enough.
Till next time…
Ten Reasons Log Data is Not Enough: #1. Logging can be turned off
September 3, 2009
Welcome to the latest series here on eIQviews. Over the next 10 days, we’ll discuss a number of reasons that log data is not enough. And no, Bunny (from the movie) will not be making a guest appearance. Sorry to disappoint your folks.
The first of the reasons that log data is not enough is so simple, sometimes you kind of forget about it. Actually, given the amount of time we spend harping on it, I’d hope you don’t forget, but let’s go through it anyway. Log management systems are driven by log data. Security information and event management (SIEM) systems are driven by log data as well. Yes, I know, that’s quite an insight. But one of the first things that even the least savvy attacker is going to do upon compromising a device is to (you ready?) turn logging off.
I know, it can’t be that simple. But in many cases it is. The attacker turns off the logging, does their evil tidings, turns logging back on and the log management and/or SIEM system doesn’t know the difference. Sure, you can set most log management systems to alert if you don’t get logs for a certain amount of time. How long do you think it takes the bad guys to make changes and install malware on a device? Right, not that long.
So unless you have a very short time period defined in that alert (think minutes, not hours), which will create a lot of noise and false positives, you are going to miss the attacker that shuts down logging. So your fancy log management system, which is supposed to make you compliant, isn’t much help.
Then again, we all acknowledge that compliance does not equal security. And neither does log management. Thus, the first reason that log data is not enough.
How long to keep security data?
August 24, 2009
In digging back through some of my bookmark archives, I came across this post from Burton’s Trent Henry about how much (and what kind) of log data should you be storing. Now to level set, Trent is talking specifically about logs and we all know that Log Data is Not Enough, so I’d extend the same conversation to include a broader data set, including configuration, asset, performance, vulnerability and network flow data. Yet the general discussion and concepts are consistent when considering the idea of security data, regardless of how broadly you define that term.
It reminds me of when I was in the anti-spam business and we came across those customers that wanted to keep everything INDEFINITELY. That’s right, there were organizations out there that wanted to keep everything (spam included). I just scratched my head, and that is really Trent’s point here.
Each organization needs to understand what kind of data will be:
- Useful from a security operations standpoint
- Useful from a compliance standpoint.
In dealing with security operations, you need enough data to isolate the root cause of any abnormalities you find in your IT systems.
We also believe this data should be kept for a longer, rather than a shorter amount of time. The reality is with today’s low and slow attacks, a patient adversary may take months to perpetrate an attack. Once you roll over that data or don’t archive it, you can’t get it back. That doesn’t mean you keep stuff indefinitely, but you should be thinking in terms of years, not months.
When thinking about compliance, your assessor will tend to have opinions about what data you need or don’t need. And unfortunately those opinions can vary between assessors (or depending on which way the wind blows). So what enterprises need to do is DOCUMENT their retention policies and be able to defend them.
You can certainly have a difference of opinion with the assessor, but unless you have your data retention policies well-thought out and documented, you don’t have a leg to stand on when the assessor challenges you.
Finally, Trent’s point about the “skeletons in the closet” is exactly right. Every organization has them and hopefully we all have learned the lessons of all the high profile cases where emails provided pretty damning evidence. Just imagine your CEO doing stammer stammer stammer backpedaling during a video deposition. That worked pretty well for Microsoft a couple of times.
So only keep what you definitely need, but that’s only the third decision point – after meeting security ops and compliance data requirements.
Defining SIEM/Log Management “Integration”
July 22, 2009
integrate
verb [ trans. ]
1 combine (two things) so that they become a whole
Based on market dynamics and confirmed with the recent Gartner MQ criteria, there are no longer separate log management and SIEM markets. Thus,
every vendor is talking about their “integrated” solution. What’s comical is how many of the players in the market define “integrated.” So before I define our idea of integrated, let’s talk about what integrated is NOT.
- If a vendor requires you buy two different technology hardware platforms, with (at least) two different data stores – it’s not integrated.
- If a vendor requires two platforms, one to collect data at high speed and another to analyze the data because they can’t analyze fast enough – that’s not integrated either.
- If the vendor’s correlation engine is outsourced, acquired, or licensed from another technology vendor , the solution is not integrated.
- If the vendor has totally different interfaces for their SIEM and log management offerings, that’s not integrated by a long shot.
- If the product doesn’t correlate all data because that’s too hard and their product would require a Cray supercomputer to keep pace, which forces a log-only collection layer to capture all that data – it’s not integrated.
- If a product needs to archive data off their platform after 30 days because it slows down the correlation engine, and then forces you to use a separate appliance to do a forensic search of the archived data – you got it, it’s not integrated.
- If the vendor talks about network configuration management, but it’s nothing more than a bolt-on of a failed product they acquired for cents on the dollar – that’s not really integrated either.
- If a vendor talks about an integrated solution, yet their design looks like the schematic of a nuclear reactor – you got it, it’s probably not integrated.
So what does eIQ mean when we say “integrated.”
- Single platform and single data store – SecureVue is one INTEGRATED product. You buy it once, deploy it once and both the SIEM and log management capabilities are built into the platform natively. No separate boxes or different interfaces are required.
- Scalable from the entry level to the largest enterprises – Data collection can happen on same box or within a multi-tier architecture, with same level of correlation, reporting, dashboards. SecureVue is linearly scalable, there is no need to deploy a front end logging layer to overcome a dog-slow correlation engine.
- Correlation is done on ALL data – SecureVue uses all data in its correlation analysis, there is no “selective” data forwarding from the logging layer to reduce the amount of data to correlate.
- Reports and Compliance Audits are pulled from ALL data – Similarly some of the competition basically discards data they don’t term as “relevant” for reporting and audit information. SecureVue doesn’t have those limitations, so reports can be pulled on all data collected and archived.
Delivering an integrated system is hard. That’s why most of the vendors out there wave their hands a lot, but don’t really want you to look behind the curtain. Integration requires a single interface, not a cobbled together console with totally different user experiences. Integration requires a purpose-built data store, not your favorite relational database. These folks built on a relational back-end would need a brain transplant to do all the processing required to do integrated SIEM/log management on a single platform. Brain transplants are hard too.
So they don’t DO integration, they just talk integration. They just glue an “integrated” sticker in the front of the multiples of boxes and hope no one really asks what integrated means.
Hopefully Mr. market is smarter than that.
LogDataisNOTEnough.com
April 15, 2009
Today we launched both a major update to our corporate site (http://www.eiqnetworks.com) as well as a new site called Log Data is Not Enough (http://www.logdataisnotenough.com). LogdataISNotEnough.com features a funny video about a data breach and it’s impact. If you like 24, you’ll like the video. There is also a bunch of educational material on the site about dealing with a data breach and the like. Enjoy, and tell your friends about it.
What we learn from Log Data
April 9, 2009
As we continue through our series on Log Management, let’s evaluate the kinds of information that are available from the typical log records we see and what we can learn from them.
First, keep in mind that pretty much everything can generate a log file for just about anything that it does. So one of the first places we’ll look is the firewall. Why? Simply put, the firewall is the first line of defense. So what kind of stuff are we looking for in the firewall logs? Here a little list:
- Traffic for unused ports – This is a pretty simple one because it’s usually indicative of some attack. After all, why would legitimate traffic be sent using ports you don’t have on. Right, it wouldn’t. So this is typically reconnaissance traffic trying to figure out what holes the attackers can start to exploit.
- Failed logons – This is self-evident as well, given that one sure-fire way to compromise a firewall is to log into it. So brute force attacks are definitely something you’ll see. This is another reason changing the default password and not using a predictable user ID (like “admin”) is critical.
Outbound connections – Given the prevalent attack vector of compromising a server and then transferring that data outside of the organization, it makes sense to scrutinize strange outbound connections. Of course there are a bunch of other things to look for as well. And a log management system that is configured to watch for strange results in these logs can save your bacon.
Similarly, you can look for information in your intrusion detection/prevention (IPS) device as well. Things to look for here include:
- Attack signatures – since the IPS is firing, we know some kind of rule was violated. The attack type is helpful to understand what you are up against.
- IP addresses – You can learn a lot from the IP addresses of attack packets. Are they spoofed packets with an internal IP address? Do they originate from a web server in your DMZ? These are all clues to some type of successful attack.
There are similar nuggets of information in database logs (like connection type, application ID, transaction type, etc.), server and application logs as well. So there is no lack of log data to collect and analyze.
Not even Einstein could make heads or tails of all this log data that is gathered on an ongoing basis, especially given that the typical large enterprise has hundreds of network devices, tons of databases, and thousands of servers. Wading through Gigabytes of log data every day isn’t really practical (though it would make for an interesting Dilbert series).
Thus you need to automate the analysis and correlation of these logs. In future posts, I’ll discuss how this needs to work, and why merely correlating log files is inherently limiting. The key takeaway here is that there is a lot of great data in the logs; you just need to know how to leverage it.
The next topic in our Log Management series will discuss the intersection of log data and identity. Stay tuned for more information on log management.
Limitations of Logs
March 30, 2009
As we continue our series on log management (check out: Why do we care about logs anyway?), let’s discuss some of the clear limitations of logs and why we say log data is not enough. The reason we at eIQ continually harp on this concept is that far too many organizations gather their logs and think they are done. Especially those just trying to “check the compliance box.”
There are two main reasons that logs can be somewhat limiting in detecting attacks.
- Logs (by definition) are backwards looking – Logs are great and important, especially for investigations and compliance reporting. But when trying to determine if you are under attack, looking in the rear view mirror can be too late. By the time your logs see it, it’s already happened.
- Logs are really corroborating evidence – Once an attack is launched, there are records of that attack and that is important to isolate the root cause and to eventually remediate the issues.
So what kind of data should we also gather to supplement the information in the logs? From a threat management perspective, there are a number of other important data types.
- Configuration data – most attacks have some impact on the configuration of a device. Maybe it’s a different setting or the opening of a non-standard port. Or turning off logging. Usually there is some kind of trail, unless the devices have some well-known vulnerability that can be exploited.
- Vulnerability data – Vulnerabilities are not a sure path to exploit, but certainly can be. So it’s important to understand what devices are exposed to what, if only to tighten thresholds around specific attacks.
- Asset data – One of the most important pieces of asset data is installed software. Because another typical “tell” of an attack underway is to see if any new software has been installed on a device. This isn’t always indicative of a compromise, but most Trojans and other attacks do involve additional executables on a device.
- Performance data – Understanding if a device is operational and looking for abnormal utilization can be indicative of a compromised device. As with the other data types, performance data by itself is not conclusive, but can certainly be used to define the issues, determine the attack vectors, and understand how critical the issues are.
- Network Flow data – The last data type we’ll mention is network flow data. This is information that comes directly from your routers and switches and provides a lot of information about which devices are talking to one another. Tracking anomalous network traffic can yield clues to attack behavior. For example, if an internal web server is sending data to an external source, it could indicate a problem.
Yet gathering all of these information types is only the first step in threat management. First of all, information in different silos is not really information at all, it’s just data. So all of these disparate data sources must be analyzed and correlated to ensure clear corroboration of the different data types.
Data doesn’t help you understand what you need to investigate and how quickly. And that is what most security professionals really need to understand.
Fundamentally, log management solutions just gather information. Although broader than a typical log management product, eIQ’s SecureVue focuses on monitoring all of these data types and providing information to help security professionals prioritize their activities. But you’ll still have to deal with the threats.
This goes beyond log management and enters the domains of anti-malware, intrusion prevention, and application control, among other technologies. Knowing what is happening is just one part of the battle (though it’s most interesting to us), doing something about it is a totally different discipline.
The next piece in our log management series will delve into some of the nuggets of information found in log files, and how to use them.



