Information security professionals are beginning to shake off their love affair with big data. Collecting, storing and parsing information is proving to be a prohibitively expensive and daunting endeavour. In our constant security-monitoring efforts, we're discovering that our data gluttony may just be making us fat and stupid, not omniscient as we hoped.
For years, we've been on a quest for more and better tools to process the truckloads of information we're gathering. But this has led us nowhere. We are like Dorothy dancing down a yellow brick road, hoping for Kansas, only to find a fake wizard behind a curtain of smoke and mirrors.
Big data glut in enterprise IT security
Once, during a technical interview, a network security engineer on the panel told me that he always set syslog to debug on the firewalls he managed so that he could have everything, just in case he needed it. These were perimeter devices on a medium-sized enterprise network, but he told me he was accumulating about 30 GB of logs a day. I was flabbergasted, because that's about the same amount of log data I saw when I worked for a financial services provider that was located in three regions with around 400 firewalls.
The same company's hiring manager told me his team was using a commercial log analysis product that licenses based upon the amount of data consumed, and like in most organizations, the team was struggling to decide whether to invest further or move to a less costly option, such as open source. Besides the heavy hardware and software investment, I asked whether he had considered other possible consequences of so much monitoring, such as the increased risk of an outage due to performance issues caused by processing logs during heavy traffic loads, or a denial of service (DoS) attack. Needless to say, I didn't get the job.
Prior to that job interview, I rode around on my high horse on the topic of advanced event correlation. I wanted every log, every event, with complete visibility, and I didn't want to hear about the cost. If a mosquito landed on a switch in a remote hub-site, I demanded to have it logged with access to advanced analysis tools. Then I had to work in environments with budgets too tight for my flights of security fancy. I had to start getting smarter about what was collected, why and when. I was forced to justify everything that we gathered, really thinking through how much intelligence I could gain from certain types of data. I asked myself, "Can I live without this?"
I began to wonder whether we had become like Connor MacLeod in the movie Highlander. Is there some kind of digital quickening that leads to increasing levels of wisdom with every piece of data we ingest? Or are we just becoming organizationally obese from information we don't have the time or resources to use?
We could have seen the big fail in big data coming
It isn't as if there haven't been plenty of warning signs when it comes to the challenges of dealing with big data. You only need to look at the 2008 economic meltdown. Even with access to data scientists, huge data sets and advanced predictability models, the financial industry's failure was epic. Simply put, the financial world didn't use the information available to prevent the disaster.
And the financial industry is not alone in its inability to make good use of big data. A 2012 SANS Institute survey shows that a whopping 35% of organizations spend "none to a few hours per week" on log analysis. Probably the best example of the excess of data collection is within intelligence organizations, which reportedly gather way more data than is ever analyzed. Even after shattering revelations from Edward Snowden on the extent of NSA surveillance, experts in the intelligence community have indicated that their data-mining efforts are pretty useless in detecting threats.
If the federal government with legions of contractors and a generous budget can't get it right, why do we think that the enterprise can do any better? Why do we continue to fall for the big data lie? Maybe because as author and former derivatives trader Nassim Taleb points out, "Big data may mean more information, but it also means more false information." The failure seems to be a case of confirmation bias, in which we fall into the trap of finding a pattern to match our beliefs, and that we want to believe we can predict what is often unpredictable.
What's the message here? We need to manage our expectations, always considering the big cost of big data. We need to remember that in security, just as in Vegas, the odds are usually against us. Solving the problem will not be about a simple analysis tool, but about a paradigm shift in the way we monitor and analyze data.