If you’re a Facebook user, or even if you’re not, you’ve probably noticed the flurry of news on the leak of millions of Facebook user profiles. The news is not new, but it recently came to the surface and ultimately ended up in a full page mea culpa from CEO Mark Zuckerberg. What is different about this incident is that data wasn’t stolen from Facebook, so it wasn’t a data breach. Instead, it was a misuse of that data because the toxicity of the data allows the firms to actually re-identify individual users. To me that is disturbing and for anyone that still believes in anonymity online it should be making them question that belief.
The Facebook “Breach of Trust”
In fact most of this data leakage occurred as far back as 2014 through 2016 but was largely unknown and underreported until just recently. On TV, Zuckerberg used his 20/20 hindsight vision to note that it was a “breach of trust, and I’m sorry we didn’t do more at the time.” To me, and probably to a lot of you, these words fall flat and leave a sour taste in the mouth. This “breach of trust” likely not only personally identified 50 million plus US Facebook users, but it also was used to carry out psyops against approximately 25% of US registered voters and likely influenced the 2016 US election, the UK Brexit vote, and feed various foreign government intelligence apparatuses. The full uses and impacts may never be publicly known as Facebook has attempted to quash the story and of course, those that may have benefited are potentially untraceable.
Ill-gotten booty or ill-booten gotty?
But, you might be asking, how did all this happen and how could it happen? UK firms such as Cambridge Analytica accessed user data through allowable channels. In fact, every company today is collecting data, data provided by their customers and every company values this data. It’s not the collection and use of data that is actually disturbing or different. It’s that the data that was harvested was able to be “weaponized” by actually linking it to actual people who were, in the case of the US election, actual voters. It’s that leap from data analysis to data identification that is the problem. Companies like Facebook sell advertising but it’s a far cry from showing users ads to targeting an individual user because you know how they feel inside and that they will be voting, not buying laundry detergent or eating at a restaurant. Privacy regulations may have been skirted but it was the abuse of the system that allowed it and likely passive behavior of the data owners that fueled it. Depraved indifference or willful conduct? That is for the courts and regulators to decide. What you can do is what companies like Facebook should have been doing all along: safeguard your customer data using pervasive data-centric protection. Fortunately, we at comforte live in this space.
comforte (should be) living with your data
At comforte, we talk to clients on a daily basis that have large datasets that they need to use and needs to be accessible. That data has to be useable and it must be safe they tell us and the fact that Hadoop and other big data environments don’t have data-level security baked in is an inhibitor to their ability to complete projects. The “breach of trust” is one that is often unrecoverable to many organizations. ‘Trust’ in this case is, financial penalties aside, the trust that the customers have in your ability to protect them when they freely give personally identifying details, bank account information, or PCI data to you and enable your ability to monetize your data. Data-centric security offers the ability to protect any data you want to protect no matter if it is at rest, in motion, or in use. It seems opposing to allow access to customer data for analytics without the ability to identify an individual customer but that’s one of the core uses of comforte SecurDPS Enterprise. Using our native integrations, you can seamlessly tokenize individual data elements before they reach your Hadoop cluster – either in the source system as it is extracted or on an edge node as it is ingested – or in a Hadoop job as it processes data. Since comforte tokenization always gives you a unique token for each value you protect, you can do analysis – all without giving internal or external organizations the ability to re-identify any user. This allows you to truly unleash the power of the data that you need to collect and allows you to anonymize the data without losing the analytical value. And, if you need to re-identify some customer data under scrutinized access, you still can.
The key point is that with comforte, you get to choose who accesses your data and when, no matter how large the dataset is. If companies take security seriously and are truly interested in data privacy, data-centric protection is a must. Regulations like GDPR, PCI, HIPAA-HITEC, etc. may help to encourage this behavior, but they only offer punishment and fines for violations. Instead, using data-centric protection from comforte allows you to truly protect customer data as a proactive approach and secure your growth using your most precious asset: data. As for Facebook and the long list of other companies with actual data breaches, they are in reactionary mode and the damage is done.