Data classification – the first step towards automating data protection

Digitalization is an irreversible trend shaping the future, with countless benefits for people and industries. However, with almost all aspects of life, work and commerce now online, data protection and data security is a critical concern for all industries. Data theft is the objective of most cybercrime, and data breaches have serious ramifications for companies, including loss of customers and revenue, downtime or brand and reputation loss. Data protection begins with a thorough understanding of data classification, and what constitutes sensitive data.

What is sensitive data? A simple definition            

Sensitive data is any information which is confidential, that needs to be kept safe and out of reach from unauthorized users. It is accessible only by those with relevant permissions.

data classification and protection

Data is classified from levels zero up to three, based on the extent of damage that it would cause, if it were available in the public domain, whether intentionally or unintentionally. There are various data classifications followed by government and non-governmental organizations.

Data classificationGovernmentNon-governmentPotential adverse impact from a data breach
Class 0UnclassifiedPublicNo damage caused
Class 1ConfidentialSensitiveSome damage caused
Class 2SecretPrivateSerious damage caused
Class 3Top SecretConfidential / ProprietaryExceptionally grave damage

Data is classified according to the adverse impact it can cause if leaked

When you hear about a movie being behind a data leak from Sony Pictures Entertainment, at the face of it, it seems like it should be classified as zero in data classification. A movie is public information, and what damage could be caused if it was intended for public viewing anyway?  However, the case of Sony Pictures being hacked in 2014 explains the complexity of data classification and risk. Sony had produced a movie called ‘The Interview’, a comedy parody about two Americans who assassinate North Korean leader Kim Jong Un. The hackers, believed to be working on behalf of North Korea, leaked embarrassing information about employees to the media, and eventually demanded that Sony cancel the release of the movie. Although Sony initially decided not to screen the movie, critics including Obama were against giving in to terror demands. Sony went on to screen the movie, but theatres then received threats and refused to screen it. Sony eventually released the film online on OT platforms. However, this incident demonstrated that the hack behind a movie release could have maximum adverse effect, for the movie industry, general public and also for foreign policy, changing even the notions of warfare.

The leakage of private or personal data is also becoming an issue that is governed by regulatory compliances in many countries. Examples of data that is classified as private include anything that has personal identifiable information (PII) or protective health information (PHI). For organizations, employee data or payroll data leaks are considered to have serious consequences. Employers who violate the General Data Protection Regulation (GDPR) could face fines of up to 20 million euros or 4% of annual revenue, whichever is higher.

Addressing human error through data security automation

A joint study from Stanford University Professor Jeff Hancock and security firm Tessian revealed that 88% data breach incidents are caused by human errors. IBM’s Cost of a Data Breach Report states that the average cost of insider cyber incidents, across sectors, due to human error is estimated to be $3.33 million.

We have to assess this problem at two levels – the user perspective and from the perspective of automated technology solutions and their implementation.

To address the problem of data security, security professionals recommend best practices such as asking users to maintain strong passwords, and two-factor authentication for emails and sensitive data. However, relying on users and best practices recommended for users to follow is unreliable.

Deploying additional technology-led security, combined with an expert keeping guard is the better option. Such ‘Managed Detection and Response’ (MDR) services provide organizations with trained and skilled analysts using cutting-edge security tools and with access to global databases, who can keep track of evolving cyberthreats. Using the latest in SOAR (security orchestration, automation and response), external security providers are able to streamline incident response workflows, automate data aggregation to assist human and machine-led analysis and coordinate response actions.

Both MDR and SOAR incorporate the latest automation and endpoint detection and response (EDR) tools. But, while solutions may be completely automated in terms of taking a defensive posture to protect the organization, data security itself is not addressed. In a situation where despite the best security practices an attack is successful, the data becomes available to hackers with malicious intentions. This is why in addition to automating data security, the data itself needs to be protected through encryption or other means. 

Data protection – encryption vs tokenization

Most developers or security experts who provide recommendations on data protection focus on encryption of sensitive data such as passwords. In such cases, there may be some comfort in that the hacker can only see the encrypted salted passwords only. On the flipside though, the hacker has other PII or private information such as addresses, names, mobile or SSN details. DLP and CASB solutions protect the data from exfiltration to the most extent, but internal threats, or circumstances in which a professional hacker bypasses all these controls will still find your data compromised.

Shareholders, senior management, and CXOs rely on standards set up by compliance, but these are usually the bare minimum. Compared to the cutting edge technologies used by hackers, latest breaches or advanced hacks, there is no complete automation solution focusing on data security.

Data encryption is a step in the right direction, but it must be implemented in such a way that it’s not just passwords that are protected. Protecting all types of data helps increase trust and manage risk. Encryption uses an encryption key to temporarily alter data, making it unreadable. However, the drawback is that with sufficient effort, any encryption can be broken. Because encryption is reversible, PCI Security Standards Council and other regulatory bodies still consider encrypted data as sensitive data. This could still attract fines due to non-compliance.

We at Entersoft recommend and implement tokenization for our customers. Tokenization is when sensitive data is replaced with a non-sensitive equivalent, referred to as a token. When tokenization is properly implemented, even if there is a data loss, there is no way that the hacker can digest or use that information.While encryption is secured using an algorithm that can be figured out, tokenization replaces the data with randomly generated non-sensitive data, while sensitive data is securely stored. Even if a hacker gets hold of the tokens, they cannot use them. The user of the token goes through additional security checks before the data is swapped. Plus, unlike encryption, the tokens themselves have no intrinsic value and cannot be broken into, thus meeting compliance regulations while providing a cost-effective and highly secure way of protecting organizational data.

Tokenization platforms provide data security and allow businesses to leverage data

End to end, automated data protection platforms combine data classification and data security, taking into account the unique regulatory or business needs of the organization. In addition, sophisticated platforms can also secure the data while allowing it to be leveraged to build insights for various business or operations purposes. As organizations look to leverage their data for competitive advantage, operational efficiency and savings, data security is paramount. Investing in a secure and mature platform or service to secure data at an organizational level provides a shield to de-risk against data breaches, reputation loss or non-compliance. Having done this, companies can then freely use the data to power their growth journey