iData Solution
iData Solutions, Software testing is and always will be a crucial element of the development and QA process and has been traditionally carried out using readily available production data. However, with the arrival of data protection regulations such as GDPR effectively making it illegal to process personal and private data for any purpose not authorized by the individual(s) concerned, organizations need to find an acceptable solution that keeps them on the right side of the rules – or face the risk of significant financial penalties.
iData Solution Software
As has been discussed in previous blogs, a possible solution might be to use one of the data obfuscation techniques such as encryption, tokenization, or data masking. However, whilst these approaches have many practical applications such as for departmental data sharing or protecting financial transactions not every method gets you off the GDPR hook when it comes to the testing or training environment. Not only that, iData Solutions but can also come with cost implications and practical useability downsides.
Data Obfuscation is Simple to Reverse Engineer
Data obfuscation typically involves clouding or replacing one or more of the data elements, such as a name or address line with meaningless fake values of similar length and structure. However, from a GDPR perspective this would still be considered non-compliant because, depending on which approach is used, obfuscated data can potentially be easily reverse-engineered back to the original state to disclose an individual’s personal and private information.
iData Solution
Reverse engineering of obfuscated data has proven to be relatively simple due to the fact that not all the data elements are obfuscated leaving multiple clues and signatures in the data that can be traced back to real people. A study in 2019, iData Solutions, by researchers in Belgium and the UK, developed an algorithm that correctly re-identified nearly every real person in any anonymized dataset with just 15 or more demographic attributes. Similar studies have also found a way to re-identify a dataset of 1.1 million people based on 3 months of credit card metadata, with 90% accuracy.
To make obfuscated data possibly GDPR compliant it needs to be strictly protected and audited in the same way as the source data but this can be resource-intensive, potentially expensive to implement, and also leave other stakeholders open to liability issues.
Encryption is another option and can be a much stronger obfuscation technique because it fully anonymizes the data and cannot be reversed without the encryption key. But unless the key is destroyed after use, encrypted data would still not be considered GDPR compliant for test purposes. Without the key, the encrypted data is effectively rendered useless in a test environment.
Synthetic Data offers a Secure and Scalable Alternative
An alternative approach would be to consider using Synthetic Data as a replacement for production data. By completely replacing all the source data with randomly generated values it means that it does not fall under the normal GDPR rules and can be used freely in the test environment without limiting or compromising the testing processes.
Synthetic data uses the raw production data to generate a completely new dataset that has all the same characteristics, attributes, and predictive potential as the original dataset. This makes it indistinguishable from the real data and because it is entirely fictitious it cannot be linked to any real people, which means there is no risk of a breach of data privacy or accidental disclosure to a non-authorized third party.
Synthetic data also has additional benefits in the testing environment. As well as removing the issue of protecting data privacy Synthetic data is ultimately scalable and provides the opportunity to create unlimited quantities of quality data to enable wider test innovation such as system stress testing providing the ability to measure performance criteria under extreme production workloads, iData Solutions helping to future-proof your corporate systems.
Which form of Data Masking is Right for my Organization?
All companies have different needs when it comes to masking or obfuscating their datasets as much depends on their specific data usage. If you are not sure that your approach to maintaining data security and privacy is fully compliant with the latest GDPR or other data protection regulations, or just want reassurance that you are doing the right thing, a call with one of our data management experts will provide you with the answers you need, help to minimize the risk of a data breach and avoid a hefty financial hit.
iData Solutions
Looking for a Quick Quote
You will receive reply within 24 hrs
iData Solutions works with agency and client partners to deliver critical consumer and marketplace information to companies & organizations.
- WordPress
From Local Area Networks to Secure Remote Connections to Cloud Services iData has what it takes to get your network up & running, & keep it there.