The amount of data that is generated every day is tremendous. According to the International Business Machines Corporation (IBM), 90% of the data in the world today has been created in the last two years. Different data sources such as instant messaging, transactions, sensors, apps, click streams, search queries, and social media, just to name a few, have created the concept of big data.
The advent of big data has brought new challenges for organisations. One of those challenges is the management of privacy and security concerns related to the personal data that is being uploaded every day to the cloud. A clear example of this is the information that people upload every day to different social media such as Facebook, Snapchat or Instagram. Facebook alone grew from 2 billion to over 6 billion uploaded pictures between the years of 2010 and 2012. The uploaded data is gathered, processed, stored and analysed by a variety of organisations with different purposes.
Making this data open and accessible generates some important concerns regarding privacy. One of the most major privacy concerns during the process of analysing data is identity disclosure.
The main problem with identity disclosure is that one can identify people by putting together data that comes from multiple data sources, which is not considered a difficult task. Professor Latanya Sweeney explains that 85% of the United States population is exposed to the risk of being individually identified just by knowing their gender, zip code and date of birth. Not surprisingly, this information is more than likely to be found in public data sources such as social media and public records.
Identity disclosure is just one of the many privacy concerns. A team of researchers led by Professor Xindong Wu have identified other privacy issues related to the analysis of big data. For instance, preventing third-party users to have access to information that unveils the users “data access patterns” which concerns how, when, and from where a user goes online.
Although location-based services need to access the users’ geo-location to produce an accurate result, there can be serious privacy consequences if someone tracks users’ movement over a long time and store it in a publicly accessible database. Because of these and other privacy issues, The European Union has decided to take action on the matter and they have put new rules on the table. Any company that wants to use European Union citizens’ personal data mjust comply with the new legislation.
General Data Protection Regulation
The new General Data Protection Regulation (GDPR) is designed to: “[…] harmonize data privacy laws across Europe, to protect and empower all EU citizens data privacy and to reshape the way organizations across the region approach data privacy.”
This regulation will become enforceable on the 25th of May 2018. It is known as an extension of its predecessor, an old data protection directive. The GDPR has included some important modifications like an extended jurisdiction which means that every organisation that processes personal data of citizens within the EU must comply with the GDPR, regardless of the organisation’s location, or face fines.
Other important adjustments include the strengthening of the conditions for consent, and the right to be informed within 72 hours after a data breach that puts one’s personal data at risk. Additionally, there is the right of any citizen to be "forgotten" from any database if the right conditions are met, as well as the right to withdraw personal data provided by a citizen in a common use and machine-readable format. Finally, there is the adoption of “privacy by design” which means that data protection should be included from the start when designing new systems and not be added later.
Despite GDPR looking promising for the future of data protection regulations, it has received some negative feedback. It is noticeable that GDPR should not be considered as a silver bullet that will solve all the issues surrounding privacy. Still, it is important to acknowledge its efforts to reinforce its predecessor’s flaws. One of the main flaws that the old data protection directive did not address was the protection of citizens data from reidentification.
The old data protection directive considered that once the personal data is anonymised, it does not count as personal data anymore and therefore the directive no longer applies. The GDPR has almost the same definition of personal data, with the difference being that it has included the information regarding the location and online identifier as personal identifiable information, which does not represent an important difference.
What the GDPR proposes is to limit the extent of which pseudonymised data is allowed to be used. Despite this being an improvement, there is still an important weakness to consider. It has already been proven that there is almost no completely anonymised data in existence as, in the right context and with the right additional data, even this anonymous data can be used to re-identify individuals.
In summary, the GDPR represents an important step when it comes to data protection regulations. However, there are still some weaknesses that must be taken into consideration. Any company working with personal data must acknowledge every detail that GDPR brings, in order to be ready for its future launch.
We believe that information should be free and will therefore never put up a paywall.
If you like reading our reports about the Scandinavian business scene and would like to donate towards the upkeep of the site, we would be very grateful. Click here to donate.