Code4Thought

SOFTWARE QUALITY

Fairness, Accountability &
Transparency (F.Acc.T) under GDPR

Why the need for Regulation?

Algorithmic decisions are already crucially affecting our lives. The last few year, news like the ones listed below are becoming more and more common:
  • “There’s software used across the country to predict future criminals. And it’s biased against blacks”, 2016 [source: ProPublica].
  • “Amazon reportedly scraps internal AI recruiting tool that was biased against women”, 2018 [source: The Verge]
  • “Apple Card is being investigated over claims it gives women lower credit limits”, 2019 [source: MIT Technology Review]
  • “Is an Algorithm Less Racist Than a Loan Officer?”, 2020 [source: NY Times]
The above indicate why we see a rise to new regulations and laws that aim to control to some extent the growing power of the algorithms. One of the first and most well-known regulations is the General Data Protection Regulation (GDPR) introduced by the European Parliament and Council of the European Union in May, 2016.
In this article, we will focus on how the properties of Fairness, Accountability and Transparency (better known as F.Acc.T) are reflected in the GDPR and we will examine the scope of the restrictions they impose on the public and private sector.

Principles of GDPR

lawfulness, fairness, transparency, purpose limitation, data minimisation, accuracy, storage limitation, integrity, confidentiality and accountability
While the F.Acc.T properties are clearly listed as principles, we need to take a deeper look in the GDPR text to understand what these terms actually mean.

Transparency under the GDPR

Transparency has been a matter of debate between law scholars, since it is not very clear if GDPR contains the right to explanation of automated decisions. For example, the following two papers [3, 4] of the same journal have almost contradicting titles (see Image 1).
Image 1: Contradicting papers [3, 4] about the right to explanation
Image 1: Contradicting papers [3, 4] about the right to explanation
The epicentre of this debate is located in Article 22 (3) GDPR,where it is stated that: “[…]the data controller shall implement suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests, at least the right to obtain human intervention on the part of the controller, to express his or her point of view and to contest the decision”. Here, it is not written explicitly if the right to an explanation exists.
However, in Recital 71 GDPR, it is stated categorically that (inter alia) this right exists: “[…] such processing should be subject to suitable safeguards, which should include specific information to the data subject and the right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision.”.
Since only articles are binding in the GDPR and not the recitals — which provide context and greater depth of meaning to the articles — most scholars [3, 4 & 5 ] believe that a right to an explanation does not follow from GDPR.
There is another conflicting point: Article 15 (1)(h) states the data controller should provide to the data subject “meaningful information about the logic involved”.
  • Some scholars [3, 4] interpret that the “meaningful information” refers only to the model’s general structure, hence the explanations for individual predictions are not necessary.
  • Other scholars [5] believe that in order for the information to be meaningful it needs to allow the data subjects to exercise their rights defined by Article 22(3), which is the right to “express his or her point of view and to contest the decision”. Since explanations provide this ability, it is argued that they must be presented.
Loophole in the system
However, even in the best case scenario, where everyone agrees that the right to explanations exists in the GDPR, the regulation states that it is referring only to “decisions based solely on automated processing (Article 22 (1)). This means that any kind of human intervention in the decision, exempts the controller from the obligation of providing explanations. For example, if a creditor uses an automated credit scoring system only as advisory and in the end they are the ones to take the final decision for the credit applicant, then in this case GDPR does not force the system to provide explanations to the creditor as well as to the applicant.
Code4Thought’s stance
We believe that explanations are essential to ensure transparency and trust across all kinds of stakeholders of an algorithmic system. From our experience, users of advisory automated systems usually are prone to follow the system’s decision, even when sometimes the decision does not make much sense. From faithfully following erroneous Google Maps directions to judges using racially biased risk assessments tools, the immense power of today’s algorithmic systems tends to overthrow people’s ability for critical thinking and as a result it takes its place.
So, regardless of human intervention, we firmly believe that when algorithmic systems affect real people, explanations are necessary as they establish trust between the system and the data controllers as well as the data subjects. Moreover, they can scrutinize the algorithms for systematic bias, consequently increasing fairness. Last but not least, understanding the algorithm’s reasoning can be very helpful for debugging/tuning purposes, while in other cases (e.g. for medical diagnosis) it is imperative.
Artificial Intelligence (AI) as a “black-box”
The intricate and obscure inner structure of the most common AI model (e.g. deep neural nets, ensemble methods, GANs, etc.) forces us more often than not to treat them as “black-boxes”, that is, getting their predictions in a no-questions-asked policy. As mentioned above, this can lead erroneous behavior of the models, which may be caused by undesired patterns in our data. Explanations help us detect such patterns: in Image 2 we observe a photo of a man (lead singer of “Green Day”, B.J. Armstrong), who was falsely classified as a “Female” by a gender recognition “black-box” model. The explanation shows us that the (red) pixels representing the make-up around his eyes contributed heavily towards predicting him as “Female”, suggesting that our model chose the gender of a picture solely by detecting make-up or the not appearance of a moustache. This discovery would motivate us to take a closer look to our training data and update them.
Image 2: Explanation of a false prediction by a gender recognition model (photo by CelebA dataset)
Explainability methods
There are various explainability methods and Image 3 presents a taxonomy of them. One of Pythia’s transparency tools is called MASHAP, which is a model-agnostic method that outputs feature summary in both local and global scope.
Image 3: Taxonomy of explainability methods
Fairness under the GDPR
Fairness might be the most difficult principle to define, since determining what is fair is a very subjective process, which varies from culture to culture. Asking ten different philosophers what defines fairness, the result would probably be ten different answers.
Even the translation of the word “fair” in various european languages is not straight-forward, where two or three different words are used, each referring to slightly different nuances. For example, in the German text, across different articles and recitals, “fair” is translated as “nach Treu un Glaube”, “faire” or “gerecht”.
In the context of GDPR, scholars believe that fairness can have many possible nuances, such as non-discrimination, fair balancing, procedural fairness, etc. [1, 2]. Fairness as non-discrimination refers to the elimination of “[…] discriminatory effects on natural persons on the basis of racial or ethnic origin, political opinion, religion or beliefs, trade union membership, genetic or health status or sexual orientation, or processing that results in measures having such an effect” as defined in Recital 71 GDPR. Procedural fairness is linked with timeliness, transparency and burden of care by data controllers [1]. Fair balancing is based on proportionality between interests and necessity of purposes [1, 2].