Publications
2024
- Constructing Capabilities: The Politics of Testing Infrastructures for Generative AIGabriel GrillIn Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024
The advertised and perceived capabilities of generative AI products like ChatGPT have recently stimulated considerable investments and discourse surrounding their potential to aid and replace work. The prominence of these systems, and their promise to be generalpurpose, has resulted in an avalanche of tests to discover and certify their capabilities. This new testing regime is concerned with creating ever-more tasks for generative AI products instead of testing a model for one specialized task. Beyond efforts to understand products’ capabilities, the construction of tasks and corresponding tests are also performative enactments meant to convince others and thus to gain attention, scientific legitimacy, and investment. The current market concentration of a few big AI companies points to a concerning conflict of interest: those with a vested interest in the success of the technology also have control over globalized testing infrastructures and thereby the exclusive means to create extensive knowledge claims about these systems. In this paper, I theorize capabilities as contested constructions and situated accomplishments shaped by power imbalances. I further unpack the globalized testing infrastructures involved in the construction and stabilization of generative AI products’ capabilities. Furthermore, I discuss how the testing of these AI models and products is externalized, extracting value from the unpaid or under-paid labor of researcher and developer communities, content creators, subcontractors, and users. Lastly, I discuss a reflexive and critical approach to testing that challenges depoliticization and seeks to produce lasting critiques that serve more emancipatory goals.
- Reparations of the Horse? Algorithmic Reparation and Overspecialized RemediesColin Doyle, Melissa Alvarez-Garcia, Pelle Tracey, Gabriel Grill, Cedric Whitney, and 1 more authorIn Big Data & Society, 2024
In his seminal article, “Cyberspace and the Law of the Horse,” Frank Easterbrook criticized the scholarly trend of developing overspecialized legal approaches to emerging technologies. Easterbrook argued that these approaches are confusing, shallow, and superfluous. Algorithmic reparation has emerged as a framework for addressing algorithmic systems’ role in inequity and injustice. One understanding of algorithmic reparation is as a method for repairing algorithmic harms. This article examines how this understanding fares against the “law of the horse” critique by posing two questions. First, is algorithmic reparation overspecialized in its methods? Second, is algorithmic reparation overspecialized in the harm it targets? If its methods are too particularized, then algorithmic reparation will only work within a narrow range of circumstances and may undercut a more robust conception of remedies for algorithmic injustice. If the harm it targets is too particularized, then algorithmic reparation will result in incomplete or misguided redress of harms. We determine that algorithmic reparation is not too specific in its methods by demonstrating how–under algorithmic reparation principles–existing methods for reparations can be applied to address algorithmic harm. We also determine that algorithmic reparation can sometimes be too narrow in the harm it targets, which can reduce its effectiveness. When an algorithmic system is both necessary and sufficient for a harm to occur, algorithmic reparation is an effective method of redress. But when an algorithmic system is not necessary and sufficient for a given harm, algorithmic reparation may be incomplete, only temporarily effective, or miss the mark entirely.
2023
- Bias as Boundary Object: Unpacking The Politics Of An Austerity Algorithm Using Bias FrameworksGabriel Grill, Fabian Fischer, and Florian CechIn Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023
Whether bias is an appropriate lens for analysis and critique remains a subject of debate among scholars. This paper contributes to this conversation by unpacking the use of bias in a critical analysis of a controversial austerity algorithm introduced by the Austrian public employment service in 2018. It was envisioned to classify the unemployed into three risk categories based on predicted prospects for re-employment. The system promised to increase efficiency and effectivity of counseling while objectifying a new austerity support measure allocation scheme. This approach was intended to cut spending for those deemed at highest risk of long term unemployment. Our in-depth analysis, based on internal documentation not available to the public, systematically traces and categorizes various problematic biases to illustrate harms to job seekers and challenge promises used to justify the adoption of the system. The classification is guided by a long-established bias framework for computer systems developed by Friedman and Nissenbaum, which provides three sensitizing basic categories. We identified in our analysis "technical biases," like issues around measurement, rigidity, and coarseness of variables, "emergent biases," such as disruptive events that change the labor market, and, finally, "preexisting biases," like the use of variables that act as proxies for inequality. Grounded in our case study, we argue that articulated biases can be strategically used as boundary objects to enable different actors to critically debate and challenge problematic systems without prior consensus building. We unpack benefits and risks of using bias classification frameworks to guide analysis. They have recently received increased scholarly attention and thereby may influence the identification and construction of biases. By comparing four bias frameworks and drawing on our case study, we illustrate how they are political by prioritizing certain aspects in analysis while disregarding others. Furthermore, we discuss how they vary in their granularity and how this can influence analysis. We also problematize how these frameworks tend to favor explanations for bias that center the algorithm instead of social structures. We discuss several recommendations to make bias analyses more emancipatory, arguing that biases should be seen as starting points for reflection on harmful impacts, questioning the framing imposed by the imagined “unbiased" center that the bias is supposed to distort, and seeking out deeper explanations and histories that also center bigger social structures, power dynamics, and marginalized perspectives. Finally, we reflect on the risk that these frameworks may stabilize problematic notions of bias, for example, when they become a standard or enshrined in law.
- Online Harassment in Majority Contexts: Examining Harms and Remedies across CountriesSarita Schoenebeck, Amna Batool, Giang Do, Sylvia Darling, Gabriel Grill, and 4 more authorsIn Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023
Online harassment is a global problem. This article examines perceptions of harm and preferences for remedies associated with online harassment with nearly 4000 participants in 14 countries around the world. The countries in this work reflect a range of identities and values, with a focus on those outside of North American and European contexts. Results show that perceptions of harm are higher among participants from all countries studied compared to the United States. Non-consensual sharing of sexual photos is consistently rated as harmful in all countries, while insults and rumors are perceived as more harmful in non-U.S. countries, especially harm to family reputation. Lower trust in other people and lower trust in sense of safety in one’s neighborhood correlate with increased perceptions of harm of online harassment. In terms of remedies, participants in most countries prefer monetary compensation, apologies, and publicly revealing offender’s identities compared to the U.S. Social media platform design and policy must consider regional values and norms, which may depart from U.S. centric-approaches.
2022
- Constructing Certainty in Machine Learning: On the Performativity of Testing and Its Hold on the FutureGabriel GrillIn OSF Preprints, 2022
The use of opaque machine learning algorithms is often justified by their accuracy. For example, IBM has advertised its algorithms as being able to predict when workers will quit with 95% accuracy, an EU research project on lie detection in border control has reported 75% accuracy, and researchers have claimed to be able to deduce sexual orientation with 91% accuracy from face images. Such performance numbers are, on the one hand, used to make sense of the functioning of opaque algorithms and promise to quantify the quality of algorithmic predictions. On the other hand, they are also performative, rhetorical, and meant to convince others of the ability of algorithms to know the world and its future objectively, making calculated, partial visions appear certain. This duality marks a conflict of interest when the actors who conduct an evaluation also profit from positive outcomes. Building on work in the sociology of testing and agnotology, I discuss seven ways how the construction of high accuracy claims also involves the production of ignorance. I argue that this ignorance should be understood as productive and strategic as it is imbued with epistemological authority by making uncertain matters seem certain in ways that benefit some groups over others. Several examples illustrate how tech companies increasingly strategically produce ignorance reminiscent of tactics used by controversial companies with a high concentration of market power such as big oil or tobacco. My analysis deconstructs claims of certainty by highlighting the politics and contingencies of testing used to justify the adoption of algorithms. I further argue that current evaluation practices in ML are prone to producing problematic forms of ignorance, like misinformation, and reinforcing structural inequalities due to how human judgment and power structures are invisibilized, narrow, oversimplified metrics overused, and pernicious incentive structures encourage overstatements enabled by flexibility in testing. I provide recommendations on how to deal with and rethink incentive structures, testing practices, and the communication and study of accuracy with the goal of opening possibilities, making contingencies more visible, and enabling the imagination of different futures.
- Women’s Perspectives on Harm and Justice after Online HarassmentJane Im, Sarita Schoenebeck, Marilyn Iriarte, Gabriel Grill, Daricia Wilkinson, and 6 more authorsIn Proceedings of ACM in Human Computer Interaction (PACM), 2022
Social media platforms aspire to create online experiences where users can participate safely and equitably. However, women around the world experience widespread online harassment, including insults, stalking, aggression, threats, and non-consensual sharing of sexual photos. This article describes women’s perceptions of harm associated with online harassment and preferred platform responses to that harm. We conducted a survey in 14 geographic regions around the world (N = 3,993), focusing on regions whose perspectives have been insufficiently elevated in social media governance decisions (e.g. Mongolia, Cameroon). Results show that, on average, women perceive greater harm associated with online harassment than men, especially for non-consensual image sharing. Women also prefer most platform responses compared to men, especially removing content and banning users; however, women are less favorable towards payment as a response. Addressing global gender-based violence online requires understanding how women experience online harms and how they wish for it to be addressed. This is especially important given that the people who build and govern technology are not typically those who are most likely to experience online harms.
- Attitudes and Folk Theories of Data Subjects on Transparency and Accuracy in Emotion RecognitionGabriel Grill and Nazanin AndalibiIn Proceedings of ACM in Human Computer Interaction (PACM), 2022
The growth of technologies promising to infer emotions raises political and ethical concerns, including concerns regarding their accuracy and transparency. A marginalized perspective in these conversations is that of data subjects potentially affected by emotion recognition. Taking social media as one emotion recognition deployment context, we conducted interviews with data subjects (i.e., social media users) to investigate their notions about accuracy and transparency in emotion recognition and interrogate stated attitudes towards these notions and related folk theories. We find that data subjects see accurate inferences as uncomfortable and as threatening their agency, pointing to privacy and ambiguity as desired design principles for social media platforms. While some participants argued that contemporary emotion recognition must be accurate, others raised concerns about possibilities for contesting the technology and called for better transparency. Furthermore, some challenged the technology altogether, highlighting that emotions are complex, relational, performative, and situated. In interpreting our findings, we identify new folk theories about accuracy and meaningful transparency in emotion recognition. Overall, our analysis shows an unsatisfactory status quo for data subjects that is shaped by power imbalances and a lack of reflexivity and democratic deliberation within platform governance.
2021
- Future Protest Made Risky: Examining Social Media Based Civil Unrest Prediction Research and ProductsGabriel GrillIn Computer Supported Cooperative Work (CSCW): The Journal of Collaborative Computing and Work Practices, 2021
Social media has both been hailed for enabling social movements and critiqued for its affordances as a surveillance infrastructure. In this work, I focus on the latter by analyzing research, products, and discourses around the recent history of civil unrest prediction based on social media data and other public data sources, thereby giving insights into current and often opaque protest surveillance and forecasting practices. Technologies to monitor individuals and groups online have been developed for instance to predict US protests following the election of President Trump in 2016 and labor strikes across global supply chains. These works are part of an emerging computer science research field focused on “civil unrest prediction” dedicated to forecasting protests across the globe (e.g., Indonesia, Brazil, and Australia). Foremost I focus on scholarly literature as my unit of analysis, but also other artifacts discussing or detailing applications for companies, organizations or governments are examined. I provide a conceptualization of civil unrest prediction technology by illustrating data sources, features and methods used, and how prediction and detection are necessarily entangled. Then I show how various kinds of unrest activity are framed as risks to be fixed or averted for various actors with differing interests such as the military, law enforcement, and various industries. Finally, I critically unpack justifications and ascribed benefits of the technology and point to how the perspectives of protestors are almost completely absent. My analysis shows a critical need for regulation centering activists and workers, and reflection within academia, particularly in the fields of computer and data science, on the ethics and politics of protest research and ensuing technological applications.
2020
- Algorithmic Profiling of Job Seekers in Austria: How Austerity Politics Are Made EffectiveDoris Allhutter, Florian Cech, Fabian Fischer, Gabriel Grill, and Astrid MagerIn Frontiers in Big Data, 2020
As of 2020, the Public Employment Service Austria (AMS) makes use of algorithmic profiling of job seekers to increase the efficiency of its counselling process and the effectivity of active labor market programs. Based on a statistical model of job seekers’ prospects on the labor market, the system - that has become known as the AMS algorithm - is designed to classify clients of the AMS into three categories: those with high chances to find a job within half a year, those with mediocre prospects on the job market, and those clients with a bad outlook of employment in the next two years. Depending on the category a particular job seeker is classified under they will be offered differing support in (re)entering the labor market. Based in science and technology studies, critical data studies and research on fairness, accountability and transparency of algorithmic systems, this paper examines the inherent politics of the AMS algorithm. An in-depth analysis of relevant technical documentation and policy documents investigates crucial conceptual, technical and social implications of the system. The analysis shows how the design of the algorithm is influenced by technical affordances, but also by social values, norms, and goals. A discussion of the tensions, challenges and possible biases that the system entails calls into question the objectivity and neutrality of data claims and of high hopes pinned on evidence-based decision-making. In this way, the paper sheds light on the co-production of (semi)automated managerial practices in employment agencies and the framing of unemployment under austerity politics.
- Der AMS Algorithmus-Eine Soziotechnische Analyse des Arbeitsmarktchancen-Assistenz-Systems (AMAS)Doris Allhutter, Astrid Mager, Florian Cech, Fabian Fischer, and Gabriel GrillIn Technical Report. Wien: ÖAW, ITA 2020-02., 2020
- Der AMS-Algorithmus.Ben Wagner, Paola Lopez, Florian Cech, Gabriel Grill, and Marie-Therese SekwenzIn zeitschrift für kritik| recht| gesellschaft, 2020
2017
- Network Analysis on the Austrian Media CorpusGabriel Grill, Julia Neidhardt, and Hannes WerthnerIn VSS 2017 - Vienna Young Scientists Symposium, 2017