Publikationen der Stiftung / Gender & AI at wor... [7

In the following, I will provide an example of how AI systems can be regularly monitored for gender bias by following a structured, ongoing process. Monitoring AI Systems for Gender Bias: A Norwegian Example In November 2023, The Norwegian Equality and Anti-Discrimination Ombud(Likestillings- og diskrimineringsombudet, LDO) launched a guide in order to uncover and prevent discrimination in the development and use of artificial intelligence. Discrimination is defined as unlawful differential treatment based on protected grounds such as gender, ethnicity, disability, religion, sexual orientation, age, and combinations thereof(compound discrimination). From the perspective of law, it is important to note that the prohibition of discrimination – for instance as it is formulated in Norwegian law – applies regardless of whether decisions are made by humans or AI systems. The guide aims to make stakeholders aware of anti-discrimination legislation and to help them systematically prevent discrimination throughout all phases of AI system development and use. Alex Moltzau at the European AI-office has provided an informative and accessible presentation of the guidelines(Moltzau 2023). The approach in the guide is that of“built-in protection against discrimination”. Measures to prevent discrimination and promote equality must be built into all development phases of an AI system, from planning to use of the technology. Guides with a similar perspective have been developed elsewhere in policy briefs, such as from the Netherlands government called, Fundamental Rights and Algorithm Impact Assessment(FRAIA)(Government of the Netherlands 2021), and from the Finnish government:“Promoting equal ity in the use of Artificial Intelligence— an assessment framework for non-discriminatory AI”(Ojanen et al 2022). “Built-in protection” implies that the AI-technology in question must be monitored from planning to implementation. To avoid discriminatory practice later, critical questions need to be asked in the very first planning phase. At the outset, it is crucial to design the project with an awareness of the potential risks of discrimination. Are the selected data sources and methods representative of the population they will affect? Is there potentially bias in historical data or systemic structures that need to be analysed carefully? What is the anticipated impact of the system on different demographic groups? The planning phase is about laying a fair foundation before data or algorithms are introduced. In the second phase, involving the training of data, the focus shifts to the choice and justification of data variables. The Norwegian guide stresses that variables must be aligned with the defined purpose of the system, and the inclusion of sensitive variables should be explained and defended based on necessity. Comparing different data models and datasets to evaluate how they influence outcomes is also recommended. This step ensures that the model is not reproducing unfair or unjustified distinctions between groups. The third phase is the development of the AI-model itself. During development, clarity of purpose and context is essential. The model’s intent must be well defined, along with identifying which groups could be affected by its decisions or predictions. Involving a broad set of stakeholders—including technical experts, policymakers, domain specialists, and representatives of affected communities—helps to identify risks and keeps the system accountable. This prevents development from being a purely technical process detached from societal concerns. Before deployment, models should undergo rigorous testing to evaluate fairness across different groups. This means testing both for direct discrimination(e.g., a model treating two groups differently due to explicit variables) and indirect or compound discrimination(e.g., when neutral variables unintentionally act as proxies for protected characteristics). An example of direct discrimination could be a hiring model that takes“gender” as an input and systematically assigns lower suitability scores to women than to men with identical CVs and test results. Indirect discrimination is more subtle, like a hiring system that does not see“gender” but uses features such as participation in certain sports, membership in specific fraternities, or patterns in previous job titles that in the historical data are much more common among men. The model then prefers“male coded” CVs and disfavours“female coded” career paths, even though gender is never explicitly provided. These tests help uncover patterns that may lead to systematic disadvantages. Implementation must be followed by monitoring. Once in operation, AI systems must remain under oversight and corrective governance. If discriminatory patterns or biases are detected, corrective measures should be applied, whether through technical adjustments, rule changes, or systemic oversight mechanisms. Importantly, affected individuals must be informed about the system’s function and their rights in relation to its decisions. Building trust requires transparency, accountability, and ongoing monitoring long after the model’s initial release. The Norwegian guide is just one example of systematic monitoring, that can be used as a source of inspiration. What the“built-in design” requires is that questions of non-discrimination are addressed early enough to have an Monitoring AI Systems for Gender Bias 5

Sammlungen