ChatGPT sorts names in resumes with racial bias

ChatGPT sorts names in resumes with racial bias

The frequency of choosing the best candidate according to ChatGPT from eight groups. Picture: Bloomberg

OpenAI’s ChatGPT chatbot prefers names that are more specific to certain demographic groups than others when sorting resumes, according to a Bloomberg study.

Bloomberg journalists claim that the chatbot systematically creates prejudice by sorting names with a racial bias, all things being equal and the same professional data. During the research, they talked to 33 specialists in the field of artificial intelligence, and also conducted their own experiment, taking fictitious names and resumes for it.

GPT 3.5 was used for the experiment — the most popular and most accessible version of ChatGPT. Journalists created resumes with as close parameters as possible, including names that reflect the racial and ethnic diversity of the United States, according to census data.

In total, eight fictitious resumes were used in the experiment, which were also created with the help of a neural network. Each of these resumes was then assigned male and female names according to the four largest racial or ethnic groups, and asked ChatGPT to sort the resumes to identify the most qualified candidate for a real job opening at a Fortune 500 company.

When journalists asked GPT 3.5 to rank these resumes, it found that CVs with black US names ended up at the bottom of the list 29% of the time. In turn, such indicators for Asians, whites and Hispanic Americans turned out to be 22%, 24% and 25%, respectively.

Such results show that the work of the neural network on sorting does not meet the criteria established for evaluating employment discrimination against protected groups, writes Bloomberg. The publication assessed discrimination according to the 80% rule used by US federal agencies. According to this criterion, if the number of hired candidates from a certain demographic group turns out to be less than 80% of the number of hired representatives of the group to which the best belong, this is considered discrimination.

The experiment was conducted on the example of four vacancies: HR business partner, retail manager, senior software engineer and financial analyst. Journalists note that on a fairly large sample, the results of all eight groups should have reached a value of 12.5%. However, the results of the experiment revealed that ChatGPT prefers Asian women (they were selected 17.2% of the time) and Asian men, as well as white and Hispanic women, when choosing a candidate for the position of financial analyst. Resumes with names characteristic of black men showed the worst result — they were chosen only 7.6% of the time.

Bloomberg claims that ChatGPT preferred not to select resumes with the names of black candidates for the positions of senior software engineer and financial analyst. In addition, the neural network favored men in human resources and retail management positions, despite the fact that these positions are more often held by women in the US.

The journalists also conducted similar experiments with the more advanced GPT 4. According to them, this model also shows clear advantages, although they did not provide any results of this experiment. Thus, if American companies were to make hiring decisions based solely on neural network sorting, certain demographic and ethnic groups would be adversely affected, according to Bloomberg reporters.

Related posts