Aller au contenu

Scientific topics and careers at the intersection: an algorithmic approach

Description

The goal of this project is to investigate the direct harms to science wrought by structural racism and the benefits derived by the inclusion of people of color and other historically marginalized groups in the scientific workforce. Specifically, this work seeks to (a) quantify the participation of people of color and members of historically marginalized populations in the production of science, (b) elucidate their role in propelling intellectual innovations, and (c) understand how the distribution of labor and composition of scientific teams creates barriers and pathways to their scientific success. The project will support the mission of open science, by making the algorithms and publications openly available to propel this area of research. Finally, the PI team will recruit a cohort of a dozen student Fellows from a variety of disciplines and countries to discuss the ways in which they incorporate their lived experiences into research design and the challenges and barriers to this process. Priority will be given to doctoral students of color, or who identify as a member of a historically marginalized population within their country of affiliation. The goal of the fellowship is to empower students to navigate academic spaces by suggesting new topical directions with advisors, to cultivate change in terms of how authors are distributed in scientific publications, and to examine what and how science is conducted. Our research aims to empirically examine the degree to which diversity in the scientific workforce creates a more innovative and robust scientific system. The research has strong implications for all sectors of society.


This research builds upon previous quantitative analyses to construct more robust and equitable algorithms that take into consideration contextual factors that influence the performance of the algorithm. To address our primary aim we use articles? abstract, title, and keywords to train a Latent Dirichlet Allocation (LDA) model to infer the topics within a corpus of papers and the distribution of topics within each article. Data sources include millions of articles and distinct authors indexed in the Web of Science (WoS) database. To address our primary aim we will use articles? abstract, title, and keywords to train a Latent Dirichlet Allocation (LDA) model and to extend our work on intersecting race, ethnicity, and gender inequalities in the US research landscape to citation and collaboration patterns, the role of institutional affiliation and changes over time; infer the topics within a corpus of papers, and the distribution of topics within each article. Our second aim is to determine if variation by race, ethnicity and gender identified in the US context translates to other national contexts. To address this second aim we will replicate and expand our methodology to two other scientifically productive, diverse societies. Comparison across all three nation states will allow for the identification of potentially generalizable characteristics, mechanisms that can be used to improve equity in science across the globe, and knowledge of how topicality of research in different countries is affected by the racial composition of teams. This research will provide a scalable methodological contribution that extends beyond the confines of this single research project and will allow other researchers to analyze race, ethnicity, and gender in any dataset that includes individual names.

Financement

Standard Grant

National Science Foundation

2022-04-01 - 2025-03-31

Axes associés

Responsable

T. Monroe-White

Cochercheurs/Cochercheuses

Cassidy Sugimoto