Loading…

Loading grant details…

Active STANDARD GRANT National Science Foundation (US)

Frameworks: Infrastructure For Political And Social Event Data using Machine Learning

$15.89M USD

Funder National Science Foundation (US)
Recipient Organization University of Texas At Dallas
Country United States
Start Date Aug 01, 2023
End Date Jul 31, 2026
Duration 1,095 days
Number of Grantees 4
Roles Principal Investigator; Co-Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2311142
Grant Description

This project intends to revolutionize computerized data extraction for conflict scholars, security analysts, and practitioners who for decades have devoted significant resources to monitor, understand, and predict armed violence, social protests, and other politically relevant events worldwide. Currently, the vast majority of conflict event data are  expensively coded by humans from increasingly large volumes of news reports.

This project uses  recent advances in artificial intelligence and large language models to address this fundamental issue for conflict research. It builds on earlier NSF efforts that created a publicly available large language model to study inter- and intra-state conflict and armed violence, called ConfliBERT. This project expands the ConfliBERT model to multilingual settings, including Arabic and Spanish.

This will help researchers and policymakers better understand the context of local events and create a continuous data analysis process by feeding in current news stories to identify new political actors and events in real time. As the project's cyberinfrastructure develops, the research community will be empowered through training, education, and outreach with groups at local, national, and international levels, including academics and government.

In the last five years, state-of-the-art language models have revolutionized the field of natural language processing (NLP). In particular, there have been significant advances in the use of domain-specific models for understanding social processes. Our research and that of other experts in this field demonstrate how ConfliBERT outperforms prior  models for coding and understanding conflict and violence from raw text (Hu, et al. 2022, Haffner, et al. 2023).

This project  supports new NLP developments for conflict research and expands their access to the academic and policy communities. Specifically, it builds on earlier NSF efforts that led to the development of ConfliBERT, a domain-specific language model, publicly available at Hugging Face, trained on an expert-curated corpus about conflict and political violence (Hu et al. 2022).

This project will integrate, extend, and apply ConfliBERT and our related innovations (e.g., actor detection for network construction) into a sustainable ecosystem to engineer data from text. It will expand ConfliBERT to multilingual settings including Arabic and Spanish, update the corpora in sustainable ways and retrain ConfliBERT on a continuous basis, provide new political network data, and develop language models for users to create customized datasets and applications.

All developed cyberinfrastructure is and will continue to be broadly accessible for the community of researchers, analysts, and others with interests in conflict dynamics, security studies, and international relations. 

This project funded by the NSF Office of Advanced Cyberinfrastructure is jointly supported by the Directorate for Social, Behavioral, and Economic Sciences, and the Directorate for STEM Education.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

University of Texas At Dallas

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant