Paper Review - Event Extraction for Criminal Legal Text(Q Li, 2020)

2023-08-01 paper review

이진헌 (Jin-Heon, Lee)
이진헌 (Jin-Heon, Lee)
MSc Course
AI-based Crime Investigation System

This paper addresses issues within the realm of legal work by employing event extraction technology on the case description section of Chinese legal texts, specifically focusing on larceny cases. The authors define event types, event arguments, and their roles in larceny cases. They create a dataset for event extraction through data annotation and divide the extraction process into two stages: identifying event trigger words and jointly extracting arguments, followed by assigning event argument roles. The methodology involves utilizing BERT for Chinese character vectors, employing a BiLSTM-CRF model for initial extraction, integrating additional features, and using a CRF model for improved extraction. The extracted event data is presented in a chronological sequence to facilitate litigation visualization. The paper also discusses formatting Chinese time expressions, sorting event information chronologically, and introducing a web application for displaying the event timeline. The primary focus areas are Chinese legal text, event dataset creation, event extraction, and litigation visualization.


Related research for this paper included Event Extraction, Event Graph Construction, and Legal Intelligence Research, We reviewed relevant studies that applied event extraction techniques across a variety of domains, including news articles, biomedical literature, and social media. It highlighted the specific challenges faced when applying event extraction techniques to legal texts due to the complexity and ambiguity of legal language and the importance of event extraction in the legal domain to help legal professionals evaluate and judge cases.


Data labeling tool: YEDDA Labeling scheme: BIO scheme



Event argument extraction results are generally good. The reason for the relatively poor extraction results for location entities is that they have a more irregular representation.