On April 19, a joint seminar was held between Hallym University’s Legal Informatics and Forensic Science(LIFS) lab and Artificial Intelligence and Machine Learning(AIML) lab.
Hallym University LIFS lab is a research and development institution in charge of the National Police Agency’s Scientific Criminal Investigation Advanced Technology Development Project in 2021.
This seminar was held under the theme of how to build a dataset for AI-based criminal investigation support projects.
The AI-based crime investigation support project aims to develop a system that helps police prepare investigative documents and supports review using artificial intelligence technology.
The seminar focused on how to construct and process the most important datasets for this technology development.
1. Announcement of Data Collection and Dataset Deployment Guidelines
The first step of the seminar was to present and discuss guidelines on how to collect and build datasets, which are essential elements to support the creation of investigative documents using natural language processing technology.
Since the collection and management of datasets is very important for AI learning, the seminar examined the overall dataset construction process and discussed improvements to make more reliable results.
It was a very helpful session for future projects, with experts verifying and discussing the status and procedures of data collection so far and the technical data processing required by the work.
2. Annotation Dataset Construction and Data Quality Assessment Guidelines Announced
In the second session of the seminar, the establishment of an annotation dataset, which is critical to the development of supporting investigative documentation, and the presentation and discussion of quality verification of the generated data were conducted.
The annotation dataset, which is based on the argument structure, is a very important factor in automatically creating more logical and objective investigation documents, so this seminar has led to a thorough examination and discussion of the annotation dataset built so far.
Quality checks on deployed annotation datasets are also an essential step in AI learning, so checks on the process have also been made.
Discussions were also made on how to process and utilize metadata, which acts as important data in the automatic generation of investigation timelines, one of the results of this project.
It was a very productive time to receive feedback on the inspection of the annotation dataset and future deployment plans.
3. Discussion on data augmentation
The last session of the seminar was presented by Professor Dong-Ok Won of the AIML lab, who discussed ways to augment data for artificial intelligence learning.
It was a time to think about how to expand the dataset in the future as the professor explained various natural language processing models that can augment text data and discussed the most suitable form of technology for this project.
Through this joint seminar between the LIFS lab and the AIML lab, it was a very beneficial time for researchers to check the current status of AI-based criminal investigation support projects and discuss future development directions.
For a detailed explanation of the AI-based criminal investigation support for the [Scientific Criminal Investigation Advanced Technology Development Project] conducted by the information law science lab of Hallym University, you can find the attached link below.