Model Building in the Humanities through Data-Driven Problem Solving (2024)
The National Institute of Japanese Literature NIJL has started a new project, "Model Building in the Humanities through Data-Driven Problem Solving", in 2024.
NIJL had digitized 300,000 pre-modern Japanese texts under the "NIJL-NW project". In the new project, 150,000 digitized pre-modern works will be added in collaboration with various institutions, including those overseas. In addition, we are going to try extracting full-text of digital images from pre-modern works.
We will also improve the functionality of the "Union Catalogue Database of Japanese Texts (国書データベース)" and enrich its content. Based on this database, we will promote the "Data-driven research" and other projects.
In this presentation, we will introduce an overview of our new project and further efforts regarding this database (e.g., text data creation by OCR).