Notice

Nara wins international competition to identify papyri with AI and OCR technology

 


ICDAR2023 Competition on Detection and Recognition of Greek Characters on Papyri


This competition investigates the performance of glyph detection and recognition on a very challenging type of historical document: Greek papyri. The detection and recognition of Greek letters on papyri is a preliminary step for computational analysis of handwriting that can lead to major steps forward in our understanding of this major source of information on Antiquity. It can be done manually by trained papyrologists. It is however a time-consuming task that would need automatising. We provide two different tasks: localization and classification or classification only.The document images are provided by several institutions and are representative of the diversity of book hands on papyri (a millennium time span, various script styles, provenance, states of preservation, means of digitization and resolution).



AI 타임즈 기사 (https://www.aitimes.com/news/articleView.html?idxno=151203)


Nara Knowledge Information (CEO Son Young-ho) announced on the 19th that it participated in the 'Ancient Text Interpretation Recognition and Identification Artificial Intelligence (AI) Contest' held online since last month by its affiliated Humanities Artificial Intelligence Research Institute, and won the third place with a character recognition rate of 39 points.


The AI Competition is organized by the International Conference on Document Analysis and Recognition (ICDAR) and is aimed at deciphering and restoring ancient documents to advance research in historical information processing and analysis.


The organizers, the Institute for Pattern Recognition at FAU University in Germany, collected and provided images of ancient Greek papyri from libraries and museums across Europe, including the Bodleian Library in the UK. The challenge is to recognize and identify Greek characters that remain blurred or fragmentary on the papyrus.


NAVER explained that it utilized image preprocessing methods such as background reduction and outline enhancement to restore the poor quality papyri. As an OCR deep learning model for character recognition, it said it used the HRNet (High-Resolution Network) model, which is advantageous when dealing with high image resolution.


It also said that the OCR model and data restoration and augmentation techniques used in the competition were applied to the 'Annotation Workbench Software' used by Nara Knowledge Information.


“We are planning to introduce the developed AI OCR engine and software at the ICDAR High Literature Workshop in August in San Jose, California,” the institute said.


Meanwhile, Nara Knowledge Information, founded in 2008, is a company specializing in historical informatization with a focus on high quality, usability, standardization, and accuracy. In 2019, the company established an affiliated research institute, which has accumulated know-how based on its experience in digitization and related service provision, and is analyzing and researching new technology trends in the field of humanities AI. Currently, it is developing and commercializing AI data analysis tools tailored for historical data.



장세민 기자 semim99@aitimes.com

출처 : AI타임스(https://www.aitimes.com)