- by Gersi Mirashi
- December 18, 2024
The Application of Deep Learning in Optical Character Recognition
By, Sildi Shahini, Ardiana Topi, Forsian Elezi
Abstract
Optical Character Recognition (OCR) is an essential technology for document digitization, enabling the conversion of scanned paper documents, PDFs and images into editable and searchable data. This paper focuses on the application of deep learning in OCR, particularly in digitizing handwritten medical prescriptions, where accuracy is critical for reducing errors and improving healthcare outcomes. Traditional OCR methods face challenges when dealing with handwritten texts due to the variability in handwriting styles and the quality of scanned documents. These limitations can result in recognition errors, which, in a medical context, may lead to serious consequences such as medication errors. To address the above issue, the study explores deep learning approaches, especially Convolutional Neural Networks (CNNs), that have shown significant promise in overcoming these challenges by learning from large datasets. The study involves collecting handwritten prescriptions, preprocessing the images, and training a deep learning-based OCR model. Performance evaluation metrics, including accuracy, 112 INGENIOUS No. 4, ISSUE 2/ 2024 precision, recall, and F1-score, indicate that the deep learning model significantly outperforms traditional OCR methods in recognizing handwritten prescriptions. The results demonstrate the deep learning model’s ability to handle the variability of handwriting more effectively, providing a more reliable solution for digitizing medical documents. This research underlines the transformative potential of deep learning in OCR technology, particularly for critical applications such as healthcare. The findings advocate for the wider adoption of deep learning in the healthcare sector, aiming to improve patient care, reduce human error, and enhance operational efficiency, especially in pharmacy management and medical record-keeping.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.