Degraded Document OCR
Welcome to the Degraded Document OCR demo on Hugging Face.
This interactive application showcases a specialized OCR model developed by the Artificial Intelligence group at the Alfréd Rényi Institute of Mathematics.
The model is based on LightOn OCR 2.1B Base and was fine-tuned with our Degraded Document Generator framework to handle aged, highly degraded, and noisy document images. The current variant is tailored for Hungarian language OCR.
You can adjust the longest_edge parameter before OCR. This sets the maximum image side used during preprocessing: higher values preserve more visual detail and usually improve recognition on small text, while lower values run faster and use less memory.
If you are interested in this demo, please contact us at gabar92@renyi.hu.