TU Wien Informatics

Few-Shot Learning for Historical Document Analysis

  • 2025-12-17
  • Award
  • Students

This ACPR 2025 runner-up for Best Student Paper shows how compact neural networks can accurately segment text lines from just a few annotated pages.

Refael Sterzinger
Refael Sterzinger
Picture: Marvin Burges

We’re delighted to announce that the paper Few-Shot Connectivity-Aware Text Line Segmentation in Historical Documents was nominated as a runner-up for the Best Student Paper Award at ACPR 2025! The paper was written by Rafael Sterzinger, Tingyu Lin, and Robert Sablatnig.

In their paper, the authors address the challenge of text line segmentation in historical documents. Text line segmentation is a foundational task for the digital analysis of documents, but automating it with deep learning is challenging due to the need for large, annotated datasets—often unavailable for historical documents. Annotation is labor-intensive and requires expert knowledge, making few-shot learning a promising alternative. The paper demonstrates that small, simple architectures, combined with a topology-aware loss function, can outperform more complex models in both accuracy and data efficiency. The developed method utilizes a compact version of the UNet++ neural network, along with a specialized loss function that helps prevent common errors, such as splitting a single text line into pieces or merging separate lines by mistake. It is trained on very small regions extracted from just three annotated pages per manuscript, demonstrating its ability to learn effectively from limited data. On the U-DIADS-TL dataset, this approach improves Recognition Accuracy by 200% and Line Intersection over Union by 75%, demonstrating a significant improvement over previous methods. It also matches or exceeds the F-Measure of the top-performing method in the DIVA-HisDB benchmark, highlighting its competitiveness.

The 8th Asian Conference on Pattern Recognition (ACPR 2025) took place in Australia in November and serves as a dynamic hub for global collaboration and innovation in the field of pattern recognition.

Congratulations to Rafael, Tingyu, and Robert Sablatnig on this excellent achievement!

If you’re curious about the implementation of the authors’ methods, have a look at Rafael’s GitHub page.

About the Authors

Tingyu Lin

Tingyu Lin is a predoctoral researcher at the Computer Vision Lab within the Institute of Visual Computing & Human-Centered Technology at TU Wien Informatics. He is currently conducting research for the FWF doctoral program Visual Analytics and Computer Vision Meet Cultural Heritage (VaCoViCu) under the supervision of Robert Sablatnig, in collaboration with the Institute of Creative Media Technologies at the St. Pölten University of Applied Sciences. His research primarily focuses on the automatic analysis and visualization of historical photographs and amateur films, with the goal of preserving cultural heritage.

Robert Sablatnig

Robert Sablatnig was born in Carinthia, Austria, in 1965. From 1992 to 2003, he served as an assistant professor, and from 2003 to 2010, as an associate professor of computer vision at the Pattern Recognition and Image Processing Group. From 2005 to 2017, he served as the head of the Institute of Computer Aided Automation. Since 2010, he has headed the Computer Vision Lab, which is part of the Institute of Visual Computing & Human-Centered Technology, founded in 2018, a position he has held since 2019. His research interests are 3D Computer Vision, including Range Finder, Stereovision, Shape from X, Registration, Calibration, Robot Vision, Machine- and Deep Learning for Computer Vision, Video data analysis (Motion and Tracking), Automated Document Analysis, Multispectral Imaging, Virtual- and Augmented Reality, and Applications in Industry and Cultural Heritage Preservation. He has edited 17 conference proceedings and is the author or co-author of more than 300 refereed scientific publications, published in journals as well as at several international conferences and workshops. He served in many program committees for international conferences and as a member of the editorial board and referee for international journals and conferences. He is also the Vice President of the Austrian Association for Pattern Recognition (AAPR/OAGM), the Austrian branch of IAPR, and a member of the IEEE.

Rafael Sterzinger

Rafael Sterzinger is a PhD student at the Computer Vision Lab within the Institute of Visual Computing & Human-Centered Technology at TU Wien Informatics. His doctoral research focuses on meta-learning for rapid domain adaptation of vision-language models to specialized data-scarce domains, and he is supervised by Robert Sablatnig. He holds a master’s degree in Computer Science from ETH Zürich (2020-2023), where he worked on machine learning approaches for power grid optimization under the supervision of Prof. Andreas Krause, and a bachelor’s degree in Business Informatics from TU Wien Informatics (2017-2020). His research interests center on computer vision, meta-learning, and probabilistic artificial intelligence. He has published at venues such as ICDAR, ICPR, ICIP, and ICMLA on topics including deep segmentation for extracting art from ancient Etruscan mirrors, human-guided refinement of segmentation masks, and few-shot segmentation of historical maps.

Curious about our other news? Subscribe to our news feed, calendar, or newsletter, or follow us on social media.