A Software System for Semi-Automatic Processing of Historical Handwritten Arabic Documents

The HADARA System is a modular toolkit for processing historical documents. It provides particular support for handwritten Arabic documents. Several processing techniques are available as different modules that can be chained up to complex processes. Built upon these modules and a powerful framework, the HADARA System also contains several applications and tools for users such as historians and linguists as well as for developers and researchers in the field of document image analysis, image processing, and pattern recognition.

The source code of the HADARA System is designed to be interpreted by Mathworks' MATLAB® environment.


Among the applications of the HADARA System are:

HADARA Tool (main application):
  • annotation of documents
  • word spotting
  • transcription searching
  • create configurations consisting of processing chains and module settings
  • define the information flow between modules
Palaeography GUI:
  • add palaeographic comments to document images and any kind of pre-annotated segments
  • can re-use annotation files from HADARA Tool
Writer Analysis GUI:
  • perform several kinds of writer analysis on document images
  • (not yet open sourced)

For more details about these applications and other tools, please refer to the project documentation (available soon).


Annotation View

HADARA Tool: Annotation View

Paleography Tool

Paleography Tool

Setup GUI

Setup GUI

See Also

The HADARA80P dataset provides a comprehensive ground truth to aid in the development and evaluation of keyword spotting approaches targeting Arabic manuscripts.


The HADARA System and source code is distributed under the GNU General Public License Version 2, or (at your option) any later version.

We are really interested in the usage of this software. If you have any feedback, suggestions or comments, please contact Volker Märgner (email).


For paper authors: We appreciate references to the following paper whenever the HADARA System or parts of it have been used in any way related to your paper:

  1. W. Pantke, V. Märgner, D. Fecker, T. Fingscheidt, A. Asi, O. Biller, J. El-Sana, R. Saabni, and M. Yehia, "HADARA – A software system for semi-automatic processing of historical handwritten Arabic documents" in Proc. Archiving Conf. 2013, Washington DC, USA, April 2013, pp. 161–166.


This work has been funded by the German Research Foundation (DFG) within the scope of the research grant FI 1494/3-2.


Downloading of the source will be available soon. Meanwhile, please contact Volker Märgner (email).