Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
OCR engine for all the languages
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets