Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
OCR engine for all the languages