Recent Releases of ocr-fileformat

ocr-fileformat - v0.8.1

What's Changed

  • update page-to-alto to v2.0.1 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/190
  • Add ORCID for author in citation file by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/182

Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.8.0...v0.8.1

- JavaScript
Published by stweil 9 months ago

ocr-fileformat - v0.8.0

What's Changed

  • update page-to-alto to v1.4.1 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/188
  • install page2img by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/189

Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.7.0...v0.8.0

- JavaScript
Published by stweil 10 months ago

ocr-fileformat - v0.7.0

What's Changed

  • Add transformation from hOCR to TEI and update transformation matrix by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/170
  • update textract2page to include slub/textract2page#13 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/171
  • update vendor/page-to-alto v1.2.0 -> v1.3.0 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/172
  • Update Dockerfile, fix #173 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/174
  • update textract2page by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/177
  • update textract2page (for valid @conf ranges) by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/180
  • update textract2page (v 0.2 - full LAYOUT etc.) by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/186

Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.6.0...v0.7.0

- JavaScript
Published by stweil over 1 year ago

ocr-fileformat - v0.6.0

What's Changed

  • Add CodeQL workflow for GitHub code scanning by @lgtm-com in https://github.com/UB-Mannheim/ocr-fileformat/pull/155
  • gcv__page: use -source-json instead of -source-xml by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/156
  • make install: use newline in sed c cmd by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/158
  • Add textract2page by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/160
  • ensure venv for Python tools by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/162
  • add PRImA converter for GCV→ALTO by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/163
  • Update Makefile to support macOS by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/165
  • update textract2page, hOCR-to-ALTO and alto-schema by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/166
  • Fix two issues reported by CodeQL CI by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/161
  • Fix broken conversions from hOCR to ALTO by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/167
  • Replace broken Travis CI by GitHub action by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/168
  • Use first bash from PATH (allows running on macOS) by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/169

New Contributors

  • @lgtm-com made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/155

Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.5.0...v0.6.0

- JavaScript
Published by stweil over 2 years ago

ocr-fileformat - v0.5.0

What's Changed

  • ⬆️ Update JPageConverter to 1.5.05 by @mikegerber in https://github.com/UB-Mannheim/ocr-fileformat/pull/131
  • update hocr2alto to include filak/hOCR-to-ALTO#23 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/130
  • page schemas: use github not primaresearch.org by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/132
  • Page to alto python by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/134
  • [doc][fix] clear README cli links by @M3ssman in https://github.com/UB-Mannheim/ocr-fileformat/pull/141
  • Add ImageWare MyBib to ALTO conversion by karkraeg, fix #139 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/140
  • page__alto: process all arguments by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/142
  • when converting to PAGE, always use latest schema by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/146
  • docker: unlimit POST upload size, #136 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/137
  • Update Saxon-HE by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/144
  • Use git submodules by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/148
  • update page-to-alto by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/152
  • page to text: rewrite by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/151
  • Update SaxonHE to version 11.2 by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/149
  • vendor/Makefile: page-to-alto is phony by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/154

New Contributors

  • @mikegerber made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/131
  • @M3ssman made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/141
  • @bertsky made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/142

Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.4.0...v0.5.0

- JavaScript
Published by stweil over 3 years ago

ocr-fileformat - v0.4.0

Update JPageConverter and saxon9he, drop support for Python 2

- JavaScript
Published by stweil over 5 years ago

ocr-fileformat - v0.3.2

  • Fix error handling for missing wget, unzip or git

- JavaScript
Published by stweil over 5 years ago

ocr-fileformat - v0.3.1

  • Improve error handling for missing wget, unzip or git

- JavaScript
Published by stweil over 5 years ago

ocr-fileformat - v0.3.0

  • Improve PAGE support
  • Update ALTO support
  • Add new conversions, e.g. hOCR to TEI, ABBYY to hOCR, PAGE to ALTO, ABBYY / ALTO / GCV / hOCR to PAGE, GCV to hOCR
  • Add new command line option --version
  • Fix bugs

- JavaScript
Published by stweil about 6 years ago

ocr-fileformat -

Fixed

  • Fix download button in web interface #73
  • Fix https URL in Docker builds #75

Changed * Tab bar above input #72 * Example URLs via https

Added * make help

- JavaScript
Published by kba about 8 years ago

ocr-fileformat - Add transformation gcv2hocr and fixes some issues with web interface

  • Support new transformation from google cloud vision format to hocr
  • Fix format switching in transform web interface
  • Produce valid HTML
  • Use eslint for JS code style checking
  • Use best practices for Dockerfile

- JavaScript
Published by kba about 8 years ago

ocr-fileformat - Update to new URLs for ABBYY schema and Docker fixes

  • Docker fixes (busybox/alpine incompatibilities + allow overriding web config) and add documentation for Docker https://github.com/UB-Mannheim/ocr-fileformat/pull/33, https://github.com/UB-Mannheim/ocr-fileformat/pull/45, https://github.com/UB-Mannheim/ocr-fileformat/pull/53
  • Update URLs to ABBYY schemas, add new PAGE format 2016-07-15 https://github.com/UB-Mannheim/ocr-fileformat/commit/fded289165d557ba016fc83f5fbbf034295313eb
  • Switch to official filak/hOCR-to-ALTO repo, linking language codes lookup xml https://github.com/UB-Mannheim/ocr-fileformat/pull/48, https://github.com/UB-Mannheim/ocr-fileformat/issues/46, https://github.com/UB-Mannheim/ocr-fileformat/pull/52

- JavaScript
Published by zuphilip almost 9 years ago

ocr-fileformat - Improved web interface, code cleanup and script support

  • Add option to run arbitrary scripts: In addition to XSD/XSLT, arbitrary executable scripts can be placed in ./script/validate and ./script/transform/, written in Python, bash or compiled C code.
  • Validation: hocr against hocr-check from tmbdev/hocr-tools
  • Web interface: Download button for transformation results
  • Web interface: Support file uploads for transformation and validation
  • Enable ALTO/hocr to plain text transformations
  • Code cleanup of the shared shell script library

More details: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.1.0...v0.2.0

- JavaScript
Published by zuphilip over 9 years ago

ocr-fileformat - Add transformation: alto2 -> alto3

  • Add transformation from alto2 to alto3: alto2.0__alto3.0.xsl. Thanks to @cneud !
  • Normalize project name and fix some links
  • Makefile: release goal

More details: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.0.1...v0.0.2

- JavaScript
Published by zuphilip over 9 years ago

ocr-fileformat -

- JavaScript
Published by kba over 9 years ago

ocr-fileformat - Initial commit

Initial commit - Transform hOCR <-> ALTO 2.0/2.1 - Validate ALTO 1/2/3, ABBYY 6,8,9,10, PAGE

- JavaScript
Published by kba almost 10 years ago