Recent Releases of ocr-fileformat
ocr-fileformat - v0.8.1
What's Changed
- update page-to-alto to v2.0.1 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/190
- Add ORCID for author in citation file by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/182
Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.8.0...v0.8.1
- JavaScript
Published by stweil 9 months ago
ocr-fileformat - v0.8.0
What's Changed
- update page-to-alto to v1.4.1 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/188
- install page2img by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/189
Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.7.0...v0.8.0
- JavaScript
Published by stweil 10 months ago
ocr-fileformat - v0.7.0
What's Changed
- Add transformation from hOCR to TEI and update transformation matrix by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/170
- update textract2page to include slub/textract2page#13 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/171
- update vendor/page-to-alto v1.2.0 -> v1.3.0 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/172
- Update Dockerfile, fix #173 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/174
- update textract2page by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/177
- update textract2page (for valid @conf ranges) by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/180
- update textract2page (v 0.2 - full LAYOUT etc.) by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/186
Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.6.0...v0.7.0
- JavaScript
Published by stweil over 1 year ago
ocr-fileformat - v0.6.0
What's Changed
- Add CodeQL workflow for GitHub code scanning by @lgtm-com in https://github.com/UB-Mannheim/ocr-fileformat/pull/155
- gcv__page: use -source-json instead of -source-xml by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/156
- make install: use newline in sed c cmd by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/158
- Add textract2page by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/160
- ensure venv for Python tools by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/162
- add PRImA converter for GCV→ALTO by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/163
- Update Makefile to support macOS by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/165
- update textract2page, hOCR-to-ALTO and alto-schema by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/166
- Fix two issues reported by CodeQL CI by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/161
- Fix broken conversions from hOCR to ALTO by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/167
- Replace broken Travis CI by GitHub action by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/168
- Use first bash from PATH (allows running on macOS) by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/169
New Contributors
- @lgtm-com made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/155
Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.5.0...v0.6.0
- JavaScript
Published by stweil over 2 years ago
ocr-fileformat - v0.5.0
What's Changed
- ⬆️ Update JPageConverter to 1.5.05 by @mikegerber in https://github.com/UB-Mannheim/ocr-fileformat/pull/131
- update hocr2alto to include filak/hOCR-to-ALTO#23 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/130
- page schemas: use github not primaresearch.org by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/132
- Page to alto python by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/134
- [doc][fix] clear README cli links by @M3ssman in https://github.com/UB-Mannheim/ocr-fileformat/pull/141
- Add ImageWare MyBib to ALTO conversion by karkraeg, fix #139 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/140
- page__alto: process all arguments by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/142
- when converting to PAGE, always use latest schema by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/146
- docker: unlimit POST upload size, #136 by @kba in https://github.com/UB-Mannheim/ocr-fileformat/pull/137
- Update Saxon-HE by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/144
- Use git submodules by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/148
- update page-to-alto by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/152
- page to text: rewrite by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/151
- Update SaxonHE to version 11.2 by @stweil in https://github.com/UB-Mannheim/ocr-fileformat/pull/149
- vendor/Makefile: page-to-alto is phony by @bertsky in https://github.com/UB-Mannheim/ocr-fileformat/pull/154
New Contributors
- @mikegerber made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/131
- @M3ssman made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/141
- @bertsky made their first contribution in https://github.com/UB-Mannheim/ocr-fileformat/pull/142
Full Changelog: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.4.0...v0.5.0
- JavaScript
Published by stweil over 3 years ago
ocr-fileformat - v0.4.0
Update JPageConverter and saxon9he, drop support for Python 2
- JavaScript
Published by stweil over 5 years ago
ocr-fileformat - v0.3.2
- Fix error handling for missing wget, unzip or git
- JavaScript
Published by stweil over 5 years ago
ocr-fileformat - v0.3.1
- Improve error handling for missing wget, unzip or git
- JavaScript
Published by stweil over 5 years ago
ocr-fileformat - v0.3.0
- Improve PAGE support
- Update ALTO support
- Add new conversions, e.g. hOCR to TEI, ABBYY to hOCR, PAGE to ALTO, ABBYY / ALTO / GCV / hOCR to PAGE, GCV to hOCR
- Add new command line option
--version - Fix bugs
- JavaScript
Published by stweil about 6 years ago
ocr-fileformat -
Fixed
- Fix download button in web interface #73
- Fix https URL in Docker builds #75
Changed * Tab bar above input #72 * Example URLs via https
Added
* make help
- JavaScript
Published by kba about 8 years ago
ocr-fileformat - Add transformation gcv2hocr and fixes some issues with web interface
- Support new transformation from google cloud vision format to hocr
- Fix format switching in transform web interface
- Produce valid HTML
- Use eslint for JS code style checking
- Use best practices for Dockerfile
- JavaScript
Published by kba about 8 years ago
ocr-fileformat - Update to new URLs for ABBYY schema and Docker fixes
- Docker fixes (busybox/alpine incompatibilities + allow overriding web config) and add documentation for Docker https://github.com/UB-Mannheim/ocr-fileformat/pull/33, https://github.com/UB-Mannheim/ocr-fileformat/pull/45, https://github.com/UB-Mannheim/ocr-fileformat/pull/53
- Update URLs to ABBYY schemas, add new PAGE format 2016-07-15 https://github.com/UB-Mannheim/ocr-fileformat/commit/fded289165d557ba016fc83f5fbbf034295313eb
- Switch to official filak/hOCR-to-ALTO repo, linking language codes lookup xml https://github.com/UB-Mannheim/ocr-fileformat/pull/48, https://github.com/UB-Mannheim/ocr-fileformat/issues/46, https://github.com/UB-Mannheim/ocr-fileformat/pull/52
- JavaScript
Published by zuphilip almost 9 years ago
ocr-fileformat - Improved web interface, code cleanup and script support
- Add option to run arbitrary scripts: In addition to XSD/XSLT, arbitrary executable scripts can be placed
in
./script/validateand./script/transform/, written in Python, bash or compiled C code. - Validation: hocr against hocr-check from tmbdev/hocr-tools
- Web interface: Download button for transformation results
- Web interface: Support file uploads for transformation and validation
- Enable ALTO/hocr to plain text transformations
- Code cleanup of the shared shell script library
More details: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.1.0...v0.2.0
- JavaScript
Published by zuphilip over 9 years ago
ocr-fileformat - Add transformation: alto2 -> alto3
- Add transformation from alto2 to alto3:
alto2.0__alto3.0.xsl. Thanks to @cneud ! - Normalize project name and fix some links
- Makefile: release goal
More details: https://github.com/UB-Mannheim/ocr-fileformat/compare/v0.0.1...v0.0.2
- JavaScript
Published by zuphilip over 9 years ago
ocr-fileformat - Initial commit
Initial commit - Transform hOCR <-> ALTO 2.0/2.1 - Validate ALTO 1/2/3, ABBYY 6,8,9,10, PAGE
- JavaScript
Published by kba almost 10 years ago