Recent Releases of frog
frog - v0.31
[Ko van der Sloot] * use ticcutils > 0.34. NFC normalizations is standard now * use Tokenizer::config_prefix() instead of magic string 'tokconfig-' * code cleanup and quality improvement (cppcheck is very useful)
[Maarten van Gompel] * added frog demo gif
- C++
Published by kosloot over 2 years ago
frog - v0.29
[Ko van der Sloot] * added a fix for https://github.com/LanguageMachines/frog/issues/100 (where Frog created invalid FoLiA in a corner-case) * improved api_test * small code refactoring * require libfolia >= 2.15, for correct working of word correction * improved MWU code. Using Unicode strings and detecting MWU's with a starting Capital.
[Maarten van Gompel] * .gitignore: added build dir
- C++
Published by kosloot about 3 years ago
frog - v0.27
[Ko van der Sloot] Major Release. Internally we always perform a 'deep' morphological analysis. This information is used for XML and JSON output. For the 'classic' Tabbed output, we maintain backward comptability. You need to specify '--deep-morph' to get the deep analysis in the output. You may also specify '--compounds' to get an extra column with compound information.
Other changes: * C++ code quality * adapted to more recent Timbl implementations (Unicode awareness) * Tokenizer: - Better handling of --languages option. - 'und' is now also acceptable as a "language" - Better debugging possibility * Mbma: To many alternatives with Inverted Verbs were generated. As the Tagger doesn't help us directly, we filter on the person of the next word, and only return V/te2I when the next word is 2-nd person
- C++
Published by kosloot over 3 years ago
frog - v0.26
[Ko van der Sloot] * fix for https://github.com/LanguageMachines/frog/issues/96 * code improvements, readability and fixing CppCheck warnings * needs recent ticcutils (>=0.30) * needs newest Timbl (6.8) for more Unicode awarenes * updated GigHub action
[Maarten van Gompel] * added MAINTAINERS file * updated codemeta.json
- C++
Published by kosloot over 3 years ago
frog - v0.25
[Maarten van Gompel] * updated metadata (codemeta.json) following new (proposed) CLARIAH requirements (CLARIAH/clariah-plus#38) * added builds-deps.sh for automatically building and installing dependencies * added Dockerfile and instructions * added support for user-based configuration dirs ($XDGCONFIGHOME/frog), takes precedence over global data dirs * use frogdata 0.21, ucto 0.25
[Ko vd Sloot] * updated Doxygen config file
- C++
Published by proycon almost 4 years ago
frog - v0.24
[Ko vd Sloot] * start using the newest UTF8 aware Timbl and Mbt and Ucto * use NFC normalized UnicodeString more general internaly * added a fix in MBMA codng, to get better reproducable result on different OS/Compiler combinations * lots of small refactoring * bumped library version, because of some API changed
[maarten van Gompel] * merged a patch suggested by Helmut Grohne helmut@subdivi.de - configure.ac: Bug#993123: frog FTCBFS: hard codes the build architecture pkg-config Source: frog Version: 0.20-2 Tags: patch upstream User: debian-cross@lists.debian.org Usertags: ftcbfs frog fails to cross build from source, because configure.ac hard codes the build architecture pkg-config in one place (after correctly detecting the host architecture one). Simply using the correct substitution variable makes frog cross buildable. Please consider applying the attached patch. Helmut Signed-off-by: Maarten van Gompel proycon@anaproy.nl
- C++
Published by kosloot over 4 years ago
frog - v0.20
[Ko vd Sloot] * added Doxygen to the build * added a lot of comment in Doxygen format * adapted to the newest ticcutils version * adapted to latest libfolia * adapted to latest ucto * lots of code refactorings * implemented --JSONin option (server only) * implemented --JSONout option * added a --allow-word-correction option which allows ucto to correct FoLiA Word nodes
[Iris Hendrix] Documentation updates
- C++
Published by kosloot about 6 years ago
frog - v0.19
- added code to use a locally installed Alpino parser
- added code to use a remote Alpino Server
- added code to use (remote) timblservers and mbtservers for alle modules using JSON calls. Stil experimental.
several code refactoring and small fixes:
- memory leaks
- using NER files in non-standard locations
- bug fixes for some corner cases.
frog.*.debug files are cleaned up after 1 day.
- C++
Published by kosloot over 6 years ago
frog - v0.18
Bug fixes and enhancements: * provenance uses new 'generate_id' option in libfolia:processor * solved problems when frogging partly tokenized FoLiA * solved problems when processing with --skip=t * small improvement in compound detection (still more to do...)
- C++
Published by kosloot almost 7 years ago
frog - v0.16
This is the last release using pre FoLiA 2.0 It includes a total rework of the Frog Internals, aiming at better maintainability and hoping for a speedup and a smaller memory footprint. This work will continue in the upcoming release for FoLiA 2.0
Major Changes: * total rework. Not using a FoLiA document as the internal datastructure anymore but a FrogData structure. * use folia::engine for all FoLiA processing * -Q option is NOT supported anymore. It was unreliable anyway * builds on the newest ucto versions only * fix for https://github.com/proycon/LaMachine/issues//135 https://github.com/LanguageMachines/frog/issues/66 * handles some corner cases in FoLiA better * lots of code cleanup * numerous small fixes ( e.g. in NER and MBMA results) * improved working of --languages option * avoid invalid FoLiA: https://github.com/LanguageMachines/frog/issues/60 * fixed memory leaks * better handling of weird FoLiA
[Maarten van Gompel] * added skeleton for new Frog documentation
- C++
Published by kosloot about 7 years ago
frog - v0.15
[Ko vd Sloot] * uctotokenizermod: removed call of (useless) ucto:setSentenceDetection(true) * fix to close the server when a socket fails * when frogging a file, and the docID is NOT specified, use the filename as the docID (filtering out non-NCName characters) * fix building the documentation from TeX files * a lot of small code improvements
[Maarten van Gompel] * added codemeta.json * Fixed python-frog example in documentation (closes #48)
- C++
Published by kosloot about 8 years ago
frog - v0.14
- use TiCC::UniFilter now
- use TiCC::diacritics_filter now
- configuration modernized. OSX build supported too
- XML (FoLiA) files are autodetected
- some more logging and time stamps added
- added code to NER module to override original tags (e.g. from gazeteer)
- C++
Published by kosloot over 8 years ago
frog - v0.13.8
- added -t / --textredundancy option, which is passed to ucto
- set textclass attributes on entities (folia 1.5 feature)
- better textclass handling in general
- multiple types of entities (setnames) are stored in different layers
- some small provisions for 'multi word' words added. mblem may use them other modules just ignore them (seeing a multiword as multi words)
- added --inpuclass and --outputclass options. (prefer over textclass)
- added a --retry option, to redo complete directories, skipping what is done.
- added a --nostdout option to suppress the tabbed output to stdout
- refactoring and small fixes
- C++
Published by kosloot over 8 years ago
frog - v0.13.6
- rework done on compounding in MBMA. (still work in progress)
- lots of improvement in MBMA rule handling. (but still work in progress)
- support for 'glue' rules added.
- support for 'hidden' morphemes added.
- proper CELEX tags are outputted now in the XML
- some structure labels have better names now
- removed exit() calls from library modules (issue #17)
- added languages option which is handled over to ucto too.
- detect multiple languages
- handle a selected language an ignore the rest
- C++
Published by kosloot over 9 years ago
frog - v0.13.4
- added long options --help and --version
- interactive use is limited to TTY's only, so pipes from std in work
- added a --language='name' option. it tries to read the configuration from a subdirectory with 'name' in the configdir The default is 'nl'
- tokenizer timing is fixed at last
- be robust against a missing clex tag
- better warning when OpenMP is not present
- adaptation in mbma
- added 2 convenience functions to FragAPI: getfullmorphanalysis() and getcompound_analysis()
- CompoundType is now in it;s own namespace
- some code refactoring, as usual
- C++
Published by kosloot almost 10 years ago