Recent Releases of https://github.com/acdh-oeaw/arche-lib-ingest
https://github.com/acdh-oeaw/arche-lib-ingest - Honor proxy settings from env vars
- PHP
Published by zozlak 9 months ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer SKIP_SPECIAL skip mode added
the SKIP_SPECIAL mode skips all files with name starting with a dot and Thumbs.db files
- PHP
Published by zozlak 12 months ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer: report new version resource URI
- PHP
Published by zozlak about 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer's automatic versioning rework
Previously the metadata handling on an automatic new version creation was hardcoded (and configurable only with a bool $pidPass parameter). Now the Indexer::setVersioning(), File::upload() and File::uploadAsync() take two callables with signatures
php
`function (
rdfInterface\DatasetNodeInterface $oldMeta,
acdhOeaw\arche\lib\Schema $repoSchema
): array{rdfInterface\DatasetNodeInterface $oldMeta, rdfInterface\DatasetNodeInterface $newMeta}
and
php
function (
acdhOeaw\arche\lib\RepoResource $old,
acdhOeaw\arche\lib\RepoResource $new
): void
The first one should generate old and new version metadata according to a given repository's business logic. The second one is responsible for doing any metadata adjustments which require both old and new version resource to exist (e.g. updating references which pointed to the old version to point to the new one).
A sample implementation of such a handler can be found in the tests/IndexerTest.php::testNewVersionCreation()
- PHP
Published by zozlak about 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Reduce concurrency on reattempts
- PHP
Published by zozlak about 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Redmine class restored
For unknown reasons the acdhOeaw\arche\lib\ingest\Redmine class was silently removed in 4.0. Now it's back
- PHP
Published by zozlak about 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
SkosVocabulary::preprocess(): do not add self-pointing parent link to the schema resource
- PHP
Published by zozlak over 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - arche-lib bumped to ^7
- PHP
Published by zozlak over 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - PHP 8.4 deprecation fixes
- PHP
Published by zozlak over 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - File::updateAsync(): avoid unnecessary metadata update
- PHP
Published by zozlak over 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Various minor fixes
MetadataCollection: terminate ingestion on terminal errors even in error mode passFile: handle versioning when a repository resource exists but lacks binary (update it with the binary without new version creation then)- CI tuning
- PHP
Published by zozlak over 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Retry if "deadlock detected" reported on a server side
- PHP
Published by zozlak over 1 year ago
https://github.com/acdh-oeaw/arche-lib-ingest - Allow PHP ^8.1
- PHP
Published by zozlak almost 2 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
- PHP
Published by zozlak almost 2 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Allow PHP 8.2
- PHP
Published by zozlak about 2 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Ported from EasyRdf to RdfInterface
- PHP
Published by zozlak about 2 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Ported from EasyRdf to RdfInterface
- PHP
Published by zozlak about 2 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - FileId class created
The file path to repository resource id translation code extracted as a separate class (acdhOeaw\arche\lib\ingest\util\FileId) allowing easy reuse in different libraries.
- PHP
Published by zozlak over 2 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
- Fixed network connection recognition in
acdhOeaw\arche\lib\ingest\MetadataCollection::import() -
acdhOeaw\arche\lib\ingest\MetadataCollection::import()andacdhOeaw\arche\lib\ingest\Indexer::import(): waiting before reingestion attempt on network errors tuned
- PHP
Published by zozlak almost 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\arche\lib\ingest\File::uploadAsync() emits the progress message for resources skipped on SKIP_NOT_EXIST
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - MetadataCollection and Indexer retry on network errors
Until now any network error just interrupted the ingestion. Now network errors are treated in (almost) the same way as conflict - a retry up to a given limit per resource is being made.
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Redmine class added
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Tuning
acdhOeaw\arche\lib\ingest\Indexer::pathToUtf8(): assume UTF-8 on linux systems
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\arche\lib\ingest\Indexer::index(): assure id prefix ends with a slash for flat-structured imports.
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
The way acdhOeaw\arche\lib\ingestIndexer::creatFile() generetes an id fixed for the flat structure ingestions (now the ids are also generated "flat" by combining just the id prefix and the filename).
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer enhancements
acdhOeaw\arche\lib\ingest\Indexerallows to combine multiple skip modes (SKIP_NOT_EXIST | SKIP_BINARY_EXISTcan be useful)acdhOeaw\arche\lib\ingest\Indexer::import()supports newIndexer::ERRMODE_CONTINUEerror mode allowing to continue the ingestion no matter errors.
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer: prolong transaction if listing files takes too long
- PHP
Published by zozlak about 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bump uri-normalizer to v2
- PHP
Published by zozlak over 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - SkosVocabulary class tuning
acdhOeaw\arche\lib\ingest\SkosVocabulary::assureTitles() - if everything fails, create a title from URI.
- PHP
Published by zozlak over 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - SkosVocabulary class added
A new class (acdhOeaw\arche\lib\ingest\SkosVocabulary) for SKOS vocabularies ingestion added.
It's a specialization of acdhOeaw\arche\lib\ingest\MetadataCollection class with:
- Additional configurable preprocessing steps added.
- Vocabulary binary ingestion.
- Removal of obsolete vocabulary resources from the repository.
- PHP
Published by zozlak over 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Allow arche-lib 5
- PHP
Published by zozlak over 3 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - MetadataCollection::import - handle NotFound exceptions just like Conflict ones
- PHP
Published by zozlak almost 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - MetadataCollection - make error reporting a little more verbose
- PHP
Published by zozlak almost 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - MetadataCollection - introduce two debug levels
acdhOeaw\arche\lib\ingest\MetadataCollection::$debug can now have following values:
falseor0- no debug messages at all- -
trueor1- basic information on preprocessing stages and detailed information on ingestion progress - -
2- detailed information on both preprocessing and ingestion progress
foo
- PHP
Published by zozlak almost 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\arche\lib\ingest\MetadataCollection ingestion progress meter fixed
- PHP
Published by zozlak almost 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Small fixes
- Ingestion chunk size is now not bigger than
$concurrency * 100giving the$errorMode = ERRMODE_FAILchances to fail early for large ingestions. 409 Transaction xyz lockedARCHE REST API response being handled correctly (as an ordinary 409 Conflict error).
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
Required PHP version constraint fixed in composer.json
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\arche\lib\ingest\Indexer() - harden against slash at the end of the directory path.
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\arche\lib\ingest\Indexer::createFile() - top level directory recognition fixed.
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Parallel ingestion conflicts handling added
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\arche\lib\ingest\MetadataCollection::import() - report ARCHE response error messages while importing in the ERRMODE_PASS mode.
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - 3.0.0
New features
- Parallel ingestion
Backward-incompatible changes
- TODO
Bugfixes
- TODO
- PHP
Published by zozlak about 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Allow arche-lib v4
- PHP
Published by zozlak over 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Allow arche-lib 3.0.0
- PHP
Published by zozlak over 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Made compatible both with guzzle/psr7 v1 and v2
- PHP
Published by zozlak over 4 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - 2.0.0
Adjusted to arche-lib 2.0.0
- PHP
Published by zozlak almost 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Minor improvements
MetadataIndexer normalizes all triple object URIs now as at the end of a day all of them cause repo resource creation and end up as identifiers.
- PHP
Published by zozlak about 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Adapt for arche-schema v2.0
Arche-schema v2.0 doesn't allow filename property on the acdh:Collection class which is typically used for directories, therefore arche-lib-ingest stops providing the property indicated in config by $.schema.fileName for directories.
- PHP
Published by zozlak about 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
Hardcoded property URIs removed from File::getMetadata().
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
No need for the hack from 1.6.1 as the solution has been introduced to the easyrdf library 1.14.3
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
acdhOeaw\acdhRepoIngest\MetadataCollection hardened against EasyRdf\Literal\Date and EasyRdf\Literal\DateTime issues.
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Use schema.isNewVersionOf for versioning
The \acdhOeaw\acdhRepoIngest\schema\SchemaObject::createNewVersion() now uses only schema.isNewVersionOf to denote old<->new resource relationship now.
The change follows dropping of inverse properties from the arche-schema v2.
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
Avoid truncating long repository REST API error messages while reporting import errors in MetadataCollection class.
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Minor fixes
Prolong the repository transaction during the acdhOeaw\acdhRepoIngest\MetadataCollection::filterResources()
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - MetadataCollection enchantments
acdhOeaw\acdhRepoIngest\MetadataCollection::preprocess()extracted fromacdhOeaw\acdhRepoIngest\MetadataCollection::index()which allows to solve an issue with preprocessing taking longer than repository's transaction timeout. To assure backward compatibilityindex()callspreprocess()when needed (if it wasn't called before).acdhOeaw\acdhRepoIngest\MetadataCollection::setAddTitle()added allowing to adjust the automatic title creation behaviour.- Automatic title creation for resources missing it turned off by default.
- Obsolete code removed from the
acdhOeaw\acdhRepoIngest\MetadataCollectionclass.
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - URI normalization rules from package instead of a config
Instead of providing URI normalization rules for resource identifiers from the config file they are now read from the require acdh-oeaw/uri-norm-rules composer package.
- PHP
Published by zozlak over 5 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Added support for AmbiguousMatch errors to MetadataCollection::index()
- PHP
Published by zozlak almost 6 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - MetadataIndexer error mode added
acdhOeaw\acdhRepoIngest\MetadataIndexer::index() now takes an optional third parameter allowing to set the error mode:
MetadataIndexer::ERRMODE_FAILthe default mode in which a first HTTP 400 response generated on a repository resource creation/update breaks the import.MetadataIndexer::ERRMODE_PASSin this mode HTTP 400 responses don't break the import but turn of the autocommit and cause an error to be thrown at the end of import. This mode allows to collect metadata problems with all resources speeding up the curation process.
- PHP
Published by zozlak almost 6 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer::setParent() strictLocations parameter added
An additional bool $strictLocations parameter added to the acdhOeaw\acdhRepoIngest\Indexer::setParent() method allowing to choose how strictly contained paths described in the parent resource's metadata should be checked.
- PHP
Published by zozlak almost 6 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
- PHP
Published by zozlak almost 6 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Bugfixes
- PHP
Published by zozlak almost 6 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Indexer class API adjusted
The containerDir and the containerToUriPrefix configuration properties were removed.
As they are different for every ingestion it made no sense to store them in a common configuartion file.
They are now taken by the acdhOeaw\acdhRepoIngest\Indexer class constructor instead.
- PHP
Published by zozlak almost 6 years ago
https://github.com/acdh-oeaw/arche-lib-ingest - Initial release
- PHP
Published by zozlak almost 6 years ago