Recent Releases of colrev
colrev - Version 0.14.0
- Replace poetry by uv (#611)
- Write internal BibTeX parser to replace pybtex (#605)
- Replace pkg_resources with importlib (#605)
- Replace zope interfaces with abstract base classes (#610)
- Add Prospero search source (#586)
- Add PLOS search source (#594)
- Extract colrev-sync to a separate (PyPI) package
- Implement
colrev convert
- Python
Published by geritwagner 12 months ago
colrev - Version 0.13.2
- Minor release to install with updated pre-commit
- Python
Published by geritwagner about 1 year ago
colrev - Version 0.13.1
- Minor release to install with updated dependencies
- Python
Published by geritwagner about 1 year ago
colrev - Version 0.13.0
- Restructure package management, moving dependencies to built-in packages (#442)
- Relay prep requirements (#529)
- Add GitHub SearchSource (#468), Unpaywall SearchSource (#469), SpringerLink SearchSource (#466), OSF SearchSource (#471)
- Refactor other SearchSources
- Replace dacite by pydantic
- Stop Docker containers
- CLI: option to add packages interactively
- Testing and bugfixes in built-in packages (paper-md, files_dir, aisel)
- Update docs (add asciinema demonstration)
- Python
Published by geritwagner over 1 year ago
colrev - Version 0.12.3
- Extend documentation (package development, package summaries, asciinema demo)
- Bugfixes and codebase improvements (e.g., package management and discovery, closing sqlite connections)
- Reduce dependencies (e.g., levenshtein, PyPDF2, pdfminer, daff, psutil)
- Refactor colrev.bibliography_export (add writers)
- Extend tests: cover MacOS and Python 3.12
- Remove unnecessary options (e.g., init --localpdfcollection)
- Add and test support for GitHub codespaces
- Python
Published by geritwagner over 1 year ago
colrev - Version 0.12.2
- Update CoLRev packages (including interfaces, development docs etc.)
- Refactoring (local-index)
- Implement json-loader
- Make ui_web (dash, blinker) optional to prevent errors in WSL
- Bugfixes
- Python
Published by geritwagner almost 2 years ago
colrev - Version 0.12.1
- Refactor and test (dataset, records, provenance, local_index)
- Extract package_manager into a separate internal package
- Use bib-dedupe for matching (instead of simple similarities)
- Update docs
- Python
Published by geritwagner almost 2 years ago
colrev - Version 0.12.0
Added
- Add linter
colrev_records_variable_naming_convention - Test coverage increased from 71% to 80%
Changed
- Split
records,dataset, createdrecordspackage. - Extracted
processas a separate package. - Implemented loaders as a separate package, created a standard interface. SearchSources now create the specific mapping of IDs, entrytypes and fields.
- Moved field standardization from
loadto SearchSources. - Extended use of constants
- SearchSourceInterface: renamed
run_searchtosearch, preferprep_link_mdoverget_masterdata - Renamed and refactored
GeneralOriginFeedtoSearchAPIFeed - Pass record objects instead of dicts (in
local_indexin particular) - Replaced unnecessary keyword arguments by positional arguments
- Moved
zotero_translation_servicetobibliography_exportpackage - Consolidated code for reference parsing in
tei_parser - Upgraded Grobid to 0.8.0
Removed
- Removed dead code
- Dropped
INCONSISTENT_WITH_DOI_METADATA transitionsdependency
Fixed
- Do not require review_manager for
colrev env -i - Fixed
status_stats, including special cases. - Repository registration: resolve() and absolute() path
- Python
Published by geritwagner almost 2 years ago
colrev - Version 0.11.0
Added
- Separate PDF quality model (#268)
download_from_websitepdf-get package- Separate loader utilities for nbib, ris, bib
- SearchSources: SemanticScholar (#288), Arxiv (#203)
- Constants module for Fields, ENTRYTYPES, etc.
- CEP003 for SearchSources
- New default dedupe package based on bib-dedupe
- Colrev pandas for Jupyter notebooks
- GitHub actions: pip-install test, make documentation
Changed
- Integrated
colrev.resolve_crossrefsintoload_utils_bib.py - Defect codes can be ignored based on the
IGNORE:prefix (#269) - Documentation for setup (VM, MacOS, WSL)
- Revised interfaces for SearchSources
- Integrated: pdfdir + videodir > files_dir
- poetry extras
- Backward search: export of parameters and expected sample sizes
- Replace thefuzz witz rapidfuzz
Removed
- Package based on dedupe-io, including incompatible dependencies
- Crossref resolution package (integrated in bib-loader)
- Python
Published by geritwagner about 2 years ago
colrev - Version 0.10.4
Fixed
- Removed unstable test case
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.10.3
Changed
- GitHub actions for CoLRev updates now install with Poetry because the fixed dependencies are more stable compared to pip installation
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.10.2
Fixed
- paper_md: export BibTeX file and replace keys containing
.to prevent pandoc error
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.10.1
Changed
- SearchTypes: API, TOC, MD are added, PDFS is replaced by FILES.
- SearchTypes are explained in the docs.
- Package documentation is imported to docs.
- colrev.pdfsdir and colrev.videodir are integrated into colrev.files_dir.
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.10.0
Added
- SearchSources: SYNERGY datasets, OpenAlex, ERIC, IEEEXplore, ArXiv
- JournalRankings: index, prep, and prescreen
- CoLRev shell via cli-repl (
colrev shell) - prep operation: pause and resume
- Dashboard overview of the sample and project status
- Extended tests, updated documentation (especially for extension development)
- GitHub workflows to update dependencies (poetry update)
- Ruff linter
Changed
- Load: ris/csv/... files are loaded directly (without creating intermediate BibTeX file)
- Introduced namespaced fields (e.g.,
colrev.pubmed.pubmedidinstead ofpubmedid) - Extracted quality checks to separate Quality Model
- Docs: instructions for development setup
- Code quality improvements (codacy)
Removed
- colrev-asreview: extracted to separate package
- watchdog-based service
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.9.3
Changed
- Introduced namespaced fields (e.g.,
colrev.pubmed.pubmedidinstead ofpubmedid).
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.9.2
Changed
- Updated colrev-asreview dependency (PyPI instead of GitHub)
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.9.1
Changed
- Integrated
loadintoSearchSource. Removedload_conversionendpoint:settings.json,packages,interfaceetc.
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.9.0
Added
- The
quality_modelwas created to check for quality defects - The
auto_upgradeflag allows users to enable/disable automated upgrades - All-contributors bot to acknowledge contributions to CoLRev
- Implemented OpenLibrary as a SearchSource
- Pylint check for direct assignment of colrev_status
- Test battery for built-in SearchSources (heuristics, load, prep)
- Backward-search comparison with OpenCitations data
Changed
- Refactored
language_service - Refactored the tests (
conftest.pynow provides thebase_repo_review_managerfixture) - Changed pdf-hash (pdf to image) from poppler to mupdf for cross-platform compatibility (
cpid1->cpid2) - Local settings changed from yaml to json
- Quality defects (colrevmasterdataprovenance notes) change
- The
colrev.global_ids_consistency_checkprep-endpoint is removed (integrated into the quality model) - Individual quality checks can be disabled through the
prep/defects_to_ignoresettings - Update the Github action workflows in CoLRev repositories
Removed
timeout-decoratordependency (for better compatibility with MacOS)- Docker image
pdf-hash-service(replaced by mupdf) - Redundant fields for the backward search are removed (
cited_by_fileandcited_by_id)
Fixed
- Documentation: typos and inconsistencies
- Codacy issues and refactored complex files 1
- Windows paths in
iter_commit(git history)
- Python
Published by geritwagner over 2 years ago
colrev - Version 0.8.4
Changed
- Implemented new quality model
- Quality defects (colrevmasterdataprovenance notes) change
- The
colrev.global_ids_consistency_checkprep-endpoint is removed (integrated into the quality model) - Individual quality checks can be disabled through the
prep/defects_to_ignoresettings - Redundant fields for the backward search are removed (
cited_by_fileandcited_by_id)
- Python
Published by geritwagner almost 3 years ago
colrev - Version 0.8.3
Changed
- CoLRev pdf IDs are now based on the mupdf library
- Python
Published by geritwagner almost 3 years ago
colrev - Version 0.8.2
Fixed
- Fix InvalidGitRepositoryError (raised upon status in empty directories)
- Python
Published by geritwagner almost 3 years ago
colrev - Version 0.8.1
Changed
- Update the GitHub action workflows in CoLRev repositories
- Add auto-upgrade flag to settings
- Python
Published by geritwagner almost 3 years ago
colrev - Version 0.8.0
Added
- Unit tests: increased test coverage to 70%, added Github actions matrix tests across OS and Python versions
- Completed OpenSSF Best Practices checks (1)
- Added forward and backward searches based on OpenCitations
- Moved documentation to readthedocs and revised documentation
- Added dependabot and pre-commit.ci: automated code and secrity checks
- Added support for Github actions, distinguishing packages that are supported in ci-environments (
ci_supportedflag) - Added Pubmed API searches and metadata preparation support
- Option to initialize and run CoLRev repositories without requiring Docker
- Overview video presented at ESMARConf2023 1
- CITATION.cff and Zenodo
- API-searches for the AIS eLibrary
Changed
- Numerous modifications based on the user tests
- Replaced OpenSearch with sqlite
- SearchSource interface:
run_searchandadd_packageare now mandatory - Documentation review, including detailed information on development status
- Consistent setup of Github actions (test, publish to PyPI)
- Built-in packages renamed from
colrev_built_intocolrev - Data package
manuscriptrenamed topaper_md - Simplified upgrade operation and activated upgrades per default
- Extracted and refactored language-service
Fixed
- Several bugfixes
- Python
Published by geritwagner almost 3 years ago
colrev - Version 0.7.1
Added
- Github action: publish to PyPI
- Python
Published by geritwagner almost 3 years ago
colrev - Version 0.7.0
Added
- Add retrieve and pdfs as high-level operations
- Metadata preparation can add records to separate origin feeds
- Initial package manager functionality (registering packages and displaying them in the docs)
- Search: update of records and propagation of changes
- Several SearchSources (including SearchSource query validation)
- Revisions of CLI (verbose mode, user feedback)
- Colrev merge (reconciliation coding when merging git branches)
- dedupe --merge/--unmerge
- Integrated colrev pre-commit hooks
- PRISMA diagram (data endpoint)
- Obsidian (data endpoint)
- Preparation: not-in-toc exception/warning
- Setup of pytests
Changed
- Curated records are now explicitly identified through curation_IDs
- Revise colrev validate (commits, users, properties)
- Detailed advisor (using get_advice() for data endpoints)
- Performance improvements and simplification of status (cli)
- Moved correction functionality to SearchSources (refactored correction path)
- Preparation: simplified preparation rounds (default settings)
- Retrieve TEIs through local_index (if available) instead of recreating it
- Replace pathos by Threadpool
- Revise the documentation
- Revise and extend exceptions
Removed
- Remove persistent colrev-ids
- Remove realtime review
- Dependencies ansiwrap and p-tqdm
Fixed
- **kwargs calls in ReviewManager
- Indexing of non-curated records
- Address special cases in dedupe (active learning)
- Python
Published by geritwagner about 3 years ago
colrev - Version 0.6.0
Added
- Web-based editor for project settings
- Comprehensive architecture refactoring
- Conformance with pylint, mypy, flake8
- Introduced packages
- Updated file and directory structure
- Documentation of modules, classes, and methods
- Github-pages as a data package_endpoint
Changed
- Renamed from colrev_core to colrev (integrated cli)
- Switch to poetry for dependency management
- Renamed scripts to package_endpoints
- PDF-hash generation based on Docker to avoid platform dependency issues
- Switch to Jinja templates (instead of concatenating multiple strings)
Fixed
- Concurrent request session handling
- StatusStats calculations
- Python
Published by geritwagner over 3 years ago
colrev - Version 0.5.0
Added
- Push/pull (including corrections), sync, validate, service operations
- Data provenance model (colrevdataprovenance, colrevmasterdataprovenance)
- Extensible endpoints (search, prep, prescreen, pdf-get, pdf-prep, screen, data)
- Prescreen scope
Changed
- Improvements: prep, dedupe operations
- Performance improvements (e.g., status, bibtexparser > pybtex)
- Extended Record class (e.g., merge and fusebestfields)
- LocalIndex: Elasticsearch to Opensearch
- Dedupe: testing and parameter optimization (option to prevent same-source merges)
- Settings.json and validation
- Updated documentation
- Testing and refactoring (e.g., for Windows, prefer keyword arguments in functions, python package type information)
- Python
Published by geritwagner over 3 years ago
colrev - Version 0.4.0
Added
- Extract functionality: ReviewDataset, Process
- Developed LocalIndex, EnvironmentManager, OpenSearch
- Curation model, including Resource installation and a "correction path"
- Search operation (reintegrating paperfeed and localpaper_index)
- Prep exclusion based on languages
Changed
- Object-oriented refactoring of the whole codebase
- Use Zotero translators (instead of bibutils) for imports
- Duplicate identification (add FP safeguards based on LocalIndex, add a procedure for small samples)
- Consistent PDF path handling
- Structured data extraction based on csv
Fixed
- Loggers
- Performance issues in prep and status
- Python
Published by geritwagner almost 4 years ago
colrev - Version 0.3.0
Added
- Introduced ReviewManager and integrated hooks/checks
- Fetch metadata from Open Library
- Required fields for misc
- Information on needsmanualpreparation (manprephints)
- Activated mypy hooks
- Introduced custom load scripts
- Documentation
- LocalIndex: hash-table implementation for indexing and retrieval
Changed
- Dedupe: based on active learning (dedupe-io)
- Improved batches
- Pass records instead of BibDatabase
- PDF prep and longer pdf hashes
Removed
- CLI: now in separate colrev repository
Fixed
- Initializing repositories
- Backward search adds two entries to search_details
- Logging (reinitialize after batches/commits)
- Python
Published by geritwagner about 4 years ago
colrev - Version 0.2.0
Added
- Status model (revstatus, mdstatus, pdf_status)
- Implemented cli interface
- Import formats (bib, ris, endn, pdf, text list of references)
- Docker services for import, ocr, building the paper etc.
- Metadata repositories for record preparation (crossref, dblp, semantic scholar)
- PDF preparation (OCR, metadata validation)
- Commit message reporting
- Check and validation of iteration completeness
- Support for building papers based on pandoc
Changed
- Integrated review process status (including prescreen, screen inclusion vs exclusion) in the references.bib
- Renamed scripts and cli entrypoints
- Refactored code
- Tracing from hash_id to origin links
- Extended and refactored pre-commit hooks
Removed
- R scripts for sample statistics (the goal is to implement them in Python)
- hashid function, traceentry, tracehashid
Fixed
- Bugs in
analysis/combine_individual_search_results.pyand inanalysis/acquire_pdfs.py - Catch exceptions and check bad responses in
analysis/acquire_pdfs.py - Bug in git modification check for
references.bibinanalysis/utils.py - Exception in
anaylsis/screen_2.py(IndexError) - Global constant conflict with
analysis/entry_hash_function.py(nameparser.config/CONSTANTS)
- Python
Published by geritwagner over 4 years ago
colrev - Version 0.1.0
Added
- First version of the pipeline, including
status,reformat_bibliography,trace_entry,trace_hash_id,combine_individual_search_results,cleanse_records,screen_sheet,screen_1,acquire_pdfs,screen_2,data_sheetanddata_pages - Environment setup including
DockerfileandMakefiles
- Python
Published by geritwagner almost 5 years ago