Recent Releases of extralit
extralit - v0.6.1: Integrated PDF processing workflow with extralit-hf-space and incremental Dataset building with imports
This release delivers major upgrades to document processing, import workflows, and exposed additional dataset-building functionalities in the UI. Highlights include OCRmyPDF-powered PDF processing via Redis jobs, a workspace selector at breadcrumb, and incremental import with dataset mapping.
What's Changed
- [FEAT] integrate OCRmyPDF and on document upload in Redis Queue jobs by @priyankeshh and @JonnyTran in https://github.com/Extralit/extralit/pull/115
- [FIX] Import Files Flow by @JonnyTran in https://github.com/Extralit/extralit/pull/120
- [FEAT] Workspace Pinia Store and Dataset Breadcrumb Selector in AppHeader @JonnyTran in https://github.com/Extralit/extralit/pull/121
- [FIX] Import File Parsing and Matching Flow and Refactoring by @JonnyTran in https://github.com/Extralit/extralit/pull/122
- [FIX] DocumentAPI to query by params and return multiple documents & fix PDF file fetching by @JonnyTran in https://github.com/Extralit/extralit/pull/123
- [FEAT] minio presigned url for pdf by @JonnyTran in https://github.com/Extralit/extralit/pull/124
- [FIX] Import Analysis and Batch Refactoring, File Matching algorithm, Document Panel by @JonnyTran in https://github.com/Extralit/extralit/pull/130
- [FIX] Consolidating linting configuration by @JonnyTran in https://github.com/Extralit/extralit/pull/133
- [FEAT] Document workflows with rq jobs by @JonnyTran in https://github.com/Extralit/extralit/pull/136
- [FEAT] Import dataset mapping by @JonnyTran in https://github.com/Extralit/extralit/pull/140
Contributors
Many thanks @priyankeshh for work on the https://github.com/Extralit/extralit-hf-space repo for PyMuPDF integration. Welcome @Mr-Youssef-Sherif!
Full Changelog: https://github.com/Extralit/extralit/compare/v0.6.0...v0.6.1
- Python
Published by JonnyTran 6 months ago
extralit - v0.6.0: PDF Importer feature with BibTeX support and namespace refactoring
What's Changed
- [FEATURE] Papers Library BibTeX Importer by @JonnyTran in https://github.com/Extralit/extralit/pull/107
- Update Frontend Dependencies: Migrate deprecated Babel plugins and refresh Vue 2 tooling by @Copilot in https://github.com/Extralit/extralit/pull/113
- feat(import-history): sidebar integration by @JonnyTran in https://github.com/Extralit/extralit/pull/116
- [FEAT] Rename question in Dataset Configuration by @JonnyTran in https://github.com/Extralit/extralit/pull/117
- Complete Python namespace refactor: argilla → extralit with directory restructure by @Copilot in https://github.com/Extralit/extralit/pull/118
- [RELEASE] v0.6.0 by @JonnyTran in https://github.com/Extralit/extralit/pull/119
Full Changelog: https://github.com/Extralit/extralit/compare/v0.5.0...v0.6.0
- Python
Published by JonnyTran 7 months ago
extralit - v0.5.0: Latest Argilla Upgrade and Repo Restructuring
This release focuses on synchronizing with the latest changes from the upstream Argilla project, improved CI pipelines, restructuring from Argilla to Extralit, and introduces support for legacy migrations.
What's New
Upstream Argilla Sync (v2.6.0-v2.8.0): We've merged the latest changes from Argilla, bringing in new features and bug fixes. Key highlights include:
- Similarity Search with Scores: The API now returns similarity scores when performing similarity searches, providing more context for your results.
- Predefined IDs for Users & Workspaces: You can now create users and workspaces with predefined IDs, simplifying integration and migration workflows.
Legacy Migration Support:
- To support users migrating from older versions, the
extralit_v1package has been added underargilla-v1/src/extralit_v1.
- To support users migrating from older versions, the
Project Refinements:
- We have completed the project-wide refactoring from
argillatoextralit, ensuring consistency across module paths and configurations like the cache directory (~/.extralit/). - The
upload_filefunction and document listing process have been streamlined for a better user experience.
- We have completed the project-wide refactoring from
Upgrade Notes
- To upgrade to the latest version, run:
bash pip install --upgrade extralit
Contributors
A big thank you to our community for the continuous support, contributions, and feedback that made this release possible!
Full Changelog
extralit/CHANGELOG.md v0.4.1...v0.5.0
- Python
Published by JonnyTran 8 months ago
extralit - v0.4.0: Argilla v2 API and CLI Rebuild
New Features
- Major CLI Overhaul PR #57:
TheextralitCLI has been rebuilt and now includes comprehensive commands for workspace, file, document, and schema management. This makes it easier than ever to interact with Extralit from the command line, automate workflows, and integrate with other tools. - Workspace API Improvements:
The Workspace API now supports more robust operations, improved error handling, and better logging for easier debugging and development. - Enhanced CLI error messages and user feedback.
- Improved file and schema management commands.
- Refactored codebase for better maintainability and developer experience.
- Updated developer documentation and issue templates.
Upgrade Notes
- Python 3.9+ required.
- Upgrade with:
bash pip install --upgrade extralit - To get started with the CLI:
bash extralit --help
Contributors
Special thanks to the contributors of PR #57 - you've made a major milestone: - @priyankeshh - @Ashutoshx7
And thanks to everyone else (@ArthrowAbstract, @SanjayUG, @Nakshatra05) who contributed to this release through code, reviews, and feedback!
Full Changelog
argilla/CHANGELOG.md v0.3.0...v0.4.0
- Python
Published by JonnyTran 9 months ago
extralit - v0.3.0: TableField and TableQuestion types
This release focuses on introducing table support for fields and questions in feedback datasets, along with infrastructure improvements.
New Features
- Schema & Fields:
- Added support for
TableFieldandes_field_for_record_fieldfor table fields - Added
TableQuestionandTableQuestionSettingto support table questions
- Added support for
Infrastructure Improvements
- DevOps:
- Added redis service to the Tilt k8s deployment for argilla-server
- Improved argilla-server and extralit-server dockerfile multi-stage build
- Changed envvars in Tilt k8s deployment at
argilla-server-deployment.yaml
Bug Fixes
- Fixed elasticsearch reindexing errors with dynamic schema
- Fixed certain extralit-specific changes when loading Dataset
Full Changelog
https://github.com/extralit/extralit/compare/v0.2.2...v0.3.0
- Python
Published by JonnyTran about 1 year ago
extralit - v0.2.1: devcontainers and unit test for files and documents
This release focuses on enhancing the continuous integration, testing, and DevOps setup, ensuring a more robust and efficient development workflow.
New Features
Development Environment:
- Added singleton schema support in SchemaStructure.
- Added docs site for the Extralit project at
argilla/docs/. - Added pytest-xdist for parallel testing.
- Pytest and Python environment setup in the "PostgreSQL & Elasticsearch for Docker-Compose" GitHub Codespaces devcontainer.
- Added .devcontainer for "Docker, Tilt, and K8s" local development on GitHub Codespaces.
Testing:
- Added tests for:
- Response: update duration.
- Files: get, put, list, delete.
- Models: get, post, put, delete.
- Records: include response_suggestions.
Changes
Dependencies:
- Updated Elasticsearch to 8.15.0.
Database:
- Reverted Suggestion table's unique constraint to only "recordid" and "questionid", fixing the test suites.
API:
- Disabled adding
LIST_DATASET_RECORDS_DEFAULT_SORT_BYwhen there's no sort-by on GET records. - Changed the
/api/v1/documentsPOST endpoint to useUploadFile.
- Disabled adding
DevOps:
- Changed K8s Elasticsearch deployment from Helm to
docker.elastic.co/elasticsearch/elasticsearchto fix PVC restarting issues. - Refactored Extralit Dockerfile and Docker Hub images to
extralit/argilla-serverandextralit/argilla-quickstart. - Changed
developbranch changes in argilla/docs tohttps:/docs.extralit.ai/latestinstead ofdev. - Changed
examples/deployments/k8s/extralit-configs.yamlfor configuring the Extralit service and secrets in a K8s cluster.
- Changed K8s Elasticsearch deployment from Helm to
Bug Fixes
- Fixed Tiltfile and k8s manifests for mono-repo setup.
- Fixed creating a new Weaviate collection with Weaviate client v4.
- Fixed an error with checking Weaviate collection existence when one doesn't exist.
- Fixed an issue with reindexing Elasticsearch by handling exceptions on failed datasets.
- Added Workspace relationship Document to enable cascade delete.
Security
- Allow admin role for workspace creation.
Full Changelog: https://github.com/extralit/extralit/compare/v0.2.0...v0.2.1
- Python
Published by JonnyTran over 1 year ago
extralit - v0.2.0: Extralit CLI workspace management and Github Actions CI workflows
This release following Argilla v1.29.1 brings significant improvements to the Extralit CLI, workspace management, and various bug fixes and enhancements to ensure a smoother user experience.
New Features
Workspace Management:
- Added workspace schema and file management to the Extralit CLI.
- Refined workspace schema and file management in the Extralit CLI.
- Updated
rg.Workspacewithupdate_schemasandget_schemasmethods. - Enabled
_IDreference IDs in schemas. - Added
inserted_atandupdated_atfields toSuggestion.
CLI Enhancements:
- Introduced the Extralit CLI for improved command-line interactions.
User Interface:
- Updated status filter options in
StatusFilter.vueandRecordRepository.ts. - Added tooltip in
LabelSelection.
- Updated status filter options in
Translation and Localization:
- Updated translation for "Use Table" option.
- Added use_table option to
QuestionSetting.
Bug Fixes
- Fixed import statements in
SchemaStructureandWorkspace. - Ensured
.mjsfiles are properly transpiled withbabel-loader. - Fixed validation errors in
FeedbackRecordsuggestions to server payload. - Fixed
RecordRepository.tsto remove fetching "All data".
Continuous Integration and Deployment
- Updated GitHub Actions and updated Docker Hub image name deployments.
- Added GitHub Codespaces in
.devcontainer. - Updated package names and build configurations for Extralit.
- Set up mono repo to merge
extralit-server.
Documentation
- Updated README.md with new information.
Miscellaneous
- Updated pip dependencies for Python tests.
- Updated community links.
Full Changelog: https://github.com/extralit/extralit/compare/v1.27.0a...v0.2.0
- Python
Published by JonnyTran over 1 year ago
extralit - v0.1.0: Enhancements in RenderTable and UI Improvements
This release following Argilla v1.27.0 brings a series of new features, improvements, and bug fixes, primarily focusing on the RenderTable component and UI enhancements.
New Features:
- Added filter values in RenderTable.
- Added reference field to Document class for useLLMExtractionViewModel.
- Added fetchDocumentSegments.
- ReactiveData on add rows.
- Added duplicate multiple rows selected in range.
- Update column context menu label to freeze/unfreeze column.
- Generate empty rows within a group.
- Added button to fetch latest schema in RenderTable component.
- RenderTable now automatically fetches validation from the server.
- Added history entry for updateTableData.
- Extraction completion working.
- Added optional parameter to retrieve documents in ArgillaMixin.
- Update Document to use URL instead of file_data.
- Delete document in RemoteFeedbackDataset.
Improvements:
- Updated useFocusAnnotationViewModel to update 'context-relevant' questions with document segments asynchronously.
- Improved UI of suggestion dropdown.
- Refactored sidebar width and transition.
- Updated RenderTable height & TextAreaSuggestion.
- Fixes suggestion & responses to dynamic multilabel questions.
Bug Fixes:
- Fixed grouping issue in editable mode and improved subgroup labeling.
- Fixed QuestionsForm duration resetting.
- Fixed focus in RenderHTML.
- Fixed search and replace in RenderHTML.
- Fixed RenderTable columns ordering on addColumn.
- Fixed issues with focus on RenderTable and RenderHTML.
- Fixed sidebar resizing overflow of QuestionsForm.
- Fixed auto-submit behavior in QuestionsForm.
Chores:
- Updated .dockerignore to ignore additional directories and files.
- Updated project name to "extralit-client" in pyproject.toml.
Full Changelog: https://github.com/extralit/extralit/compare/v1.21.0b...v1.27.0a
- Python
Published by JonnyTran over 1 year ago
extralit - v0.0.9: UI Tabs & Improved Tables editing
v0.0.9
This release following Argilla v1.21.0b includes several enhancements, bug fixes, and refactoring to improve the overall user experience, performance, and code maintainability.
Enhancements
- UI Refinements: We've made several updates to the user interface to make it more intuitive and user-friendly. This includes updates to RenderTable, BaseCardWithTabs, TextField, and various other components.
- Resizable Components: Added the ability to resize the form and sidebar panels for better user experience.
- PDF Viewer: Refactored the PDFViewer base component and updated to use
@jonnytran/vue-pdf-viewer. - Keyboard Shortcuts: Updated keyboard shortcuts for better usability and added new ones for table editing and clearing records.
Bug Fixes
- RenderTable Fixes: Fixed various issues with RenderTable including group header styling, column update, and error handling.
- Overflow Issues: Fixed overflow issues in RenderHTML and RenderMarkdown components.
- Document Creation Bug: Fixed a bug that was causing issues with document creation.
Refactoring
- Code Refactoring: Refactored several parts of the codebase for better maintainability. This includes TypeScript types, SCSS mixins, and various components like TextArea, TextAreaSuggestion, and RenderTable.
- Table Component: Refactored the table component to add a resizable feature and improve its functionality.
Please note that this release is based off argilla's v1.21.0 release and may still contain bugs. We appreciate your feedback and bug reports to help us improve the application.
- Python
Published by JTran-IDM almost 2 years ago