Recent Releases of extralit

extralit - v0.6.1: Integrated PDF processing workflow with extralit-hf-space and incremental Dataset building with imports

This release delivers major upgrades to document processing, import workflows, and exposed additional dataset-building functionalities in the UI. Highlights include OCRmyPDF-powered PDF processing via Redis jobs, a workspace selector at breadcrumb, and incremental import with dataset mapping.

What's Changed

  • [FEAT] integrate OCRmyPDF and on document upload in Redis Queue jobs by @priyankeshh and @JonnyTran in https://github.com/Extralit/extralit/pull/115
  • [FIX] Import Files Flow by @JonnyTran in https://github.com/Extralit/extralit/pull/120
  • [FEAT] Workspace Pinia Store and Dataset Breadcrumb Selector in AppHeader @JonnyTran in https://github.com/Extralit/extralit/pull/121
  • [FIX] Import File Parsing and Matching Flow and Refactoring by @JonnyTran in https://github.com/Extralit/extralit/pull/122
  • [FIX] DocumentAPI to query by params and return multiple documents & fix PDF file fetching by @JonnyTran in https://github.com/Extralit/extralit/pull/123
  • [FEAT] minio presigned url for pdf by @JonnyTran in https://github.com/Extralit/extralit/pull/124
  • [FIX] Import Analysis and Batch Refactoring, File Matching algorithm, Document Panel by @JonnyTran in https://github.com/Extralit/extralit/pull/130
  • [FIX] Consolidating linting configuration by @JonnyTran in https://github.com/Extralit/extralit/pull/133
  • [FEAT] Document workflows with rq jobs by @JonnyTran in https://github.com/Extralit/extralit/pull/136
  • [FEAT] Import dataset mapping by @JonnyTran in https://github.com/Extralit/extralit/pull/140

Contributors

Many thanks @priyankeshh for work on the https://github.com/Extralit/extralit-hf-space repo for PyMuPDF integration. Welcome @Mr-Youssef-Sherif!

Full Changelog: https://github.com/Extralit/extralit/compare/v0.6.0...v0.6.1

- Python
Published by JonnyTran 6 months ago

extralit - v0.6.0: PDF Importer feature with BibTeX support and namespace refactoring

What's Changed

  • [FEATURE] Papers Library BibTeX Importer by @JonnyTran in https://github.com/Extralit/extralit/pull/107
  • Update Frontend Dependencies: Migrate deprecated Babel plugins and refresh Vue 2 tooling by @Copilot in https://github.com/Extralit/extralit/pull/113
  • feat(import-history): sidebar integration by @JonnyTran in https://github.com/Extralit/extralit/pull/116
  • [FEAT] Rename question in Dataset Configuration by @JonnyTran in https://github.com/Extralit/extralit/pull/117
  • Complete Python namespace refactor: argilla → extralit with directory restructure by @Copilot in https://github.com/Extralit/extralit/pull/118
  • [RELEASE] v0.6.0 by @JonnyTran in https://github.com/Extralit/extralit/pull/119

Full Changelog: https://github.com/Extralit/extralit/compare/v0.5.0...v0.6.0

- Python
Published by JonnyTran 7 months ago

extralit - v0.5.0: Latest Argilla Upgrade and Repo Restructuring

This release focuses on synchronizing with the latest changes from the upstream Argilla project, improved CI pipelines, restructuring from Argilla to Extralit, and introduces support for legacy migrations.

What's New

  • Upstream Argilla Sync (v2.6.0-v2.8.0): We've merged the latest changes from Argilla, bringing in new features and bug fixes. Key highlights include:

    • Similarity Search with Scores: The API now returns similarity scores when performing similarity searches, providing more context for your results.
    • Predefined IDs for Users & Workspaces: You can now create users and workspaces with predefined IDs, simplifying integration and migration workflows.
  • Legacy Migration Support:

    • To support users migrating from older versions, the extralit_v1 package has been added under argilla-v1/src/extralit_v1.
  • Project Refinements:

    • We have completed the project-wide refactoring from argilla to extralit, ensuring consistency across module paths and configurations like the cache directory (~/.extralit/).
    • The upload_file function and document listing process have been streamlined for a better user experience.

Upgrade Notes

  • To upgrade to the latest version, run: bash pip install --upgrade extralit

Contributors

A big thank you to our community for the continuous support, contributions, and feedback that made this release possible!

Full Changelog

extralit/CHANGELOG.md v0.4.1...v0.5.0

- Python
Published by JonnyTran 8 months ago

extralit - v0.4.0: Argilla v2 API and CLI Rebuild

New Features

  • Major CLI Overhaul PR #57:
    The extralit CLI has been rebuilt and now includes comprehensive commands for workspace, file, document, and schema management. This makes it easier than ever to interact with Extralit from the command line, automate workflows, and integrate with other tools.
  • Workspace API Improvements:
    The Workspace API now supports more robust operations, improved error handling, and better logging for easier debugging and development.
  • Enhanced CLI error messages and user feedback.
  • Improved file and schema management commands.
  • Refactored codebase for better maintainability and developer experience.
  • Updated developer documentation and issue templates.

Upgrade Notes

  • Python 3.9+ required.
  • Upgrade with: bash pip install --upgrade extralit
  • To get started with the CLI: bash extralit --help

Contributors

Special thanks to the contributors of PR #57 - you've made a major milestone: - @priyankeshh - @Ashutoshx7

And thanks to everyone else (@ArthrowAbstract, @SanjayUG, @Nakshatra05) who contributed to this release through code, reviews, and feedback!

Full Changelog

argilla/CHANGELOG.md v0.3.0...v0.4.0

- Python
Published by JonnyTran 9 months ago

extralit - v0.3.0: TableField and TableQuestion types

This release focuses on introducing table support for fields and questions in feedback datasets, along with infrastructure improvements.

New Features

  • Schema & Fields:
    • Added support for TableField and es_field_for_record_field for table fields
    • Added TableQuestion and TableQuestionSetting to support table questions

Infrastructure Improvements

  • DevOps:
    • Added redis service to the Tilt k8s deployment for argilla-server
    • Improved argilla-server and extralit-server dockerfile multi-stage build
    • Changed envvars in Tilt k8s deployment at argilla-server-deployment.yaml

Bug Fixes

  • Fixed elasticsearch reindexing errors with dynamic schema
  • Fixed certain extralit-specific changes when loading Dataset

Full Changelog

https://github.com/extralit/extralit/compare/v0.2.2...v0.3.0

- Python
Published by JonnyTran about 1 year ago

extralit - v0.2.1: devcontainers and unit test for files and documents

This release focuses on enhancing the continuous integration, testing, and DevOps setup, ensuring a more robust and efficient development workflow.

New Features

  • Development Environment:

    • Added singleton schema support in SchemaStructure.
    • Added docs site for the Extralit project at argilla/docs/.
    • Added pytest-xdist for parallel testing.
    • Pytest and Python environment setup in the "PostgreSQL & Elasticsearch for Docker-Compose" GitHub Codespaces devcontainer.
    • Added .devcontainer for "Docker, Tilt, and K8s" local development on GitHub Codespaces.
  • Testing:

    • Added tests for:
    • Response: update duration.
    • Files: get, put, list, delete.
    • Models: get, post, put, delete.
    • Records: include response_suggestions.

Changes

  • Dependencies:

    • Updated Elasticsearch to 8.15.0.
  • Database:

    • Reverted Suggestion table's unique constraint to only "recordid" and "questionid", fixing the test suites.
  • API:

    • Disabled adding LIST_DATASET_RECORDS_DEFAULT_SORT_BY when there's no sort-by on GET records.
    • Changed the /api/v1/documents POST endpoint to use UploadFile.
  • DevOps:

    • Changed K8s Elasticsearch deployment from Helm to docker.elastic.co/elasticsearch/elasticsearch to fix PVC restarting issues.
    • Refactored Extralit Dockerfile and Docker Hub images to extralit/argilla-server and extralit/argilla-quickstart.
    • Changed develop branch changes in argilla/docs to https:/docs.extralit.ai/latest instead of dev.
    • Changed examples/deployments/k8s/extralit-configs.yaml for configuring the Extralit service and secrets in a K8s cluster.

Bug Fixes

  • Fixed Tiltfile and k8s manifests for mono-repo setup.
  • Fixed creating a new Weaviate collection with Weaviate client v4.
  • Fixed an error with checking Weaviate collection existence when one doesn't exist.
  • Fixed an issue with reindexing Elasticsearch by handling exceptions on failed datasets.
  • Added Workspace relationship Document to enable cascade delete.

Security

  • Allow admin role for workspace creation.

Full Changelog: https://github.com/extralit/extralit/compare/v0.2.0...v0.2.1

- Python
Published by JonnyTran over 1 year ago

extralit - v0.2.0: Extralit CLI workspace management and Github Actions CI workflows

This release following Argilla v1.29.1 brings significant improvements to the Extralit CLI, workspace management, and various bug fixes and enhancements to ensure a smoother user experience.

New Features

  • Workspace Management:

    • Added workspace schema and file management to the Extralit CLI.
    • Refined workspace schema and file management in the Extralit CLI.
    • Updated rg.Workspace with update_schemas and get_schemas methods.
    • Enabled _ID reference IDs in schemas.
    • Added inserted_at and updated_at fields to Suggestion.
  • CLI Enhancements:

    • Introduced the Extralit CLI for improved command-line interactions.
  • User Interface:

    • Updated status filter options in StatusFilter.vue and RecordRepository.ts.
    • Added tooltip in LabelSelection.
  • Translation and Localization:

    • Updated translation for "Use Table" option.
    • Added use_table option to QuestionSetting.

Bug Fixes

  • Fixed import statements in SchemaStructure and Workspace.
  • Ensured .mjs files are properly transpiled with babel-loader.
  • Fixed validation errors in FeedbackRecord suggestions to server payload.
  • Fixed RecordRepository.ts to remove fetching "All data".

Continuous Integration and Deployment

  • Updated GitHub Actions and updated Docker Hub image name deployments.
  • Added GitHub Codespaces in .devcontainer.
  • Updated package names and build configurations for Extralit.
  • Set up mono repo to merge extralit-server.

Documentation

  • Updated README.md with new information.

Miscellaneous

  • Updated pip dependencies for Python tests.
  • Updated community links.

Full Changelog: https://github.com/extralit/extralit/compare/v1.27.0a...v0.2.0

- Python
Published by JonnyTran over 1 year ago

extralit - v0.1.0: Enhancements in RenderTable and UI Improvements

This release following Argilla v1.27.0 brings a series of new features, improvements, and bug fixes, primarily focusing on the RenderTable component and UI enhancements.

New Features:

  • Added filter values in RenderTable.
  • Added reference field to Document class for useLLMExtractionViewModel.
  • Added fetchDocumentSegments.
  • ReactiveData on add rows.
  • Added duplicate multiple rows selected in range.
  • Update column context menu label to freeze/unfreeze column.
  • Generate empty rows within a group.
  • Added button to fetch latest schema in RenderTable component.
  • RenderTable now automatically fetches validation from the server.
  • Added history entry for updateTableData.
  • Extraction completion working.
  • Added optional parameter to retrieve documents in ArgillaMixin.
  • Update Document to use URL instead of file_data.
  • Delete document in RemoteFeedbackDataset.

Improvements:

  • Updated useFocusAnnotationViewModel to update 'context-relevant' questions with document segments asynchronously.
  • Improved UI of suggestion dropdown.
  • Refactored sidebar width and transition.
  • Updated RenderTable height & TextAreaSuggestion.
  • Fixes suggestion & responses to dynamic multilabel questions.

Bug Fixes:

  • Fixed grouping issue in editable mode and improved subgroup labeling.
  • Fixed QuestionsForm duration resetting.
  • Fixed focus in RenderHTML.
  • Fixed search and replace in RenderHTML.
  • Fixed RenderTable columns ordering on addColumn.
  • Fixed issues with focus on RenderTable and RenderHTML.
  • Fixed sidebar resizing overflow of QuestionsForm.
  • Fixed auto-submit behavior in QuestionsForm.

Chores:

  • Updated .dockerignore to ignore additional directories and files.
  • Updated project name to "extralit-client" in pyproject.toml.

Full Changelog: https://github.com/extralit/extralit/compare/v1.21.0b...v1.27.0a

- Python
Published by JonnyTran over 1 year ago

extralit - v0.0.9: UI Tabs & Improved Tables editing

v0.0.9

This release following Argilla v1.21.0b includes several enhancements, bug fixes, and refactoring to improve the overall user experience, performance, and code maintainability.

Enhancements

  • UI Refinements: We've made several updates to the user interface to make it more intuitive and user-friendly. This includes updates to RenderTable, BaseCardWithTabs, TextField, and various other components.
  • Resizable Components: Added the ability to resize the form and sidebar panels for better user experience.
  • PDF Viewer: Refactored the PDFViewer base component and updated to use @jonnytran/vue-pdf-viewer.
  • Keyboard Shortcuts: Updated keyboard shortcuts for better usability and added new ones for table editing and clearing records.

Bug Fixes

  • RenderTable Fixes: Fixed various issues with RenderTable including group header styling, column update, and error handling.
  • Overflow Issues: Fixed overflow issues in RenderHTML and RenderMarkdown components.
  • Document Creation Bug: Fixed a bug that was causing issues with document creation.

Refactoring

  • Code Refactoring: Refactored several parts of the codebase for better maintainability. This includes TypeScript types, SCSS mixins, and various components like TextArea, TextAreaSuggestion, and RenderTable.
  • Table Component: Refactored the table component to add a resizable feature and improve its functionality.

Please note that this release is based off argilla's v1.21.0 release and may still contain bugs. We appreciate your feedback and bug reports to help us improve the application.

- Python
Published by JTran-IDM almost 2 years ago