Recent Releases of pyjedai

pyjedai - 0.3.2

⚒️ Fixed

  • - llm_matching.py: Examples not created properly fixed [@Teris45 ]

Added

  • None

⚠️ Issues

  • None

Authored by @Teris45

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.3.1...0.3.2

- Python
Published by Teris45 10 months ago

pyjedai - 0.3.1

⚒️ Fixed

  • None

Added

  • llm_matching.py: Using ollama matching can be done by utilizing llms [@Teris45 ]
  • docs/tutorials/LLMsMatching.ipynb: Check this tutorial for a better understanding of llm_matching process [@Teris45 ]

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.3.0...0.3.1


Authored by @Teris45

- Python
Published by Teris45 10 months ago

pyjedai - 0.3.0

⚒️ Fixed

  • schema/matching.py: Coma wrong attributes used [@Teris45 ]
  • schema/schema_model.py: Minor Bug when loading Schema [@Teris45 ]

Added

  • None

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.9...0.3.0


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai -

⚒️ Fixed

schema_model.py: `self` attribute not included in some functions [@Teris45 ]

➕ Added

None

⚠️ Issues

None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.6...0.2.9

- Python
Published by Teris45 about 1 year ago

pyjedai - 0.2.8

⚒️ Fixed

  • None

Added

  • workflow.py: JoinWorkflow added. [@Teris45]

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.5...0.2.6


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai - 0.2.7

⚒️ Fixed

  • workflow.py: EmbeddingsNNWorkflow didn't work correctly with clustering method and export_pairs. [@Teris45]

Added

  • None

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.5...0.2.6


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai - 0.2.6

⚒️ Fixed

  • workflow.py: EmbeddingsNNWorkflow didn't work correctly with clustering method and export_pairs. [@Teris45]

Added

  • None

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.5...0.2.6


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai -

⚒️ Fixed

  • None

Added

  • vectorbasedblocking.py: EmbeddingsNNBlockBuilding class is allowed custom word and sentence embedding models, provided the user passes the correct argument to EmbeddingsNNBlockBuilding.build_blocks [@jstammers ]

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.4...0.2.5


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai - 0.2.4

⚒️ Fixed

  • None

Added

  • schema/schema_model.py: Added class to load data for schema-matching [@Teris45]
  • schema/utils.py: Functions needed for schema-matching.

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.3...0.2.4


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai - 0.2.3

⚒️ Fixed

  • blockbuilding.py: exportto_df bug for Dirty ER fixed [@Teris45 ]
  • clustering.py: exporttodf bug for Dirty ER fixed [@Teris45 ]
  • joins.py: exporttodf bug for Dirty ER fixed [@Teris45 ]
  • matching.py: exporttodf bug for Dirty ER fixed [@Teris45 ]
  • vectorbasedblocking.py: exporttodf bug for Dirty ER fixed [@Teris45 ]

Added

  • None

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.2...0.2.3


Authored by @Teris45

- Python
Published by Teris45 about 1 year ago

pyjedai - 0.2.2

⚒️ Fixed

  • joins.py: Export pairs bug [@Nikoletos-K]
  • matrching.py: Export pairs bug [@Nikoletos-K]

Added

  • None

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.2.0...0.2.1


Authored by @Nikoletos-K

- Python
Published by Nikoletos-K about 1 year ago

pyjedai - 0.2.1

⚒️ Fixed

  • joins.py: Export pairs [@Nikoletos-K]

Added

  • A new method that reads datasets from json files [@Nikoletos-K]
  • Reproducibility guide for the 11 Clean-Clean ER datasets [@Nikoletos-K]

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.1.9...0.2.0


Authored by @Nikoletos-K

- Python
Published by Nikoletos-K over 1 year ago

pyjedai - 0.2.0

⚒️ Fixed

  • None

Added

  • @Teris45 is the new maintainer!
  • @Nikoletos-K is out!

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.1.9...0.2.0


Authored by @Teris45

- Python
Published by Teris45 over 1 year ago

pyjedai - 0.1.9

⚒️ Fixed

  • Issue #25

Added

  • Optimized exports runtime - Removed pandas concat

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.1.8...0.1.9


Authored by @Nikoletos-K

- Python
Published by Nikoletos-K almost 2 years ago

pyjedai - 0.1.8

⚒️ Fixed

  • Issue #22 and #23.
  • NNs save/load embeddings issue [ @JacobMaciejewski ].
  • NN unused print.
  • Matching issues.

Added

  • New visualizations (PCA and tSNE)

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.1.7...0.1.8


Authored by @Nikoletos-K

- Python
Published by Nikoletos-K almost 2 years ago

pyjedai - 0.1.7

⚒️ Fixed

  • Issue #19 , #20 , #21 ;
  • Removed FALCONN and SCANN
  • Refined dependencies
  • Removed Optuna injection
  • Fixed typos
  • Reports

Added

  • New utilities to docs

⚠️ Issues

  • None

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.1.6...0.1.7


Authored by @Nikoletos-K

- Python
Published by Nikoletos-K about 2 years ago

pyjedai - 0.1.6

⚒️ Fixed

  • Issue #16 ;
  • Typos in clustering.py
  • Datamodel gt initialization
  • Imports in utils
  • Bugs in NN-workflow
  • Bugs and evaluation of simple Schema Clustering

Added

  • Dataframe memory consumption
  • New Schema Clustering method for RDF data [Not final implementation - alpha version]

⚠️ Issues

  • SCANN and FALCONN produce warnings

Full Changelog: https://github.com/AI-team-UoA/pyJedAI/compare/0.1.5...0.1.6


Authored by @Nikoletos-K

- Python
Published by Nikoletos-K about 2 years ago

pyjedai - 0.1.5

⚒️ Fixed

  • Schema Matching structure [ @Nikoletos-K ]

Added

  • First working version of Schema Clustering [ @Nikoletos-K ]
  • vectorbasedblocking component: SCANN/FAISS full functionality on Linux OS only! [ @JacobMaciejewski ]
  • RowColumnClustering: new clustering algorithm [ @JacobMaciejewski ]

⚠️ Issues

  • Minor changes in ProgressiveWorkFlow(PYJEDAIWorkFlow). [ @JacobMaciejewski ]

- Python
Published by Nikoletos-K over 2 years ago

pyjedai - 0.1.4

⚒️ Fixed

  • Correlation Clustering method.
  • nltk.download('stopwords') download only when needed.
  • Schema Matching component to align with the latest version of Valentine.

➕ Added

  • datamodel.py: SchemaData for Schema Matching Component
  • ‼️ New Component; pyJedAI Spatial, for Interlinking geospatial RDF data. [ @IordanisT ]
  • SCANN functionality, only available for Linux OS. [ @JacobMaciejewski ]

⚠️ Issues

  • None

- Python
Published by Nikoletos-K over 2 years ago

pyjedai - 0.1.3

⚒️ Fixed

  • None

➕ Added

  • Clustering algorithms: [ Author: @JacobMaciejewski 📌 ]

    • EquivalenceCluster
    • ExtendedSimilarityEdge
    • Vertex
    • RicochetCluster
    • ExactClustering
    • CenterClustering
    • BestMatchClustering
    • MergeCenterClustering
    • CorrelationClustering
    • CutClustering
    • MarkovClustering
    • KiralyMSMApproximateClustering
    • RicochetSRClustering
  • Blocking:

    • Statistics

⚠️ Issues

  • None

- Python
Published by Nikoletos-K over 2 years ago

pyjedai - 0.1.2

⚒️ Fixed

  • Fixed export methods. Use case of not providing a ground-truth
  • Restructured and optimized Joins methods by creating a vectorizer module and in-memory transactions
  • Time of vectorization by saving and retrieving the distance matrix

➕ Added

  • 'sqeuclidean' metric in matching step
  • Valentine as a Schema Matching plugin

⚠️ Issues

  • Vectorizers (tfidf, etc) don't support dirty er. Will be fixed in the next release.

- Python
Published by Nikoletos-K over 2 years ago

pyjedai - 0.1.1

⚒️ Fixed

  • Removed deprecated whoosh imports from prioritization file

➕ Added

  • None

⚠️ Issues

  • None

- Python
Published by Nikoletos-K over 2 years ago

pyjedai - 0.1.0

⚒️ Fixed

  • Restructured Matching Module - vectorizer, tokenizer, and qgrams as arguments (not inferred)
  • Clustering step randomization bug

➕ Added

  • PER notebook tutorials
  • PER grid-search pipeline (config files, search scripts, storage)
  • PER workflows visualization and comparison through:
    • feature configuration budget-centric metric progress plots
    • feature configuration dataset-centric sorting and comparison

⚠️ Issues

  • None

- Python
Published by Nikoletos-K almost 3 years ago

pyjedai - 0.0.9

⚒️Fixed:

  • FAISS euclidean distance
  • Workflow methods
  • Removed whoosh
  • Removed SCANN

➕Added:

  • 3 New workflow methods
  • Export pairs in each step
  • Tfidf weights in matching options
  • Website:
    • code API
    • new tutorials

⚠️ Issues:

  • None

- Python
Published by Nikoletos-K almost 3 years ago

pyjedai - 0.0.8

Fixed: - Word grams tokenization - Code architecture in entity matching - py_stringmatching dependencies - Pypi readme

Added: - Boolean/Tfidf/Tf weights

- Python
Published by Nikoletos-K almost 3 years ago

pyjedai - 0.0.7

Fixed: - Issues in block filtering - Issues in vector based blocking - Data model set types - EJoin wrong naming

Added: - Prioritization algorithms - Tf-Idf functionality - More metrics on entity matching - Optional data cleaning functionalities - New visualizations - New stats for the blocking workflows

- Python
Published by Nikoletos-K almost 3 years ago

pyjedai - v0.0.6

Fixed issue in VB.

- Python
Published by Nikoletos-K almost 3 years ago

pyjedai - v0.0.5

Added: - New evaluation module - Matching metrics - Vector based blocking techniques - Data process methods - Entity matching plots - sphinx website - New tests

Fixed: - Architecture, abstract data types - Data bugs in block building - Bugs in vector based blocking - Using workflows without gt - Code runtime

- Python
Published by Nikoletos-K about 3 years ago

pyjedai - v0.0.4

Python 3.7 and 3.8 are now supported!

New dependencies. pyJedAI supports now older python versions. Total supported versions: - 3.7 - 3.8 - 3.9 - 3.10

Also, added tests for all supported python versions and MacOS.

- Python
Published by Nikoletos-K over 3 years ago

pyjedai - v0.0.3

First official release in PyPI

Contains: - Tutorials and demos - Fixed issues

- Python
Published by Nikoletos-K over 3 years ago

pyjedai - v0.0.2

Optimizations, User-friendly Approach Updates

This is the second release. Project is still under development. In this release we: - Added WorkFlow module: A high-level method that simplifies all the process. User friendly approach. - Added comments in the basic methods. - Performed time optimizations using by utilizing the most python. - Created automatic tests. - Created new Block Building Method, by using pre-trained embeddings and Gensim. Similarity search with FAISS framework. - Uploaded to PyPI. - Visualization techniques for performance check.

- Python
Published by Nikoletos-K over 3 years ago

pyjedai - v0.0.1

First pyJedAI release: This release presents the basic structure of the well-known JedAI toolkit into the python environment. Contains: - Data reading techniques: RDF/OWL, SPARKQL, CSV, JSON, DB - Block building: Standard Blocking, QGrams & Extended, SuffixArray & Extended - Block cleaning: Block purging, Block filtering - Comparison cleaning: Weighted edge/node pruning, Cardinality edge/node pruning, BLAST, etc - Entity matching: strsimpy - Entity clustering: Connected component clustering - Similarity Joins: SchemaAgnosticΕJoin, TopKSchemaAgnosticJoin - Evaluation through Jupyter notebook

- Python
Published by Nikoletos-K almost 4 years ago