Recent Releases of extractor
extractor - v1.3.2 Final version for first delivery
Final release for first delivery.
This release fix a problem with the Dockerfile. Documentation updated.
- Python
Published by adrijmz about 2 years ago
extractor - v1.3.2 Final version
Final release with documentation updated and script and test finished.
Ready for first delivery
- Python
Published by adrijmz about 2 years ago
extractor - v1.3.2 Add some test & refactor script.py
This release refactor some functions in order to be able to run test
- Python
Published by adrijmz about 2 years ago
extractor - v1.3.1 Creating a Docker image to install the app
Update
This release allow the user install the application using Docker. It also include an updated README file with the instructions to use the application.
- Python
Published by adrijmz about 2 years ago
extractor - v1.2.1: Use Docker Container to run GROBID
Update README
In this release README.md have been updated to explain how to run docker container
Update Script
Now the script use local GROBID server
Unittest
Unittest to check all files in papers directory are .pdf
Requirements
Added requirements.txt
- Python
Published by adrijmz about 2 years ago
extractor - v1.1.1: Update metadata files
Update metadata files
In this release codemeta.json and extractor.cff have been updated
- Python
Published by adrijmz about 2 years ago
extractor -
This release fix an error in the extract_links function that didn't allow read properly from papers.
Errors fixed in this release
extract_links funcion:
This version fix an error that didn't allow read properly the links from papers. Now links can be read and written down in links.txt
save automatically figures:
This version saves automatically the keyword cloud and the diagram in output directory.
- Python
Published by adrijmz about 2 years ago
extractor - Release v1.0.0: Initial Release
This release introduces a Python script for extracting and analyzing information from scientific articles in PDF format. It provides a suite of features to facilitate the analysis of multiple articles, including text extraction, keyword cloud generation, and figure counting.
Features
Text Extraction:
Utilizes Grobid to extract text from PDF documents, enabling further analysis of the content.
Keyword Cloud Generation:
Creates a keyword cloud based on the abstracts of the articles, providing a visual representation of the most common words.
Figure Counting:
Counts the number of figures in each article, aiding in understanding the visual content of the research presented.
Link Extraction (WIP):
Initial implementation for extracting links within the PDF documents, particularly references cited in the articles. Please note that this feature is still a work in progress and may not function correctly.
- Python
Published by adrijmz about 2 years ago