extractor - v1.3.2 Final version for first delivery

Final release for first delivery.

This release fix a problem with the Dockerfile. Documentation updated.

- Python
Published by adrijmz over 2 years ago

extractor - v1.3.2 Final version

Final release with documentation updated and script and test finished.

Ready for first delivery

- Python
Published by adrijmz over 2 years ago

extractor - v1.3.2 Add some test & refactor script.py

This release refactor some functions in order to be able to run test

- Python
Published by adrijmz over 2 years ago

extractor - v1.3.1 Creating a Docker image to install the app

Update

This release allow the user install the application using Docker. It also include an updated README file with the instructions to use the application.

- Python
Published by adrijmz over 2 years ago

extractor - v1.2.1: Use Docker Container to run GROBID

Update README

In this release README.md have been updated to explain how to run docker container

Update Script

Now the script use local GROBID server

Unittest

Unittest to check all files in papers directory are .pdf

Requirements

Added requirements.txt

- Python
Published by adrijmz over 2 years ago

extractor - v1.1.1: Update metadata files

Update metadata files

In this release codemeta.json and extractor.cff have been updated

- Python
Published by adrijmz over 2 years ago

extractor -

This release fix an error in the extract_links function that didn't allow read properly from papers.

Errors fixed in this release

extract_links funcion:

This version fix an error that didn't allow read properly the links from papers. Now links can be read and written down in links.txt

save automatically figures:

This version saves automatically the keyword cloud and the diagram in output directory.

- Python
Published by adrijmz over 2 years ago

extractor - Release v1.0.0: Initial Release

This release introduces a Python script for extracting and analyzing information from scientific articles in PDF format. It provides a suite of features to facilitate the analysis of multiple articles, including text extraction, keyword cloud generation, and figure counting.

Features

Text Extraction:

Utilizes Grobid to extract text from PDF documents, enabling further analysis of the content.

Keyword Cloud Generation:

Creates a keyword cloud based on the abstracts of the articles, providing a visual representation of the most common words.

Figure Counting:

Counts the number of figures in each article, aiding in understanding the visual content of the research presented.

Link Extraction (WIP):

Initial implementation for extracting links within the PDF documents, particularly references cited in the articles. Please note that this feature is still a work in progress and may not function correctly.

- Python
Published by adrijmz over 2 years ago

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

Recent Releases of extractor

extractor - v1.3.2 Final version for first delivery

extractor - v1.3.2 Final version

extractor - v1.3.2 Add some test & refactor script.py

extractor - v1.3.1 Creating a Docker image to install the app

Update

extractor - v1.2.1: Use Docker Container to run GROBID

Update README

Update Script

Unittest

Requirements

extractor - v1.1.1: Update metadata files

Update metadata files

extractor -

Errors fixed in this release

extract_links funcion:

save automatically figures:

extractor - Release v1.0.0: Initial Release

Features

Text Extraction:

Keyword Cloud Generation:

Figure Counting:

Link Extraction (WIP):