What's Changed

Update pdf.py by @MoonDavid in https://github.com/jannisborn/paperscraper/pull/82
Chemrxiv limit by @jannisborn in https://github.com/jannisborn/paperscraper/pull/84
prepare 0.3.2 by @jannisborn in https://github.com/jannisborn/paperscraper/pull/86

New Contributors

@MoonDavid made their first contribution in https://github.com/jannisborn/paperscraper/pull/82

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.3.1...v0.3.2

- Python
Published by jannisborn 8 months ago

What's Changed

Load API keys automatically from .env file if available -- by @jannisborn in https://github.com/jannisborn/paperscraper/pull/77
Optionally download bioRxiv PDFs via requester-pays S3 bucket -- by @jannisborn in https://github.com/jannisborn/paperscraper/pull/80

Pre-release

Selflink by @jannisborn in https://github.com/jannisborn/paperscraper/pull/76
Homogenize self citation/reference client by @jannisborn in https://github.com/jannisborn/paperscraper/pull/78

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.3.0...v0.3.1

- Python
Published by jannisborn 11 months ago

What's Changed

Citations of a paper can now be retrieved from a DOI by @jannisborn in https://github.com/jannisborn/paperscraper/pull/73
Full text download fallback implementation by @mathinic in https://github.com/jannisborn/paperscraper/pull/72

New Contributors

@mathinic made their first contribution in https://github.com/jannisborn/paperscraper/pull/72

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.2.16...v0.3.0

- Python
Published by jannisborn about 1 year ago

What's Changed

feat: support retries for chemrxiv api by @jannisborn in https://github.com/jannisborn/paperscraper/pull/66
BREAKING CHANGE: Homogenize the usage of begindate instead startdate by @jannisborn in https://github.com/jannisborn/paperscraper/pull/69
Ensure unique DOI from PubMed API by @jannisborn in https://github.com/jannisborn/paperscraper/pull/71
More robust PubMed requests (bumped pymed-paperscraper dependency)

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.2.15...v0.2.16

- Python
Published by jannisborn about 1 year ago

What's Changed

feat: support scraping arxiv entirely by @jannisborn in https://github.com/jannisborn/paperscraper/pull/64
feat: support date search in arxiv by @jannisborn in https://github.com/jannisborn/paperscraper/pull/63
feat: Journal Impact factors are now up to date until 2024 @jannisborn in https://github.com/jannisborn/paperscraper/pull/55
feat: paperscraper.pdf.save_pdf can now also save paper metadata in json format by @jannisborn in https://github.com/jannisborn/paperscraper/pull/57

Pre-releases: * Adding support for self-referencing (#59) by @jannisborn in https://github.com/jannisborn/paperscraper/pull/60 * Base setup for self-linking by @jannisborn in https://github.com/jannisborn/paperscraper/pull/61

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.2.14...v0.2.15

- Python
Published by jannisborn about 1 year ago

paperscraper - v0.2.14

What's Changed

Refactor to pymed-paperscraper as dependency by @jannisborn in https://github.com/jannisborn/paperscraper/pull/53
Support and Tests for higher Python versions by @jannisborn in https://github.com/jannisborn/paperscraper/pull/48
Expand unit tests by @jannisborn in https://github.com/jannisborn/paperscraper/pull/49
doc: Basic mkdocs setup by @jannisborn in https://github.com/jannisborn/paperscraper/pull/50
Add codespell support (config, workflow to detect/not fix) and make it fix few typos by @yarikoptic in https://github.com/jannisborn/paperscraper/pull/54

New Contributors

@yarikoptic made their first contribution in https://github.com/jannisborn/paperscraper/pull/54

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.2.13...v0.2.14

- Python
Published by jannisborn over 1 year ago

paperscraper - v0.2.13

What's Changed

Bump scholarly dependency by @jannisborn in https://github.com/jannisborn/paperscraper/pull/47

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.2.12...v0.2.13

- Python
Published by jannisborn almost 2 years ago

paperscraper - v0.2.12

What's Changed

chore(deps): bump requests from 2.31.0 to 2.32.0 by @dependabot in https://github.com/jannisborn/paperscraper/pull/42
add retry logic in XRXivApi to tackle request timed out by @memray in https://github.com/jannisborn/paperscraper/pull/43

New Contributors

@memray made their first contribution in https://github.com/jannisborn/paperscraper/pull/43

Full Changelog: https://github.com/jannisborn/paperscraper/compare/v0.2.11...v0.2.12

- Python
Published by jannisborn almost 2 years ago

paperscraper - v0.2.11

What's Changed

fix: lower default max_results by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/41

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.10...v0.2.11

- Python
Published by jannisborn about 2 years ago

paperscraper - Impact factor restoration

0.2.9 was broken because deps of paperscraper.impact were not shipped via PyPI (installation from source was OK). Fixed this and expanded tests to discover such cases in future

What's Changed

Hotfix by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/39

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.9...v0.2.10

- Python
Published by jannisborn about 2 years ago

paperscraper - Impact factor integreation

Fuzzy search of impact factor from journals

What's Changed

Impact factor by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/37

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.8...v0.2.9

- Python
Published by jannisborn over 2 years ago

paperscraper - v0.2.8

What's Changed

Graceful handling of connection errors by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/35
chore(deps): bump requests from 2.24.0 to 2.31.0 by @dependabot in https://github.com/PhosphorylatedRabbits/paperscraper/pull/30

New Contributors

@dependabot made their first contribution in https://github.com/PhosphorylatedRabbits/paperscraper/pull/30

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.7...v0.2.8

- Python
Published by jannisborn over 2 years ago

paperscraper - v0.2.7

What's Changed

fix: OS agnostic urljoining by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/29

A bugfix for Windows users that prevented from querying the chemrxiv API

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.6...v0.2.7

- Python
Published by jannisborn almost 3 years ago

paperscraper - 0.2.6

What's Changed

Save DOIs from arxiv papers by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/27 --> This also allows to scrape PDFs from arxiv metadata

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.5...v0.2.6

- Python
Published by jannisborn almost 3 years ago

paperscraper - v0.2.5

What's Changed

Extract records from biorxiv and medrxiv based on start date and end date by @achouhan93 in https://github.com/PhosphorylatedRabbits/paperscraper/pull/24
Extract records from chemrxiv based on start date and end date by @achouhan93 and @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/25

EXAMPLE

Since v0.2.5 paperscraper also allows to scrape {med/bio/chem}rxiv for specific dates! py medrxiv(begin_date="2023-04-01", end_date="2023-04-08") But watch out. The resulting .jsonl file will be labelled according to the current date and all your subsequent searches will be based on this file only. If you use this option you might want to keep an eye on the source files (paperscraper/server_dumps/*jsonl) to ensure they contain the paper metadata for all papers you're interested in.

New Contributors

@achouhan93 made their first contribution in https://github.com/PhosphorylatedRabbits/paperscraper/pull/24

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.4...v0.2.5

- Python
Published by jannisborn almost 3 years ago

paperscraper - v0.2.4

v0.2.4 - release summary 1. Support for scraping PDFs 2. Harmonize return types of scraper classes to pd.DataFrame rather than List[Dict].

1. Scraping PDFs v0.2.4 now supports downloading PDFs. The core function is paperscraper.pdf.save_pdf which receives a dictionary with the key doi and downloads the PDF for the desired DOI. There's also a wrapper function paperscraper.pdf.save_pdf_from_dump that can be called with a filepath of a .jsonl file that was previously obtained in the metadata search. This wrapper downloads all PDFs from the metadata search. Examples are given in the README.

Thanks to @daenuprobst for suggestions!

2.Return types With this version, it is ensured that all scraper classes return the results in a pandas dataframe (one paper per row) as opposed to a list of dictionaries (one paper per dict).

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.3...v0.2.4

- Python
Published by jannisborn over 3 years ago

paperscraper - v0.2.3

What's Changed

fix: preprint['id'] should be preprint['item']['id'] by @oppih in https://github.com/PhosphorylatedRabbits/paperscraper/pull/22

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/v0.2.2...v0.2.3

- Python
Published by jannisborn almost 4 years ago

paperscraper - v0.2.2

What's Changed

refactor: extraction of published DOI/URL by @oppih in https://github.com/PhosphorylatedRabbits/paperscraper/pull/21

New Contributors

@oppih made their first contribution in https://github.com/PhosphorylatedRabbits/paperscraper/pull/21

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/0.2.1...v0.2.2

- Python
Published by jannisborn almost 4 years ago

paperscraper - 0.2.1 Streamline .jsonl handling (saving/loading)

This version streamlines the handling of .jsonl files throughout the package. It removes an inconsistency between the arxiv/pubmed and the biorxiv/chemrxiv/medrxiv entry points where the former would dump the papers one string per line and the latter dumps it as one dict (json) per line. Thanks @juliusbierk for pointing this out.

What's Changed

Export to json format by @juliusbierk in https://github.com/PhosphorylatedRabbits/paperscraper/pull/19
0.2.1 - Streamline jsonl file saving/loading by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/20

New Contributors

@juliusbierk made their first contribution in https://github.com/PhosphorylatedRabbits/paperscraper/pull/19

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/0.2.0...0.2.1

- Python
Published by jannisborn over 4 years ago

paperscraper - 0.2.0 - Integrate chemRxiv API from Open Engage

What's Changed

0.2.0 - Chemrxiv engage api by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/18
Bring back the support of chemrxiv
Extend functionalities compared to old figshare API (more searchable fields)

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/0.1.1...0.2.0

- Python
Published by jannisborn over 4 years ago

paperscraper - 0.1.1 - Reflect ChemRxiv API shutdown

Release 0.1.1 to reflect ChemRxiv API shutdown

What's Changed

ChemRxiv update by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/16:
- Decided to keep the chemrxiv-related code in the package to ensure backwards compatibility.
- attempting to download the latest chemrxiv dump (paperscraper.get_dumps.chemrxiv) will now be denied by ConnectionRefusedError
- Loading the package still tries to find a local chemrxiv dump. If one is available, package behaves as before (i.e., existing local chemrxiv dumps will continue to be fully searchable with all associated functionalities)
- If no chemrxiv dump is available, package silently proceeds (no logging, since this is the new default, closing #13)
- Improved dump loading in case the .jsonl files are empty or faulty (fixes #15)
- README description with details about chemrxiv migration from figshare to Endorse
- added badges about download statistics
ci: switch from travis to GA by @jannisborn in https://github.com/PhosphorylatedRabbits/paperscraper/pull/12
- PyPI releases now triggered with releases instead of tags

Full Changelog: https://github.com/PhosphorylatedRabbits/paperscraper/compare/0.1.0...0.1.1

- Python
Published by jannisborn over 4 years ago

Recent Releases of paperscraper

paperscraper - v0.3.2

What's Changed

New Contributors

paperscraper - v0.3.1

What's Changed

Pre-release

paperscraper - v0.3.0

What's Changed

New Contributors

paperscraper - v0.2.16

What's Changed

paperscraper - v0.2.15

What's Changed

paperscraper - v0.2.14

What's Changed

New Contributors

paperscraper - v0.2.13

What's Changed

paperscraper - v0.2.12

What's Changed

New Contributors

paperscraper - v0.2.11

What's Changed

paperscraper - Impact factor restoration

What's Changed

paperscraper - Impact factor integreation

What's Changed

paperscraper - v0.2.8

What's Changed

New Contributors

paperscraper - v0.2.7

What's Changed

paperscraper - 0.2.6

What's Changed

paperscraper - v0.2.5

What's Changed

EXAMPLE

New Contributors

paperscraper - v0.2.4

paperscraper - v0.2.3

What's Changed

paperscraper - v0.2.2

What's Changed

New Contributors

paperscraper - 0.2.1 Streamline .jsonl handling (saving/loading)

What's Changed

New Contributors

paperscraper - 0.2.0 - Integrate chemRxiv API from Open Engage

What's Changed

paperscraper - 0.1.1 - Reflect ChemRxiv API shutdown

Release 0.1.1 to reflect ChemRxiv API shutdown

What's Changed