https://github.com/cedergrouphub/limesoup
LimeSoup is a package to parse HTML or XML papers from different publishers.
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: wiley.com, nature.com, aps.org, rsc.org, acs.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.0%) to scientific vocabulary
Keywords
Repository
LimeSoup is a package to parse HTML or XML papers from different publishers.
Basic Info
Statistics
- Stars: 17
- Watchers: 7
- Forks: 6
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
LimeSoup
LimeSoup is a package to parse HTML or XML papers from different publishers. It can be used to feed a database.
Usage
Full Usage:
``` from LimeSoup import ( ACSSoup, AIPSoup, APSSoup, ECSSoup, ElsevierSoup, IOPSoup, NatureSoup, RSCSoup, SpringerSoup, WileySoup, )
with open(article, 'r', encoding = 'utf-8') as f: html_str = f.read()
***Choose correct publisher data = ECSSoup.parse(html_str)
with open('filetest.json', 'w', encoding = 'utf-8') as f: json.dump(data, f, sortkeys=True, indent=4, ensure_ascii=False) ```
Currently, we have implemented the following parsers:
- ECS: The Electrochemical Society
- RSC: The Royal Society of Chemistry
- Elsevier
- Nature Publishing Group
- Springer
- Wiley
- ACS: American Chemical Society
- APS: American Physical Society
- IOP Publishing
- AIP: American Institute of Physics
Development documentation
Please refer to the wiki pages.
Change logs
Please see change logs.
Credits
LimeSoup was contributed to by these genius people:
- Tiago Botari
- Ziqin Rong
- Vahe Tshitoyan
- Nicolas Mingione
- Jason Madeano
- Haoyan Huo
- Tanjin He
- Zach Jensen
- Alex van Grootel
- Edward Kim
- Haihao Liu
- Zheren Wang
If you are planning to use LimeSoup in your work, please consider citing the following paper:
- Kononova et. al "Text-mined dataset of inorganic materials synthesis recipes", Scientific Data 6 (1), 1-11 (2019) 10.1038/s41597-019-0224-1
Owner
- Name: Ceder Group
- Login: CederGroupHub
- Kind: organization
- Website: http://ceder.berkeley.edu/
- Repositories: 19
- Profile: https://github.com/CederGroupHub
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- beautifulsoup4 >=4.6.1
- lxml *
- lxml >=4.2.6,<=4.3.5
- old *
- unexpected *