https://github.com/ggnowayback/cathodedataextractor

A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries.

https://github.com/ggnowayback/cathodedataextractor

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: rsc.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.1%) to scientific vocabulary

Keywords

battery-information electrochemistry information-extraction materials-science nature-inspired-algorithms nature-language-process relation-extraction synthesis-parameters text-mining
Last synced: 5 months ago · JSON representation

Repository

A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries.

Basic Info
  • Host: GitHub
  • Owner: GGNoWayBack
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 608 KB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 2
  • Open Issues: 1
  • Releases: 2
Topics
battery-information electrochemistry information-extraction materials-science nature-inspired-algorithms nature-language-process relation-extraction synthesis-parameters text-mining
Created over 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License

README.md

CathodeDataExtractor


Supported Python versions GitHub LICENSE PyPI version
Cathodedataextractor is a lightweight document-level information extraction pipeline that can automatically extract comprehensive properties related to synthesis parameters, cycling and rate performance of cathode materials from the literature of layered cathode materials for sodium-ion batteries.

Installation


pip install cathodedataextractor

Features


  • It is built on open-source libraries: pymatgen, text2chem, and ChemDataExtractor v2 with some modifications.
  • BatterySciBERT-uncased Multi-Label text classification model for filtering documents.
  • Automated comprehensive data extraction pipeline for cathode materials.
  • Paragraph Multi-Class classification algorithms for documents (HTML/XML) from the RSC and Elsevier.
  • A normalised entity handling process is provided.
  • An effective chemical abbreviation detection module.
  • Heuristic multi-level relation extraction algorithm for electrochemical properties.

In addition, the pipeline is also suitable for string sequence text extraction.

Quick start


Extract from documents

```python from glob import iglob from cathodedataextractor.informationextractionpipe import Pipeline

pipline = Pipeline() for document in iglob('*ml'): extraction_results = pipline.extract(document) ```

Extract from string

```python from cathodedataextractor.informationextractionpipe import Pipeline

extractionresults = Pipeline.fromstring( 'Apart from the conventional cationic redox of transition metals, ' 'both Na-deficit and Na-excess materials have showcased the ability ' 'to exploit oxygen redox activity as O2–/O2n– for a charge ' 'compensation mechanism. To realize cathodes with enhanced energy ' 'density, a technique like the incorporation of alkali metal ions ' 'into transition metal layers has been adopted. Recent work by Boisse ' '(13) et al. displayed the impact of honeycomb cation ordering of ' 'a highly stabilized intermediate phase for a Na2RuO3 cathode material ' 'in instigating the anionic redox activity and providing a capacity ' 'of 180 mAh g–1 at 0.2C with a capacity retention of 89% for over ' '50 cycles. More devoted efforts to realize the utmost potential ' 'of anionic redox ought to be carried out in the future.') ```

Issues?


You can either report an issue on GitHub or contact me directly. Try gouyx@mail2.sysu.edu.cn.

Citing


If the source code turns out to be helpful to your research, please cite the following work:

paper

Gou, Y., Zhang, Y., Zhu, J. et al. A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries. Sci Data 11, 372 (2024).

Owner

  • Login: GGNoWayBack
  • Kind: user

GitHub Events

Total
  • Issues event: 1
  • Watch event: 4
  • Issue comment event: 1
  • Fork event: 1
Last Year
  • Issues event: 1
  • Watch event: 4
  • Issue comment event: 1
  • Fork event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 26
  • Total Committers: 2
  • Avg Commits per committer: 13.0
  • Development Distribution Score (DDS): 0.192
Past Year
  • Commits: 26
  • Committers: 2
  • Avg Commits per committer: 13.0
  • Development Distribution Score (DDS): 0.192
Top Committers
Name Email Commits
GGNoWayBack 2****9@q****m 21
GGNoWayBack 9****k 5
Committer Domains (Top 20 + Academic)
qq.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 minutes
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • GGNoWayBack (1)
Pull Request Authors
  • GGNoWayBack (2)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 9 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
pypi.org: cathodedataextractor

A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries.

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 9 Last month
Rankings
Dependent packages count: 9.9%
Average: 37.5%
Dependent repos count: 65.2%
Maintainers (1)
Last synced: 6 months ago