duecredit

Automated collection and reporting of citations for used software/methods/datasets

https://github.com/duecredit/duecredit

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    3 of 22 committers (13.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.6%) to scientific vocabulary

Keywords from Contributors

closember neuroimaging brain-imaging fmri neuroscience brainweb usable git-annex data-storage mvpa
Last synced: 6 months ago · JSON representation

Repository

Automated collection and reporting of citations for used software/methods/datasets

Basic Info
  • Host: GitHub
  • Owner: duecredit
  • License: other
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 1.07 MB
Statistics
  • Stars: 239
  • Watchers: 7
  • Forks: 28
  • Open Issues: 56
  • Releases: 9
Created almost 11 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog Contributing License Zenodo

README.md

duecredit

Coverage Status DOI PyPI version fury.io

duecredit is being conceived to address the problem of inadequate citation of scientific software and methods, and limited visibility of donation requests for open-source software.

It provides a simple framework (at the moment for Python only) to embed publication or other references in the original code so they are automatically collected and reported to the user at the necessary level of reference detail, i.e. only references for actually used functionality will be presented back if software provides multiple citeable implementations.

Installation

Duecredit is easy to install via pip, simply type:

pip install duecredit

Examples

To cite the modules and methods you are using

You can already start "registering" citations using duecredit in your Python modules and even registering citations (we call this approach "injections") for modules that do not (yet) use duecredit. duecredit will remain an optional dependency, i.e. your software will work correctly even without duecredit installed.

For example, list citations of the modules and methods yourproject uses with a few simple commands: bash cd /path/to/yourmodule # for ~/yourproject cd yourproject # change directory into where the main code base is python -m duecredit yourproject.py Or you can also display them in BibTex format, using: bash duecredit summary --format=bibtex See this gif animation for a better illustration: Example

To let others cite your software

For using duecredit in your software

  1. Copy duecredit/stub.py to your codebase, e.g.

    wget -q -O /path/tomodule/yourmodule/due.py \
      https://raw.githubusercontent.com/duecredit/duecredit/master/duecredit/stub.py
    

    Note that it might be better to avoid naming it duecredit.py to avoid shadowing installed duecredit.

  2. Then use duecredit import due and necessary entries in your code as

    from .due import due, Doi, BibTeX
    

    To provide a generic reference for the entire module just use e.g.

     due.cite(Doi("1.2.3/x.y.z"), description="Solves all your problems", path="magicpy")
    

    By default, the added reference does not show up in the summary report (but see the User-view section below). If your reference is to a core package and you find that it should be listed in the summary then set cite_module=True (see here for a complete description of the arguments)

     due.cite(Doi("1.2.3/x.y.z"), description="The Answer to Everything", path="magicpy", cite_module=True)
    

    Similarly, to provide a direct reference for a function or a method, use the dcite decorator (by default this decorator sets cite_module=True)

     @due.dcite(Doi("1.2.3/x.y.z"), description="Resolves constipation issue")
     def pushit():
         ...
    

    You can easily obtain a DOI for your software using Zenodo.org and a few other DOI providers.

References can also be entered as BibTeX entries

    due.cite(BibTeX("""
            @article{mynicearticle,
            title={A very cool paper},
            author={Happy, Author and Lucky, Author},
            journal={The Journal of Serendipitous Discoveries}
            }
            """),
            description="Solves all your problems", path="magicpy")

Now what

Do the due

Once you obtained the references in the duecredit output, include them in in the references section of your paper or software.

Add injections for other existing modules

We hope that eventually this somewhat cruel approach will not be necessary. But until other packages support duecredit "natively" we have provided a way to "inject" citations for modules and/or functions and methods via injections: citations will be added to the corresponding functionality upon those modules import.

All injections are collected under duecredit/injections. See any file there with mod_ prefix for a complete example. But overall it is just a regular Python module defining a function inject(injector) which will then add new entries to the injector, which will in turn add those entries to the duecredit whenever the corresponding module gets imported.

User-view

By default duecredit does exactly nothing -- all decorators do not decorate, all cite functions just return, so there should be no fear that it would break anything. Then whenever anyone runs their analysis which uses your code and sets DUECREDIT_ENABLE=yes environment variable or uses python -m duecredit, and invokes any of the cited function/methods, at the end of the run all collected bibliography will be presented to the screen and pickled into .duecredit.p file in the current directory or to your DUECREDIT_FILE environment setting:

$> python -m duecredit examples/example_scipy.py
I: Simulating 4 blobs
I: Done clustering 4 blobs

DueCredit Report:
- Scientific tools library / numpy (v 1.10.4) [1]
- Scientific tools library / scipy (v 0.14) [2]
  - Single linkage hierarchical clustering / scipy.cluster.hierarchy:linkage (v 0.14) [3]

2 packages cited
0 modules cited
1 function cited

References
----------

[1] Van Der Walt, S., Colbert, S.C. & Varoquaux, G., 2011. The NumPy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2), pp.22–30.
[2] Jones, E. et al., 2001. SciPy: Open source scientific tools for Python.
[3] Sibson, R., 1973. SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal, 16(1), pp.30–34.

Incremental runs of various software would keep enriching that file. Then you can use duecredit summary command to show that information again (stored in .duecredit.p file) or export it as a BibTeX file ready for reuse, e.g.:

$> duecredit summary --format=bibtex
@article{van2011numpy,
        title={The NumPy array: a structure for efficient numerical computation},
        author={Van Der Walt, Stefan and Colbert, S Chris and Varoquaux, Gael},
        journal={Computing in Science \& Engineering},
        volume={13},
        number={2},
        pages={22--30},
        year={2011},
        publisher={AIP Publishing}
        }
@Misc{JOP+01,
      author =    {Eric Jones and Travis Oliphant and Pearu Peterson and others},
      title =     {{SciPy}: Open source scientific tools for {Python}},
      year =      {2001--},
      url = "http://www.scipy.org/",
      note = {[Online; accessed 2015-07-13]}
    }
@article{sibson1973slink,
        title={SLINK: an optimally efficient algorithm for the single-link cluster method},
        author={Sibson, Robin},
        journal={The Computer Journal},
        volume={16},
        number={1},
        pages={30--34},
        year={1973},
        publisher={Br Computer Soc}
    }

and if by default only references for "implementation" are listed, we can enable listing of references for other tags as well (e.g. "edu" depicting instructional materials -- textbooks etc. on the topic):

$> DUECREDIT_REPORT_TAGS=* duecredit summary

DueCredit Report:
- Scientific tools library / numpy (v 1.10.4) [1]
- Scientific tools library / scipy (v 0.14) [2]
  - Hierarchical clustering / scipy.cluster.hierarchy (v 0.14) [3, 4, 5, 6, 7, 8, 9]
  - Single linkage hierarchical clustering / scipy.cluster.hierarchy:linkage (v 0.14) [10, 11]

2 packages cited
1 module cited
1 function cited

References
----------

[1] Van Der Walt, S., Colbert, S.C. & Varoquaux, G., 2011. The NumPy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2), pp.22–30.
[2] Jones, E. et al., 2001. SciPy: Open source scientific tools for Python.
[3] Sneath, P.H. & Sokal, R.R., 1962. Numerical taxonomy. Nature, 193(4818), pp.855–860.
[4] Batagelj, V. & Bren, M., 1995. Comparing resemblance measures. Journal of classification, 12(1), pp.73–90.
[5] Sokal, R.R., Michener, C.D. & University of Kansas, 1958. A Statistical Method for Evaluating Systematic Relationships, University of Kansas.
[6] Jain, A.K. & Dubes, R.C., 1988. Algorithms for clustering data, Prentice-Hall, Inc..
[7] Johnson, S.C., 1967. Hierarchical clustering schemes. Psychometrika, 32(3), pp.241–254.
[8] Edelbrock, C., 1979. Mixture model tests of hierarchical clustering algorithms: the problem of classifying everybody. Multivariate Behavioral Research, 14(3), pp.367–384.
[9] Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), pp.179–188.
[10] Gower, J.C. & Ross, G., 1969. Minimum spanning trees and single linkage cluster analysis. Applied statistics, pp.54–64.
[11] Sibson, R., 1973. SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal, 16(1), pp.30–34.

The DUECREDIT_REPORT_ALL flag allows one to output all the references for the modules that lack objects or functions with citations. Compared to the previous example, the following output additionally shows a reference for scikit-learn since example_scipy.py uses an uncited function from that package.

$> DUECREDIT_REPORT_TAGS=* DUECREDIT_REPORT_ALL=1 duecredit summary

DueCredit Report:
- Scientific tools library / numpy (v 1.10.4) [1]
- Scientific tools library / scipy (v 0.14) [2]
  - Hierarchical clustering / scipy.cluster.hierarchy (v 0.14) [3, 4, 5, 6, 7, 8, 9]
  - Single linkage hierarchical clustering / scipy.cluster.hierarchy:linkage (v 0.14) [10, 11]
- Machine Learning library / sklearn (v 0.15.2) [12]

3 packages cited
1 module cited
1 function cited

References
----------

[1] Van Der Walt, S., Colbert, S.C. & Varoquaux, G., 2011. The NumPy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2), pp.22–30.
[2] Jones, E. et al., 2001. SciPy: Open source scientific tools for Python.
[3] Sneath, P.H. & Sokal, R.R., 1962. Numerical taxonomy. Nature, 193(4818), pp.855–860.
...

Tags

You are welcome to introduce new tags specific to your citations but we hope that for consistency across projects, you would use the following tags

  • implementation (default) — an implementation of the cited method
  • reference-implementation — the original implementation (ideally by the authors of the paper) of the cited method
  • another-implementation — some other implementation of the method, e.g. if you would like to provide a citation for another implementation of the method you have implemented in your code and for which you have already provided implementation or reference-implementation tag
  • use — publications demonstrating a worthwhile noting use of the method
  • edu — tutorials, textbooks and other materials useful to learn more about cited functionality
  • donate — should be commonly used with URL entries to point to the websites describing how to contribute some funds to the referenced project
  • funding — to point to the sources of funding which provided support for a given functionality implementation and/or method development
  • dataset - for datasets

Ultimate goals

Reduce demand for prima ballerina projects

Problem: Scientific software is often developed to gain citations for original publication through the use of the software implementing it. Unfortunately, such an established procedure discourages contributions to existing projects and fosters new projects to be developed from scratch.

Solution: With easy ways to provide all-and-only relevant references for used functionality within a large(r) framework, scientific developers will prefer to contribute to already existing projects.

Benefits: As a result, scientific developers will immediately benefit from adhering to proper development procedures (codebase structuring, testing, etc) and already established delivery and deployment channels existing projects already have. This will increase efficiency and standardization of scientific software development, thus addressing many (if not all) core problems with scientific software development everyone likes to bash about (reproducibility, longevity, etc.).

Adequately reference core libraries

Problem: Scientific software often, if not always, uses 3rd party libraries (e.g., NumPy, SciPy, atlas) which might not even be visible at the user level. Therefore they are rarely referenced in the publications despite providing the fundamental core for solving a scientific problem at hand.

Solution: With automated bibliography compilation for all used libraries, such projects and their authors would get a chance to receive adequate citability.

Benefits: Adequate appreciation of the scientific software developments. Coupled with a solution for "prima ballerina" problem, more contributions will flow into the core/foundational projects making new methodological developments readily available to even wider audiences without proliferation of the low quality scientific software.

Similar/related projects

sempervirens -- an experimental prototype for gathering anonymous, opt-in usage data for open scientific software. Eventually, in duecredit we aim either to provide similar functionality (since we are collecting such information as well) or just interface/report to sempervirens.

citepy -- Easily cite software libraries using information from automatically gathered from their package repository.

Currently used by

This is a running list of projects that use DueCredit natively. If you are using DueCredit, or plan to use it, please consider sending a pull request and add your project to this list. Thanks to @fedorov for the idea.

Last updated 2024-02-23.

Owner

  • Name: DueCredit
  • Login: duecredit
  • Kind: organization

GitHub Events

Total
  • Issues event: 5
  • Watch event: 4
  • Delete event: 3
  • Issue comment event: 44
  • Push event: 26
  • Pull request review comment event: 7
  • Pull request review event: 26
  • Pull request event: 48
  • Fork event: 1
  • Create event: 1
Last Year
  • Issues event: 5
  • Watch event: 4
  • Delete event: 3
  • Issue comment event: 44
  • Push event: 26
  • Pull request review comment event: 7
  • Pull request review event: 26
  • Pull request event: 48
  • Fork event: 1
  • Create event: 1

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 479
  • Total Committers: 22
  • Avg Commits per committer: 21.773
  • Development Distribution Score (DDS): 0.37
Top Committers
Name Email Commits
Yaroslav Halchenko d****n@o****m 302
Matteo Visconti dOC m****r@d****u 126
John T. Wodder II g****t@v****g 13
Michał Szczepanik m****k@n****l 7
Jason Gors j****k@g****m 7
Chris Barnes b****c@j****g 3
Emily Irvine e****e@g****m 3
Pradeep Reddy Raamana p****a@r****g 2
auto a****o@n****l 2
Jakub Wilk j****k@j****t 2
Chris Markiewicz e****s@g****m 1
AdityaSavara 3****a@u****m 1
Benjamin Drung b****g@c****m 1
Chris Markiewicz m****z@s****u 1
Katrin Leinweber k****i@p****e 1
Loïc Estève l****e@y****m 1
Oliver Beckstein o****t@g****m 1
Fernando Pérez-García f****r@g****m 1
Pradeep Reddy Raamana r****a@g****m 1
sanjay-cpu s****2@g****m 1
David Volgyes d****s@i****g 1
Omer Faruk Gulban o****n@u****m 1

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 53
  • Total pull requests: 106
  • Average time to close issues: 7 months
  • Average time to close pull requests: 13 days
  • Total issue authors: 28
  • Total pull request authors: 23
  • Average comments per issue: 2.3
  • Average comments per pull request: 2.81
  • Merged pull requests: 96
  • Bot issues: 0
  • Bot pull requests: 6
Past Year
  • Issues: 3
  • Pull requests: 24
  • Average time to close issues: about 21 hours
  • Average time to close pull requests: 9 days
  • Issue authors: 3
  • Pull request authors: 3
  • Average comments per issue: 0.67
  • Average comments per pull request: 1.58
  • Merged pull requests: 21
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • yarikoptic (20)
  • mvdoc (4)
  • clbarnes (2)
  • ihincks (2)
  • marcelzwiers (2)
  • pckroon (1)
  • RMeli (1)
  • ayushsuhane (1)
  • mtb-za (1)
  • hjmjohnson (1)
  • emdupre (1)
  • janosh (1)
  • richardjgowers (1)
  • jbwexler (1)
  • lvotapka (1)
Pull Request Authors
  • DimitriPapadopoulos (46)
  • yarikoptic (37)
  • jwodder (13)
  • dependabot[bot] (9)
  • a-detiste (7)
  • marcelzwiers (4)
  • clbarnes (3)
  • mvdoc (2)
  • mikemhenry (2)
  • effigies (2)
  • orbeckst (1)
  • sanjaymsh (1)
  • raamana (1)
  • hroncok (1)
  • bdrung (1)
Top Labels
Issue Labels
good-for-hackathon (8) enhancement (4) question (3) bug (1) enhancement-for-developers (1) tests (1)
Pull Request Labels
internal (29) patch (10) tests (9) release (8) major (5) minor (2) deployment (2) documentation (1)

Packages

  • Total packages: 12
  • Total downloads:
    • pypi 13,986 last-month
  • Total docker downloads: 466
  • Total dependent packages: 29
    (may contain duplicates)
  • Total dependent repositories: 70
    (may contain duplicates)
  • Total versions: 52
  • Total maintainers: 3
pypi.org: duecredit

Publications (and donations) tracer

  • Versions: 33
  • Dependent Packages: 27
  • Dependent Repositories: 70
  • Downloads: 13,986 Last month
  • Docker Downloads: 466
Rankings
Dependent packages count: 0.5%
Docker downloads count: 1.2%
Dependent repos count: 1.8%
Average: 3.1%
Downloads: 3.2%
Stargazers count: 4.4%
Forks count: 7.6%
Maintainers (1)
Last synced: 7 months ago
spack.io: py-duecredit

Publications (and donations) tracer.

  • Versions: 3
  • Dependent Packages: 2
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Average: 13.1%
Stargazers count: 13.7%
Dependent packages count: 19.0%
Forks count: 19.9%
Maintainers (1)
Last synced: about 1 year ago
alpine-edge: py3-duecredit-pyc

Precompiled Python bytecode for py3-duecredit

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Average: 13.4%
Dependent packages count: 14.3%
Stargazers count: 17.1%
Forks count: 22.1%
Maintainers (1)
Last synced: 7 months ago
alpine-edge: py3-duecredit

Automated collection and reporting of citations for used software/methods/datasets

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Average: 13.4%
Dependent packages count: 14.3%
Stargazers count: 17.1%
Forks count: 22.1%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.19: py3-duecredit-pyc

Precompiled Python bytecode for py3-duecredit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.19: py3-duecredit

Automated collection and reporting of citations for used software/methods/datasets

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.20: py3-duecredit

Automated collection and reporting of citations for used software/methods/datasets

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.21: py3-duecredit

Automated collection and reporting of citations for used software/methods/datasets

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.22: py3-duecredit

Automated collection and reporting of citations for used software/methods/datasets

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.20: py3-duecredit-pyc

Precompiled Python bytecode for py3-duecredit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.21: py3-duecredit-pyc

Precompiled Python bytecode for py3-duecredit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago
alpine-v3.22: py3-duecredit-pyc

Precompiled Python bytecode for py3-duecredit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 7 months ago