https://github.com/chendaniely/pyprojroot

Finding project directories in Python (data science) projects, just like in R rprojroot and here packages

https://github.com/chendaniely/pyprojroot

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: plos.org
  • Committers with academic emails
    1 of 15 committers (6.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.1%) to scientific vocabulary

Keywords from Contributors

bioinformatics
Last synced: 10 months ago · JSON representation

Repository

Finding project directories in Python (data science) projects, just like in R rprojroot and here packages

Basic Info
  • Host: GitHub
  • Owner: chendaniely
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 63.5 KB
Statistics
  • Stars: 151
  • Watchers: 7
  • Forks: 16
  • Open Issues: 30
  • Releases: 0
Created about 7 years ago · Last pushed over 3 years ago
Metadata Files
Readme License

README.md

Project-oriented workflow in Python

Finding project directories in Python (data science) projects.

This library aims to provide both the programmatic functionality from the R rprojroot package and the interactive functionality from the R here package.

Motivation

Problem: I have a project that has a specific folder structure, for example, one mentioned in Noble 2009 or something similar to this project template, and I want to be able to:

  1. Run my python scripts without having to specify a series of ../ to get to the data folder.
  2. cd into the directory of my python script instead of calling it from the root project directory and specify all the folders to the script.
  3. Reference datasets from a root directory when using a jupyter notebook because everytime I use a jupyter notebook, the working directory changes to the location of the notebook, not where I launched the notebook server.

Solution: pyprojroot finds the root working directory for your project as a pathlib.Path object. You can now use the here function to pass in a relative path from the project root directory (no matter what working directory you are in the project), and you will get a full path to the specified file. That is, in a jupyter notebook, you can write something like pandas.read_csv(here('data/my_data.csv')) instead of pandas.read_csv('../data/my_data.csv'). This allows you to restructure the files in your project without having to worry about changing file paths.

Great for reading and writing datasets!

Further reading:

Installation

pip

bash python -m pip install pyprojroot

conda

https://anaconda.org/conda-forge/pyprojroot

bash conda install -c conda-forge pyprojroot

Example Usage

pyprojroot looks for certain files like .here or .git to identify the here directory. To make any of the following examples work, you'll need one of those files in the current directory or one of its parents. (For the complete list of files, see here.py.)

Interactive

This is based on the R here library.

```python from pyprojroot.here import here

here() ```

Programmatic

This based on the R rprojroot library.

```python import pyprojroot

basepath = pyprojroot.findroot(pyprojroot.has_dir(".git")) ```

Demonstration

Load the packages

In [1]: from pyprojroot.here import here In [2]: import pandas as pd

The current working directory is the "notebooks" folder

In [3]: !pwd /home/dchen/git/hub/scipy-2019-pandas/notebooks

In the notebooks folder, I have all my notebooks

In [4]: !ls 01-intro.ipynb 02-tidy.ipynb 03-apply.ipynb 04-plots.ipynb 05-model.ipynb Untitled.ipynb

If I wanted to access data in my notebooks I'd have to use ../data

In [5]: !ls ../data billboard.csv country_timeseries.csv gapminder.tsv pew.csv table1.csv table2.csv table3.csv table4a.csv table4b.csv weather.csv

However, with there here function, I can access my data all from the project root. This means if I move the notebook to another folder or subfolder I don't have to change the path to my data. Only if I move the data to another folder would I need to change the path in my notebook (or script)

In [6]: pd.read_csv(here('data/gapminder.tsv'), sep='\t').head() Out[6]: country continent year lifeExp pop gdpPercap 0 Afghanistan Asia 1952 28.801 8425333 779.445314 1 Afghanistan Asia 1957 30.332 9240934 820.853030 2 Afghanistan Asia 1962 31.997 10267083 853.100710 3 Afghanistan Asia 1967 34.020 11537966 836.197138 4 Afghanistan Asia 1972 36.088 13079460 739.981106

By the way, you get a pathlib.Path object path back!

In [7]: here('data/gapminder.tsv') Out[7]: PosixPath('/home/dchen/git/hub/scipy-2019-pandas/data/gapminder.tsv')

Owner

  • Name: Daniel Chen
  • Login: chendaniely
  • Kind: user
  • Location: JFK -> DCA -> ROA -> JFK -> YVR
  • Company: @rstudio @UBC-DSCI @UBC-MDS

bow ties are cool

GitHub Events

Total
  • Watch event: 11
  • Fork event: 1
Last Year
  • Watch event: 11
  • Fork event: 1

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 57
  • Total Committers: 15
  • Avg Commits per committer: 3.8
  • Development Distribution Score (DDS): 0.632
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Daniel Chen c****y@g****m 21
Eric Ma e****g@g****m 8
Achim Gädke a****e 4
Tranquilize m****l@g****m 4
Yashika Khurana y****a@g****m 3
James Myatt j****t@c****m 3
Achim Gädke a****m@m****m 3
Carol Willing c****e@w****m 2
Jason Brackman j****n 2
Jason McLaren j****n@f****a 2
Agustina p****a@g****m 1
Joseph Egan j****b@g****m 1
S.J. van Rijn s****n@l****l 1
Stefan Lehmann s****m@p****e 1
James Myatt m****j@t****k 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 41
  • Total pull requests: 30
  • Average time to close issues: 29 days
  • Average time to close pull requests: 2 months
  • Total issue authors: 15
  • Total pull request authors: 13
  • Average comments per issue: 1.71
  • Average comments per pull request: 1.2
  • Merged pull requests: 26
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • chendaniely (21)
  • jamesmyatt (4)
  • majidaldo (2)
  • willingc (2)
  • jasonbrackman (2)
  • rorynolan (1)
  • achimgaedke (1)
  • tbsexton (1)
  • buhtz (1)
  • sjvrijn (1)
  • NickleDave (1)
  • jasonmclaren (1)
  • r0f1 (1)
  • yashikakhurana (1)
  • joshpsawyer (1)
Pull Request Authors
  • chendaniely (4)
  • jamesmyatt (4)
  • jasonbrackman (3)
  • jasonmclaren (3)
  • yashikakhurana (3)
  • rj-wilson (2)
  • willingc (2)
  • eganjs (2)
  • achimgaedke (2)
  • ericmjl (2)
  • sjvrijn (1)
  • stlehmann (1)
  • aguspesce (1)
Top Labels
Issue Labels
good first issue (10)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 118,629 last-month
  • Total docker downloads: 603
  • Total dependent packages: 14
    (may contain duplicates)
  • Total dependent repositories: 99
    (may contain duplicates)
  • Total versions: 6
  • Total maintainers: 1
pypi.org: pyprojroot

Project-oriented workflow in Python

  • Versions: 4
  • Dependent Packages: 14
  • Dependent Repositories: 82
  • Downloads: 118,629 Last month
  • Docker Downloads: 603
Rankings
Dependent packages count: 0.8%
Dependent repos count: 1.6%
Docker downloads count: 1.9%
Downloads: 2.7%
Average: 3.8%
Stargazers count: 6.5%
Forks count: 9.1%
Maintainers (1)
Last synced: 10 months ago
conda-forge.org: pyprojroot
  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 17
Rankings
Dependent repos count: 8.6%
Stargazers count: 32.8%
Average: 33.1%
Forks count: 39.5%
Dependent packages count: 51.6%
Last synced: 10 months ago

Dependencies

.github/workflows/python-build-test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/python-package-conda.ignore-yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
pyproject.toml pypi
  • typing-extensions *