Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.1%) to scientific vocabulary
Repository
Package for vizualising polarised genomic data
Basic Info
- Host: GitHub
- Owner: Studenecivb
- License: mit
- Language: Python
- Default Branch: main
- Size: 673 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 5
Metadata Files
README.md
CarpePy
Welcome to the CarpePy documentation!
CarpePy is a toolset for visualising polarised genomic data. It is dependent on a pre-processed data from diem package: https://github.com/StuartJEBaird/diem
CarpePy is dependent on pandas and numpy Python packages and we recommend running it using a virtual environment.
Example of running CarpePy with Pneumocystis data
Pneumocystis data: Jan Petružela, Beate Nürnberger, Alexis Ribas, et al. Comparative genomic analysis of co-occurring hybrid zones of house mouse parasites Pneumocystis murina and Syphacia obvelata using genome polarisation. Authorea. January 31, 2025.
First load the input data from diem as a pandas dataframe:
python
HonzaPneumo_df = pd.read_csv(file_path,sep=',')
HonzaPneumo = HonzaPneumo_df.values.tolist()
This is how we want the HonzaPneumo_df to look like approximately:
| Row | diemmarkpos | scaffold | refpos | V3 | ... | SK1151DM | SU4201DM | ... | isig | osig | admixturecategory | genic | cds | msg | |-----|--------------|----------|----------------|-----|-----|-----------|-----------|------|-------|-------|--------------------|-------|-----|-----| | 0 | 0 | m1 | AFWA02000001.1 | 44 | ... | 0 | 2 | .... | 0 | 0 | barr | 0 | 0 | 0 | | 1 | 1 | m2 | AFWA02000001.1 | 94 | ... | 0 | _ | ... | 0 | 0 | barr | 0 | 0 | 0 | | 2 | 2 | m3 | AFWA02000001.1 | 314 | ... | 0 | 2 | ... | 0 | 0 | barr | 0 | 0 | 0 |
Now we are going to take the thrid and fourth column to get the BED information into a separate variable:
python
third_column = [row[2] for row in HonzaPneumo]
fourth_column = [row[3] for row in HonzaPneumo]
HonzaPneumoBED = list(zip(third_column, fourth_column))
As the next step, we want to extract the names of the individuals:
python
first_row = HonzaPneumo_df.columns.tolist()
HonzaPneumoIndIDs = first_row[18:-6]
And then the markers and the selected input data:
python
column_indices = list(range(18, 18 + len(HonzaPneumoIndIDs)))
HonzaPneumoSelected = [[row[i] for i in column_indices] for row in HonzaPneumo]
HonzaPneumoMarkers = ["".join(map(str, row)) for row in HonzaPneumoSelected]
HonzaPneumoPolariseNjoin = [list(row) + [marker] for row, marker in zip(HonzaPneumoBED, HonzaPneumoMarkers)]
Now we are finally ready to run the diem Plot Prepper:
```python
plot_theme = "Pneumocystis"
PneumoPlotPrep = DiemPlotPrep(plottheme='Pneumocystis', polariseddata=HonzaPneumoPolariseNjoin, indids=HonzaPneumoIndIDs, dithreshold="NO DI FILTER", dicolumn=5, physres=1, ticks='kb') PneumoPlotPrep.formatbeddata() ``` The arguments include the plot theme, the polarised and processed data, the index names, diagnostic index filtering if we want any and the column we want to use for it, resolution (in this case 1) and lastly the tick sizes we want - either kb or mb, depending on our data.
Now we are all prepped to run either the Unit plots - which represent the unit of our genome, either
a scaffold or chromosome or then the IrisPlot which shows us the whole genome.
python
for i in range(len(PneumoPlotPrep.unit_plot_prep)):
diemUnitPlot(PneumoPlotPrep.unit_plot_prep[i], bed_data=PneumoPlotPrep.DIfilteredBED_formatted[i],
index=i+1,
path='output_path',
names_list=PneumoPlotPrep.IndIDs_ordered, ticks='kb')
The output of the unit plot:

And now the IrisPlot: ```python
diemIrisPlot(PneumoPlotPrep.diemDITgenomesordered, names=PneumoPlotPrep.IndIDsordered, bedinfo=PneumoPlotPrep.irisplotprep, lengthofchromosomes=PneumoPlotPrep.lengthofchromosomes, heatmap=heatmapmap, path=outputpath, png='cuteiris', pdf='cute_iris') ``` The arguments include the chromosome names, BED information and the diemDITheredgenomes that are ordered according to the Hybrid Index. If you do not add any png or pdf name, the plot will just be shown, pdf and png names allow it to be saved into a folder.
We can also add a heatmap to the IrisPlot and it should be processed:
python
heatmap_pre_values = list(HonzaPneumo_df.iloc[:, -4])
rle_heatmap_values = np.array(RichRLE(heatmap_pre_values)).T
heatmap_map = np.delete(rle_heatmap_values, 1, axis=1)
The output of Iris Plot:

Please if you have any questions, contact us on: ninahaladova@gmail.com
Cite as: Baird, S. J. E., & Daley, N. (2025). CarpePy (Version 0.0.1) [Computer software]
Owner
- Name: IVB_Studenec
- Login: Studenecivb
- Kind: organization
- Repositories: 1
- Profile: https://github.com/Studenecivb
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Baird
given-names: Stuart J.E.
- family-names: Daley
given-names: Nina
title: "CarpePy"
version: 0.0.6
date-released: 2025-01-26
GitHub Events
Total
- Release event: 5
- Push event: 10
- Create event: 8
Last Year
- Release event: 5
- Push event: 10
- Create event: 8
Packages
- Total packages: 1
-
Total downloads:
- pypi 22 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: carpepy
Module for visualising polarised genomes
- Documentation: https://carpepy.readthedocs.io/
- License: mit
-
Latest release: 0.0.5
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- matplotlib ==3.6.2
- numpy ==1.24.1
- pandas ==1.5.2
- scipy ==1.15.1
- setuptools ==65.5.0
- setuptools ==68.2.0
- matplotlib *
- numpy *
- pandas *