firecaller

FIREcaller: Python library for detecting Frequently Interacting REgions (FIREs) from Hi-C data

https://github.com/cellular-genomics/python-firecaller

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.3%) to scientific vocabulary

Keywords

3d-genome cooler genomic-data-analysis genomics hic

Keywords from Contributors

interactive distributed serializer packaging network-simulation shellcodes hacking autograding observability embedded
Last synced: 9 months ago · JSON representation

Repository

FIREcaller: Python library for detecting Frequently Interacting REgions (FIREs) from Hi-C data

Basic Info
  • Host: GitHub
  • Owner: cellular-genomics
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 21.5 KB
Statistics
  • Stars: 5
  • Watchers: 1
  • Forks: 1
  • Open Issues: 2
  • Releases: 0
Topics
3d-genome cooler genomic-data-analysis genomics hic
Created over 6 years ago · Last pushed about 4 years ago
Metadata Files
Readme License

README.md

Detect Frequently Interacting REgions (FIREs) in Python

PyPI

The project is a port of the R package for detecting frequently interacting regions (FIREs) from Hi-C data to Python. FIRE is described in A Compendium of Chromatin Contact Maps Reveal Spatially Active Regions in the Human Genome paper.

Command line usage

Install the package: python3 -m pip install FIREcaller

Download some HiC experiment results from the 4D Nucleome database. Choose Contact Matrix (.mcool) as a file type.

Download the mappability file from Yunjiang's website. Ensure the genomic assembly and the bin size matches your HiC files. Add header line chr start end F GC M if missing.

Perform FIRE calling: FIREcaller \ --cooler_filenames 4DNFIT5YVTLO.mcool \ --cooler_filenames 4DNFIJTOIGOI.mcool \ --mappability_filename F_GC_M_Hind3_10Kb_el.GRCh38.txt \ --bin_size 10000 \ --output_filename fires.csv

The output file will consist of the genomic regions and their corresponding FIRE scores and log p-values for each HiC file: chr start end F GC M 0_count_neig 1_count_neig 0_fire 1_fire 0_logpvalue 1_logpvalue chr1 1970000 1980000 2000 0.5175 1.0000 5365 2376 0.9515 0.8702 0.5861 0.4317 chr1 2020000 2030000 3000 0.6287 0.9907 4305 2005 0.5806 0.5831 0.1128 0.1144 chr1 2060000 2070000 4000 0.4770 0.9210 4029 2171 0.6880 0.8678 0.1954 0.4277 (...)

{n}_fire column stores the FIIRE score for the n-th cooler file provided as --cooler_filenames argument

{n}_logpvalue column stores the log p-value for the n-th cooler file provided as --cooler_filenames argument

Python usage

Use the FIREcaller.calc_fires() function to perform FIRE calling from your Python program:

calc_fires(mappability_filename, cooler_filenames, bin_size, neighborhood_region, perc_threshold=.25, avg_mappability_threshold=0.9)

mappability_filename : str - Path to mappability file

cooler_filenames : str - List of paths to HiC experiment results in cooler format

bin_size : int - Bin size

neighborhood_region : int - Size of neighbor region

perc_threshold : float - Maximum ratio of "bad" neighbors allowed

avg_mappability_threshold : float - Minimum mappability allowed

The function returns the Pandas DataFrame matrix.

Verification

In order to compare the results derived from this project with the ones obtained from the FIREcaller R package please follow the steps described in the README.md file to produce the FIRE_ANALYSIS_40KB.txt file.

Now download and uncompress the HippoHi-Cinputschr122.tar.gz and FGCMHindIII40KB_hg19.txt.gz files.

Convert the HiC files to get the hippo.mcool cooler file using the scripts/build_hippo_cool.py python script:

python scripts/build_hippo_cool.py

Run the FIRE calling: FIREcaller \ --cooler_filename hippo.mcool \ --mappability_filename F_GC_M_HindIII_40KB_hg19.txt \ --bin_size 40000 \ --output_filename hippo_fires.csv

Compare the FIRE_ANALYSIS_40KB.txt and fires.csv files.

Citation

Crowley, C., Yang, Y., Qiu, Y., Hu, B., Abnousi, A., Lipiński, J., Plewczynski, D., Wu, D., Won, H., Ren, B., Hu, M., Li, Y*. (2021) FIREcaller: Detecting Frequently Interacting Regions from Hi-C Data. Computational and Structural Biotechnology Journal, 19: 355–362.

TODO:

  • [ ] Add option to remove the MHC regions
  • [ ] Calculate the Super FIREs
  • [x] Convert the script into the Python package

Let me know if you need any of the above and I'll be happy to implement it. I also accept PRs and comments. Enjoy.

GitHub Events

Total
Last Year

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 8
  • Total Committers: 2
  • Avg Commits per committer: 4.0
  • Development Distribution Score (DDS): 0.125
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Jakub Lipinski j****i@g****m 7
dependabot[bot] 4****] 1

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 2
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 4 months
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • hmyh1202 (1)
  • okurman (1)
Pull Request Authors
  • dependabot[bot] (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 13 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 1
  • Total maintainers: 1
pypi.org: firecaller

Python library for detecting Frequently Interacting REgions (FIREs) from Hi-C data

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 13 Last month
Rankings
Dependent packages count: 10.0%
Dependent repos count: 21.7%
Forks count: 22.6%
Stargazers count: 23.0%
Average: 25.8%
Downloads: 51.7%
Maintainers (1)
Last synced: 9 months ago

Dependencies

requirements.txt pypi
  • cooler ==0.8.7
  • dask ==2021.10.0
  • fsspec ==0.6.2
  • pandas ==1.0.1
  • statsmodels ==0.11.0
  • tables ==3.6.1
setup.py pypi
  • cooler >=0.8.7
  • dask >=2.10.1
  • fsspec >=0.6.2
  • pandas >=1.0.1
  • statsmodels >=0.11.0
  • tables >=3.6.1