xist
R and Python code and data for the "XCut imitation via st-MinCuts" (Xist) algorithm
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary
Repository
R and Python code and data for the "XCut imitation via st-MinCuts" (Xist) algorithm
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Content
This repository contains four related sets of files:
Python code supplementary to the paper "A scalable algorithm to approximate graph cuts" (Suchan, Li, Munk; 2023), see here on arXiv:
xist.pyxist_applications.pybash_chaco.sh
R code supplementary to the same paper:
xist.Rxist_applications.R
R code supplementary to the paper "Distributional limits of graph cuts on discretized grids" (Suchan, Li, Munk; 2024), to appear on arXiv:
graph_cut_limits.Rgraph_cut_limit_applications.R
The NIH 3T3 dataset:
- The folder
NIH3T3_Datacontaining 21 mouse embryo stem cell images. This data belongs to Ulrike Rölleke and Sarah Köster (University of Göttingen).
- The folder
In particular, the Xist algorithm is implemented in xist.py for Python as well as in xist.R for R.
Usage
1. Installation of the Python code supplement to "A scalable algorithm to approximate graph cuts"
- Install KaHIP for Python (https://github.com/KaHIP/KaHIP - follow the installation instructions in their README under section "Using KaHIP in Python")
- Install the Chaco algorithm (https://www3.cs.stonybrook.edu/~algorith/implement/chaco/implement.shtml)
- Download
xist.py,xist_applications.pyandbash_chaco.shfrom this repository. Put both Python files into thedeployfolder of your KaHIP installation, and putbash_chaco.shinto theexecfolder of your Chaco installation. - Edit the path fragments
/home/lsuchan/Chaco-2.2inside the functionsncut_chaco_unweightedandncut_chacoinxist.pyto point towards your Chaco installation directory. - If your Chaco installation directory is not
~/Chaco-2.2/, change this expression in lines 5 and 6 ofbash_chaco.shso that it points towards your Chaco installation directory. - Install the following Python packages via pip:
numpyigraphpandasPILleidenalgpymetistqdm
- Run
xist.py. - (Optional) If you desire to work with the NIH 3T3 Dataset, download the
NIH3T3_Datafolder from this repository and place it into your Python working directory. - (Optional) If you desire to work with the large datasets used in the paper, download them from the SNAP database. Create a
Datasetsfolder inside thedeployfolder of your KaHIP installation, and put the following files into it:musae_squirrel_edges.csv(from https://snap.stanford.edu/data/wikipedia-article-networks.html)CA-HepPh.txt(from https://snap.stanford.edu/data/cit-HepPh.html)musae_facebook_edges.csv(from https://snap.stanford.edu/data/facebook-large-page-page-network.html)Email-Enron.txt(from https://snap.stanford.edu/data/email-Enron.html)artist_edges.csv(from https://snap.stanford.edu/data/gemsec-Facebook.html)large_twitch_edges.csv(from https://snap.stanford.edu/data/twitch_gamers.html)
- Done! You are now ready to run any part of
xist_applications.pyand should therefore be able to reproduce the results from "A scalable algorithm to approximate graph cuts" (Suchan, Li, Munk; 2023).
2. Usage of the R code supplement to "A scalable algorithm to approximate graph cuts"
- Download
xist.Randxist_applications.Rfrom this repository. - Run
xist.R. - (Optional) If you desire to work with the NIH 3T3 Dataset, download the
NIH3T3_Datafolder from this repository and place it into your Python working directory. Editxist_applications.Rto replace the two occurences ofsetwd("/path/to/NIH3T3/data")by the appropriate path to the NIH 3T3 dataset folder. - (Optional) Similarly, if you desire to work with the large datasets used in the paper, download them from the SNAP database, putting the files listed above in section 1.9. into a folder. Then replace
setwd("/path/to/SNAP/data")inxist_applications.Rwith the path to your newly created folder. - Done! You should now be able to run any part of
graph_cut_limit_applications.R.
All the time comparisons and algorithm comparisons in "A scalable algorithm to approximate graph cuts" (Suchan, Li, Munk; 2023) have been done using Python and are not present in the R supplement. This is because some SOTA algorithms used in the paper, namely Chaco, KaHIP, and METIS, do not have an R implementation.
Notice that the entirety of xist.R is heavily commented to aid the user. Please read through the comments if the use of some functions is not immediately obvious.
3. Usage of the R code supplementary to "Distributional limits of graph cuts on discretized grids"
- Download
graph_cut_limits.Randgraph_cut_limit_applications.Rfrom this repository. - Run
graph_cut_limits.R. - Done! You are now ready to run any part of
graph_cut_limit_applications.Rand should therefore be able to reproduce the results from "Distributional limits of graph cuts on discretized grids" (Suchan, Li, Munk; 2024).
4. Citing the NIH 3T3 Dataset
The NIH 3T3 Dataset can be found in the folder NIH3T3_Data in this repository. It should be cited using NIH3T3_Data/CITATION.cff
Owner
- Login: leosuchan
- Kind: user
- Repositories: 1
- Profile: https://github.com/leosuchan
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: R and Python implementations of the Xist algorithm
message: >-
If you use this dataset, please cite it using the metadata
from this file.
type: repository
authors:
- given-names: Leo
family-names: Suchan
email: leo.suchan@uni-goettingen.de
repository-code: 'https://github.com/leosuchan/Xist'
abstract: >-
This repository contains R and Python implementations of the Xist algorithm as introduced in "A scalable clustering algorithm to approximate graph cuts" (Suchan, Li, Munk; 2023) as well as the code accompanying this paper, and it contains the code accompanying "Distributional limits of graph cuts on discretized grids" (Suchan, Li, Munk; 2024). All code belongs to Leo Suchan.
GitHub Events
Total
- Watch event: 1
- Member event: 1
- Fork event: 1
Last Year
- Watch event: 1
- Member event: 1
- Fork event: 1