fishtoolbox
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: UlrikeScherer
- Language: Jupyter Notebook
- Default Branch: main
- Size: 14.6 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
fishtoolbox
Contains a set of modules to analyze and visualize data from block1 and block2 of the experiment from 2021 September.
Dependencies
This repository is based on python and therefore requires conda and python-pip for installations. The following repositories are project-dependencies that have to been build inside the underlying environment: - fishproviz
Data Flow
Using the HPC cluster
Installation
- Environment installations using Conda, including the python environment and c++ dependencies
bash conda env create -n toolbox --file environment.yml conda activate toolbox - Python package installations using python-pip
bash # conda environment should be activated python -m venv .venv # python virtual environment creation source .venv/bin/activate pip install -r requirements.txt - Fishproviz project-dependencies installation
bash # working directory should be equal to <path/to/fishtoolbox> # conda environment should be activated # python venv should be activated cd .. git clone git@github.com:lukastaerk/Fish-Tracking-Visualization.git cd Fish-Tracking-Visualization python setup.py install - Motionmapper project-dependencies installation
bash # working directory should be equal to <path/to/fishtoolbox> # conda environment should be activated # python venv should be activated cd .. git clone git@github.com:lukastaerk/motionmapperpy.git cd motionmapperpy python setup.py install
HPC Usage
- Start on the GPU
sbatch scripts/hpc-python.sh - NOTEBOOK
conda activate rapids-22.04- Type
ifconfigand get theinetentry foreth0, i.e. the IP address of the node srun --pty --partition=ex_scioi_gpu --gres=gpu:1 --time=0-02:00 bash -ito start a new shell with a GPUssh -L localhost:5000:localhost:5000 user.name@[IP address]on your local machinejupyter-lab --no-browser --port=5000
Start
- set the BLOCK variable to BLOCK1 or BLOCK2 in
config.py - set the projectPath variable to the path of a new folder in
config.pythis is where the data will be stored - setup fishprovis with the correct paths and area configurations.
- export the preprocessed data with
python3 -m data_factory.processing - repeat for the other block
Program Parts
Parameters
parameters = set_parameters()to get the parameters that are used throughout the fishtoolbox
Data Factory
Processing
load_trajectory_data_concatload the x y coordinates, projections (the three features), time index, areaload_zVals_concatload the umap dataload_clusters_concatload cluster labels for individuals and day paramerter.kmeans = 5 to specify the clustering that you want to load.
Plasticity
There are three ways in this module to compute plasticity.
- compute_cluster_entropy computes the cluster entropy for each individual and day. Using the watershed regions or kmeans clusters, by providing the function to load the corresponding clusters.
- compute_coefficient_of_variation computes the coefficient of variation for each individual and day.
Table Export
Records function to export averaged step length to a csv file and melted them into a long format table for statistical analysis (Repeatability).
Repeatability
From means of features (step, angle, wall distance), e.g. batches of 60 data frames. Produce a long table, recording block number, id.
Sampling
The research question is how many samples are needed to get a good estimate of the repeatability. Provided a table with means of a feature (step length) over a number of consecutive data frames, we can sample from this table a number of minutes for a number of days. Further we look at the effect when sampling the time of the day only once for all days versus sampling the time of the day for each day.
Poltting
Caterpillar Plots
- ethnogramofclusters
TODOs:
- check the new area files, see if there are significant updates for any of them, what is the difference, do we need an refined getareafunction(fishkey,day) ?
Owner
- Login: UlrikeScherer
- Kind: user
- Repositories: 1
- Profile: https://github.com/UlrikeScherer
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Beese" given-names: "Marvin" orcid: "https://orcid.org/0009-0008-7417-515X" - family-names: "Stärk" given-names: "Luka" orcid: "https://orcid.org/0000-0002-1390-7432" - family-names: "Scherer" given-names: "Ulrike" orcid: "https://orcid.org/0000-0003-1086-8527" - family-names: "Ehlman" given-names: "Sean" orcid: "https://orcid.org/0000-0001-6981-9549" - family-name: "Wolf" given-name: "Max" orcid: "https://orcid.org/0000-0001-8878-0504" title: "Tool for behavioural clustering and analysis of trajectories." version: 1.0.0 date-released: 2023-06-30 url: "https://github.com/UlrikeScherer/fishtoolbox"
GitHub Events
Total
Last Year
Dependencies
- conda 23.5.0
- conda-content-trust
- cudatoolkit 11.8
- cudf 23.04
- cuml 23.04
- graph-tool 2.56
- h5py 3.7.0
- hdf5storage 0.1.19
- moviepy 1.0.3
- pip
- python 3.10.11.*
- scikit-image 0.20.0
- scikit-learn 1.2.2
- setuptools
- statsmodels 0.14.0
- wheel
- cython ==0.29.34
- envbash ==1.2.0
- matplotlib ==3.3.4
- numpy ==1.21.6
- pandas ==1.2.4
- plotly ==5.0.0
- scikit_learn ==1.2.2
- scipy ==1.8.0