fishtoolbox

https://github.com/ulrikescherer/fishtoolbox

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: UlrikeScherer
Language: Jupyter Notebook
Default Branch: main
Size: 14.6 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 3 years ago · Last pushed over 2 years ago

Metadata Files

Readme Citation

fishtoolbox

Contains a set of modules to analyze and visualize data from block1 and block2 of the experiment from 2021 September.

Dependencies

This repository is based on python and therefore requires conda and python-pip for installations. The following repositories are project-dependencies that have to been build inside the underlying environment: - fishproviz

motionmapperpy fork

Data Flow

Dataflow

Using the HPC cluster

Installation

Environment installations using Conda, including the python environment and c++ dependencies bash conda env create -n toolbox --file environment.yml conda activate toolbox
Python package installations using python-pip bash # conda environment should be activated python -m venv .venv # python virtual environment creation source .venv/bin/activate pip install -r requirements.txt
Fishproviz project-dependencies installation bash # working directory should be equal to <path/to/fishtoolbox> # conda environment should be activated # python venv should be activated cd .. git clone git@github.com:lukastaerk/Fish-Tracking-Visualization.git cd Fish-Tracking-Visualization python setup.py install
Motionmapper project-dependencies installation bash # working directory should be equal to <path/to/fishtoolbox> # conda environment should be activated # python venv should be activated cd .. git clone git@github.com:lukastaerk/motionmapperpy.git cd motionmapperpy python setup.py install

HPC Usage

Start on the GPU sbatch scripts/hpc-python.sh
NOTEBOOK
conda activate rapids-22.04
Type ifconfig and get the inet entry for eth0, i.e. the IP address of the node
srun --pty --partition=ex_scioi_gpu --gres=gpu:1 --time=0-02:00 bash -i to start a new shell with a GPU
ssh -L localhost:5000:localhost:5000 user.name@[IP address] on your local machine
jupyter-lab --no-browser --port=5000

Start

set the BLOCK variable to BLOCK1 or BLOCK2 in config.py
set the projectPath variable to the path of a new folder in config.py this is where the data will be stored
setup fishprovis with the correct paths and area configurations.
export the preprocessed data with python3 -m data_factory.processing
repeat for the other block

Program Parts

Parameters

parameters = set_parameters() to get the parameters that are used throughout the fishtoolbox

Data Factory

Processing

load_trajectory_data_concat load the x y coordinates, projections (the three features), time index, area
load_zVals_concat load the umap data
load_clusters_concat load cluster labels for individuals and day paramerter.kmeans = 5 to specify the clustering that you want to load.

Plasticity

There are three ways in this module to compute plasticity. - compute_cluster_entropy computes the cluster entropy for each individual and day. Using the watershed regions or kmeans clusters, by providing the function to load the corresponding clusters. - compute_coefficient_of_variation computes the coefficient of variation for each individual and day.

Table Export

Records function to export averaged step length to a csv file and melted them into a long format table for statistical analysis (Repeatability).

Repeatability

From means of features (step, angle, wall distance), e.g. batches of 60 data frames. Produce a long table, recording block number, id.

Sampling

The research question is how many samples are needed to get a good estimate of the repeatability. Provided a table with means of a feature (step length) over a number of consecutive data frames, we can sample from this table a number of minutes for a number of days. Further we look at the effect when sampling the time of the day only once for all days versus sampling the time of the day for each day.

Poltting

Caterpillar Plots

ethnogramofclusters

TODOs:

check the new area files, see if there are significant updates for any of them, what is the difference, do we need an refined getareafunction(fishkey,day) ?

Owner

Login: UlrikeScherer
Kind: user

Repositories: 1
Profile: https://github.com/UlrikeScherer

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Beese"
  given-names: "Marvin"
  orcid: "https://orcid.org/0009-0008-7417-515X"
- family-names: "Stärk"
  given-names: "Luka"
  orcid: "https://orcid.org/0000-0002-1390-7432"
- family-names: "Scherer"
  given-names: "Ulrike"
  orcid: "https://orcid.org/0000-0003-1086-8527"
- family-names: "Ehlman"
  given-names: "Sean"
  orcid: "https://orcid.org/0000-0001-6981-9549"
- family-name: "Wolf"
  given-name: "Max"
  orcid: "https://orcid.org/0000-0001-8878-0504"
title: "Tool for behavioural clustering and analysis of trajectories."
version: 1.0.0
date-released: 2023-06-30
url: "https://github.com/UlrikeScherer/fishtoolbox"

GitHub Events

Total

Last Year

Dependencies

environment.yml conda

conda 23.5.0
conda-content-trust
cudatoolkit 11.8
cudf 23.04
cuml 23.04
graph-tool 2.56
h5py 3.7.0
hdf5storage 0.1.19
moviepy 1.0.3
pip
python 3.10.11.*
scikit-image 0.20.0
scikit-learn 1.2.2
setuptools
statsmodels 0.14.0
wheel

requirements.txt pypi

cython ==0.29.34
envbash ==1.2.0
matplotlib ==3.3.4
numpy ==1.21.6
pandas ==1.2.4
plotly ==5.0.0
scikit_learn ==1.2.2
scipy ==1.8.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

fishtoolbox

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

fishtoolbox

Dependencies

Data Flow

Using the HPC cluster

Installation

HPC Usage

Start

Program Parts

Parameters

Data Factory

Processing

Plasticity

Table Export

Repeatability

Sampling

Poltting

Caterpillar Plots

TODOs:

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies