https://github.com/bionf/fas

FAS - Tool for Feature Architecture Similarity calculation

https://github.com/bionf/fas

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    1 of 5 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.5%) to scientific vocabulary

Keywords

feature-architectures protein-comparison protein-domains proteins

Keywords from Contributors

orthologs homology homology-search orthology-inference
Last synced: 7 months ago · JSON representation

Repository

FAS - Tool for Feature Architecture Similarity calculation

Basic Info
Statistics
  • Stars: 6
  • Watchers: 5
  • Forks: 1
  • Open Issues: 2
  • Releases: 20
Topics
feature-architectures protein-comparison protein-domains proteins
Created over 6 years ago · Last pushed 9 months ago
Metadata Files
Readme License

README.md

FAS - Feature Architecture Similarity

PyPI version GPLv3-license Github Build

FAS is a new release of the original FACT algorithm. It calculates the so called FAS-score which is a measure of how similar the feature architectures of two proteins are. This is done by combining the Multiplicity Score (MS) and the Positional Score (PS) from FACT. Unlike the original FACT, FAS can resolve feature architectures that have overlapping features by searching for the best overlap-free path. This can be done either extensively or by using the priority mode, a greedy approach. FAS also allows for more options in the weighting of features.

Table of Contents

Installation

FAS is provided as a python package and compatible with Python3.

You can install FAS with pip: python3 -m pip install greedyFAS

(*) In case you do not have admin rights, and don't use package systems like Anaconda to manage environments you need to use the --user option (not recommended): python3 -m pip install --user greedyFAS

and then add the following line to the end of your .bashrc or .bash_profile file, restart the current terminal to apply the change: export PATH=$HOME/.local/bin:$PATH

Usage

Download and install annotation tools

Before using FAS, some annotation tools and databases need to be installed. FAS' standard databases/annotation tools are: PFAM, SMART, COILS, THMHH 2.0c and SignalP 4.1g and 2 optional tools fLPS, SEG. To get these tools and make a configuration file for FAS, please use the setupFAS function: fas.setup -t /directory/where/you/want/to/save/annotation/tools Inside the output directory you will find a file called annoTools.txt that contains all installed annotation tools. If you wish to discard any of them from the annotation process, you can just remove the unneeded tools from that file.

Please read our wiki page of setupFAS for other use-cases, such as how to use your old annotation tools with the new FAS, etc.

NOTE: we provide compiled code only for PFAM, COILS and SEG. fLPS will be automatically downloaded and installed. For SMART, you need to download it from EMBLEM and give the path to fas.setup. For TMHMM and SignalP, you can decide if you want to include those two tools to the annotation step (recommended) or ignore them. For using TMHMM version 2.0c and SignalP version 4.1g, you need to request a license from the authors at https://services.healthtech.dtu.dk, and save the downloaded files in the same directory. FAS will do the rest for you ;-)

NOTE2: SignalP 5.0b is not supported yet!!!

We suggest you test the annotation tools by running this command: fas.doAnno -i test_annofas.fa -o test_output test_annofas.fa is a demo multiple fasta file, which is saved in the installed greedyFAS directory.

Perform feature annotation

If you only want to annotate your protein sequences without calculating the FAS scores, you can use the doAnno function.

fas.doAnno --fasta your_proteins.fa --outPath /annotation/path/

The annotation output (your_proteins.json by default) will be saved in /annotation/path/.

Alternatively, you can do the annotation using InterProScan and use the function parseAnno to convert the InterProScan's tsv output into json format for using with FAS.

fas.parseAnno -i INPUT.tsv -o /annotation/path/INPUT.json -t <tool_name> -f <feature columns> ...

Please check the usage of parseAnno for more info (using fas.parseAnno -h).

Compare protein feature architectures

The main purpose of FAS is to calculate the similarity score between 2 given proteins (or two list of proteins). This can be done using the run function.

fas.run -s seed.fa -q query.fa -a /annotation/path/ -o /output/path/ If the annotations of seed and query protein(s) already exist in /annotation/path/ (seed.json and query.json, respectively), run will use these annotations for calculating the FAS scores. Otherwise, it will first annotate the proteins and then compare the feature architectures of those two protein sets.

Additional Information

A thorough guide to all FAS commands and options can be found at our WIKI page.

How-To Cite

Julian Dosch, Holger Bergmann, Vinh Tran, Ingo Ebersberger, FAS: assessing the similarity between proteins using multi-layered feature architectures, Bioinformatics, Volume 39, Issue 5, May 2023, btad226, https://doi.org/10.1093/bioinformatics/btad226

Contributors

Contact

Julian Dosch dosch@bio.uni-frankfurt.de

Ingo Ebersberger ebersberger@bio.uni-frankfurt.de

Owner

  • Name: BIONF
  • Login: BIONF
  • Kind: organization

GitHub Events

Total
  • Issues event: 1
  • Watch event: 2
  • Issue comment event: 1
  • Push event: 18
  • Create event: 5
Last Year
  • Issues event: 1
  • Watch event: 2
  • Issue comment event: 1
  • Push event: 18
  • Create event: 5

Committers

Last synced: about 3 years ago

All Time
  • Total Commits: 396
  • Total Committers: 5
  • Avg Commits per committer: 79.2
  • Development Distribution Score (DDS): 0.52
Top Committers
Name Email Commits
julian j****h@w****e 190
trvinh t****h@g****m 183
Julian Dosch 4****o@u****m 15
trvinh t****n@b****e 7
Vinh Tran v****h@M****l 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 8
  • Total pull requests: 18
  • Average time to close issues: 5 months
  • Average time to close pull requests: 6 minutes
  • Total issue authors: 4
  • Total pull request authors: 2
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • trvinh (4)
  • JuRuDo (2)
  • leuaqut (1)
  • LucyJimenez (1)
Pull Request Authors
  • trvinh (10)
  • JuRuDo (8)
Top Labels
Issue Labels
feature request (3) enhancement (2)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 388 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 1
  • Total versions: 104
  • Total maintainers: 1
pypi.org: greedyfas

A tool to compare protein feature architectures

  • Versions: 104
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Downloads: 388 Last month
Rankings
Dependent packages count: 4.8%
Downloads: 9.1%
Average: 15.9%
Dependent repos count: 21.6%
Stargazers count: 21.6%
Forks count: 22.7%
Maintainers (1)
Last synced: 8 months ago

Dependencies

setup.py pypi
  • biopython *
  • gnureadline *
  • graphviz *
  • tqdm *
.github/workflows/github_build.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • casperdcl/deploy-pypi v2 composite