ffdb

nosql flat-file database and document storage

https://github.com/g-insana/ffdb.py

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

nosql flat-file database and document storage

Basic Info
  • Host: GitHub
  • Owner: g-insana
  • License: agpl-3.0
  • Language: Python
  • Default Branch: master
  • Size: 191 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created almost 6 years ago · Last pushed about 1 year ago
Metadata Files
Readme Funding License Citation

README.md

ffdb.py - Flat-File DB: nosql single file database and document storage

Build Status PyPI DOI

The module ffdb and the associated set of utility scripts (indexer, extractor, filestorer, remover and merger) allow the creation, maintenance and usage of a database and document storage which employs a single file acting as a container for all the data.

This file can be distributed in several copies, made locally available or placed online to be accessed remotely (e.g. over ftp or www).

The resulting database is hence accessible everywhere, without the need for any complex installation and in particular without requiring any service to be running continuously.

  • Simple 1-2-3 procedure:
  • Index your flat-file and/or choose documents to store
  • Place the resulting file wherever you want (locally or online, e.g. ftp or www)
  • Retrieve the entries or the documents from any device and location

FFDB can index entries according to any pattern, creating index of identifiers unique to each entry or shared across many. For a biological database this could mean for example to allow retrieval of all entries belonging to a certain species.

When requesting entries, the user can specify whether retrieving all the entries corresponding to the given identifier, or only the first or only the last one. This allows the possibility of storing a version history of the entries, continuously appending new entry versions to the file.

When indexing entries of a flat-file or storing documents inside the database, the user can optionally specify to encrypt (AES) or compress (ZLIB) the entries/files.

Encrypted entries share a single master password but each has a unique IV (Initialization vector). No plaintext information is sent or received, thus allowing secure access to encrypted entries of the database over insecure channels.

With an external utility (gztool), the container file can even be gzip-compressed as a whole, while still allowing retrieval of single entries. This can be useful when said file needs to also be distributed and employed in its entirety, without limiting its use only via ffdb.

If the container file is compressed with bgzip, then no external utility is needed.

Documentation

A part from the present README, please refer to the usage page for each of the utility scripts: * indexer * extractor * remover * merger

Download and installation

ffdb is pure python code. It has no platform-specific dependencies and should thus work on all platforms. It requires the packages requests pycryptodomex and sortedcontainers. The latest version of ffdb can be installed by typing either:

bash pip3 install -U ffdb (from Python Package Index)

or: bash pip3 install git+git://github.com/g-insana/ffdb.py.git (from GitHub).

The utility scripts should get installed for you by pip (if installed as --user you may need to add e.g. ~/.local/bin/ to your $PATH). Alternatively, you can directly download those you need:

bash curl -LO https://github.com/g-insana/ffdb.py/raw/master/scripts/indexer.py curl -LO https://github.com/g-insana/ffdb.py/raw/master/scripts/extractor.py curl -LO https://github.com/g-insana/ffdb.py/raw/master/scripts/remover.py curl -LO https://github.com/g-insana/ffdb.py/raw/master/scripts/merger.py

Quickstart: some examples

``` bash #index a flat-file:

$ indexer.py -i '^name: (.+)$' 'email: (.+)$' -f addressbook >addressbook.idx

$ indexer.py -i '^AC (.+?);' -f uniprot.dat -e '^//$' >uniprot.pac

#store all the files contained in a directory:

$ filestorer.py -f myphotos.db -d iceland_photos/

#add some more files

$ filestorer.py -f myphotos.db -s sunset.jpg icecream.jpg

#extract a series of entries from the db

$ extractor.py -f uniprot.dat -i uniprot.pac -l sars_proteins

#extract an entry from remote location

$ extractor.py -f http://remote.host/addressbook -i addressbook.idx -s john@abc.de

#extract a file from the db

$ extractor.py -f myphotos.db -s sunset.jpg >sunset.jpg

#merge two indexed flat-files

$ merger.py -f mydb -i mydb.idx -e newentries -n newentries.idx -d #(mydb will incorporate newentries)

#remove a series of entries from the db

$ remover.py -f addressbook -i addressbook.idx -l disagreedwithme.list ```

Copyright

ffdb is licensed under the GNU Affero General Public License.

(c) Copyright Giuseppe Insana, 2020-

Owner

  • Name: Giuseppe Insana
  • Login: g-insana
  • Kind: user
  • Location: Cambridge, UK
  • Company: EMBL-EBI @embl-ebi

Data scientist, biology and linguistic expert, business consultant.

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Insana"
  given-names: "Giuseppe"
  orcid: "https://orcid.org/0000-0002-8186-1026"
title: "nosql flat-file database and document storage"
version: v2.5.6
date-released: 2024-12-08
url: "https://github.com/g-insana/ffdb.py"

GitHub Events

Total
  • Release event: 4
  • Push event: 7
  • Create event: 4
Last Year
  • Release event: 4
  • Push event: 7
  • Create event: 4

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 73
  • Total Committers: 1
  • Avg Commits per committer: 73.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 8
  • Committers: 1
  • Avg Commits per committer: 8.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Giuseppe Insana 5****a 73

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 124 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 20
  • Total maintainers: 1
pypi.org: ffdb

ffdb

  • Versions: 20
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 124 Last month
Rankings
Dependent packages count: 10.1%
Dependent repos count: 21.5%
Average: 27.9%
Forks count: 29.8%
Stargazers count: 31.9%
Downloads: 46.3%
Maintainers (1)
Last synced: 7 months ago

Dependencies

requirements.txt pypi
  • pycryptodomex >=3.9.7
  • requests >=2.18.4
  • sortedcontainers >=2.1.0
  • tqdm >=4.28.1
setup.py pypi
  • pycryptodomex *
  • requests *
  • sortedcontainers *
  • tqdm *