Recent Releases of pyscenic

pyscenic - 0.12.1

Updates:

  • Add support for running arboretowithmultiprocessing.py with spawn instead of fork as multiprocessing method.Pool
  • Use ravel instead of flatten to avoid unnecessary memory copy in aucell
  • Update Docker image file and add separated Docker file for pySCENIC with scanpy.

- Python
Published by ghuls over 3 years ago

pyscenic -

Updates: - Only databases in Feather v2 format are supported now (ctxcore >= 0.2), which allow uses recent versions of pyarrow (>=8.0.0) instead of very old ones (<0.17). Databases in the new format can be downloaded from https://resources.aertslab.org/cistarget/databases/ and end with *.genesvsmotifs.rankings.feather or *.genesvstracks.rankings.feather. - Support clustered motif databases. - Use custom multiprocessing instead of dask, by default. - Docker image uses python 3.10 and contains only needed pySCENIC dependencies for CLI usage. - Remove unneeded scripts and notebooks for unused/deprecated database formats.

- Python
Published by ghuls over 3 years ago

pyscenic - 0.11.2

Major changes: * Split some core cisTarget functions out into a separate repository, ctxcore. This is now a required package for pySCENIC. * Documentation updates

- Python
Published by cflerin almost 5 years ago

pyscenic - 0.11.1

  • Fix bug in motif url construction (#275)
  • Fix for export2loom with sparse dataframe (#278)
  • Fix sklearn t-SNE import (#285)
  • Updates to Docker image (expose port 8787 for Dask dashboard)

- Python
Published by cflerin almost 5 years ago

pyscenic - 0.11.0

Major features:

  • Updated Arboreto release (GRN inference step) includes:

    • Support for sparse matrices (using the --sparse flag in pyscenic grn, or passing a sparse matrix to grnboost2/genie3).
    • Fixes to avoid dask metadata mismatch error
  • Updated cisTarget:

    • Fix for metadata mismatch in ctx prune2df step
    • Support for databases Apache Parquet format
    • Faster loading from feather databases
    • Bugfix: loading genes from a database (previously missing the last gene name in the database)
  • Support for Anndata input and output

  • Package updates:

    • Upgrade to newer pandas version
    • Upgrade to newer numba version
    • Upgrade to newer versions of dask, distributed
  • Input checks and more descriptive error messages.

    • Check that regulons loaded are not empty.
  • Bugfixes:

    • In the regulons output from the cisTarget step, the gene weights were incorrectly assigned to their respective target genes (PR #254).
    • Motif url construction fixed when running ctx without pruning
    • Compression of intermediate files in the CLI steps
    • Handle loom files with non-standard gene/cell attribute names
    • Reformat the genesig gmt input/output
    • Fix AUCell output to loom with non-standard loom attributes

- Python
Published by cflerin about 5 years ago

pyscenic - 0.10.4

Updates: - Included new (optional) CLI option to add correlation information to the GRN adjacencies file. This can be called with pyscenic add_cor. (vib-singlecell-nf/vsn-pipelines/issues/254) - The correlation calculation is subsequently skipped if using this adjacencies + correlations file as the input into pyscenic ctx.

- Python
Published by cflerin about 5 years ago

pyscenic - 0.10.3

Updates: * Fix bug in motif url construction (#158) * Integrate arboreto multiprocessing script into pySCENIC CLI * cisTarget step: Check for modules with zero db overlap and skip them (#158, #177, #132, #85) * Bugfix in TF-gene correlation calculation. Quit with error if there is a mismatch between the genes present in the GRN and the expression matrix (#103, #149) * Error message when regulons file is empty (#133)

- Python
Published by cflerin over 5 years ago

pyscenic - 0.10.2

Updates: - Bugfix for CLI grn step

- Python
Published by cflerin over 5 years ago

pyscenic - 0.10.1

Updates: * CLI: file compression (optionally) enabled for intermediate files for the major steps: grn (adjacencies matrix), ctx (regulons), and aucell (auc matrix). Compression is used when the file name argument has a .gz ending. * Restrict packages (pyarrow, pandas) for compatibility.

- Python
Published by cflerin almost 6 years ago

pyscenic -

Updates: - Added a helper script scripts/arboretowithmultiprocessing.py that runs the Arboreto GRN algorithms (GRNBoost2, GENIE3) without Dask for compatibility. - Initial support for the use of sparse expression matrices (applies only to the GRN step in the CLI currently). Using sparse matrices with the GRN step requires a patch to Arboreto (tmoerman/arboreto#20) - AUCell uses a random sampling of the expression matrix to break ties in the ranking step. The CLI parameter --seed or aucell function parameter seed uses a fixed seed for this step. The regulon thresholds also depend on random sampling in the binarization step (bimodality test: np.random.uniform), and the seed parameters apply here as well. - Fixed typo in regulon threshold labeling - Security patch: Bump bleach from 3.1.0 to 3.1.1.

- Python
Published by bramvds almost 6 years ago

pyscenic -

-BugFixes: #100 and #51.

- Python
Published by bramvds over 6 years ago

pyscenic -

In the modulesfromadjacencies function, the default value of rhomaskdropouts is changed to False. This now matches the behavior of the R version of SCENIC. Since this is likely to change the final output regulons slightly, warnings have been added to the logging system. The cli version has an additional option to turn dropout masking back on (--mask_dropouts).

- Python
Published by bramvds over 6 years ago

pyscenic -

In the modulesfromadjacencies function, the default value of rhomaskdropouts is changed to False. This now matches the behavior of the R version of SCENIC. Since this is likely to change the final output regulons slightly, warnings have been added to the logging system. The cli version has an additional option to turn dropout masking back on (--mask_dropouts).

- Python
Published by bramvds over 6 years ago

pyscenic -

BugFix: minor bugfix in calculation of Regulon Specificity Score (RSS).

- Python
Published by bramvds over 6 years ago

pyscenic -

New features: multiprocessing support for the binarization function.

- Python
Published by bramvds over 6 years ago

pyscenic -

  • BugFix: issue #87 - Binarization fails with regulon of few genes.
  • BugFix: Always strip off extensions of filenames with os.path.splitext()[0] instead of using split(".")[0], which wouldn't work properly for filenames which contain embedded dots.

- Python
Published by bramvds over 6 years ago

pyscenic -

  • New functionality: Support for Regulon Specificity Score (RSS) and plotting functions for AUCell distributions and RSS values for a cell type.
  • Bug Fixes: #70 and #81

- Python
Published by bramvds over 6 years ago

pyscenic -

  • New functionality: new algorithm to define binarization threshold on AUC values of a regulon. Hartigan's Dip Test (HDT) is used to decide if the distribution of AUC values deviates from unimodality. If this is the case, a bimodal gaussian mixture model is fit to capture the two modes of the distribution. The trough between these two modes is the threshold and is derived by minimization on the kernel density smoothed histogram.

- Python
Published by bramvds over 6 years ago

pyscenic -

BugFix: CLI grn - output of list of adjacencies is TSV or CSV (default) based on provided file extension.

- Python
Published by bramvds over 6 years ago

pyscenic -

New functionality: support for multiple embeddings then exporting to loom file.

- Python
Published by bramvds almost 7 years ago

pyscenic -

New functionality: minimal integration of SCENIC with scanpy (https://scanpy.readthedocs.io/en/latest/). The approach is fully explained in the Jupyter notebook: https://github.com/aertslab/pySCENIC/blob/master/notebooks/pySCENIC%20-%20Integration%20with%20scanpy.ipynb

- Python
Published by bramvds almost 7 years ago

pyscenic -

  • BugFix: Changed load_adjacencies to use fixed column types to avoid type errors (e.g. 'nan' as gene name in D. melanogaster incorrectly got interpreted as a floating point number).

- Python
Published by bramvds almost 7 years ago

pyscenic -

  • Added new Singularity file incorporating version tags.
  • Small correction in documentation on Docker.
  • BugFix: Incorrect warning message about number of genes being present in expression matrix when calculating recovery (based on #57).

- Python
Published by bramvds almost 7 years ago

pyscenic -

Fixing versions for pandas and dask because github issues: - https://github.com/aertslab/pySCENIC/issues/51 - https://github.com/aertslab/pySCENIC/issues/45

Fixed an issue where the regulon assignment matrix was not being added to row attributes (loom output).

Changed lp.connect to read-only mode when importing expression matrix from a loom file.

- Python
Published by bramvds about 7 years ago

pyscenic -

This release fixes an issue where the container images weren't properly updating after a new pySCENIC version was released.

- Python
Published by bramvds about 7 years ago

pyscenic -

  • CLI: ability to choose algorithm used for gene regulatory network reconstruction.
  • CLI: When loom file requested as output of aucell step, the regulon AUC values are appended as column attributes of a newly created loom file.

- Python
Published by bramvds about 7 years ago

pyscenic -

Minor updates for compatibility with SCope.

- Python
Published by bramvds about 7 years ago

pyscenic -

  • CLI: BugFix for issue https://github.com/aertslab/pySCENIC/issues/39

- Python
Published by bramvds about 7 years ago

pyscenic -

  • Command Line Interface (CLI): support for more file formats (e.g. loom and GMT format).
  • Command Line Interface (CLI): new 'csv2loom' utility to convert an expression matrix as csv to loom.
  • Differentiation of dependencies for CLI and Jupyter notebook usage: matplotlib dependency removed from the former.

- Python
Published by bramvds about 7 years ago

pyscenic -

Removed support for direct pruning of modules to regulons without the enriched motif table as intermediate.

- Python
Published by bramvds about 7 years ago

pyscenic -

  • pySCENIC requires python 3.6 because of loompy (http://linnarssonlab.org/loompy/installation/index.html)
  • Updated support for loom file export (SCope).

- Python
Published by bramvds about 7 years ago

pyscenic -

BugFix: TypeError when invoking pyscenic from the command line.

- Python
Published by bramvds about 7 years ago

pyscenic -

BugFix: Signature of Loompy.create() changed which resulted in an error when invoking export2loom.

- Python
Published by bramvds about 7 years ago

pyscenic -

Avoid degradation of performance by better thread allocation (i.e. better cooperation between dask framework and numpy - via MKL extensions or OpenBLAS).

- Python
Published by bramvds about 7 years ago

pyscenic -

Clean up of package dependencies.

- Python
Published by bramvds over 7 years ago

pyscenic -

Fixed installation issue.

- Python
Published by bramvds over 7 years ago

pyscenic -

BugFix in command line interface: change functionality of all_modules option.

- Python
Published by bramvds over 7 years ago

pyscenic -

  • Overall more consistent CLI.
  • Possibility for the user to keep negative regulons in the analysis via the CLI.

- Python
Published by bramvds over 7 years ago

pyscenic -

  • support for dask 0.18+
  • modulesfromadjacencies: by default keep only modules in which the expression of the TFs is positively correlated with its target genes.
  • Command Line Interface: support for JSON-serialized regulon output in cisTarget/prune step.

- Python
Published by bramvds over 7 years ago

pyscenic -

Loom export functionality (for SCope): sequence logo is now part of a regulon's metadata.

- Python
Published by bramvds over 7 years ago

pyscenic -

  • Changed dependency: the arboretum package was renamed to arboreto. Changed the dependency requirements accordingly.
  • Support for export to loom format to be able to explore cellular scatter plots and activity of regulons/gene signatures in the SCope tool.
  • Support for exporting selected regulons to Cytoscape to visualise Gene Regulatory Networks.

- Python
Published by bramvds over 7 years ago

pyscenic -

BugFix for NameError: free variable 'module_chunksize' referenced before assignment in enclosing scope.

- Python
Published by bramvds almost 8 years ago

pyscenic -

  • Faster implementation of the sole remaining speed bottleneck i.e. modulefromadjacencies: from 23min to less than 3 minutes on benchmark.

- Python
Published by bramvds almost 8 years ago

pyscenic -

BugFix: AUCell - When there is a complete mismatch between a gene signature/regulon and the genes in the expression matrix, AUCell does not abort anymore with an assertion error but warns the end-user and continues with calculations for the other supplied regulons.

- Python
Published by bramvds almost 8 years ago

pyscenic -

  • Several optimisations for computing on clusters using dask.distributed.
  • Installation: version of pandas should be at least 0.20.1 (df2regulons uses groupby with an index column) - this dependency is enforced.

- Python
Published by bramvds almost 8 years ago

pyscenic -

  • Easier and more robust Jupyter notebook API:
    • Removed nomenclature attribute from all functions.
    • Changed name of parameter numcores to numworkers for aucell function to make it more consistent with pruning for cis-regulatory footprints (prune function).
    • In modulesfromadjacencies: the expression matrix is always converted to floating point numbers. This requirement might be violated when dealing with raw counts as input.
    • In modulesfromadjacencies: removing duplicate genes in the expression matrix to avoid errors when looking up correlations between genes.
  • Better default values:
    • Adjusted default setting for threshold based modules: now percentile based instead of based on an absolute threshold. 75th and 90th percentiles are the new defaults.
    • Masking of dropouts for calculation of Pearson correlation between a TF and its target genes based on expression levels across cells is the new default.
  • BugFixes:
    • Incorrect validation of IP-address when using dask distributed scheduler.
    • AUC calculation based on weighted recovery without weighted recovery being used for target gene selection.

- Python
Published by bramvds almost 8 years ago

pyscenic -

  • Support for Drosophila melanogaster.
  • Experimental - Support for region-based databases: instead of ranking genes based on the score of a motif we rank candidate regulatory regions (i.e. enhancers) and map genes to their putative regulatory regions. Regulons hereby gain enhancer-resolution.
  • Experimental - Support for loom file format export.

- Python
Published by bramvds almost 8 years ago