Recent Releases of pyhmmer
pyhmmer - v0.11.1
Added
DigitalMSA.identity_filtermethod to remove sequences too similar from a multiple sequence alignment (#84).reference,model_mask,secondary_structure,surface_accessibility,posterior_probabilitiesproperties ofMSAto get and set additional column annotation for a sequence alignment.MSA.selectto select a subset of columns and rows of a sequence alignment given an iterable of indices.- New parallelization strategy in
hmmsearchsuitable for low query counts to parallelize on target chunks rather than individual queries (more similar to original HMMER). TextSequence.sampleandDigitalSequence.sampleconstructors to generate random sequences for testing.DigitalMSA.sampleconstructor to generate a random multiple sequence alignment for testing.MSA.indexedandSequenceBlock.indexedto get the sequences of an alignment or block by name.DigitalMSA.alignmentto get the rows of an alignment in digital mode asVectorU8objects.
Changed
- Remove some test data from
pyhmmer.tests.datato reduce distribution size. - Use fused types to specialize queries of
Pipelinesearch methods. - Improve error reporting message in
Builder.build_msaon input format errors. - Deprecate positional arguments in
TextSequence,DigitalSequence,TextMSAandDigitalMSAconstructors. - Ensure to poison workers before yielding the last result in
pyhmmer.hmmerdispatchers. - Allow passing a seed integer or
NonetoHMM.sample. - Use pure-Python
collections.abc.Sequenceclasses to exposeMSA.sequencesinstead of Cython class. - Use pure-Python
collections.abc.Sequenceclasses to exposeTextMSA.alignmentinstead of generating atupleon demand. - Relax
psutildependency to support both6.0and7.0.
Fixed
- Deployment to Arch User Repository.
- Use of deprecated API in documentation.
Pipelinemethods not raising errors on unsupported targets (#87).Matrix.__getitem__raisingIndexErrorwhen slicing with a slice without upper bound.TopHits.mergenot properly merging book-keeping attributes when mergingTopHitswith zero hits.- Relax equality comparison in
TopHits.mergewhen the query was anOptimizedProfile. - Some uncaught connection errors in
pyhmmer.hmmerbackground workers. - Broken assertion in
TextMSA.sequencesdue to Easel bug inesl_sq_FetchFromMSA(EddyRivasLab/easel#80). - Fused Cython methods not being properly documented in Sphinx documentation.
- Cython
Published by github-actions[bot] 8 months ago
pyhmmer - v0.11.0
Added
- Missing type annotations to specific options of
LongTargetsPipeline. __len__implementation forHMMPressedFileusing the entries in the SSI index.- Setters for residue-wise annotation properties in
HMM. - Support for process-based parallelism in
pyhmmer.hmmerin addition to thread-based. - Read-only buffer protocol implementation for
TextSequenceandDigitalSequenceclasses.
Changed
- Drop support for Python 3.6.
- Use CMake and
scikit-build-coreto build the package instead ofsetuptools. - Use
TypedDictAPI to mark allowed keyword arguments inpyhmmer.hmmerfunctions. - Reorganize detection and handling of alphabets for arbitrary queries in
pyhmmer.hmmer. - Allow passing a
SequenceFileto mostpyhmmer.hmmerfunctions. - Use faster
PyUnicode_FromStringAndSizefunction to decode strings of known lengths in severalplan7classes. - Make
SequenceFileandMSAFilegeneric on the individual sequence and MSA types. - Reorganize
pyhmmer.hmmerinto different submodules.
Fixed
- Type annotations not using
typing-extensionsfor Python3.8 to 3.10. - Logic error causing out-of-bounds memory access in
TopHits.__getstate__. - Avoid creating a new
Pipelineobject when running single-threaded searches inpyhmmer.hmmer. - Detect the appropriate
SequenceFiletype based on thedigitalflag value (#72).
Removed
- Deprecated properties of
TopHits(query_name,query_length,query_accession).
- Cython
Published by github-actions[bot] 12 months ago
pyhmmer - v0.10.15
Added
querypropertyTopHitsreferencing the original object used to create theTopHits#76.
Changed
- Require the query object to create a
TopHitsobject. - Make
TopHitsgeneric over itsqueryproperty. - Deprecate old query properties of
TopHits(query_name,query_length,query_accession).
Removed
- Detection of SSE flush from
setup.py(#71).
- Cython
Published by althonos about 1 year ago
pyhmmer - v0.10.13
Changed
- Allow
AlphabetMismatcherror to allow for an unknown actual alphabet. - Make
HMMFileandHMMPressedFileraiseAlphabetMismatchon files with mixed alphabets.
Fixed
- Avoid calling
fclosewith null pointers inSequence.writeandMSA.write.
- Cython
Published by github-actions[bot] over 1 year ago
pyhmmer - v0.10.8
Added
- Getter to access the strand of a
Domainproduced by aLongTargetsPipeline.
Changed
- Display model and cutoff names in
MissingCutoffserror message, if any. - Allow
LongTargetsPipelineto be configured with window length and beta parameters. - Make
nhmmeruse the window length and beta from the options when creating aBuilder.
Fixed
nhmmernot computing E-values for non-default window lengths (moshi4/pybarrnap#2).SequenceFileandMSAFilecrashing with a segmentation fault when given the path to a folder rather than a file.
- Cython
Published by github-actions[bot] almost 2 years ago
pyhmmer - v0.10.7
Added
- Pre-compiled wheels for PyPy 3.10.
Fixed
- Invalid pointer cast in
__getbuffer__method ofMatrixandVectorobjects. - Remaining tests failing to run on missing
importlib-resources. pyhmmer.hmmerdispatchers possibly dead-locking on background thread errors (#60).
- Cython
Published by github-actions[bot] almost 2 years ago
pyhmmer - v0.10.6
Added
-
armv7andaarch64to thePKGBUILDarchitectures.
Changed
SSIReaderandSSIWriterconstructors now accept path-like objects.- Skip tests dependending on
importlib.resources.fileswhen it is not available on the host machine.
Fixed
- Memory leak caused by alphabet allocation in
Pipeline._scan_loop_file.
- Cython
Published by github-actions[bot] almost 2 years ago
pyhmmer - v0.10.5
Added
Alignmentproperties to get the original lengths of the sequence and HMM being stored.Hit.lengthproperty storing the length of the hit sequence (or HMM).TopHits.query_lengthstoring the length of the hit HMM (or query).Alignment.posterior_probabilitiesproperty showing an encoded representation of posteriors (#59, by @arajkovic).Trace.scoremethod to compute a trace score from a given profile and sequence.Alignment.__sizeof__implementation leveraingp7_alidisplay_SizeOf.
Fixed
Cutoffsproxy objects not recording their owner to prevent deallocation.- Avoid GIL re-acquisition in
GeneticCode.translate. - Query metadata not being recorded in
Hitsobtained fromdaemon.Client. - Empty
MatrixU8creation attempting zero-allocation. VectorU8.zerosallocating 4x more memory than required.- Memory leak caused by string duplication in
__getbuffer__methods ofMatrixandVectortypes.
- Cython
Published by github-actions[bot] almost 2 years ago
pyhmmer - v0.10.4
Added
residue_markupsargument toTextSequenceandDigitalSequenceconstructors.__reduce__implementation toTextSequence,DigitalSequence,TextSequenceBlockandDigitalSequenceBlock.
Changed
- Handling of
easelI/O methods to avoid implicit GIL acquisition for error checking.
Fixed
- Syntax errors in type annotation files.
- Cython
Published by github-actions[bot] about 2 years ago
pyhmmer - v0.10.3
Added
- Out-of-band pickle serialization of
Bitfieldobjects. - Getters for
floatattributes and forward/backward parameters ofOptimizedProfile. InvalidHMMerror raised byHMM.validate.
Changed
- Mark
HMM.zeromethod asnoexcept. - Increase size of buffer for the query queue in the
hmmerdispatcher.
Fixed
- Unneeded semaphore in
pyhmmer.hmmermessage passing implementation. - Broken assertion in
Bitfield._from_raw_bytes. - Relax tolerance of HMM validation in
TraceAligner.align_traces.
- Cython
Published by github-actions[bot] about 2 years ago
pyhmmer - v0.10.1
Added
HMM.set_consensusmethod to set the consensus for a method or compute it from the emission probabilities.
Fixed
- Platform detection for MacOS and Armv7 platforms in
setup.py. pyhmmer.plan7.HMMconstructor setting a consensus string forcefully.
- Cython
Published by github-actions[bot] over 2 years ago
pyhmmer - v0.10.0
Added
- Support for compiling wheels for Aarch64 and NEON-enabled Arm platforms.
Changed
- Updated HMMER to
v3.4. - Updated Easel to
v0.49. - Use
cibuildwheelto build wheel distributions.
Fixed
- Patch missing
PyInterpreterState_GetIDpreventing the package from working on PyPy 3.9.
- Cython
Published by github-actions[bot] over 2 years ago
pyhmmer - v0.8.2
Added
- Bracket-style
reprimplementation toHMM,ProfileandOptimizedProfileshowing model alphabet, length and name. MissingCutoffsandInvalidParameterexceptions inheritingValueError.
Changed
- Replace
pthreadlocks withPyThreadAPI for synchronizing models inOptimizedProfileBlock.
Fixed
- Sequence length extraction in
LongTargetsPipeline.search_hmm(#42). LongTargetsPipeline.search_msanot building a HMM withBuilder.build_msa.
- Cython
Published by github-actions[bot] over 2 years ago
pyhmmer - v0.8.0
PyHMMER has been accepted for publication in Bioinformatics. Paper accessible here: doi:10.1093/bioinformatics/btad214.
Added
pyhmmer.hmmer.jackhmmerfunction to run several JackHMMER iterative searches in parallel using multithreading (#35, by @zdk123).HMM.to_profileshortcut method to allocate and configure a newProfileobject.
Fixed
- Type annotations of
Pipeline.iterate_seqandPipeline.iterate_hmm. - Potential memory leak on exceptions raised by
HMMPressedFile.read. Offsets.profilenot recording offsets properly, causingpyhmmer.hmmer.hmmpressto produce invalid pressed files (#37).
Changed
HMM.__init__andHMM.samplenow take theAlphabetas the first argument, for consistency with the rest of the API.HMMnow require anameargument.
Removed
- Deprecated
ignore_gapsargument inSequenceFile.__init__. - Deprecated
Sequence.taxonomy_idproperty.
- Cython
Published by github-actions[bot] over 2 years ago
pyhmmer - v0.7.4
Added
- Recipes page to the documentation with code example for loading multiple HMM files (#24, by @zdk123).
Fixed
TraceAlignermethods causing a segfault when passed an uninitialized HMM (#36).
Changed
HMMdefault constructor now always creates a valid HMM (with respects to probability arrays).TraceAlignernow validates the inputHMMbefore calling the HMMER code.- Use stack allocation for all error buffers instead of creating empty
bytearrayobjects where applicable.
- Cython
Published by github-actions[bot] over 2 years ago
pyhmmer - v0.7.2
Added
easel.GeneticCodeclass wrapping anESL_GENCODEstruct for configuring translation.DigitalSequence.translatemethod to translate a nucleotide sequence to a protein sequence. Metadata is copied from the source sequence to its translation (#31, by @valentynbez).
Deprecated
Sequence.taxonomy_idproperty, as it is not used by Easel and implementation is not consistent (See EddyRivasLab/easel#68).
- Cython
Published by github-actions[bot] almost 3 years ago
pyhmmer - v0.7.0
Added
Bitfield.zerosandBitfield.onesclassmethods for constructing an empty bitfield of known size.Bitfield.copymethod to copy a bitfield object.SequenceBlockandOptimizedProfileBlockclasses to store Python objects next to a contiguous array of pointers for iterating with the GIL released.SequenceFile.read_blockmethod to read a whole sequence block from a file.HMM.sampleclass method to generate a HMM at random given aRandomnesssource.hmmscanfunction to scan a profile database with sequence queries.deepcopyimplementations toHMM,ProfileandOptimizedProfileclasses ofplan7.rewindmethod toHMMFile,HMMPressedFileandSequenceFileto reset a file back to its initial position.nameattribute toHMMFile,HMMPressedFile,MSAFileandSequenceFileto expose the path of a file (when it was created from path).localproperty toProfileandOptimizedProfile, indicating whether a profile is in local or global mode.multihitproperty toProfileandOptimizedProfile, indicating whether a profile is in unihit or multihit mode, with a setter taking care of the reconfiguration.Domain.includedandDomain.reportedsettable properties to report the inclusion and reporting status of a single domain.TopHits.includedandTopHits.reportedsized iterator to iterate only on included and reported hits.Domains.includedandDomains.reportedsized iterator to iterate only on included and reported domains.
Changed
Bitfield,VectorandMatrixcan now be created from an iterable.Pipelinesearch methods now expect aDigitalSequenceBlockor aSequenceFilefor the target sequence database.Pipelinescan methods now expect anOptimizedProfileBlockor aHMMPressedFilefor the target profile database.TraceAlignernow expect aDigitalSequenceBlockfor the sequences to align to the HMM.Profile.configurenow uses a default value of 400 for theLargument.hmmsearch,nhmmerandphmmersupport being given a single query instead of requiring an iterable.HMMPressedFilecan now be created, closed and used as a context manager directly without having to manage the sourceHMMFile.- Renamed
Profile.optimizedmethod toProfile.to_optimized. - Replaced
Randomness.is_fastmethod with theRandomness.fastproperty. - Rewrite handling of
Hitflags using settable properties (Hit.included,Hit.reported,Hit.new,Hit.dropped,Hit.duplicate) instead of methods.
Fixed
- Memory leak in the
LongTargetsPipelinesearch loop. - PyPy behaviour change of
readintomethods now expectingunsigned char*instead ofchar*memoryview. NULL-pointer dereference inPipeline.search_hmmwhen given a query without name.LongTargetsPipelinenot recording the query name and accession.- Memory leak caused by using a non-default prior scheme when constructing a
Builder.
Removed
PipelineSearchTargets, replaced in functionality witheasel.DigitalSequenceBlock.is_localandis_multihitmethods ofProfileandOptimizedProfile, replaced with equivalent properties.Hit.manually_dropandHit.manually_includemethods, replaced with the differentHitproperties.
- Cython
Published by github-actions[bot] about 3 years ago
pyhmmer - v0.6.3
Fixed
- Error not being raised on alphabet detection failure in
SequenceFileorMSAFile. - Add check in
DigitalSequenceconstructor to make sure encoded characters are in valid range (#25).
Added
SequenceFile.guess_alphabetandMSAFile.guess_alphabetto guess the alphabet from an open file.Alphabet.encodeandAlphabet.decodeto convert raw sequences between digital and text format.
- Cython
Published by github-actions[bot] over 3 years ago
pyhmmer - v0.6.2
Changed
hmmsearch,phmmerandnhmmerfunctions will reduce the requested number of threads to the number of queries, if it can be detected usingoperator.length_hint.
Added
- Documentation for loading all HMMs from an
HMMFileobject at once (#23). - List of projects depending on PyHMMER to the
Examplespage of the documentation.
- Cython
Published by github-actions[bot] over 3 years ago
pyhmmer - v0.6.1
Added
pickleprotocol support forTopHitsobjects, using the HMMER network serialization.TopHits.writemethod to write hits to a file in tabular format.query_nameandquery_accessionproperties toTopHitsobjects to access the name and accession of the query that produced the hits.
Fixed
- Extraction of filename from file-like objects in the
HMMFileconstructor. - Use
os.cpu_countinstead ofmultiprocessing.cpu_countwhere applicable to preserve OS scheduling. - Wrong return type in docstring of
HMM.insert_emissions. TopHits.searched_nodesreturning the searched number of residues instead of the searched number of model nodes.- Unsound decoding of pickled
MatrixForVectorFwhen data comes from a source of different endianness.
Changed
- Rewrite
pyhmmer.hmmerthreading code usingDequeinstead ofcollections.Queueto store the queries and results. - Reduce memory consumption of
pyhmmer.hmmerby reducing the number of semaphores and event flags used concurrently. - Make
pyhmmer.hmmermain threads block on query insertion rather than result retrieval to make sure worker threads are never idling.
- Cython
Published by github-actions[bot] over 3 years ago
pyhmmer - v0.6.0
Added
pyhmmer.daemonmodule with an client implementation to communicate to ahmmpgmdserver.Pipeline.argumentsmethods to get a list of CLI arguments from the parameters used to initialize thePipeline.- Setters for
name,accessionanddescriptionproperties ofplan7.Hit. - Constructor for individual
plan7.Traceobjects outside aplan7.Traceslist. plan7.Trace.from_sequenceconstructor to create a faux trace from a single sequence.manually_includeandmanually_dropmethods toplan7.Hitfor manually selecting the inclusion status of aHitin aTopHitsinstance.compare_rankingmethod toplan7.TopHitsfor comparing the order of the hits compared to a previous run on the same targets stored in aneasel.KeyHashobject.Pipeline.iterate_seqandPipeline.iterate_hmmto run iterative queries like JackHMMER.reprimplementations foreasel.MSAFile,easel.SequenceFileandeasel.HMMFileshowing the path or file object they were created from.reprimplementation foreasel.Randomnessshowing the seed and the RNG algorithm in use.strimplementation forplan7.Alignmentusing HMMER original code to display a domain alignment like in search/scan results.
Changed
plan7.Trace.posterior_probabilitiesproperty may now beNonein case no memory is allocated for the posteriors in theP7_TRACEstruct.TopHits.to_msacan now add additional sequences passed as arguments to the alignment.plan7.HMMPressedFilenow raises an exception on attempts to create a new instance manually.ignore_gapsargument ofeasel.SequenceFileis now deprecated.reprimplementations foreaseltypes now use the fully qualified class name.
Fixed
easel.SequenceFile.readintodocstring not rendering properly in documentation.- Type annotations of
hits_includedandhits_reportedofplan7.TopHitsmarking these properties asboolinstead ofint. - Setters of
name,accession,descriptionandauthorproperties ofeasel.MSAcrashing when givenNonevalues. - Exception value raised from Easel code not being properly extracted.
- Plain strings being used in example for
easel.TextSequenceandeasel.TextMSAconstructors where byte strings are expected (#20).
- Cython
Published by github-actions[bot] over 3 years ago
pyhmmer - v0.5.0
Added
plan7.PipelineSearchTargetsto reduce the overhead when searching the same sequences several times with different. query profiles.TopHits.copymethod to duplicate aTopHitsinstance.TopHits.mergemethod to merge hits obtained with the same query on different targets.- Buffer protocol implementation for
pyhmmer.easel.Bitfield.
Changed
- Renamed
TopHits.includedandTopHits.reportedproperties toTopHits.hits_includedandTopHits.hits_included. MSAFileandSequenceFileare now directly in digital mode if they are instantiated withdigital=True.SequenceFile.parsecan now return a sequence in digital mode.- Reorganized tests to make then runnable from a site install.
Fixed
- Usage of
memcpyin contexts where it may have had undefined behaviour. VectorF.__eq__crashing when comparing two empty objects.SequenceFileandMSAFilenot closing file handles when raising an error in__init__.
- Cython
Published by github-actions[bot] almost 4 years ago
pyhmmer - 0.4.11
Added
plan7.HMMFile.readmethod to read a singleplan7.HMMfrom anplan7.HMMFile(instead of usingnext).closedproperty oneasel.SequenceFile,easel.MSAFileandplan7.HMMFileto mark whether a file object is closed.plan7.HMMFile.is_pressedmethod to check whether a HMM file has associated pressed data.plan7.HMMFile.optimized_profilesmethods to read theplan7.OptimizedProfileentries in anplan7.HMMFileis there are associated pressed data available.- Getters for the
name,accession,description,consensus,consensus_structure,evalue_parametersandcutoffsproperties of aplan7.OptimizedProfile. plan7.OptimizedProfile.__eq__implementation to compare two optimized profiles.__sizeof__implementations forplan7.OptimizedProfileandplan7.Profileto get the allocated size of a profile.
Fixed
- Double-free caused by the Cython cycle breaking feature on several view types (
easel.Randomness,easel.Vector,easel.Matrix,plan7.Cutoffs,plan7.EvalueParameters,plan7.Offsets,plan7.Trace) plan7.Hit.descriptionusing the pointer to the accession string erroneously, causing occasional NULL dereference.plan7.OptimizedProfile.copyperforming a shallow copy instead of a deep copy as expected.
Changed
pyhmmer.hmmertype annotations now explicit support forplan7.Profileorplan7.OptimizedProfileinputs where applicable.
- Cython
Published by althonos about 4 years ago
pyhmmer - 0.4.10
Added
entropyandrelative_entropymethods toeasel.VectorFto compute the Shannon entropy of a vector and the Kullback-Leibler divergence of two vectors.mean_match_entropy,mean_match_informationandmean_match_relative_entropymethods toplan7.HMMto get information statistics of an HMM model.match_occupancymethod toplan7.HMMto compute the occupancy for each match state as aneasel.VectorF.
Fixed
plan7.Builder.build_msausing the gap-open and gap-extend probabilities instead of the MSA itself to compute the transition probabilities for the new HMM.
Changed
plan7.Builder.buildwill now only load the score system once and reuse it unless a different score system is requested between calls.
- Cython
Published by althonos about 4 years ago
pyhmmer - 0.4.9
Added
plan7.ScoreDataclass to store the substitution scores and maximal extensions for a long target search.plan7.LongTargetsPipelineto run searches on targets longer than 100,000 residues.Alphabetmethods to check whether anAlphabetobject is a DNA, RNA, nucleotide or protein alphabet.window_lengthandwindow_betaarguments toplan7.Builderto set the max length of nucleotideHMMcreated by builder objects.
Changed
pyhmmer.hmmer.nhmmernow uses aLongTargetsPipelineinstead of aPipelineto search the target sequences.pyhmmer.hmmer.nhmmernow supportsHMMqueries in addition toDigitalSequenceandDigitalMSAqueries.pyhmmer.hmmer.phmmernow always assumes protein queries.ZanddomZattributes ofplan7.TopHitsobjects is now read-only.
Fixed
nhmmernow uses DNA as the default alphabet instead of amino acid alphabet like it did before (#12).
- Cython
Published by althonos about 4 years ago
pyhmmer - 0.4.8
Added
- Constructor arguments and properties to
plan7.Pipelineto support bit score thresholds instead to filter top hits. - Support for creating a
SequenceFileand anMSAFileusing a Python file-like object instead of only supporting filenames. - Support for reading individual sequences from an MSA file with
SequenceFile. TextMSA.alignmentto access the actual alignment as a tuple of strings.- Subtraction and division support for
easel.Vectorsubclasses
Changed
plan7.Cutoffsnow support setting the bit score cutoffs, but requires both to be set or cleared at the same time.easel.Vectorwill always allocate some memory when created manually to avoid having a special empty case in every vector method.pyhmmer.easel.AllocationErrornow stores the size it failed to allocate, and the number of elements when allocating an array.
Fixed
TextSequence.digitizewill not raise aValueErrorwhen the sequence contains invalid characters for the alphabet (previously was anUnexpectedError).
- Cython
Published by althonos about 4 years ago
pyhmmer - 0.4.7
Added
TraceAligner,TraceandTracesclasses topyhmmer.plan7to get tracebacks after aligning several sequences against an HMM.pyhmmer.hmmalignfunction with the same features as thehmmalignbinary from HMMER3.- Support for out-of-band pickling in
easel.Vectorandeasel.Matrix.
Changed
- Allow creating an empty
VectororMatrixby calling their constructor without arguments.
Fixed
- Potential unreported exceptions in
plan7.OptimizedProfile.writeand severalplan7.SSIWritermethods.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.4.6
Added
pickleprotocol foreasel.Alphabet,easel.Bitfield,easel.KeyHash,easel.Vector,easel.Matrixandplan7.HMM.taxonomy_idandresidue_markupsproperties toeasel.Sequence.sum_scoreproperty toplan7.Hit.plan7.EvalueParametersclass to expose the e-value parameters of aplan7.HMMor aplan7.Profile.- Equality checks and slicing for
easel.Matrixandeasel.Vector. - Support for creating and manipulating zero-sized
easelmatrices and vectors. plan7.Cutoffsclass to expose the Pfam score cutoffs of aplan7.HMMor aplan7.Profile.- Keyword arguments to configure E-value thresholds when creating a
plan7.Pipelineobject. - Support for using model-specific thresholding options in
plan7.Pipeline.
Changed
- Use the replace error handler when decoding error messages to skip potential decoding issues when already building an exception.
- Improve
pyhmmer.hmmerto ensure background threads exit on aKeyboardInterrupt. easel.VectorU8.__eq__accepts any object implementing the buffer protocol.plan7.HMM.creation_timenow takes and returns adatetime.datetimeobject, assuming the field is only ever set withasctime.- Refactor
easel.Vectorandeasel.Matrixand mark exposed memory as C-contiguous.
Fixed
easel.Alphabetnot reporting potential allocation errors.- Potential buffer overflow in
easel.Matrixandeasel.Vectorwhen calling__init__more than once.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.4.5
Added
OptimizedProfile.convertmethod to configure an optimized profile from aProfilewithout reallocating a newP7_OPROFILEstruct.
Changed
- Rewrite the
plan7.Pipelinesearch loop to avoid reacquiring the GIL between reference sequences. - Require the reference sequences to be stored in a collection (instead of an iterable) when passing them to the
search_hmm,search_msaandsearch_seqmethods ofplan7.Pipeline. - Avoid reallocating a new
OptimizedProfileevery time a new HMM is passed toPipeline.search_hmm. - Relax the GIL while sorting and thresholding
TopHitsinPipelinesearch methods.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.4.4
Added
ignore_gapsparameter topyhmmer.plan7.SequenceFile, allowing to skip the gap characters when reading a sequence from an ungapped format.__sizeof__implementation for some- Dedicated check for sequence length before running the platform-specific code in
pyhmmer.plan7.Pipeline.
Fixed
- Score system not being set in
pyhmmer.plan7.Builder.build_msa. - Alphabet not being checked after the first sequence in
Pipelinesearch and scan methods.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.4.2
Added
pyhmmer.easel.Randomnessclass exposing a deterministic random number generator.pyhmmer.plan7.Builder.randomnessandpyhmmer.plan7.Pipeline.randomnessattributes exposing the internal random number generator used by each object.pyhmmer.plan7.Hit.best_domainproperty mapping to the highest scoring domain of a hit.pyhmmer.plan7.OptimizedProfile.rbvproperty exposing match scores.pyhmmer.plan7.Domain.pvalueandpyhmmer.plan7.Hit.pvaluereporting the p-value for a domain or hit bitscore.
Fixed
- Dimensions of the
pyhmmer.plan7.OptimizedProfile.sbvmatrix not being properly set.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.4.1
Fixed
- Main buffer not being freed in
MatrixF.__dealloc__andMatrixU8.__dealloc__when created without owner.
Added
- Additional configuration values for
pyhmmer.plan7.Pipelineas both constructor arguments and mutable properties. consensus,consensus_structureandoffsetsproperties topyhmmer.plan7.Profileobjects.
Changed
- Make
OptimizedProfile.ssv_filtercheck the alphabet of the given sequence.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.4.0
Added
- Linear algebra primitives to expose 1D (
Vector) and 2D (Matrix) contiguous buffers containing numerical values topyhmmer.easel. - Documentation for the
ZanddomZparameters of thepyhmmer.plan7.Pipelineconstructor. pyhmmer.errors.AlphabetMismatchexception deriving fromValueErrorto specifically report mismatching Easel alphabets where applicable.scaleandnormalizemethods topyhmmer.plan7.HMMobjects.- Property to access
pyhmmer.plan7.Backgroundresidue frequencies as aVectorFobject. - Property to access
pyhmmer.plan7.HMMmean residue composition as aVectorFobject. - Property to access
pyhmmer.plan7.HMMprobabilities and emissions asMatrixFobjects. ssv_filtermethods topyhmmer.plan7.OptimizedProfileto get the SSV filter score of the profile for a given sequence.- Several additional properties to access the
pyhmmer.plan7.OptimizedProfileinternals.
Removed
- Unused
report_eparameter ofpyhmmer.plan7.Pipelineconstructor. pyhmmer.plan7.TopHits.clearmethod which could lead to segfault if it was called while aHitis being held.
Changed
- Multithreaded loop in
pyhmmer.hmmerto reduce memory consumption while still yielding hits in order. pyhmmer.easel.DigitalSequence.sequenceproperty is now aVectorU8.
Fixed
- Type annotations in
pyhmmer.hmmer. - Potential double free in
pyhmmer.plan7.HMM.command_lineproperty setter. - Minor floating-point precision issues in
pyhmmer.plan7.Builderconstructor. - Segfault in
TextMSA.digitizecaused byesl_msa_Copynot digitizing on-the-fly likeesl_sq_Copy. - Exceptions not being raised in some methods of
pyhmmer.plan7.Profileandpyhmmer.plan7.TopHits.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.3.1
Added
Pipeline.scan_seqmethod to query a database of profiles with one or more sequences.transition_probabilities,match_emissions,insert_emissionsproperties to theHMMclass, providing access to the numerical parameters of the HMM.consensus_structureandconsensus_accessibilityproperties to theHMMclass to get consensus lines from the source alignment if the HMM was created from a MSA.nseqandnseq_effectiveproperties to theHMMclass to get the number of training sequences and effective sequences used to build the HMM.
Changed
HMM.checksumis nowNoneif thep7H_CHKSUMflag is not set.Buildermethods will now recordsys.argvwhen creating a HMM.
Fixed
HMM.write(..., binary=False)crashing on HMMs without a consensus line. (#5). Fixed upstream in (EddyRivasLab/HMMER#236).Pipeline.resetmishandling theZanddomZvalues if those were detected from the number of targets.pyhmmer.hmmerfunctions will not block until all results have been collected anymore when run in multithreaded mode.
- Cython
Published by althonos over 4 years ago
pyhmmer - 0.3.0
Added
easel.MSAFileto read from a file containingaccession,author,nameanddescriptionproperties toeasel.MSAobjects.plan7.Builder.build_msato build a pHMM from a sequence alignment.- Additional methods to
easel.KeyHash, allowing to use it as adict/sethybrid. Sequence.writeandMSA.writemethods to format a sequence or an alignment to a file handle.plan7.TopHist.to_msamethod to convert all the top hits of a query against a database into a multiple sequence alignment.easel.MSA.sequencesattribute to access individual sequences of an alignment using thecollections.abc.Sequenceinterface.easel.DigitalMSA.textizemethod to convert a multiple sequence alignment in digital mode to its text-mode counterpart.- Read-only
name,accessionanddescriptionproperties toplan7.Profileshowing attributes inherited from the HMM it was configured with. plan7.HMM.consensusproperty, allowing to access the consensus sequence of a pHMM.plan7.HMMequality implementation, using zero tolerance.plan7.Pipeline.search_msato query a MSA against a sequence database.easel.Sequence.reverse_complementmethod allowing to reverse-complement inplace or to build a copy.errors.AlphabetMismatchexception for use in cases where an alphabet is expected but not matched by the input.hmmer.nhmmerfunction with the same behaviour ashmmer.phmmer, except it expects inputs with a DNA alphabet.
Fixed
plan7.Builder.copynot copying some parameters correctly, causingpyhmmer.hmmer.phmmerto give inconsistent results in multithreaded mode.easel.Bitfieldnot properly handling index overflows.- Documentation not rendering for the
__init__method of all classes.
Changed
plan7.Buildergap-open and gap-extend probabilities are now set on instantiation and depend on the alphabet type.- Constructors for
easel.TextMSAandeasel.DigitalMSA, which can now be given an iterable ofeasel.Sequenceobjects to store in the alignment.
Removed
- Unimplemented
easel.SequenceFile.fetchandeasel.SequenceFile.fetchintomethods.
- Cython
Published by althonos almost 5 years ago
pyhmmer - 0.2.0
Added
pyhmmer.plan7.Builderclass to handle building a HMM from a sequence.Pipeline.search_seqto query a sequence against a sequence database.psutildependency to detect the most efficient thread count forhmmsearchbased on the number of physical CPUs.pyhmmer.hmmer.phmmerfunction to run a search of query sequences against a sequence database.
Changed
Pipeline.searchwas renamed toPipeline.search_hmmfor disambiguation.libeasel.randomsequences do not require the GIL anymore.- Public API now have proper signature annotations.
Fixed
- Inaccurate exception messages in
Pipeline.search_hmm. - Unneeded RNG reallocation, replaced with re-initialisation where possible.
SequenceFile.__next__not working after being set in digital mode.sequencesargument ofhmmsearchnow only requires atyping.Collection[DigitalSequence]instead of atyping.Collection[Sequence](not more__getitem__needed).
Removed
hitsargument toPipeline.search_hmmto reduce risk of issues withTopHitsreuse.- Broken alignment coordinates on
Domainclasses.
- Cython
Published by althonos almost 5 years ago
pyhmmer - 0.1.4
Added
DigitalSequence.textizeto convert a digital sequence to a text sequence.DigitalSequence.__init__method allowing to create a digital sequence from any object implementing the buffer protocol.Alignment.hmm_accessionproperty to retrieve the accession of the HMM in an alignment.
- Cython
Published by althonos almost 5 years ago
pyhmmer - 0.1.1
Fixed
HMMFilecallingfile.peekwithout arguments, causing it to crash when passed some types, e.g.gzip.GzipFile.HMMFilefailing to work with PyPy file objects because of a bug with their implementation ofreadinto.- C/Python file object implementation using
strcpyinstead ofmemcpy, causing issues when null bytes were read.
- Cython
Published by althonos about 5 years ago
pyhmmer - 0.1.0
Initial beta release.
Fixed
TextSequenceuses the sequence argument it's given on instantiation.- Segmentation fault in
Sequence.__eq__caused by implicit type conversion. - Segmentation fault on
SequenceFile.readfailure. - Missing type annotations for the
pyhmmer.easelmodule.
- Cython
Published by althonos about 5 years ago
pyhmmer - 0.1.0-a5
Added
Sequence.__len__magic method so thatlen(seq)returns the number of letters inseq.- Python file-handle support when opening an
pyhmmer.plan7.HMMFile. - Context manager protocol to
pyhmmer.easel.SSIWriter. - Type annotations for
pyhmmer.easel.SSIWriter. add_aliastopyhmmer.easel.SSIWriter.writemethod topyhmmer.plan7.OptimizedProfileto write an optimized profile in binary format.offsetsproperty to interact with the disk offsets of apyhmmer.plan7.OptimizedProfileinstance.pyhmmer.hmmer.hmmpressemulating thehmmpressbinary from HMMER.Mproperty topyhmmer.plan7.HMMexposing the number of nodes in the model.
Changed
- Bumped vendored Easel to
v0.48. - Bumped vendored HMMER to
v3.3.2. pyhmmer.plan7.HMMFilewill raise anEOFErrorwhen given an empty file.- Renamed
lengthproperty toLinpyhmmer.plan7.Background.
Fixed
- Segmentation fault when
closemethod ofpyhmmer.easel.SSIWriterwas called more than once. closemethod ofpyhmmer.easel.SSIWriternot writing the index contents.
- Cython
Published by althonos about 5 years ago
pyhmmer - 0.1.0-a4
Added
MSA,TextMSAandDigitalMSAclasses representing a multiple sequence alignment topyhmmer.easel.- Methods and protocol to copy a
Sequenceand aMSA. pyhmmer.plan7.OptimizedProfilewrapping a platform-specific optimized profile.SSIReaderandSSIWriterclasses interacting with sequence/subsequence indices topyhmmer.easel.- Exception handler using Python exceptions to report Easel errors.
Changed
pyhmmer.hmmsearchreturns an iterator ofTopHits, with one instance perHMMin the input.pyhmmer.hmmsearchproperly raises errors happenning in the background threads without deadlock.pyhmmer.plan7.Pipelinerecycles memory betweenPipeline.searchcalls.
Fixed
- Missing type annotations for the
pyhmmer.errorsmodule.
Removed
- Unneeded or private methods from
pyhmmer.plan7.
- Cython
Published by althonos about 5 years ago
pyhmmer - 0.1.0-a3
Added
TextSequenceandDigitalSequencerepresenting aSequencein a given mode.- E-value properties to
HitandDomain. TopHitsnow stores a reference to the pipeline it was obtained from.Pipeline.ZandPipeline.domZproperties.- Experimental pickling support to
Alphabet. - Experimental freelist to
Sequenceclass to avoid allocation bottlenecks when iterating on aSequenceFilewithout recycling sequence buffers.
Changed
- Made
Sequencean abstract base class. - Additional
Pipelineparameters can be passed as keyword arguments topyhmmer.hmmsearch. SequenceFile.readcan now be configured to skip reading the metadata or the content of a sequence.
Removed
- Redundant
SequenceFilemethods.
Fixed
doctestloader crashing on Python 3.5.TopHits.thresholdsegfaulting when being called without priorTophits.sortcall- Unknown
formatargument toSequenceFileconstructor not raising the right error.
- Cython
Published by althonos about 5 years ago
pyhmmer - 0.1.0-a2
Added
- Support for compilation on PowerPC big-endian platforms.
- Type annotations and stub files for Cython modules.
Changed
distutilsis now used to compile the package, instead of callingautotoolsand letting HMMER configure itself.Bitfield.countnow allows passing an argument (for compatibility withcollections.abc.Sequence).
- Cython
Published by althonos about 5 years ago