Recent Releases of SeqArray

SeqArray - Bioconductor Release 3.16

CHANGES IN VERSION 1.38.0

UTILITIES

  • new option 'ext_nbyte' in seqGet2bGeno()
  • seqAlleleCount() and seqGetAF_AC_Missing() return NA instead of zero when all genotypes are missing at a site
  • seqGDS2VCF() does not output the FORMAT column if there is no selected sample (e.g., site-only VCF files)
  • seqGetData(, "$chrom_pos2") is similar to seqGetData(, "$chrom_pos") except the duplicates with the suffix ("1", "2" or >2)

NEW FEATURES

  • seqGDS2BED() can convert to PLINK BED files with the best-guess genotypes when there are only numeric dosages in the GDS file
  • seqEmptyFile() outputs an empty GDS file

- C++
Published by zhengxwen over 3 years ago

SeqArray - Bioconductor Release 3.15

CHANGES IN VERSION 1.36.0

NEW FEATURES

  • new functions seqUnitCreate(), seqUnitSubset() and seqUnitMerge()
  • new functions seqFilterPush() and seqFilterPop()
  • new functions seqGet2bGeno() and seqGetAF_AC_Missing()
  • new function seqGetData(, "$dosage_sp") for a sparse matrix of dosages
  • the first argument 'gdsfile' can be a file name in seqAlleleFreq(), seqAlleleCount(), seqMissing()
  • new function seqMulticoreSetup() for setting a multicore cluster according to a numeric value assigned to the argument 'parallel'

UTILITIES

  • allow opening a duplicated GDS file ('allow.duplicate=TRUE') when the input is a file name instead of a GDS object in seqGDS2VCF(), seqGDS2SNP(), seqGDS2BED(), seqVCF2GDS(), seqSummary(), seqCheck() and seqMerge()
  • remove the deprecated '.progress' in seqMissing(), seqAlleleCount() and seqAlleleFreq()
  • add summary.SeqUnitListClass()
  • no genotype and phase data nodes from seqSNP2GDS() if SNP dosage GDS is the input

BUG FIXES

  • seqUnitApply() works correctly with selected samples if 'parallel' is a non-fork cluster
  • seqVCF2GDS() and seqVCF_Header() work correctly if the VCF header has white space
  • seqGDS2BED() with selected samples for sex and phenotype information
  • bug fix in seqGDS2VCF() if there is no integer genotype

- C++
Published by zhengxwen about 4 years ago

SeqArray - Bioconductor Release v3.13

CHANGES IN VERSION 1.32.0

NEW FEATURES

  • new option 'ret.idx' in seqSetFilter() for unsorted sample and variant indices
  • new option 'ret.idx' in seqSetFilterAnnotID() for unsorted variant index
  • rewrite the function seqSetFilterPos(): new options 'ref' and 'alt', 'multi.pos=TRUE' by default
  • new option 'packed.idx' in seqAddValue() for packing an indexing variable
  • new option 'warn' in seqSetFilter() to enable or disable the warning
  • new functions seqNewVarData() and seqListVarData() for variable-length data

UTILITIES

  • allow no variant in seqApply() and seqBlockApply()
  • the list object returned from seqGetData() always have names if there are more than one input variable names

BUG FIXES

  • seqGDS2VCF() should output "." instead of NA in the FILTER column
  • seqGetData() should support factor when '.padNA=TRUE' or '.tolist=TRUE'
  • fix seqGDS2VCF() with factor variables
  • seqSummary(gds, "$filter") should return a data frame with zero row if 'annotation/filter' is not a factor

- C++
Published by zhengxwen almost 5 years ago

SeqArray - Bioconductor Release 3.10

CHANGES IN VERSION 1.26.0

NEW FEATURES * new function seqAddValue()

UTILITIES * RLE chromosome coding in seqBED2GDS() * change the file name "vignettes/R_Integration.Rmd" to "vignettes/SeqArray.Rmd", so vignette("SeqArray") can work directly * correct Estimated remaining Time to Complete (ETC) for load balancing in seqParallel()

BUG FIXES * seqBED2GDS(, verbose=FALSE) should have no display

CHANGES * use a svg file instead of png in vignettes

- C++
Published by zhengxwen over 6 years ago

SeqArray - Bioconductor Release 3.9

CHANGES IN VERSION 1.24.0

NEW FEATURES

  • a new function seqResetVariantID()
  • a new option in seqRecompress(, compress="none") to uncompress all data
  • seqGetData() allows a GDS file name in the first argument

- C++
Published by zhengxwen about 7 years ago

SeqArray - Bioconductor Release v3.8

CHANGES IN VERSION 1.22.0

UTILITIES

  • avoid duplicated meta-information lines in seqVCF2GDS() and seqVCF_Header()
  • require >= R_v3.5.0, since reading from connections in text mode is buffered
  • seqDigest() requires the digest package
  • optimization in reading genotypes from a subset of samples (according to gdsfmt_1.17.5)

NEW FEATURES

  • seqSNP2GDS() imports dosage GDS files
  • seqVCF_Header() allows a BCF file as an input
  • a new function seqRecompress()
  • a new function seqCheck() for checking the data integrity of a SeqArray GDS file
  • seqGDS2SNP() exports dosage GDS files

BUG FIXES

  • seqVCF2GDS() and seqVCF_Header() are able to import site-only VCF files (i.e., VCF with no sample)
  • fix seqVCF2GDS() and seqBCF2GDS() since reading from connections in text mode is buffered for R >= v3.5.0

- C++
Published by zhengxwen over 7 years ago

SeqArray - backward compatibility for ≤ R_3.4.4

Reading from connections in text mode is buffered for >= R3.5.0. No use buff in the new version (>=3.5.0) of R_ext/Connections.h: ```c struct Rconn { ... unsigned char *buff; sizet bufflen, buffstoredlen, buffpos; }; ```

Install: R library(devtools) install_github("zhengxwen/SeqArray", ref="1d5ab05fa8ae8b754feab62f41ab00a182d54793")

- C++
Published by zhengxwen over 7 years ago

SeqArray - Bioconductor Release (v3.5)

- C++
Published by zhengxwen about 9 years ago

SeqArray - SeqArray_1.12.8

- C++
Published by zhengxwen over 9 years ago

SeqArray - SeqArray package with a slow version of seqVCF2GDS()

  • SeqArrayv1.11.18 is backward compatible with Rv2.15.0
  • the later version will require R (>=v3.3.0), which utilizes the official C API R_GetConnection() to accelerate text import and export

R library("devtools") install_github("zhengxwen/SeqArray", ref="v1.11.18")

- C++
Published by zhengxwen about 10 years ago

SeqArray - SeqArray_1.10.0 (BioC 3.2 Release)

  • new functions seqGDS2SNP(), seqOptimize(), seqMissing(), seqAlleleFreq(), seqNumAllele(), seqSetFilterChrom(), seqSNP2GDS(), seqBED2GDS(), seqAlleleCount() and seqResetFilter()
  • supported by the SNPRelate package
  • support seqApply(..., margin="by.sample")
  • "intersection" and "push+intersection" in seqSetFilter()
  • a new function seqExport()
  • new argument ".useraw" in seqApply()
  • improve access speed (+50%, benchmark on calling seqApply(..., FUN=function(x) {}))
  • seqCompress.Option is renamed to seqStorage.Option
  • "ZIPRA" is the default value in seqStorage.Option() and other functions instead of "ZIPRA.max"
  • seqSetFilter() becomes a S4 method

- C++
Published by zhengxwen over 10 years ago