Recent Releases of stringi
stringi - stringi_1.8.6
1.8.6 (2025-03-26)
[BUGFIX] Fixed build warnings.
[BUGFIX] #512: Fixed PROTECT stack imbalance in
stri_encode_from_marked.
- C++
Published by gagolews 11 months ago
stringi - stringi_1.8.4
1.8.4 (2024-05-06)
- [BUILD TIME] [BUGFIX] #508: Fixed build errors on Windows (thanks to @jeroen and @kalibera).
- C++
Published by gagolews almost 2 years ago
stringi - stringi_1.8.3
1.8.3 (2023-12-10)
[BUILD TIME] [BUGFIX] Fixed the *format string is not a string literal
(potentially insecure)* warnings.
- C++
Published by gagolews about 2 years ago
stringi - stringi_1.8.2
1.8.2 (2021-11-22)
[BUILD TIME] [BUGFIX] #501: Fixed failing build on 32-bit Windows (Windows API
ResolveLocaleNamefunction not available).[BUILD TIME] [BUGFIX] #502:
PKG_CPPFLAGSare now considered before otherCPPFLAGS(the same with other flag types) in theconfigurescript to make it compatible with what happens inMakevars.[BUILD TIME] [BUGFIX] Support for ICU's
doubleconversion on Loongarch has been restored (see #463).
- C++
Published by gagolews over 2 years ago
stringi - stringi_1.8.1
1.8.1 (2023-11-09)
[GENERAL] ICU bundle updated to version 74.1 (Unicode 15.1, CLDR 44).
[BACKWARD INCOMPATIBILITY] [BUILD TIME] Support for Solaris has now been dropped. The package is no longer shipped with the very outdated ICU55 bundle. A compiler supporting at least C++11 as well as ICU >= 61 are now required.
[BACKWARD INCOMPATIBILITY] #469: Missing date-time fields in
stri_datetime_parseandstri_datetime_createnow default to today's midnight local time.[BACKWARD INCOMPATIBILITY] Removed the long-deprecated and defunct
fallback_encodingparameter ofstri_read_linesand the ellipsis parameter ofstri_opts_collator,stri_opts_regex,stri_opts_fixed, andstri_opts_regex.[BUILD TIME] As per the suggestion of Prof. Brian Ripley,
icudt74l(ICU data - little endian) is now included in the source tarball (compressed with xz to save space). This allows for building stringi on systems with no internet access.[NEW FEATURE] #476: In break iterator-, date-time-, and collator-based operations (e.g.,
stri_sort), a warning is emitted when the root ICU resource bundle is returned when using an explicitly requested locale. This might happen when we pass an 'unknown'localeargument to these functions. Note that when relying on the defaultlocale=NULLargument, no warning is emitted. In such a case, checking if the default locale as returned bystri_enc_getis amongst those listed instri_enc_listis recommended.[NEW FEATURE] The
Clocale identifier now resolves toen_US_POSIX.[BUGFIX] #469:
stri_datetime_parsedid not reset theCalendarobject when parsing multiple dates.[BUGFIX] #487: Some functions did not accept ASCII strings longer than 858993457 characters on input.
- C++
Published by gagolews over 2 years ago
stringi - stringi_1.7.12
1.7.12 (2023-01-09)
[BUGFIX] Fixed some potential problems reported by
rchk.[NOTE] [BACKWARD INCOMPATIBLE CHANGE IF ICU >= 72] If building against ICU >= 72, note a backward incompatible change:
@is no longer a word break; see https://github.com/unicode-org/cldr/pull/2256 for more details.
- C++
Published by gagolews about 3 years ago
stringi - stringi_1.7.7
1.7.7 (2022-07-02)
[DOCUMENTATION] Paper on stringi has been published in the Journal of Statistical Software, see https://dx.doi.org/10.18637/jss.v103.i02.
[BUGFIX] #473, #397: Fixed buffer overflow in
stri_dup.stri_dup,stri_paste, ... fail more graciously on attempts to generate strings of length >= 2^31 each.[BUILD TIME] #480: Using
Rf_isNullinstead ofisNull.[DOCUMENTATION] #462: That the
numeric=TRUEcollator does not handle negative numbers correctly is now mentioned in the manual.
- C++
Published by gagolews over 3 years ago
stringi - stringi_1.7.6
1.7.6 (2021-11-29)
[BUILD TIME] #463: Added loongarch support in ICU's double conversion (@liuxiang88).
[BUGFIX] #467: The UCRT build on Windows was not marking strings as
latin1.
- C++
Published by gagolews about 4 years ago
stringi - stringi_1.7.5
1.7.5 (2021-10-04)
[DOCUMENTATION] Paper on stringi has been accepted for publication in the Journal of Statistical Software, see https://stringi.gagolewski.com/_static/vignette/stringi.pdf for a draft version.
[DOCUMENTATION] The stringi website at https://stringi.gagolewski.com now features a comprehensive tutorial based on the aforementioned paper.
[DOCUMENTATION] The ICU Project site has been moved to https://icu.unicode.org/.
[BUILD TIME] #457: The
autoconfmacrosAC_LANG_CPLUSPLUSandAC_TRY_COMPILEwere obsolete.[BUGFIX] #458: Passing ALTREP objects no longer yields 'embeded nul in string' errors.
- C++
Published by gagolews over 4 years ago
stringi - stringi_1.7.4
1.7.4 (2021-08-12)
[BUGFIX] #449: Fixed segfaults generated by
stri_sprintf.[BUILD TIME] No longer defining
USE_RINTERNALSandR_NO_REMAP.
- C++
Published by gagolews over 4 years ago
stringi - stringi_1.7.3
[BUGFIX] Fixed the previous patch of ICU55 causing a build failure on, amongst others, CRAN's Solaris-based target.
- C++
Published by gagolews over 4 years ago
stringi - stringi_1.7.2
- [BUGFIX] Workaround for a bug in
tools::checkFFfailing whenNA_character_is passed to.Call.
- C++
Published by gagolews over 4 years ago
stringi - stringi_1.7.1
What Is New in stringi
1.7.1 (2021-07-14)
[BACKWARD INCOMPATIBILITY]
%s$%and%stri$%now use the newstri_sprintf(see below) function instead ofbase::sprintf.[BACKWARD INCOMPATIBILITY, NEW FEATURE] In
stri_sub<-andstri_sub_all<-, providing a negativelengthfrom now on does not result in the corresponding input string being altered.[BACKWARD INCOMPATIBILITY, NEW FEATURE] In
stri_subandstri_sub_all, negativelengthresults in the corresponding output beingNAor not extracted at all, depending on the setting of the new argumentignore_negative_length.[BACKWARD INCOMPATIBILITY, BUGFIX, NEW FEATURE] In
stri_subset*and their replacement versions,patternandvaluecannot be longer thanstr(but now they are recycled if necessary).[BACKWARD INCOMPATIBILITY, NEW FEATURE]
stri_sub*now accept thefromargument being a matrix likecbind(from, length=length). Unnamed columns or any other names are still interpreted ascbind(from, to). Also, the new argumentuse_matrixcan be used to disable the special treatment of such matrices.[DOCUMENTATION] It has been clarified that the syntax of
*_charclass(e.g., used instri_trim*) differs slightly from regex character classes.[NEW FEATURE] #420:
stri_sprintf(alias:stri_string_format) is a Unicode-aware replacement for and enhancement of the basesprintf: it adds a customised handling ofNAs (on demand), computing field size based on code point width, outputting substrings of at most given width, variable width and precision (both at the same time), etc. Moreover,stri_printfcan be used to display formatted strings conveniently.[NEW FEATURE] #153:
stri_match_*_regexnow extract capture group names.[NEW FEATURE] #25:
stri_locate_*_regexnow have a new argument,capture_groups, which allows for extracting positions of matches to parenthesised subexpressions.[NEW FEATURE]
stri_locate_*now have a new argument,get_length, whose setting may result in generating from-length matrices (instead of from-to ones).[NEW FEATURE] #438:
stri_trans_generalnow supports rule-based as well as reverse-direction transliteration.[NEW FEATURE] #434:
stri_datetime_formatandstri_datetime_parseare now vectorised also with respect to theformatargument.[NEW FEATURE]
stri_datetime_fstrhas a new argument,ignore_special, which defaults toTRUEfor backward compatibility.[NEW FEATURE]
stri_datetime_format,stri_datetime_add, andstri_datetime_fieldsnow callas.POSIXctmore eagerly.[NEW FEATURE]
stri_trim*now have a new argument,negate.[NEW FEATURE]
stri_replace_rstrconvertsgsub-style replacement strings tostri_replace-style.[INTERNAL]
stri_prepare_arg*have been refactored, buffer overruns in the exception handling subsystem are now avoided.[BUGFIX] Few functions (
stri_length,stri_enc_toutf32, etc.) did not throw an exception on an invalid UTF-8 byte sequence (and merely issues a warning instead).[BUGFIX]
stri_datetime_fstrdid not honourNA_character_and did not parse format strings such as"%Y%m%d"correctly. It has now been completely rewritten (in C).[BUGFIX]
stri_wrapdid not recognise the width of certain Unicode sequences correctly.
- C++
Published by gagolews over 4 years ago
stringi - stringi_1.6.2
[BACKWARD INCOMPATIBILITY] In
stri_enc_list(),simplifynow defaults toTRUE.[NEW FEATURE] #425: The outputs of
stri_enc_list(),stri_locale_list(),stri_timezone_list(), andstri_trans_list()are now sorted.[NEW FEATURE] #428: In
stri_flatten,na_empty=NAnow omits missing values.[BUILD TIME] #431: Pre-4.9.0 GCC has
::max_align_t, but notstd::max_align_t, added a (possible) workaround, see the INSTALL file.[BUGFIX] #429:
stri_width()misclassified the width of certain code points (including grave accent, Eszett, etc.); General category Sk (Symbol, modifier) is no longer of width 0, UCHAREASTASIANWIDTH of UEA_AMBIGUOUS is no longer of width 2.[BUGFIX] #354:
ALTREPCHARSXPs were not copied, and thus could have been garbage collected in the so-called meanwhile (with thanks to @jimhester).
- C++
Published by gagolews almost 5 years ago
stringi - stringi_1.6.1
What Is New in stringi
1.6.1 (2021-05-05)
[GENERAL] #401: stringi is now bundled with ICU4C 69.1 (upgraded from 61.1), which is used on most Windows and OS X builds as well as on *nix systems not equipped with system ICU. However, if the C++11 support is disabled, stringi will be built against the battle-tested ICU4C 55.1. The update to ICU brings Unicode 13.0 and CLDR 39 support.
[DOCUMENTATION] A draft version of a paper on
stringiis now available at https://stringi.gagolewski.com/_static/vignette/stringi.pdf[GENERAL] stringi now requires R >= 3.1 (
CXX_STDofCXX11orCXX1X).[NEW FEATURE] #408:
stri_trans_casefold()performs case folding; this is different from case mapping, which is locale-dependent. Folding makes two pieces of text that differ only in case identical. This can come in handy when comparing strings.[NEW FEATURE] #421:
stri_rank()ranks strings in a character vector (e.g., for ordering data frames with regards to multiple criteria, the ranks can be passed toorder(), see #219).[NEW FEATURE] #266:
stri_width()now supports emojis.[NEW FEATURE]
%s$%and%stri$%are now vectorised with respect to both arguments.[BUGFIX]
stri_sort_key()now outputsbytes-encoded strings.[BUGFIX] #415:
locale=''was not equivalent tolocale=NULLinstri_opts_collator().[INTERNAL] #414: Use
LEVELS(x)macro instead of accessing(x)->sxpinfo.gpdirectly (@lukaszdaniel).
- C++
Published by gagolews almost 5 years ago
stringi - stringi_1.5.3
1.5.3 (2020-09-04) CRAN
[NEW FEATURE] #400:
%s$%and%stri$%are now binary operators that call base R'ssprintf().[NEW FEATURE] #399: The
%s*%and%stri*%operators can be used in addition tostri_dup(), for the very same purpose.[NEW FEATURE] #355:
stri_opts_regex()now accepts thetime_limitandstack_limitoptions so as to prevent malformed or malicious regexes from running for too long.[NEW FEATURE] #345:
stri_startswith()andstri_endswith()are now equipped with thenegateparameter.[NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
[DEPRECATION WARNING] #347: Any unknown option passed to
stri_opts_fixed(),stri_opts_regex(),stri_opts_coll(), andstri_opts_brkiter()now generates a warning. In the future, the...parameter will be removed, so that will be an error.[DEPRECATION WARNING]
stri_duplicated()'sfromLastargument has been renamedfrom_last.fromLastis now its alias scheduled for removal in a future version of the package.[DEPRECATION WARNING]
stri_enc_detect2()is scheduled for removal in a future version of the package. Usestri_enc_detect()or the more targetedstri_enc_isutf8(),stri_enc_isascii(), etc., instead.[DEPRECATION WARNING]
stri_read_lines(),stri_write_lines(),stri_read_raw(): useconargument instead offnamenow. The argumentfallback_encodingis scheduled for removal and is no longer used.stri_read_lines()does not supportencoding="auto"anymore.[DEPRECATION WARNING]
nparagraphsinstri_rand_lipsum()has been renamedn_paragraphs.[NEW FEATURE] #398: Alternative, British spelling of function parameters has been introduced, e.g.,
stri_opts_coll()now supports bothnormalizationandnormalisation.[NEW FEATURE] #393:
stri_read_bin(),stri_read_lines(), andstri_write_lines()are no longer marked as draft API.[NEW FEATURE] #187:
stri_read_bin(),stri_read_lines(), andstri_write_lines()now support connection objects as well.[NEW FEATURE] #386: New function
stri_sort_key()for generating locale-dependent sort keys which can be ordered at the byte level and return an equivalent ordering to the original string (@DavisVaughan).[BUGFIX] #138:
stri_encode()andstri_rand_strings()now can generate strings of much larger lengths.[BUGFIX]
stri_wrap()did not honourindentcorrectly whenuse_widthwasTRUE.
- C++
Published by gagolews over 5 years ago
stringi - stringi_1.5.2
1.5.2 (2020-09-01) CRAN
[NEW FEATURE] #400:
%s$%and%stri$%are now binary operators that call base R'ssprintf().[NEW FEATURE] #399: The
%s*%and%stri*%operators can be used in addition tostri_dup(), for the very same purpose.[NEW FEATURE] #355:
stri_opts_regex()now accepts thetime_limitandstack_limitoptions so as to prevent malformed or malicious regexes from running for too long.[NEW FEATURE] #345:
stri_startswith()andstri_endswith()are now equipped with thenegateparameter.[NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
[DEPRECATION WARNING] #347: Any unknown option passed to
stri_opts_fixed(),stri_opts_regex(),stri_opts_coll(), andstri_opts_brkiter()now generates a warning. In the future, the...parameter will be removed, so that will be an error.[DEPRECATION WARNING]
stri_duplicated()'sfromLastargument has been renamedfrom_last.fromLastis now its alias scheduled for removal in a future version of the package.[DEPRECATION WARNING]
stri_enc_detect2()is scheduled for removal in a future version of the package. Usestri_enc_detect()or the more targetedstri_enc_isutf8(),stri_enc_isascii(), etc., instead.[NEW FEATURE] #398: Alternative, British spelling of function parameters has been introduced, e.g.,
stri_opts_coll()now supports bothnormalizationandnormalisation.[NEW FEATURE] #393:
stri_read_bin(),stri_read_lines(), andstri_write_lines()are no longer marked as draft API.stri_read_lines()does not supportencoding="auto"anymore.[NEW FEATURE] #187:
stri_read_bin(),stri_read_lines(), andstri_write_lines()now support connection objects as well.[NEW FEATURE] #386: New function
stri_sort_key()for generating locale-dependent sort keys which can be ordered at the byte level and return an equivalent ordering to the original string (@DavisVaughan).[BUGFIX] #138:
stri_encode()andstri_rand_strings()now can generate strings of much larger lengths.[BUGFIX]
stri_wrap()did not honourindentcorrectly whenuse_widthwasTRUE.
- C++
Published by gagolews over 5 years ago
stringi - stringi_1.5.1
1.5.1 (2020-08-31)
[NEW FEATURE] #400:
%s$%and%stri$%are now binary operators that call base R'ssprintf().[NEW FEATURE] #399: The
%s*%and%stri*%operators can be used in addition tostri_dup(), for the very same purpose.[NEW FEATURE] #355:
stri_opts_regex()now accepts thetime_limitandstack_limitoptions so as to prevent malformed or malicious regexes from running for too long.[NEW FEATURE] #345:
stri_startswith()andstri_endswith()are now equipped with thenegateparameter.[NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
[DEPRECATION WARNING] #347: Any unknown option passed to
stri_opts_fixed(),stri_opts_regex(),stri_opts_coll(), andstri_opts_brkiter()now generates a warning. In the future, the...parameter will be removed, so that will be an error.[DEPRECATION WARNING]
stri_duplicated()'sfromLastargument has been renamedfrom_last.fromLastis now its alias scheduled for removal in a future version of the package.[DEPRECATION WARNING]
stri_enc_detect2()is scheduled for removal in a future version of the package. Usestri_enc_detect()or the more targetedstri_enc_isutf8(),stri_enc_isascii(), etc., instead.[NEW FEATURE] #398: Alternative, British spelling of function parameters has been introduced, e.g.,
stri_opts_coll()now supports bothnormalizationandnormalisation.[NEW FEATURE] #393:
stri_read_bin(),stri_read_lines(), andstri_write_lines()are no longer marked as draft API.stri_read_lines()does not supportencoding="auto"anymore.[NEW FEATURE] #187:
stri_read_bin(),stri_read_lines(), andstri_write_lines()now support connection objects as well.[NEW FEATURE] #386: New function
stri_sort_key()for generating locale-dependent sort keys which can be ordered at the byte level and return an equivalent ordering to the original string (@DavisVaughan).[BUGFIX] #138:
stri_encode()andstri_rand_strings()now can generate strings of much larger lengths.[BUGFIX]
stri_wrap()did not honourindentcorrectly whenuse_widthwasTRUE.
- C++
Published by gagolews over 5 years ago
stringi - stringi_1.4.6
CRAN release v1.4.5
[NEW FEATURE] #369:
stri_c()now returns an empty string when input is empty andcollapseis set.[BUGFIX] #370: fixed an issue in
stri_prepare_arg_POSIXct()reported by rchk.[BUGFIX] #372: documented arguments not in
\usagein documentation objectstri_datetime_format:...
- C++
Published by gagolews about 6 years ago
stringi - stringi_1.3.1
1.3.1 (2019-02-10) CRAN
[BACKWARD INCOMPATIBILITY] #335: A fix to #314 (by design) prevented the use of the system ICU if the library had been compiled with
U_CHARSET_IS_UTF8=1. However, this is the default setting inlibicu>=61. From now on, in such cases the system ICU is used more eagerly, butstri_enc_set()issues a warning stating that the default (UTF-8) encoding cannot be changed.[NEW FEATURE] #232: All
stri_detect_*functions now have themax_countargument that allows for, e.g., stopping at first pattern occurrence.[NEW FEATURE] #338:
stri_sub_replace()is now an alias forstri_sub<-()which makes it much more easily pipable (@yutannihilation, @BastienFR).[NEW FEATURE] #334: Added missing
icudt61b.datto support big-endian platforms (thanks to Dimitri John Ledkov @xnox).[BUGFIX] #296: Out-of-the box build used to fail on CentOS 6, upgraded
./configureto--disable-cxx11more eagerly at an early stage.[BUGFIX] #341: Fixed possible buffer overflows when calling
strncpy()from within ICU 61.[BUGFIX] #325: Made
./configuremore portable so that it works under/bin/dashnow.[BUGFIX] #319: Fixed overflow in
stri_rand_shuffle().[BUGFIX] #337: Empty search patters in search functions (e.g.,
stri_split_regex()andstri_count_fixed()) used to raise too many warnings on empty search patters.
- C++
Published by gagolews about 7 years ago
stringi - stringi_1.1.6
CHANGELOG:
``
* [WINDOWS SPECIFIC] #270: Strings marked withlatin1` encoding
are now converted internally to UTF-8 using the WINDOWS-1252 codec.
This fixes problems with - among others - displaying the Euro sign.
[NEW FEATURE] #263: Add support for custom rule-based break iteration, see
?stri_opts_brkiter.[NEW FEATURE] #267:
omit_na=TRUEinstri_sub<-now ignores missing values in any of the arguments provided.[BUGFIX] fixed unPROTECTed variable names and stack imbalances as reported by rchk ```
- C++
Published by gagolews over 8 years ago
stringi - stringi_0.5-3
- [BACKWARD INCOMPATIBILITY]
stri_install_checkandstri_install_icudtare now deprecated. From now on they are supposed to be used only by thestringiinstaller. - [BUGFIX] #176: a patch for
sys/feature_tests.hno longer included (the original file was copyrighted by Sun Microsystems); fixed the Compiler or options invalid for pre-UNIX 03 X/Open applications and pre-2001 POSIX applications error by forcing_XPG6conformance. - [BUGFIX] #174:
stri_paste()did not generate any warning when the recycling rule is violated andsep=="". - [BUGFIX] #170:
setDataDirectoryno longer called if our ICU src bundle is not used (this used to cause build problems on openSUSE). - [BUILD TIME] #169:
./configurenow tries to switch to the "standard" C++ compiler if a C++11 one is not properly configured. - [BUILD TIME]
configure.win(Biarch: TRUE) now mimicsautoconf'sAC_SUBSTandAC_CONFIG_FILESso that the build process is now more similar across different platforms. - [NEW FEATURE]
stri_info()now also gives information on which ICU4C is used (system or bundle).
- C++
Published by gagolews over 10 years ago
stringi - stringi_0.5-2
- [NEW FUNCTIONS] #137: date-time formatting/parsing:
stri_timezone_list()- lists all known time zone identifiersstri_timezone_set(),stri_timezone_get()- manage current default time zonestri_timezone_info()- basic information on a given time zonestri_datetime_symbols()- localizable date-time formatting datastri_datetime_fstr()- convert astrptime-like format string to an ICU date/time format stringstri_datetime_format()- convert date/time to stringstri_datetime_parse()- convert string to date/time objectstri_datetime_create()- construct date-time objects from numeric representationsstri_datetime_now()- return current date-timestri_datetime_fields()- get values for date-time fieldsstri_datetime_add()- add specific number of date-time units to a date-time object
- [BUGFIX] #168: Build now fails if
icudtis not available. - [BACKWARD INCOMPABILITY] The second argument to
stri_pad_*()has been renamedwidth. - [GENERAL] #69:
stringiis now bundled with ICU4C 55.1. - [NEW FUNCTIONS]
stri_extract_*_boundaries()extract text between text boundaries. - [NEW FUNCTION] #46:
stri_trans_char()is astringi-flavouredchartr()equivalent. - [NEW FUNCTION] #8:
stri_width()approximates the width of a string in a more Unicodish fashion thannchar(..., "width") - [NEW FEATURE] #149:
stri_pad()andstri_wrap()now by default bases on code point widths instead of the number of code points. Moreover, the default behavior ofstri_wrap()is now such that it does not get rid of non-breaking, zero width, etc. spaces - [NEW FEATURE] #133:
stri_wrap()silently allows forwidth <= 0(for compatibility withstrwrap()). - [NEW FEATURE] #139:
stri_wrap()gained a new argument:whitespace_only. - [GENERAL] #144: Performance improvements in handling ASCII strings
(these affect
stri_sub(),stri_locate()and other string index-based operations) - [GENERAL] #143: Searching for short fixed patterns (
stri_*_fixed()) now relies on the currentlibC's implementation ofstrchr()andstrstr(). This is very fast e.g. onglibcutilizing theSSE2/3/4instruction set. - [GENERAL] #141: a local copy of
icudt*.zipmay be used on package install; see theINSTALLfile for more information. - [GENERAL] #165: the
./configureoption--disable-icu-bundleforces the use of system ICU when building the package. - [BUGFIX] locale specifiers are now normalized in a more intelligent way:
e.g.
@calendar=gregorianexpands toDEFAULT_LOCALE@calendar=gregorian. - [BUGFIX] #134:
stri_extract_all_words()did not acceptsimplify=NA. - [BUGFIX] #132: incorrect behavior in
stri_locate_regex()for matches of zero lengths - [BUGFIX] stringr/#73:
stri_wrap()returnedCHARSXPinstead ofSTRSXPon empty string input withsimplify=FALSEargument. - [BUGFIX] #164: libicu-dev usage used to fail on Ubuntu.
- [BUGFIX] #135: C++11 is now used by default (see the
INSTALLfile, however) to build stringi from sources. This is because ICU4C uses thelong longtype which is not part of the C++98 standard. - [BUGFIX] #154: Dates and other objects with a custom class attribute were not coerced to the character type correctly.
- [BUGFIX] Force ICU
u_init()call on stringi dynlib load. - [BUGFIX] #157: many overfull hboxes in the package PDF manual has been corrected.
- C++
Published by gagolews over 10 years ago
stringi - stringi_0.4-1
CHANGELOG:
- [IMPORTANT CHANGE] n_max argument in stri_split_*() has been renamed n.
- [IMPORTANT CHANGE] simplify=FALSE in stri_extract_all_*() and
stri_split_*() now calls stri_list2matrix() with fill="".
fill=NA_character_ may be obtained by using simplify=NA.
- [IMPORTANT CHANGE, NEW FUNCTIONS] #120: stri_extract_words has been
renamed stri_extract_all_words and stri_locate_boundaries -
stri_locate_all_boundaries as well as stri_locate_words -
stri_locate_all_words. New functions are now available:
stri_locate_first_boundaries, stri_locate_last_boundaries,
stri_locate_first_words, stri_locate_last_words,
stri_extract_first_words, stri_extract_last_words.
- [IMPORTANT CHANGE] #111: opts_regex, opts_collator, opts_fixed, and
opts_brkiter can now be supplied individually via ....
In other words, you may now simply call e.g.
stri_detect_regex(str, pattern, case_insensitive=TRUE) instead of
stri_detect_regex(str, pattern, opts_regex=stri_opts_regex(case_insensitive=TRUE)).
- [NEW FEATURE] #110: Fixed pattern search engine's settings can
now be supplied via opts_fixed argument in stri_*_fixed(),
see stri_opts_fixed(). A simple (not suitable for natural language
processing) yet very fast case_insensitive pattern matching can be
performed now. stri_extract_*_fixed is again available.
- [NEW FEATURE] #23: stri_extract_all_fixed, stri_count, and
stri_locate_all_fixed may now also look for overlapping pattern
matches, see ?stri_opts_fixed.
- [NEW FEATURE] #129: stri_match_*_regex gained a cg_missing argument.
- [NEW FEATURE] #117: stri_extract_all_*(), stri_locate_all_*(),
stri_match_all_*() gained a new argument: omit_no_match.
Setting it to TRUE makes these functions compatible with their
stringr equivalents.
- [NEW FEATURE] #118: stri_wrap() gained indent, exdent, initial,
and prefix arguments. Moreover Knuth's dynamic word wrapping algorithm
now assumes that the cost of printing the last line is zero, see #128.
- [NEW FEATURE] #122: stri_subset() gained an omit_na argument.
- [NEW FEATURE] stri_list2matrix() gained an n_min argument.
- [NEW FEATURE] #126: stri_split() now is also able to act
just like stringr::str_split_fixed().
- [NEW FEATURE] #119: stri_split_boundaries() now have
n, tokens_only, and simplify arguments. Additionally,
stri_extract_all_words() is now equipped with simplify arg.
- [NEW FEATURE] #116: stri_paste() gained a new argument:
ignore_null. Setting it to TRUE makes this function more compatible
with paste().
- [NEW FEATURE] #114: stri_paste(): ignore_null arg has been added.
- [OTHER] #123: useDynLib is used to speed up symbol look-up in
the compiled dynamic library.
- [BUGFIX] #94: Run-time errors on Solaris caused by setting
-DU_DISABLE_RENAMING=1 -- memory allocation errors in i.a. ICU's
UnicodeString. This setting also caused some ABSan sanity check
failures within ICU code.
- C++
Published by gagolews about 11 years ago