Recent Releases of DataFrames

DataFrames - v1.7.1

DataFrames v1.7.1

Diff since v1.7.0

Ecosystem changes: - CompatHelper: bump compat for DataStructures to 0.19, (keep existing compat) (#3503) (@github-actions[bot]) - Bump codecov/codecov-action from 4 to 5 (#3481) (@dependabot[bot])

Documentation changes: - Updated Basic Usage of Manipulation Functions (#3360) (@nathanrboyer) - docs for aggregation over grouped array-like elements (#3425) (@huangyxi) - Stabilize random number reproducibility in doctests (#3472) (@nathanrboyer) - Docs: Fix typo (#3474) (@agdestein) - dcast instead of SDcols (#3475) (@tdhock) - typo, df was d (#3477) (@rOsemium) - compare stack/unstack to data.table melt/dcast (#3478) (@tdhock) - Small formatting tweaks to #3360 after reviewing online (#3483) (@nathanrboyer) - Update querying_frameworks.md adding TidierData on introduction (#3488) (@indymnv) - Document DataFrame definition in code file using CSV.jl (#3501) (@MagicMuscleMan) - Update categorical.md after CategoricalArrays.jl release (#3504) (@bkamins)

- Julia
Published by github-actions[bot] 10 months ago

DataFrames - v1.7.0

DataFrames v1.7.0

Diff since v1.6.1

Merged pull requests: - allow push!/pushfirst!/append!/prepend! with multiple values (#3372) (@bkamins) - add cols kwarg to rename/rename! (#3380) (@bkamins) - Add JSS citation information (#3381) (@bkamins) - fix typos (#3384) (@spaette) - Fix @spawn_or_run_task with interactive threads (#3385) (@nalimilan) - add cols to mapcols and mapcols! (#3386) (@bkamins) - add example of using Tables.dictcolumntable (#3387) (@bkamins) - fix nonunique bug (#3393) (@bkamins) - remove unnecessary @time in tests (#3394) (@bkamins) - fix first and last for negative row count (#3402) (@bkamins) - Fix eachrow and eachcol indexing with CartesianIndex (#3413) (@bkamins) - Update for Documenter.jl v1 and Julia v1.10 (#3416) (@hyrodium) - Change big to BigInt calls (#3419) (@bkamins) - Update docs on Juliacon (#3420) (@hyrodium) - Import groupby from DataAPI, remove by and aggregate (#3422) (@bkamins) - Advanced transformation examples (#3433) (@bkamins) - disambiguate allunique signature (#3434) (@bkamins) - do not pass empty vector to Tables.columntable (#3435) (@bkamins) - Explain the role of querying frameworks for DataFrames.jl (#3438) (@bkamins) - Typo fix (#3439) (@nathanrboyer) - Add TidierData to frameworks docs page (#3447) (@drizk1) - add ? suffix to show on all return paths (#3448) (@adienes) - Update ci.yml (#3449) (@ViralBShah) - Create dependabot.yml (#3450) (@ViralBShah) - Bump julia-actions/cache from 1 to 2 (#3453) (@dependabot[bot]) - fix vcat type piracy (#3457) (@bkamins) - Remove REPL dependency (#3459) (@topolarity) - Update filter docs, Fixes #3460 (#3461) (@sprig) - fix tests on nightly and 32-bit (#3463) (@bkamins) - Improve names docs (#3464) (@bkamins) - CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#3465) (@github-actions[bot]) - Fix codecov badge in README.md (#3466) (@ViralBShah)

Closed issues: - rand(::GroupedDataFrame) sampler? (#2097) - Investigate performance of innerjoin between large tables (#2974) - Make row lookup easier (#3051) - website: https://juliadata.org (#3338) - Feature Request: Allow naming function in rename operation pairs. (#3361) - Would adding support for JLD2.jl allow Type preservation? (#3364) - Add support for multiple positional arguments in push!/pushfirst!/append!/prepend! (#3371) - "RowNumber by Partition" function (#3374) - Not with non-existing columns (#3375) - leftjoin! is actually copying reference instead of value?! (#3379) - Tests of describe and multithreading fail in Julia-1.10.0-beta3 (#3383) - error when unique! a empty dataframe (#3392) - combine on grouped df return empty df when args is empty (#3399) - Inconsistent Mean Calculation in Grouped DataFrame Compared to Overall DataFrame (#3405) - What is the best way to write large DataFrames efficiently and with high performance in Julia while minimizing memory usage? (#3406) - Segmentation Fault when reading compressed file (#3407) - Better error message when forming a DataFrame from a vector of dictionaries with missing data. (#3410) - describe is slow (#3411) - CartesianIndex error in Julia 1.11 (#3412) - DataFrame(x=Int[], y=Int) (#3414) - unique fails with column-type FixedDecimal (#3418) - Grouped DataFrame with array elements fails to combine (#3424) - error when combining a grouped empty dataframe using first (#3426) - Short circuit && on subset? (#3427) - Document custom generation of column names in manual (#3430) - using propertynames on GroupedDataFrame (#3443) - Very slow to convert DBInterface (DuckDB) result (#3444) - Add Tidier.jl to docs/src/man/querying_frameworks.md (#3446) - Type piracy of reduce(vcat) (#3456) - filter performance (#3460) - [POSSIBLE REGRESSION] DataFrames.jl Currently Failing on Nightly? (#3467)

- Julia
Published by github-actions[bot] almost 2 years ago

DataFrames - v1.6.1

DataFrames v1.6.1

Diff since v1.6.0

Closed issues: - sort missing data placement (#2267) - Dependency on DataStructures should be explicit and versioned (#3358)

Merged pull requests: - Improve error message when pushing/appending with promote=true (#3356) (@bkamins) - more descriptive error message for only (#3357) (@ssfrr) - Bk/fix faster orderings (#3359) (@bkamins) - Add vector of names method to rename docstring (#3362) (@nathanrboyer)

- Julia
Published by github-actions[bot] almost 3 years ago

DataFrames - v1.6.0

DataFrames v1.6.0

Diff since v1.5.0

Closed issues: - sort! to give warning if resulting sorting order is not fully determined (#2159) - More flexible Not column selector (#3288) - DataFrame not print correctly (#3292) - transpose method errors (#3295) - juliadata.org website pointing to random blog about martial arts? (#3296) - When partitioned, partition might lose the missingness eltype (in Tables.schema) (#3298) - transform should expand a data frame when it has 0 rows. (#3301) - Base.reduce(::typeof(vcat), ...) on DataFrames does not support init (#3309) - DimensionMismatch when checking if the cell value (not) belong to a collection (#3316) - Rename SubDataFrame columns (#3317) - Accepting array element in rows specificed by named tuples, in combine (#3335) - unstack error message for missing values (#3339) - Bounds error when sorting a column after select (#3340) - Don't print all data in huge columns (#3343) - Show problem columns for "ArgumentError: missing values in key columns are not allowed when matchmissing == :error" (#3345) - Don't truncate UUID columns (#3346) - Cannot vcat DataFrames with ReadStatTables.LabeledArrays (#3351) - Join memory usage workaround issues (#3355)

Merged pull requests: - Fix typo in the manual (#3287) (@bkamins) - Use pkgdir instead of pathof (#3289) (@rikhuijzer) - Update README.md (#3297) (@aramirezreyes) - add Iterators.partition for DataFrameRows (#3299) (@bkamins) - add support for Not with multiple positional indices (#3302) (@bkamins) - add :sum to describe (#3303) (@alecloudenback) - deleteat! where drop is a column (#3304) (@gustafsson) - Correct documentation typos (#3305) (@Naunet) - Fix some typos (#3308) (@goggle) - add init kwarg to vcat (#3310) (@bkamins) - add nrow, ncol, and Tables.subset for eachcol and eachrow (#3311) (@bkamins) - Simple uniqueness checks for sorting-related functions (#3312) (@alonsoC1s) - Document use of isequal for comparisons (#3313) (@knuesel) - Add support for renamecols keyword argument in crossjoin (#3314) (@bkamins) - Update reshape.jl (#3319) (@alancummings) - Allow to always pass column names in DataFrame constructor (#3320) (@bkamins) - Allow CI failure on Julia nightly (#3321) (@bkamins) - Use DataAPI.rownumber instead of DataFrames' rownumber (#3322) (@VEZY) - copy more constructors from type doc to getting started (#3323) (@xgdgsc) - [@ref] => (@ref) (#3325) (@likanzhan) - SnoopPrecompile -> PrecompileTools (#3326) (@timholy) - Update documentation of how to disable precompilation (#3329) (@bkamins) - Stop using internal [inv]permute!! as sentinel (#3330) (@LilithHafner) - optimize reverse! for small data frames and factor out foreachunique_column (#3332) (@LilithHafner) - Add "Julia for Data Analysis" reference in manual (#3333) (@bkamins) - Add test for issue #3340 which exposed upstream issues with the use of TimSort (#3341) (@LilithHafner) - fix dispatch errors in tests on Julia 1.10 (#3342) (@bkamins) - improve unstack error messages (#3344) (@bkamins) - Do not crop columns with type Base.UUID (#3347) (@ronisbr) - Correctly handle Tables.AbstractRow in operation specficiation (#3348) (@bkamins) - improve error messages in joins (#3349) (@bkamins) - Fix typo (#3350) (@ronisbr) - Prepare for 1.6 release (#3352) (@bkamins) - fix tests on 32-bit (#3353) (@bkamins)

- Julia
Published by github-actions[bot] almost 3 years ago

DataFrames - v1.5.0

DataFrames v1.5.0

Diff since v1.4.4

Closed issues: - New contents about handing missing values in DataFrame (#1662) - Functions taking collections of column names always require them to be in AbstractVectors (#1769) - Stack/Melt over multiple sets of variables (#1839) - Allow unstack to take multiple columns to unstack on (#2148) - Feature request: unstack multiple :values columns (#2215) - Add all keyword argument to nonunique (#2238) - special case percentage in combine (#2272) - Add a pushfirst! method (#2275) - add filter example to docs on taking subsets (#2318) - Some code blocks missing syntax highlighting in docs (#2319) - Stacking multiple groups of columns (#2414) - Add more keyword arguments to stack and unstack (#2422) - Add reverse and reverse! functions similar to sort and sort! (#2438) - Allow keeping first or last observation with unique function (#2443) - Add insert! (#2446) - Improve inline documentation of select to include examples of multiple columns not to be included (#2513) - Transposing DataFrame (#2743) - add a keyword to allow specifying target row order in joins (#2753) - Improve flatten (slightly breaking) (#2767) - Add manual part for indexing and selection (#2887) - a new method of the flatten function in DataFrames (#2890) - Generalization of the value parameter in the unstack function (#3066) - resolve circular reference issue when printing (#3148) - Support allunique with column selectors? (#3205) - Add support for Tables.AbstractRow to functions that take row (#3244) - Stack Overflow during type inference with large dataframes (#3246) - innerjoin fast path where join column is allequal? (#3247) - Invalidations when loading CSV (#3248) - Improve groupby sort (#3251) - improve performance of dropmissing (#3254) - Let DataFrame behave more like GroupedDataFrame with one zero-key group (#3257) - Lifecycle annotations (#3259) - String display quotation missing (#3261) - Bool columns are printed as 0/1 in HTML, but not in plain (#3265) - sum doesn't work with Missing column (#3267) - Views of DataFrame design issue (#3272) - Multi-threading hangs combine on Julia nightly (#3275) - Check CompatHelper setup (#3278) - Add get function for AbstractDataFrame (#3281) - Rename Iterators.partition (#3284)

Merged pull requests: - add Iterators.partition (#3212) (@bkamins) - add an option to intersect arguments passed to Cols (#3224) (@bkamins) - Add allunique and improve nonunique and describe (#3232) (@bkamins) - Add an option in joins to specify row order (#3233) (@bkamins) - Improve examples in the manual in basics.md (#3236) (@bkamins) - Add hints to use macro packages for new users (#3238) (@bkamins) - improve error message when used selector is incorrect (#3242) (@bkamins) - add support for Tables.AbstractRow in push!, pushfirst!, and insert! (#3245) (@bkamins) - fix deleteat! and subset! performance (#3249) (@bkamins) - Fix typo in documentation (#3250) (@bkamins) - Mention ReadStatTables.jl in documentation (#3252) (@junyuan-chen) - Add sorting options to groupby (#3253) (@bkamins) - Improve performance of dropmissing (#3256) (@svilupp) - add keep to nonunique, unique, and unique! (#3260) (@bkamins) - document breaking change policy (#3262) (@bkamins) - improve error message in operation specification syntax (#3263) (@bkamins) - Fix bug in subset[!] when handling no conditions case (#3264) (@bkamins) - Fix error in fast aggregation of missing only columns for sum and mean (#3268) (@bkamins) - add information about TableMetadaTools.jl to docs (#3269) (@bkamins) - Update TagBot.yml (#3271) (@bkamins) - correctly index into a SubDataFrame with no columns (#3273) (@bkamins) - Reduce size of multi-threading enablement to 100_000 (#3274) (@bkamins) - Improve allcombinations docstring + minor cleanups after #3256 (#3276) (@bkamins) - Allow to pass multiple predicates in Cols and mix them with other selectors (#3279) (@bkamins) - update CompatHelper.jl setup (#3280) (@bkamins) - add haskey and get support for DataFrameColumns (#3282) (@bkamins) - Add scalar keyword argument to flatten (#3283) (@bkamins) - improve precompilation coverage (#3285) (@bkamins)

- Julia
Published by github-actions[bot] over 3 years ago

DataFrames - v1.4.4

DataFrames v1.4.4

Diff since v1.4.3

Closed issues: - Segmentation fault Julia 1.8.2, DataFrames v1.4.3 (#3227) - sizeof() not working correctly with Dataframes (#3229) - subset / subset! AbstractVector restriction inconvenient (#3230)

Merged pull requests: - Explain column-independent operations (#3225) (@bkamins) - Fix unstack docstring (#3226) (@bkamins) - fix select bug with copycols=false on SubDataFrame (#3231) (@bkamins) - fix markdown tests (#3234) (@bkamins)

- Julia
Published by github-actions[bot] over 3 years ago

DataFrames - v1.4.3

DataFrames v1.4.3

Diff since v1.4.2

Closed issues: - docs for groupindices has wrong example (#3210) - (Possible) Bug with shuffle when shuffling DataFrame rows (#3211) - Improve combine documentation (#3214) - ERROR: AssertionError: length(res) > 0 (#3217) - Column metadata anchored to wrong column after insertion of new colums (#3218)

Merged pull requests: - Make sure we use MIME when calling repr in GroupedDataFrame printing (#3213) (@bkamins) - add default style to metadata! and colmetadata! (#3216) (@bkamins) - fix insertcols! bug (not shifting column metadata) (#3220) (@bkamins) - fix HTML printing tests after PrettyTables.jl 2.2 release (#3221) (@bkamins) - make aggregation of empty GroupedDataFrame correct with AsTable (#3222) (@bkamins)

- Julia
Published by github-actions[bot] over 3 years ago

DataFrames - v1.4.2

DataFrames v1.4.2

Diff since v1.4.1

Closed issues: - Make docstrings method specific (#2015) - Additional functions supported for DataFrame.jl (#2088) - OffsetArray Compatibility (#2123) - Return data frame unaltered when Not only includes columns that are not in data frame (#2197) - Kwarg to choose missing values for unstack (#2205) - Allow DF() as a selector in select and combine (#2220) - no method matching InvertedIndex(::String, ::String) (#2227) - add view::Bool kwarg to first and last (#2845) - Inconsistency in push!ing an empty row into a DataFrame (#2953) - Flatten errors on empty dataframe (#3197) - 10 seconds to show(df) of size (120764, 22) (#3202) - Ignoring ENV["LINES"] in 1.4.x (#3203) - JET.JL problem with v1.4.1 (#3204) - Speed of filter (#3208) - Allow end to select last column. (#3209)

Merged pull requests: - Mention DataFrameMacros.jl in the docs (#3195) (@jkrumbiegel) - make sure flatten works corretly on a data frame with zero rows (#3198) (@bkamins) - improve manual entry of assignment to a data frame (#3201) (@bkamins)

- Julia
Published by github-actions[bot] over 3 years ago

DataFrames - v1.4.1

DataFrames v1.4.1

Diff since v1.4.0

Closed issues: - Filtering of eachrow(df) not working in 1.4.0 (#3191)

Merged pull requests: - make sure getindex on DataFrameRows does not alias passed selector (#3192) (@bkamins) - Add missing triple quotes around docstrings (#3194) (@bkamins)

- Julia
Published by github-actions[bot] over 3 years ago

DataFrames - v1.4.0

DataFrames v1.4.0

Diff since v1.3.6

Closed issues: - Metadata for columns and/or DataFrames (#35) - What metadata should be (#2276) - Add metadata (#2961) - Add precompilation for PooledArray for all allowed ref types (#3013) - update precompilation for 1.4 release (#3080) - Require Julia 1.6 (#3136) - Metadata: follow-up notes (#3168) - Add references to names documentation (#3171) - sync Tables.subset (#3180) - change valuestransform in unstack (#3184) - better handling of corner cases of GroupedDataFrame printing (#3186) - Version incompatibility with PrettyTables.jl (#3188)

Merged pull requests: - Metadata on data frame and column level (#3055) (@bkamins) - Use PrettyTables.jl as HTML backend (#3096) (@ronisbr) - Improved REPL printing for GroupedDataFrames (#3107) (@Jollywatt) - 1-arg permutedims(df) (#3115) (@anandijain) - Require Julia 1.6 (#3145) (@bkamins) - synch NEWS.md between 1.4 and 1.3 branches (#3164) (@bkamins) - add ShiftedArrays 2.x support (#3165) (@bkamins) - improve error message when column is not found (#3166) (@bkamins) - Improve metadata documentation (#3169) (@bkamins) - Reduce memory use in threading correctness tests (#3172) (@yakir12) - Fix typos in metadata docs (#3174) (@nalimilan) - fix metadata handling in permutedims (#3176) (@bkamins) - Add better error message on error when pushing rows to a data frame (#3177) (@bkamins) - improve names docstring (#3178) (@bkamins) - Avoid method dispatch ambiguities in DataFrames.jl (#3179) (@bkamins) - switch from view to viewhint in Tables.subset (#3181) (@bkamins) - precompilation for 1.4 release (#3182) (@bkamins) - enable multithreading tests of joins only on 64 bit machines (#3183) (@bkamins) - rename valuestransform to combine in unstack (#3185) (@bkamins) - improve printing of GroupedDataFrame in corner cases (#3187) (@bkamins) - Sync metadata implementation with DataAPI.jl 1.12.0 (#3189) (@bkamins) - Fix deprecation warning when sorting data frame with no columns (#3190) (@bkamins)

- Julia
Published by github-actions[bot] over 3 years ago

DataFrames - v1.3.6

DataFrames v1.3.6

  • Fix type assertion in filterhelper (#3155) (@bkamins)

- Julia
Published by github-actions[bot] almost 4 years ago

DataFrames - v1.3.5

DataFrames v1.3.5

Diff since v1.3.4

Add support for Compat.jl 4.x

- Julia
Published by github-actions[bot] almost 4 years ago

DataFrames - v1.3.4

DataFrames v1.3.4

Diff since v1.3.3

Closed issues: - stack not catching invalid value of keyword variable_eltype (#3042)

Merged pull requests: - Fix handling of variable_eltype in stack (#3043) (@bkamins)

- Julia
Published by github-actions[bot] about 4 years ago

DataFrames - v1.3.3

DataFrames v1.3.3

Closed issues: - outerjoin: keyword augument matchmissing not correctly passed (#3039)

Merged pull requests: - make sure we correctly pass matchmissing in joins (#3040) (@bkamins)

- Julia
Published by github-actions[bot] about 4 years ago

DataFrames - v1.3.2

DataFrames v1.3.2

Diff since v1.3.1

Closed issues: - Variance in runtime reduction functions (#2956) - use of map in ByRow (#2957) - Replace and Missing Values (#2976) - Subset and Missing Values (#2977) - copying of columns in select! and transform! (#2978) - Unexpected Behavior of Combined Column Selection (#2980)

Merged pull requests: - Add a note about df.col .= v broadcasting changes (#2971) (@bkamins) - Update workingwithdataframes.md (#2973) (@alfaromartino) - Clean up join code (#2975) (@bkamins) - Add links to docs, rephrase a bit (#2979) (@nalimilan) - fix aliasing detection in sort! (#2981) (@bkamins) - make sure ByRow invokes generic map (#2982) (@bkamins) - make sure we use source column only once (#2983) (@bkamins) - Update subset to handle large number of selectors better (#2989) (@bkamins)

- Julia
Published by github-actions[bot] over 4 years ago

DataFrames - v1.3.1

DataFrames v1.3.1

Diff since v1.3.0

Closed issues: - Decide if we want to rename All to Cols (#2203) - Creating new columns on a view should fill in missings everywhere else. (#2211) - Consider allowing to sort! a SubDataFrame (#2300) - Locate the problem in disallowmissing error (#2965) - Arrow Notation within Column Selection is Inconsistent (#2969)

Merged pull requests: - fix: change "dont" to "don't" (#2962) (@Mo-Gul) - better disallowmissing error message (#2966) (@bkamins) - fix issues with parameter type printing in doctests (#2967) (@bkamins) - update docs for join on (left, right) tuple (#2968) (@visr) - fix getindex with vector of Pairs (#2970) (@bkamins)

- Julia
Published by github-actions[bot] over 4 years ago

DataFrames - v1.3.0

DataFrames v1.3.0

Diff since v1.2.2

Closed issues: - Port pqr benchmarks (#298) - Memory efficiency of join (#1334) - Selections.jl + DataFrames.jl (#1936) - Add support for All, Between and Not broadcating (#2171) - filter(df, :x => f) would be useful to have (#2187) - allow selector => fun1 => fun2 in select and combine (#2207) - add a leftjoin! (or match! or merge! or whatever it should be called) (#2259) - Provide a syntax to perform row aggregations fast (#2439) - Investigate performance of aggregations (#2440) - Rework the manual (#2595) - Add after keyword argument to insertcols! (#2613) - control fill value for missing cells in unstack (#2698) - Allow selecting columns based on predicate on column contents (#2747) - Fast row aggregation in DataFrames.jl (#2768) - Add a method to add/insert empty columns (#2783) - Assignment to SubDataFrame (#2785) - DataFrameMacros.jl and DataFramesMeta.jl (#2793) - DataFrames not threadsafe (#2795) - Better documentation for combine(gd, fun => :x) (#2830) - AsTable in combine seems to require at least one column (#2832) - implement Tables.materializer(::Type{<:AbstractDataFrame})? (#2833) - Should ByRow use map or not (#2834) - Error for unstacking an empty dataframe (#2841) - The test/show.jl tests fail when Julia is started with julia --color=no (#2846) - Faster count (#2849) - AsTable docstring doesn't mention it can be used as a target for select etc. (#2850) - delete! in DataFrames.jl (#2853) - Import nrow and ncol from DataAPI.jl (#2855) - Support the Case of Matrix{Any} as Data and Vector{Any} as Header (#2858) - Allow DataFrame(matrix, names, copycols=false) (#2860) - Displaying DateTime columns (#2861) - update docs to CSV.jl 0.9 (#2864) - Better error messages when frame is empty (#2867) - Add "Filtering" section to the documentation User Guide. (#2871) - Add documentation for transformation functions without the Split-Apply-Combine strategy to User Guide. (#2872) - Make Cols more flexible (#2875) - In src => fun => dst allow transformation function in dst (#2876) - Ambiguity error between CategoricalArrays and SentinelArrays (#2883) - ByRow and transform not working (#2884) - Avoid mixing standard and scientific floats in output (#2885) - Updating ClassImbalance.jl; Needed help debugging (#2886) - mixing :x => :y and :x => f => :y syntax in vector to select errors (#2888) - Trimming variables in a data frame (#2891) - renamecols function for transform (#2893) - TableOperations.joinpartitions doesn't work properly (#2895) - Correct isiterable(DataFrame) (#2896) - Strange behaviour with non-ASCII column names (#2901) - tf keyword argument from PrettyTables.jl does not work in DataFrames.jl show function. (#2903) - Aggregate function with multiple output columns of different types (#2905) - Recommend PooledArrays to pool data (#2908) - update DataFramesMeta.jl docs (#2910) - Add contributing opportunities to the contributing guide (#2912) - Default show truncates too soon (#2913) - DataFrames logo banner (#2917) - Regenerate precompile statements for 1.3 release (#2921) - subset doesn't accept a vector of transformations (#2924) - Printing of data frames in try-catch (#2925) - Modifying transformations with grouped dataframes (#2927) - Improve filter docs (#2930) - Improve sort docs (#2931) - DataFrames errors on loading with --depwarn=error (#2935) - Add AsTable([:a, :b]) => AsTable (#2939) - Grouped describe fails or "clashes" with StatsBase (#2952)

Merged pull requests: - Add standard deviation and 25% and 75% quantiles to describe :detailed (#2459) (@nalimilan) - Support adding columns to views (#2794) (@bkamins) - Add muli-threading support description to the manual (#2823) (@bkamins) - feat: unstack receives kwarg fillvalue (#2828) (@pstorozenko) - feat: insertcols! receives kwarg after (#2829) (@pstorozenko) - explain that fun => target does not work in general (#2836) (@bkamins) - more careful test of ByRow for PooledArray (#2837) (@bkamins) - fix transformation minilanguage docs (#2838) (@bkamins) - add Tables.materializer for types methods (#2839) (@bkamins) - Fix typo math => match (#2840) (@Nosferican) - Fix empty unstack on empty data frame (#2842) (@bkamins) - Bk/add leftjoin! (#2843) (@bkamins) - Fix tests broken by Julia Base changes (#2844) (@bkamins) - Disable color testing when color is not supported (#2847) (@bkamins) - Improve docstring of AsTable (#2851) (@bkamins) - Fix three uses of "data table" (#2852) (@nalimilan) - deprecate delete!, define deleteat! (#2854) (@bkamins) - use nrow and ncol from DataAPI.j (#2856) (@bkamins) - Fix signature of constructor in docstring (#2857) (@nalimilan) - make DataFrame constructor more flexible (#2859) (@bkamins) - fix transpose error message and clean up code (#2862) (@bkamins) - Update to latest GA for docs (#2863) (@quinnj) - update docs following CSV.jl 0.9 release (#2865) (@bkamins) - code cleanup to improve error messages (#2868) (@bkamins) - Add fast reductions (#2869) (@bkamins) - fix: typo (#2873) (@kunzaatko) - fix: do not copy syntax is with ! (#2874) (@kunzaatko) - Allow constructing Matrix from empty dataframe (#2878) (@jakobnissen) - Fix typo in NEWS.md (#2880) (@bkamins) - Allow predicate in Cols (#2881) (@bkamins) - Improve docstring for names() (#2882) (@xluo127) - avoid not specialized Pair issue (#2889) (@bkamins) - Specify why leftjoin! needs at most one match (#2894) (@rikhuijzer) - allow transformation destination to be a function (#2897) (@bkamins) - improve docs alignment (#2898) (@bkamins) - improve missings documentation (#2899) (@bkamins) - add filter and subset to documentation (#2900) (@bkamins) - Try to detect unicode normalization issues in column names (#2904) (@bkamins) - Faster computation of quantiles in describe (#2909) (@nalimilan) - add info about PooledArrays (#2911) (@bkamins) - Add more guidance for new contributors (#2914) (@bkamins) - Update Querying frameworks DataFramesMeta.jl docs (#2915) (@pdeffebach) - hardening haskey (#2916) (@bkamins) - Add broadcasting of selectors to the minilanguage (#2918) (@bkamins) - Add general fast aggregation for wide tables with collect (#2920) (@bkamins) - Fix tests of names (#2922) (@bkamins) - Update ci.yml in preparation of Julia 1.6 LTS (#2923) (@bkamins) - Allow passing multiple columns to subset (#2926) (@bkamins) - docs: fix typo and add some newlines in tutorial (#2932) (@rfourquet) - mention ClipData.jl (#2933) (@Datseris) - Correctly handle functors when auto-generating column names (#2934) (@bkamins) - plan for a change in broadcasting rules in Julia 1.7 (#2937) (@bkamins) - Change join tests to reduce memory consumption (#2938) (@bkamins) - Improve Docstrings for sort and sort! (#2940) (@Chandu-4444) - Add examples for issorted docstrings. (#2941) (@Chandu-4444) - Add row indexing to filter docstring and examples. (#2942) (@nathanrboyer) - Reduce test memory usage (#2943) (@bkamins) - Add reverse prototype (#2944) (@Chandu-4444) - Define sort! for AbstractDataFrame and fix issues of kwargs in sorting functions (#2946) (@bkamins) - Make transformation docstring more precise (#2948) (@bkamins) - Catch OutOfMemoryError (#2949) (@bkamins) - clean up source code (#2950) (@bkamins) - Add view kwarg to first and last (#2951) (@Chandu-4444) - Generate precompile statements for Julia 1.7 (#2955) (@bkamins)

- Julia
Published by github-actions[bot] over 4 years ago

DataFrames - v1.2.2

DataFrames v1.2.2

Diff since v1.2.1

Closed issues: - Add method to filter on Bool column symbols (#2465) - Enable documenter doctests (#2702) - Extend => renaming syntax (#2728) - add a keyword to specify group order in groupby (#2762) - subset with grouped data frame has worse compile times than transform (#2806) - Performance Issues with filter and subset (#2821) - Extremely slow GroupBy behaviour on a small table (#2822) - Is there any Julia alternatives to to_dict function in pandas? (#2824)

Merged pull requests: - making a new top level section to work with DataFrames (#2717) (@RohitRathore1) - make sort kwarg in groupby more flexible (#2812) (@bkamins) - add a link to JuliaCon2021 tutorial to docs (#2817) (@bkamins) - review of the DataFrames.jl tutorial (#2825) (@bkamins) - correct signature of merge for AbstractIndex (#2826) (@bkamins)

- Julia
Published by github-actions[bot] almost 5 years ago

DataFrames - v1.2.1

DataFrames v1.2.1

Diff since v1.2.0

Closed issues: - "transform" function not available (#2815) - How to change the value of a cell to a different data type? (#2816) - dropmissing! creates weird memory bugs/errors on 1.7 and 1.6 (#2819)

Merged pull requests: - Document GroupedDataFrame consistency check (#2811) (@bkamins) - fix delete! for versions of Julia 1.6.2 or earlier (#2820) (@bkamins)

- Julia
Published by github-actions[bot] almost 5 years ago

DataFrames - v1.2.0

DataFrames v1.2.0

Diff since v1.1.1

Closed issues: - Add matchmissing = :notequal option (#2650) - Implement pushfirst! to allow appending rows in the beginning of a DataFrame (#2678) - Review comparisons with R/Python (#2737) - Slow sorts in columns with Union{<:Any, missing} even if no missing values in the column (#2745) - Display complex numbers - alignment (#2754) - Slow row aggregation in presence of missings (#2757) - Convert column from string to float (#2761) - Improve SubDataFrame creation for AbstractVector{Bool} (#2765) - Flatten in case column contains string and array (#2766) - Question: Small Delimited file into DataFrame (#2772) - transform(df, :x => AsTable)should probably work (#2779) - missing methodcombine(gd::GroupedDataFrame, ::Matrix)(#2781) - Sync with DataAPI.jl 1.7 release (#2788) - inconsistency of groupby() for -0.0 (#2790) - Clean up precompile statements (#2792) - Test failures when usingjulia --color=no(#2796) - Differently typed columns when usingDataFrame(myVector)vsDataFrame(x = myVector)(#2798) - DataFrame(table) != DataFrame(table, copycols=true) (#2799) - html dataframe representation includes invalid placement of

tag (#2800) - subset!(gd::GroupedDataFrame, ...) should make suregd` still works after (#2808)

Merged pull requests: - Matchmissing == :notequal (#2724) (@pstorozenko) - Update comparisons with data.table info (#2725) (@eloualiche) - Run findall(rows) only if rows are not all true (#2727) (@pstorozenko) - Fix type instability in sort for few columns case and fix issorted bug (#2746) (@bkamins) - Cover corner case of compactype (wide name and CategoricalValue) (#2751) (@bkamins) - Update docs URLs in README (#2752) (@ViralBShah) - reviewed and fixed (#2755) (@RohitRathore1) - Alignment of complex numbers (#2756) (@ronisbr) - audit more master -> main (#2758) (@Moelf) - make "Edit on Github" points to main branch (#2759) (@Moelf) - Mark outdated docs (#2760) (@pfitzseb) - update NEWS.md (#2763) (@bkamins) - add findall for AbstractVector{Bool} and use it in internal functions (#2769) (@bkamins) - Explicit loop in `findall` to avoid allocations (#2771) (@pstorozenko) - add information how DelimitedFiles can be used (#2773) (@bkamins) - Put longer type into th title argument in HTML show (#2774) (@mortenpi) - Deprecate AbstractVector in hcat (#2777) (@bkamins) - remove escape in Char (#2778) (@bkamins) - allow :col => AsTable and :col => cols (#2780) (@bkamins) - allow Matrices in transformations of GroupedDataFrame (#2782) (@bkamins) - Use latest Documenter.jl (#2786) (@bkamins) - Fix float grouping (#2791) (@bkamins) - Use standard Tables.Schema constructor instead of constructing directly (#2797) (@quinnj) - move summary outside of a

in text/html (#2801) (@bkamins) - Add some clarifying comments on copycols for Tables.jl inputs (#2805) (@quinnj) - up DataAPI.jl to 1.7 and CategoricalArrays.jl to 0.10.0 (#2807) (@bkamins) - improve subset! for GroupedDataFrame (#2809) (@bkamins) - update precompilation and .gitignore (#2810) (@bkamins)

- Julia
Published by github-actions[bot] almost 5 years ago

DataFrames - v1.1.1

DataFrames v1.1.1

Diff since v1.1.0

Closed issues: - DataFrames with many columns are too slow (because of show()) (#2739) - Unable to install DataFrames: error regarding ComposedFunction (#2748)

Merged pull requests: - Optimize completecases to process only missingable columns (#2726) (@pstorozenko) - fix performance issue in multirow split-apply-combine (#2749) (@bkamins) - use dict to cache eltype names (#2750) (@bkamins)

- Julia
Published by github-actions[bot] about 5 years ago

DataFrames - v1.1.0

DataFrames v1.1.0

Diff since v1.0.2

Merged pull requests: - require AbstractVector from subset selectors (#2744) (@bkamins)

- Julia
Published by github-actions[bot] about 5 years ago

DataFrames - v1.0.2

DataFrames v1.0.2

Diff since v1.0.1

Closed issues: - requesting new feature which covers stack, unstack and permutedims in a simpler way (at least conceptually) (#2732) - Significant regression of groupby when threading (#2735) - DataFrames with many columns are too slow (because of show()) (#2739) - .==(0) (#2741)

Merged pull requests: - Fix typo in depwarn (#2734) (@bkamins) - fix rowgroupslots_threading (#2736) (@bkamins) - Redirect https://dataframes.juliadata.org/ to https://dataframes.juliadata.org/stable/ (#2742) (@fredrikekre)

- Julia
Published by github-actions[bot] about 5 years ago

DataFrames - v1.0.1

DataFrames v1.0.1

Diff since v1.0.0

Closed issues: - describe fails on a categorical array column from RDatasets dataframe (#2672)

Merged pull requests: - fix PooledArray performance bottleneck (#2733) (@bkamins)

- Julia
Published by github-actions[bot] about 5 years ago

DataFrames - v1.0.0

DataFrames v1.0.0

Diff since v0.22.7

Merged PRs since 0.22.0 release

API enchancements:

  • error when using one dimension for indexing (#2553)
  • Implement firstindex, lastindex, axes, size, and ndims consistently (#2584)
  • Add subset (#2496)
  • add predicate support for names and more tests (#2417)
  • add vcat with source; deprecate indicator in joins in favor of source (#2649)
  • check for data frame corruption in delete! (#2690)
  • make Tables.columns(df) return eachcol(df) (#2680)
  • add ==, isequal <, and isless for DataFrameRow and GroupKey (#2669)
  • add support for getproperty broadcasting (#2655)
  • clean up outstanding convert (#2675)
  • deprecate map on GroupedDataFrame (#2662)
  • Deprecate unnecessary convert methods (#2671)

Bug fixes:

  • Fix size of float columns without eltypes (#2542)
  • fix display problem with Turing.jl (#2583)
  • Fix groupreduce with var and std for Unitful types (#2601)
  • add firstindex check and fix a bug in mapcols! (#2594)
  • fix repeat and some other minor issues (#2648)
  • make renaming perform copy in transform and transform! (#2721)
  • check if column order is correct in aggregations (#2682)

Performance and compilation latency:

  • Improve inference in vcat (#2559)
  • Avoid performance bottleneck corner case in grouping (#2592)
  • Enable multithreading with several operations in combine/select/transform (#2574)
  • Use refpool optimized method for integer grouping (#2610)
  • Multithreaded custom grouped operations with single-row result (#2588)
  • implement faster innerjoin (#2612)
  • left, right and outerjoin rewrite (#2622)
  • Add faster semi and antijoin (#2641)
  • use multithreading in basic operations (#2647)
  • additional precompilation for all=true (#2719)
  • update precompilation statements (#2718)
  • Despecialization part 2 (#2709)
  • Replace tforeach with @spawn_for_chunks (#2716)
  • Use fast path for single column aggs (#2687)
  • Enable multithreading in hashrows_col! (#2685)
  • do not use collect in describe (#2694)
  • Fix grouping aggregation slowdown (#2708)
  • Inference improvements (#2691)
  • Support threading in joined column creation (#2664)
  • Use multithreading in rowgroupslots refarray method (#2661)

Documentation and maintenance:

  • Zero after decimal (#2548)
  • Optional args style (#2547)
  • Spaces after commas (#2546)
  • fix two small typos in docs (#2551)
  • Switch from travis to GitHub Actions for CI testing (#2552)
  • Add link to CI status badge #2555
  • add links to curated DataFrames.jl materials (#2557)
  • additional tests to improve coverage (#2560)
  • remove deprecations and CategoricalArrays.jl dependency (#2554)
  • Update DataFramesMeta.jl manual entry (#2561)
  • Make GroupedDataFrame show tests more robust to context (#2571)
  • Fix Markdown in test (#2572)
  • Add Apache Arrow reference in the manual (#2575)
  • Update ci.yml from master to main (#2585)
  • Add tests for column selection and renaming for GroupedDataFrame (#2586)
  • bump compat for "Reexport" to "1.0" (#2587)
  • Mention number of groups in performance note (#2578)
  • Update deploydocs (#2599)
  • Fix DataFrame(Any[]) test (#2609)
  • Use PrettyTables.jl's alignment anchor for numbers (#2608)
  • Support new PooledArrays 1.0 release (#2614)
  • Fix missing mention of transform in manual (#2615)
  • fix docstring (#2617)
  • Fix CategoricalArrays.jl in Project.toml (#2646)
  • fix split-apply-combine tests (#2644)
  • fix broadcasting tests (#2645)
  • Respect copycols when building DataFrame from Tables.CopiedColumns (#2656)
  • update describe, unique and nonunique docs (#2652)
  • Normalize references to LINQ in docs (#2657)
  • improve new contributors guide (#2723)
  • Bk/clean exports (#2720)
  • Improve structure of join benchmarks (#2707)
  • Update PrettyTables to v0.12 (#2715)
  • Fix default copycols for Tables.jl constructor (#2710)
  • Update composer.jl (#2711)
  • Enable doctests (#2703)
  • promoted documentation on working with DataFrames (#2700)
  • Fix typo in unstack docstring (#2701)
  • 0.22.7 release notes (#2695)
  • remove test memory consumption (#2686)
  • remove deprecations for 1.0 release (#2679)
  • Update CONTRIBUTING.md (#2684)
  • Omit vertical lines with showrownumbers = false (#2674)
  • Fix small typo in docs (#2670)
  • Prepare for 1.0 release (#2729)

Merged PR contributors:

@cjalmeida, @EarthGoddessDude, @eirikbrandsaas, @kescobo, @nalimilan, @pdeffebach, @quinnj, @RohitRathore1, @ronisbr, @sa-, @timholy, @waldyrious

- Julia
Published by github-actions[bot] about 5 years ago

DataFrames - v0.22.7

DataFrames v0.22.7

Diff since v0.22.6

  • Respect copycols when building DataFrame from Tables.CopiedColumns #2656

- Julia
Published by github-actions[bot] about 5 years ago

DataFrames - v0.22.6

DataFrames v0.22.6

Diff since v0.22.5

Merged pull requests:

  • deprecate map on GroupedDataFrame (#2662) (@bkamins)
  • Deprecate unnecessary convert methods (#2671) (@bkamins)

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.22.5

DataFrames v0.22.5

Diff since v0.22.4

Merged pull requests: - Support new PooledArrays 1.0 release (#2614) (@quinnj)

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.22.4

DataFrames v0.22.4

Diff since v0.22.3

Add support of PrettyTables 0.11

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.22.3

DataFrames v0.22.3

Diff since v0.22.2

Merged pull requests:

  • Fix groupreduce with var and std for Unitful types (#2601) (@nalimilan)

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.22.2

DataFrames v0.22.2

Diff since v0.22.1

Closed issues: - Error when showing dataframe in terminal (#2582)

Merged pull requests: - fix display problem with Turing.jl (#2583) (@bkamins)

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.22.1

DataFrames v0.22.1

Diff since v0.22.0

Closed issues: - eltype width taken into accounet in display even if it is not shown (#2540) - Final ellipsis appears on next row (#2544) - clarify the interface for crossjoin when makeunique=true (#2545) - Two small typos in docs (#2550)

Merged pull requests: - Fix size of float columns without eltypes (#2542) (@ronisbr) - Spaces after commas (#2546) (@kescobo) - Optional args style (#2547) (@kescobo) - Zero after decimal (#2548) (@kescobo) - issue #2550 fix two small typos in docs (#2551) (@roualdes) - Switch from travis to GitHub Actions for CI testing (#2552) (@quinnj) - error when using one dimension for indexing (#2553) (@bkamins) - Add link to CI status badge (#2555) (@nalimilan)

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.22.0

DataFrames v0.22.0

Diff since v0.21.8

DataFrames v0.22 Release Notes

Breaking changes

  • the rules for transformations passed to select/select!, transform/transform!, and combine have been made more flexible; in particular now it is allowed to return multiple columns from a transformation function (#2461 and #2481)
  • CategoricalArrays.jl is no longer reexported: call using CategoricalArrays to use it #2404. In the same vein, the categorical and categorical! functions have been deprecated in favor of transform(df, cols .=> categorical .=> cols) and similar syntaxes #2394. stack now creates a PooledVector{String} variable column rather than a CategoricalVector{String} column by default; pass variable_eltype=CategoricalValue{String} to get the previous behavior (#2391)
  • isless for DataFrameRows now checks column names (#2292)
  • DataFrameColumns is now not a subtype of AbstractVector (#2291)
  • nunique is not reported now by describe by default (#2339)
  • stop reordering columns of the parent in transform and transform!; always generate columns that were specified to be computed even for GroupedDataFrame with zero rows (#2324)
  • improve the rule for automatically generated column names in combine/select(!)/transform(!) with composed functions (#2274)
  • :nmissing in describe now produces 0 if the column does not allow missing values; earlier nothing was produced in this case (#2360)
  • fast aggregation functions in for GroupedDataFrame now correctly choose the fast path only when it is safe; this resolves inconsistencies with what the same functions not using fast path produce (#2357)
  • joins now return PooledVector not CategoricalVector in indicator column (#2505)
  • GroupKeys now supports in for GroupKey, Tuple, NamedTuple and dictionaries (2392)
  • in describe the specification of custom aggregation is now function => name; old name => function order is now deprecated (#2401)
  • in joins passing NaN or real or imaginary -0.0 in on column now throws an error; passing missing thows an error unless matchmissing=:equal keyword argument is passed (#2504)
  • unstack now produces row and column keys in the order of their first appearance and has two new keyword arguments allowmissing and allowduplicates (#2494)
  • PrettyTables.jl is now the default back-end to print DataFrames to text/plain; the print option splitcols was removed and the output format was changed (#2429)

New functionalities

  • add filter to GroupedDataFrame (#2279)
  • add empty and empty! function for DataFrame that remove all rows from it, but keep columns (#2262)
  • make indicator keyword argument in joins allow passing a string (#2284, #2296)
  • add new functions to GroupKey API to make it more consistent with DataFrameRow (#2308)
  • allow column renaming in joins (#2313 and (#2398)
  • add rownumber to DataFrameRow (#2356)
  • allow passing column name to specify the position where a new columns should be inserted in insertcols! (#2365)
  • allow GroupedDataFrames to be indexed using a dictionary, which can use Symbol or string keys and are not dependent on the order of keys. (#2281)
  • add isapprox method to check for approximate equality between two dataframes (#2373)
  • add columnindex for DataFrameRow (#2380)
  • names now accepts Type as a column selector (#2400)
  • select, select!, transform, transform! and combine now allow renamecols keyword argument that makes it possible to avoid adding transformation function name as a suffix in automatically generated column names (#2397)
  • filter, sort, dropmissing, and unique now support a view keyword argument which if set to true makes them retun a SubDataFrame view into the passed data frame.
  • add only method for AbstractDataFrame (#2449)
  • passing empty sets of columns in filter/filter! and in select/transform/combine with ByRow is now accepted (#2476)
  • add permutedims method for AbstractDataFrame (#2447)
  • add support for Cols from DataAPI.jl (#2495)

Deprecated

  • DataFrame! is now deprecated (#2338)
  • several in-standard DataFrame constructors are now deprecated (#2464)
  • all old deprecations now throw an error (#2350)

Dependency changes

  • Tables.jl version 1.2 is now required.
  • DataAPI.jl version 1.4 is now required. It implies that All(args...) is deprecated and Cols(args...) is recommended instead. All() is still supported.

Other relevant changes

  • Documentation is now available also in Dark mode (#2315)
  • add rich display support for Markdown cell entries in HTML and LaTeX (#2346)
  • limit the maximal display width the output can use in text/plain before being truncated (in the textwidth sense, excluding ) to 32 per column by default and fix a corner case when no columns are printed in situations when they are too wide (#2403)
  • Common methods are now precompiled to improve responsiveness the first time a method is called in a Julia session. Precompilation takes up to 30 seconds after installing the package (#2456).

Closed issues: - Allow to hide row numbers (#592) - Stop printing row numbers in show(io, df)? (#864) - Show a (kind of) transposed DataFrame (#2065) - Improve text/plain show for AbstractDataFrame (#2146) - Showing of very wide data frames (#2302) - Add PrettyTables.jl as an alternative backend for display in DataFrames.jl (#2337) - add transpose(df, srcnamescol, dstnamescol) (#2420) - Deprecate DataFrame(::AbstractMatrix) (#2433) - Always use ? for Union{T, Missing} (#2480) - Stop supporting broadcasting + against whole DataFrames (#2483) - clean-up unstack (#2485) - Join on index with compatible Unitful types (#2486) - ERROR: UndefVarError: ByRow not defined (#2493) - Explicitly handling missingness in join columns (#2499) - sort with by accepts tuples still (#2500) - innerjoin not working if one df is a SubDataFrame or item of GroupedDataFrame (#2502) - remaining dependencies on CategoricalArrays (#2506) - Immutable DataFrames (#2507) - general principles of data manipulation for dicussion (#2509) - create maprow to be complementary with mapcol (#2510) - insertcols!(df, values => :name ) (#2512) - [Feature request] Support for converting single-column dataframes to Vectors (#2526) - Sync tests with Tables 1.2 (#2529) - select does not have method to handle Pair? (#2531) - Warning: getindex(df::DataFrame, col_ind::ColumnIndex) is deprecated (#2532) - ERROR: The following package names could not be resolved: (#2534)

Merged pull requests: - remove dependency on CategoricalArrays.jl in legacy show (#2427) (@bkamins) - [BREAKING] Add PrettyTables.jl backend for printing DataFrames (#2429) (@ronisbr) - Implement permutedims (#2447) (@kescobo) - Enable precompilation (#2456) (@nalimilan) - [BREAKING] deprecate DataFrame constructors (#2464) (@bkamins) - [BREAKING] Multicolumn transformations for GoupedDataFrame (#2481) (@bkamins) - [BREAKING] Refactor unstack (#2494) (@bkamins) - add Cols support (#2495) (@bkamins) - avoid allocation when negating BitArray (#2497) (@OkonSamuel) - make sure by isa Function or a vector of functions (#2501) (@bkamins) - Remove type parameters in DataFrameJoiner (#2503) (@bkamins) - [BREAKING] add matchmissing kwarg to joins (#2504) (@bkamins) - [BREAKING] remove CategoricalArrays dependency from joins (#2505) (@bkamins) - fix deprecated tests in reshape (#2511) (@bkamins) - require DataAPI.jl version 1.4 (#2514) (@bkamins) - move All(args...) tests to deprecated.jl (#2515) (@bkamins) - make hashrows_col! not depend on CategoricalArrays.jl (#2518) (@bkamins) - avoid CategoricalArrays dependency in aggregates (#2519) (@bkamins) - Switch from Coveralls to Codecov (#2520) (@nalimilan) - Allow CategoricalArrays 0.9 (#2521) (@nalimilan) - update manual and docstrings to PrettyTables.jl (#2522) (@bkamins) - Update Categorical test (#2523) (@bkamins) - fix coverage badge (#2524) (@pdeffebach) - Update TagBot.yml (#2527) (@quinnj) - Update tests to Tables.jl v1.2 (#2530) (@bkamins) - Add StatsKit to the ecosystem section (#2535) (@nalimilan) - code layout improvements (#2536) (@bkamins) - Improve floating point alignment (#2537) (@ronisbr) - update deprecated tests (#2538) (@bkamins)

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.21.8

DataFrames v0.21.8

Fix a bug in select/select!/transform/transform! in case when a GroupedDataFrame containing reordered groups is processed.

- Julia
Published by github-actions[bot] over 5 years ago

DataFrames - v0.21.7

DataFrames v0.21.7

Merged pull requests: - Update README.md with JuliaAcademy reference (#2294) (@logankilpatrick) - Don't match on AbstractVector type parameter due to compiler crash (#2383) (@quinnj)

- Julia
Published by github-actions[bot] almost 6 years ago

DataFrames - v0.21.6

DataFrames v0.21.6

Diff since v0.21.5

Merged pull requests: - make colnames type stable in broadcasting (#2331) (@bkamins) - Don't specialize groupreduce! on result array element type (#2335) (@quinnj) - Add check if input is a table before trying alternative constructors (#2341) (@quinnj) - Update outdated documentation (#2343) (@bkamins) - Change Travis from Julia 1.4 to 1.5 (#2345) (@bkamins) - Update dataframerow.jl (#2353) (@tbeason)

- Julia
Published by github-actions[bot] almost 6 years ago

DataFrames - v0.21.5

DataFrames v0.21.5

Diff since v0.21.4

Closed issues: - How to save dataframe to CSV file (#2312) - Single row manipulation leads to wrong results (#2332)

Merged pull requests: - Up Documenter.jl (#2315) (@bkamins) - fix deprecated example in the docs for innerjoin (#2320) (@bkamins) - fix DataFrameRow setindex! (#2333) (@bkamins)

- Julia
Published by github-actions[bot] almost 6 years ago

DataFrames - v0.21.4

DataFrames v0.21.4

Diff since v0.21.3

Closed issues: - findfirst/findlast/nextind/prevind not working for eachcol in v0.21.0 (#2229) - What happened to by()? (#2306) - Error in showing DataFrame with Distribution column (#2310)

Merged pull requests: - move lock inside if clause (#2303) (@bkamins) - update DataFrameRow documentation (#2307) (@bkamins) - fix show of UnionAll (#2311) (@bkamins)

- Julia
Published by github-actions[bot] almost 6 years ago

DataFrames - v0.21.3

DataFrames v0.21.3

Diff since v0.21.2

Closed issues: - When join(..., validate=(true,true)) fails, it should include a list of non-unique joinkeyrows in the error) (#1732) - Unify error messages for setting index of subdataframe (#2277) - indicator column in joins should allow Strings (#2283) - no method matching iterate(::InvertedIndex{BitArray{1}}), trying to any() a BitArray (#2285) - Fatal error: ERROR: UndefVarError: identifier not defined (#2286) - show all columns at HTML in Jupyter notebook (#2293) - unable to touch doc website (#2295) - select/transform: oldcolumn => fun => newcolumn_name syntax (#2301)

Merged pull requests: - Add .editorconfig and comment on GitHub Actions (#2163) (@jonas-schulze) - Fix various bugs in split/apply/combine in 0.21 release (#2280) (@bkamins) - improve error message when validating (#2282) (@bkamins) - let indicator allow strings (#2284) (@pdeffebach) - add eltypes to show docstring (#2288) (@bkamins) - add handling to cases when all is not optimized out (#2290) (@bkamins) - improve handling of indicator in joins (#2296) (@bkamins) - fixes to follow breaking changes in printing in 1.6 (#2299) (@bkamins)

- Julia
Published by github-actions[bot] about 6 years ago

DataFrames - v0.21.2

DataFrames v0.21.2

Diff since v0.21.1

Closed issues: - diag(::DataFrame)? (#2268)

Merged pull requests: - Fix position of backticks in disallowmissing docstring (#2270) (@nalimilan) - fix bugs in append! and push! corner cases (#2273) (@bkamins)

- Julia
Published by github-actions[bot] about 6 years ago

DataFrames - v0.21.1

DataFrames v0.21.1

Diff since v0.21.0

Closed issues: - Standardizing working with multiple columns (#2016) - In docs, note subsets are copies (unless of columns)? (#2224) - first/last/etc. documentation problem (#2232) - Make DataFrame's BoundsError message more informative and similar to that of Base.Matrix (#2234) - Problems in groupreduceinit (#2241) - Tables.columns should return a DataFrameColumns object (#2244) - map on DataFrameColumns should return DataFrameColumns (#2245) - rename(uppercase, df) doesn't work anymore (#2252) - update docs at pkg.julialang.org (#2255) - allow [:a,:b,:c,:d] => fun => newcolumn_name syntax (#2256) - ENV["COLUMNS"] not working as expected in Jupyter Lab (#2266)

Merged pull requests: - add note on views / copies in getting started -- #2224 (#2226) (@nickeubank) - Fix several references in documentation (#2231) (@dmolina) - Fix docstring references in the manual (#2233) (@mortenpi) - better deprecations of by and map (#2249) (@bkamins) - fix bug in reduceinit (#2263) (@bkamins)

- Julia
Published by github-actions[bot] about 6 years ago

DataFrames - v0.21.0

DataFrames v0.21.0

Diff since v0.20.2

Closed issues: - Output format of reshape functions (#645) - Grouping API consistency and improvements (#1256) - by with arrays of inputs and broadcasting (#1615) - Group Indices function (#1704) - push! which promotes type (#1716) - API for groupwise column transformation (#1727) - aggregate function: Add option to NOT re-name column (#1756) - Two regex-related items on a wish list (#1849) - How about DismensionMismatch rather than ArgumentError? (#1879) - Handling of strings for column indexing (#1926) - How to perform by on two variables? Should we auto-splat? (#1935) - Row-wise vs. whole vector functions (#1952) - API for aggregate (#1953) - Unify push!, append! and vcat implementation. (#2032) - Add an easy way to get a number of rows in by (#2035) - Column naming in combine (#2071) - Implicit broadcasting rules (#2086) - Add "begin" tests for Julia 1.4 (#2089) - redesign of eachcol (#2090) - Reconsider overloading Base.join? (#2092) - Using the public API should be safe (#2094) - Speed up key lookup in GroupedDataFrame (#2095) - Add Tables.namedtupleiterator implementation (#2100) - Decide if we want to copy levels of CategoricalValue if we do Tables.allocatecolumn (#2104) - Add transform function (#2110) - Should we export Tables (#2114) - Optionally remove type from heading (#2116) - Disallow passing zero columns to aggregate functions in combine/by (#2118) - DataFrame constructor from Dict (#2119) - Add a wrapper type for passing named tuples to functions when transforming (#2121) - Constructor behavior on nested array vs array of tuple (#2124) - Unexpected BoundsError message (#2125) - map(DataFrame, groups) does not return a collection of DataFrames (#2126) - Bad deprecation warning for df.C = "c" when C is a new column (#2129) - Data Manipulation - Map categorical value (#2130) - ERROR: BoundsError: attempt to access String (#2134) - Circular reference in DataFrame bug (#2135) - Groupby + count append 0 if not exists (#2136) - Fix bounds in registry (#2137) - Pandas like MultiIndex function (#2138) - ⍰ character in header output (#2139) - review consistency of @view semantics (#2143) - Precompilation error with latest release on julia v1.4-rc2 (#2149) - add missing columns when push! ing? (#2150) - Performance of allunique (#2153) - Redesign of combine (#2156) - Emulating Stata's rowtotal (#2161) - Automatically fill in select and combine for scalars? (#2162) - [BREAKING] Making combine more flexible (#2166) - Sync API of append! with push! (#2173) - Add a method to insert a column to last index (#2175) - sort and sort! API (#2178) - push! with cols=:subset is not allowed if there is no missing in previous data (#2179) - unrelated error message when trying to access 0 index (#2182) - Base.depwarn stopped printing warnings on Julia 1.5 (#2184) - New name for names and rename! (#2185) - Shall mapcols be deprecated? (#2186) - Why does categorical!(df::DataFrame, ...) exist? (#2192) - Extend categoric values by category data (#2198) - Tag a new release to reflect [compat] CategoricalArrays = "0.8" update (#2204) - Cleaner syntax (#2206) - by does not generate correct results (#2208) - possible test failure in upcoming Julia version 1.5 (#2221)

Merged pull requests: - List missing functions in AbstractDataFrame API (#2074) (@nalimilan) - Add transformation and renaming to select and select! (#2080) (@bkamins) - add Pair to filter and filter! (#2091) (@bkamins) - Special case of Ref and 0-dimensional AbstractArray in DataFrame and insertcols! (#2093) (@bkamins) - improve performance of Tables.rowtable (#2098) (@bkamins) - add inner/left/right/outer/semi/anti/cross-join (#2101) (@bkamins) - Consistent period printing on Julia 1.5-DEV (#2107) (@omus) - Add replacement examples to gettingstarted.md (#2108) (@anandijain) - add Tables.jl integration for eachrow and eachcol (#2113) (@bkamins) - Show friendly error when trying to iterate DataFrames (#2115) (@non-Jedi) - improve speed of Tables.namedtupleiterator (#2117) (@bkamins) - [BREAKING] deprecate names=true in eachcol (#2120) (@bkamins) - export Tables and imoprt DataAPI and Tables internally (#2122) (@bkamins) - [BREAKING] Change ArgumentError to DimensionMismatch (#2128) (@bkamins) - Flatten multiple columns at once (#2131) (@pearlzli) - use Dict{Any,Int} to speed up GroupDataFrame lookup (#2132) (@bkamins) - change ⍰ to ? when showing a DataFrame with missing eltype (#2140) (@bkamins) - add haskey to GroupedDataFrame and GroupKey (#2141) (@bkamins) - add displaycoltypes to showrows & show for dataframes (#2142) (@ssikdar1) - Make the Jupyter Notebook documentation more precise (#2144) (@bkamins) - [BREAKING] fix eltype in stack with view=true (#2145) (@bkamins) - [BREAKING] make idvars go first in stack (#2147) (@bkamins) - finalize adding eltypes to show (#2151) (@bkamins) - allow :union as cols kwarg in push! and append! (#2152) (@bkamins) - Fix typo (#2154) (@prosoitos) - add transform and transform! and select cleanup (#2155) (@bkamins) - Fix docstring (#2157) (@jonas-schulze) - [BREAKING] sync combine with select (#2158) (@bkamins) - update to Juia 1.4 (#2160) (@bkamins) - [BREAKING] allow pseudo-broadcasting in select (#2165) (@bkamins) - [BREAKING] add groupcols and valuecols functions (#2167) (@bkamins) - add nrow to select (#2168) (@bkamins) - [BREAKING] add to combine and by: column selection, pseudo broadcasting, fix bug with unequal column lengths (#2170) (@bkamins) - [BREAKING] Deprecate aggregate (#2174) (@bkamins) - fix deprecation in append (#2176) (@bkamins) - add cols to names (#2177) (@bkamins) - [BREAKING] Deprecate passing tuple of columns to sort (#2181) (@bkamins) - add AsTable wrapper, disallow NamedTuple in ByRow (#2183) (@bkamins) - Port to CategoricalArrays 0.8 (#2188) (@nalimilan) - [BREAKING] Rename deleterows! to delete! (#2189) (@nalimilan) - clean-up reshaping code (#2190) (@bkamins) - add Not to delete! (#2191) (@bkamins) - add mapcols! and repeat!, fix corner cases of repeat (#2195) (@bkamins) - dataframerow related docstrings (#2196) (@pdeffebach) - [BREAKING] Add column indexing using strings (#2199) (@bkamins) - [BREAKING] Circular ref bug show (#2200) (@bkamins) - Fix some typos (#2201) (@jonas-schulze) - Fix building documentation on Travis (#2202) (@nalimilan) - Fix exponent in docs (#2209) (@nalimilan) - Make ByRow subtype Function (#2212) (@oxinabox) - cleanup after string PR (#2213) (@bkamins) - [BREAKING] new design of select, transform and combine (#2214) (@bkamins) - Fix exponent in docs (#2216) (@nalimilan) - errant space confused juliadoc builder (#2222) (@nickeubank) - updates required for Julia 1.5 (#2223) (@bkamins) - up version to 0.21.0 (#2225) (@bkamins)

- Julia
Published by github-actions[bot] about 6 years ago

DataFrames - v0.20.2

v0.20.2 (2020-02-13)

Diff since v0.20.1

Closed issues:

  • pkg\> add DataFrames Tables installs DataFrames v0.19.4 and Tables v1.0.0 (#2112)
  • Sync with Tables.jl 1.0 release (#2109)

- Julia
Published by julia-tagbot[bot] over 6 years ago

DataFrames - v0.20.1

Add compatibility with Tables.jl v1.0.

- Julia
Published by julia-tagbot[bot] over 6 years ago

DataFrames - v0.20.0

v0.20.0 (2019-12-07)

Diff since v0.19.4

Closed issues:

  • Make describe not accept io (#2024)
  • Switch cols to kwarg from positional args (#2023)
  • Problems sorting dataFrames imported from CSV (#2019)
  • Allow rename!\(df, pair::Pair{String, String}\) as a signature (#2017)
  • Add an argument allowing to select columns to calculate statistics on for describe (#2014)
  • Add flatten function (#2012)
  • describe should also apply to Vector (#2010)
  • Add :equal support in append! (#2007)
  • CSV.read cannot detect "Time" type string (#2005)
  • When vcat dataframes, ordering of categorical variables is lost (#2002)
  • Allow mix of Symbol and Pair in join (#2001)
  • Documenting the difference betwen df[!, :col] and df[:, :col] (#1999)
  • select!(df, Not(tuple)) does not work (#1997)
  • using DataFrames in Jupiter lab and notebook hangs... (#1996)
  • [package code fancyness] Redundant code snippet (#1993)
  • Merge meanings of cols keyword arguments between push! and vcat (#1991)
  • master still on v0.19.3, though release branch already on v0.19.4 (#1990)
  • Bad performance of "by" function for random queries (#1988)
  • Warning T is deprecated, use nonmissingtype instead (#1987)
  • ERROR: ArgumentError: 'Array{UInt8,1}' iterates 'UInt8' values, which don't satisfy the Tables.jl Row-iterator interface (#1983)
  • vcat! or push!(..., columns=:union) (#1982)
  • Drop AppVeyor in favour of TravisCI (#1980)
  • 32-bit BoundsError (#1978)
  • allow by to receive keyword argument for custom output column name. (#1976)
  • dropmissing! fails on PooledStrings (#1973)
  • Fix tests to pass on Julia nightly (#1967)
  • can't join more than two dataframes? (#1962)
  • Issues using the df\[!, col\] syntax during broadcasts (#1959)
  • API for functions that help reduce memory usage (#1954)
  • NamedTuple backing or switchable? (#1949)
  • Sorting error using examples from docs (#1945)
  • merge names! intorename!` (#1943)
  • Allow partial re-ordering for permutecols! (#1942)
  • sort! performance (#1927)
  • Add kwarg do disallowmissing that skips conversion of columns with missing values (#1922)
  • Sync the behavior of push!, vcat and append! in DataFrames.jl with Base (#1904)
  • How about raising ArgumentError rather than just calling error() in append!()? (#1869)
  • How about raise ArgumentError rather than just calling error() in insertcols!()? (#1867)
  • Improve select and select! performance with Not (#1861)
  • Make getproperty\(df, col\) return a full length view of the column (#1844)
  • Allow empty keys argument in by\(\) (#1837)
  • Find a better API for stackdf and meltdf (#1736)
  • DataFrames.jl roadmap (#1678)
  • setindex!/broadcast! design (#1645)
  • Update docstrings to new conventions (#1093)

Merged pull requests:

  • Add upper bounds to versions in Project.toml (#2037) (nalimilan)
  • Enable CompatHelper (#2036) (nalimilan)
  • remove unnecessary line in DataFrame constructor (#2033) (bkamins)
  • deprecate stackdf and meltdf (#2031) (bkamins)
  • allow integer as from value in rename (#2030) (bkamins)
  • Release 0.20.0 (#2028) (bkamins)
  • deprecate io support in describe (#2027) (bkamins)
  • change Julia from 1.2 do 1.3 on Travis CI (#2026) (bkamins)
  • add Array convert for data frame and data frame row (#2025) (bkamins)
  • make SubIndex strict about duplicates (#2022) (bkamins)
  • Key in warning doesn't reflect the the right entry (#2021) (laborg)
  • add cols to describe (#2020) (bkamins)
  • allow strings in rename! (#2018) (bkamins)
  • Add flatten function (#2013) (pdeffebach)
  • update documentation of select (#2011) (bkamins)
  • Update docstrings and removed unnecessary internal comments (#2009) (bkamins)
  • add names documentation (#2008) (bkamins)
  • allow mixing Symbol and Pair{Symbol,Symbol} in on for joining (#2006) (bkamins)
  • change how copycols in Vector{<:NamedTuple} constructor (#2004) (bkamins)
  • Documenting the difference betwen df[!, :col] and df[:, :col] (#2000) (aminya)
  • fix a bug in select! (#1995) (bkamins)
  • Add reference to DataFrames Tutorial in index.md (#1994) (dmolina)
  • Avoid recompiling combine for each new NamedTuple or Tuple type (#1992) (nalimilan)
  • deprecate names! (#1986) (bkamins)
  • Rewrite select! to improve performance (#1985) (Ellipse0934)
  • add travis on widows + x86 and remove AppVeyor (#1984) (bkamins)
  • Fix handling of hash collisions in row_group_slots (#1979) (nalimilan)
  • allow more flexible cols in groupby and in particular handle empty cols (#1977) (bkamins)
  • Allow permutaion of names in rename! (#1974) (innerlee)
  • clean up permutecols! and tests (#1972) (bkamins)
  • Add InvertedIndex and in\(\) examples (#1971) (nilshg)
  • wrap outstanding tests in data.jl in testset (#1970) (bkamins)
  • update documentation of column selection (#1969) (bkamins)
  • Fix documentation (#1966) (kojix2)
  • allow passing multiple columns arguments to select and select! (#1964) (bkamins)
  • allow joins for more than two data frames (#1963) (bkamins)
  • allow to create columns using : as row selector (#1961) (bkamins)
  • correctly use dotview instead of maybeview in broadcasting contexts (#1960) (bkamins)
  • Add :setequal to cols kwarg in vcat, push! and append! (#1958) (bkamins)
  • fix Vector for DataFrameRow eltype detection (#1957) (bkamins)
  • add error to disallowmissing and disallowmissing! (#1956) (bkamins)
  • disallow Base.Generator in setindex! to DataFrameRow (#1951) (bkamins)
  • make sure mapcols does not reuse source vectors (#1950) (bkamins)
  • fix documentation examples (#1946) (bkamins)
  • Fix path to iris.csv (#1944) (nalimilan)
  • Use nonmissingtype instead of Missings.T (#1941) (nalimilan)
  • Deprecate eltypes (#1940) (nalimilan)
  • add skipmissing to by (#1939) (bkamins)
  • Stop using makeunique=true for grouping keys in combine (#1938) (nalimilan)
  • Make push! more strict (#1937) (bkamins)
  • Make sort and sort! type-stable to improve performance (#1934) (nalimilan)
  • Cleanup tests (#1931) (bkamins)
  • Add @ref links to groupby docstring (#1930) (asinghvi17)
  • Get values of grouped columns (#1908) (jlumpe)
  • target setindex implementation (#1899) (bkamins)
  • Throw more specific error, ArgumentError, rather than a general Error() (#1872) (petershintech)

- Julia
Published by julia-tagbot[bot] over 6 years ago

DataFrames - v0.19.4

Make DataFrames.jl to depend on Missings.jl version 0.4.2. Stop using Missings.T internally and use nonmissingtype instead.

- Julia
Published by julia-tagbot[bot] almost 7 years ago

DataFrames - v0.19.3

  • fixed a bug in deprecation code for when setindex! was passed a 1-row DataFrame as the right hand side (note though this syntax is not recommended to be used);
  • DataFrameRows and DataFrameColumns now support getproperty and propertynames and have a custom printing that shows them similarly like data frames
  • categorical and categorical! now accept Type as cols argument, so that the user can flexibly decide which columns are converted to categorical based on their type
  • columnindex function from Tables.jl is now exported
  • All and Between can now be used for indexing columns of a data frame
  • passing a Tuple as a on keyword argument is deprecated (use Pair instead)
  • minor documentation and build system improvements

- Julia
Published by julia-tagbot[bot] almost 7 years ago

DataFrames - v0.19.2

New features

  • added disallowmissing, allowmissing and categorical functions
  • unstack now accepts renamecols keyword argument

Minor changes

  • documentation has been updated to reflect new indexing rules
  • setindex! deprecation warnings were improved and now take into account new rules of broadcasting into 0-row data frames
  • :eltype column in describe now contains a true element type of a data frame column (previously if the type was an union with Missing then the Missing part was stripped, which sometimes lead to user confusion)
  • broadcasting over GroupedDataFrame is now disallowed (it was never intended to work; in the future this might be allowed, but the target design is not decided upon yet)
  • append! now throws ArgumentError instead of ErrorException when column names of appended data frame does not match the target

Bug fixes

  • fixed a typo in append! error message
  • fixed a bug in categorical! function when a Colon as column selector was passed (the behavior was inconsistent with the documentation); now only categorical!(::DataFrame) changes columns whose eltype is <:Union{AbstractString, Missing} to categorical; any valid categorical!(::DataFrame, cols) call changes all columns selected by cols to categorical

- Julia
Published by nalimilan almost 7 years ago

DataFrames - v0.19.1

v0.19.1 (2019-07-25)

Changes summary

  • correctly handle broadcasting into a single cell of a data frame; now df[row, col] .= v broadcasts into the object held in df[row, col] cell
  • we now allow broadcasting into empty data frame (data frame df for which isempty(df) is true); in particular we allow column creation in empty DataFrame in which case we always create a 0-row vector;
  • push! and append! now make sure that the result of the operation did not corrupt the data frame (which mostly happens when there are column aliasing issues) and throw an error if it happens (this introduces a small overhead but greatly reduces a number of possible bugs in user code)
  • join, groupby and show-related functions now check if the data frames passed to them are consistent (have the same number of rows in each column and do not have corrupted index)
  • improved loading time by replacing StatsBase.jl by DataAPI.jl dependency
  • fixed documentation generation issues

Diff since v0.19.0

Closed issues:

  • Explanation of the deprecation of df[col] and df[cols] (#1897)
  • Basics docs are broken on master (#1891)
  • Preventing problems with aliased columns (#1885)
  • dropna (#1884)
  • Release 0.19.0 (#1883)
  • describe StatsBase vs DataAPI (#1882)
  • Issue when converting Excelfile with missing data to DataFrame (#1878)
  • select and deletecols for SubDataFrame and DataFrameRow (#1825)
  • Surprising setindex/setproperty behaviour (#1815)
  • Convience methods for getproperty and setproperty in DataFrames with new ownership rules (#1753)
  • pairs outputs warnings (#1751)
  • Add checks of DataFrame consistency before expensive operations (#1744)
  • DataFrames should be indexable by CartesianIndex{2} (#1610)
  • allow vcat to widen columns (#1574)
  • Creating an empty DataFrame is unwieldy and has unexpected behavior (#1569)
  • Make a DataFrame not iterable (#1513)
  • broadcasted setindex not working as expected (#1507)
  • Writing to latex: omitting column with row index (#1381)
  • functional interface for deleting columns (#1378)
  • Implement a from_records constructor (#1191)
  • Markdown display (#1167)
  • implement view(::DataFrame, ...) to support broadcasted assignment (#1019)
  • JSON to dataframe input and output (#873)
  • Implement Base.cor for DataFrame (#583)
  • Remove nrow/ncol (#406)

Merged pull requests:

  • Improve documentation generation (#1892) (bkamins)
  • allow scalar broadcasting into an empty data frame (#1890) (bkamins)
  • First proposal of consistency checks (#1887) (bkamins)
  • Extend describe from DataAPI to allow removing StatsBase dependency (#1818) (quinnj)

- Julia
Published by julia-tagbot[bot] almost 7 years ago

DataFrames - v0.19.0

API changes:

  • allow Regex indexing of columns
  • allow Not from InvertedIndices.jl indexing of rows and columns
  • add ! indexing of rows of AbstractDataFrame
  • deprecate indexing with column or columns only (like df[:a] or df[1:2])
  • define target rules for getindex, getproperty,setindex!, andsetproperty!forAbstractDataFrameandDataFrameRow` (in this release old behavior is deprecated; in the next release wit will get replaced by target functionality)
  • add indexing using CartesianIndex{2} for AbstractDataFrame
  • full support of broadcasting for AbstractDataFrame
  • support for broadcasting assignment for DataFrameRow
  • keys(::DataFrameRow) now returns a Tuple of column names
  • added get and map methods for DataFrameRow
  • categorical! now accepts columns that contain missing values
  • get and haskey for AbstractDataFrame is deprecated now
  • empty! for DataFrame is deprecated now
  • add hasproperty for AbstractDataFrame

Fixes:

  • improved showind DataFrameRow with zero columns
  • fix combine with aggregation when skipmissing=true

Minor changes:

  • improvements in error messages and types of thrown exceptions on error
  • various documentation improvements
  • improved getindex speed for vector of Bool indexing
  • remove InteractiveUtils.jl dependency

- Julia
Published by julia-tagbot[bot] almost 7 years ago

DataFrames - v0.18.4

Changes since last release: - Fix combine with aggregation when skipmissing=true - Remove InteractiveUtils load

- Julia
Published by julia-tagbot[bot] almost 7 years ago

DataFrames - v0.18.3

Changes since last release: * improved DataFrame constructor to avoid allocations when it is not necessary * updated compat bounds * custom dump for AbstractDataFrame method is removed; use summary to get a brief description of a data frame instead * fixed errors in map on GroupedDataFrame in a situation when 0 rows are produced

- Julia
Published by julia-tagbot[bot] about 7 years ago

DataFrames - v0.18.2

This is a patch release. Changes:

  • integration with Tables.jl is more flexible in accepting what is considered a table
  • a bug when DataFrame was created with zero rows but some columns was fixed
  • reduce(vcat, ...) now accepts cols argument and correctly handles data frames with zero columns
  • improved documentation; major changes: added instructions for setting up the Jupyter notebook to work with DataFrames.jl and explained the policy what is considered a public API of DataFrames.jl

- Julia
Published by julia-tagbot[bot] about 7 years ago

DataFrames - v0.18.1

  • DataFrame constructor now correctly handles corner cases of objects that follow Tables.jl interface (see https://github.com/JuliaData/DataFrames.jl/issues/1788 for an example problem that had to be fixed)
  • DataFrame constructor for GroupedDataFrame and Vector{<:NamedTuple} now disallows passing copycols keyword argument

- Julia
Published by julia-tagbot[bot] about 7 years ago

DataFrames - v0.18.0

Breaking changes: * functions that create a new DataFrame copy passed columns by default; this can be overridden by copycols keyword argument or by using the DataFrame! function that does not copy passed columns * in eachcol now names keyword argument defaults to false * make dropmissing and dropmissing! disallow missing values in column by default * removed long deprecated uses of by, nullable!, keys, values, pool, pool!, complete_cases, complete_cases!, sub, rename!, rename and vcat * removed dependency on DataStreams.jl and WeakRefStrings.jl

Deprecations: * several DataFrame constructors not taking source data were deprecated * colwise is now deprecated

Enhancements: * add cols keyword argument to vcat, make it ignore data frames with no columns, and support efficient reduce * allow passing a data frame with no columns to append! * allow push! to a data frame with no columns and add cols keyword argument to it * improvements of showing data frames, grouped data frames and data frame rows for CSV, TSV, HTML and LaTeX MIME types * optimized grouping methods for PooledArrays * DataFrame constructor can now take tuples of column vectors and column names * added compress keyword argument to the categorical! function * describe now supports passing custom functions * make allowmissing! and disallowmissing! accept vector of Bool * add select, select! and deletecols functions

Bug fixes: * combine now has a better handling of combining incompatible return values

Miscellaneous: * started testing against Julia 1.1 and stopped against Julia 0.7 * migration to JuliaRegistrator * numerous documentation improvements * removed custom deepcopy implementation * removed dependency on CodecZlib,jl, TranscodingStreams.jl * improved error message when column is not found in a data frame

- Julia
Published by julia-tagbot[bot] about 7 years ago

DataFrames - Version 0.17.1

API changes

  • add DataFrame constructor from GroupedDataFrame
  • add groupvars and groupindices functions for GroupedDataFrame
  • DataFrameRow(::AbstractDataFrame, ::Integer) constructor is now not deprecated
  • combine(::GroupedDataFrame) (without passing a function to use for combining groups) is now deprecated

Bug fixes

  • fixed getindex for GroupedDataFrame
  • fixed DataFrame constructor from Vector{NamedTuple}
  • GroupedDataFrame constructor now correctly allows AbstractVectors for specification of grouping columns

Documentation

  • added append! documentation
  • improved unstack examples in the documentation

Internal

  • fromcolumns is now a bit more efficient
  • Improved error messages for getindex function and improved handling of @inbounds with getindex
  • Clean up Travis CI script
  • various improvements in test code organization

- Julia
Published by bkamins over 7 years ago

DataFrames - Version 0.17.0

  • further performance improvements in split-apply-combine functions and fix a bug in map on GroupedDataFrame grouping column selection;
  • now view of a DataFrame retains parent DataFrame information for SubDataFrame and DataFrameRow
    • calling a view on a subset of columns of a DataFrame is much faster now
    • selecting all columns of a parent with a colon (e.g. view(df, 1:2, :)) makes a view always have all columns of a parent DataFrame even if it is mutated;
  • you can now convert a SubDataFrame to a DataFrame;
  • push! now accepts DataFrameRow as an argument and does not accept Dict with string keys (Symbols are required)
  • DataFrame constructor accepts SubDataFrame or DataFrameRow as an argument returning a freshly allocated DataFrame;
  • improved displaying of AbstractDataFrame, DataFrameRow and GroupedDataFrame for all supported MIMEs;
  • added parent method for GroupedDataFrame;
  • any AbstractDataFrame (not only DataFrame as before) now supports iterable table interface and other small improvements in integeation with Tables.jl and TableTraits; calling Tables.materializer on a DataFrame returns a DataFrame;
  • various documentation and test coverage improvements

- Julia
Published by bkamins over 7 years ago

DataFrames - Version 0.16.0

  • Remove deprecations for getindex, view, collect(::DataFrameRow); key highlights on the current functionality are:

    1. using @view on getindex always consistently returns the same range of values as getindex (but in a view)
    2. selecting a single row of a data frame always returns a DataFrameRow which is a view (using view or getindex); you can copy a DataFrameRow to get a NamedTuple containing the data from this row;
    3. iterating DataFrameRow yields the values only (similarly to a NamedTuple);
    4. when using row selector in getindex, e.g. df[:, col] always copies returned columns; use df[col] to avoid making a copy (and get the column itself if df is a DataFrame or an appropriate view of the column if df is a SubDataFrame or DataFrameRow).
  • Fix bug with constructor with TableTraits 0.4.1.

  • Respect output limit when printing to LaTeX.

  • Avoid computing unused statistics in describe.

  • Add disallowmissing keyword argument to dropmissing and dropmissing!.

  • Make unstack use all columns other than :variable and :value as rowkeys.

  • Improve error message for deletecols!.

  • Add summary keyword argument to HTML show method.

  • Fully implement parentindices and parent.

  • Deprecate some convert methods and improve others.

  • Implement more ndims methods.

  • Show group values when printing grouped data frame.

  • Correctly handle ranges in deletecols!.

- Julia
Published by nalimilan over 7 years ago

DataFrames - Version 0.15.2

  • fix unique! and unique signatures to sync with Julia 1.1
  • avoid printing deprecation warnings by dump, show and mapcols with SubDataFrame
  • A minor documentation improvement

- Julia
Published by bkamins over 7 years ago

DataFrames - Version 0.15.1

  • Avoid deprecation warning in eachcol.

- Julia
Published by nalimilan over 7 years ago

DataFrames - Version 0.15.0

  • Make indexing more consistent (introducing many deprecations to be removed in the next release).
  • Introduce new API for grouping, making it dramatically more efficient. Improve naming of columns created from anonymous functions.
  • Deprecate length, delete!, insert! and merge! to make the API consistent with the definition of data frames as collections of rows (rather than of columns).
  • Finish deprecation period of the makeunique argument.
  • Deprecate head and tail in favor of first and last for consistency with Julia Base.
  • Deprecate eachcol(df) in favor of eachcol(df, true) in order to change the default behavior in the future. Add a mapcols function to apply an operation to each column and return a data frame.
  • Improve performance of TableTraits sink.
  • Allow specifying columns to completecases, dropmissing and dropmissing!.
  • Fix show methods for CSV/TSV. Add dimensions to HTML output.
  • Add conversion from a DataFrameRow to a Vector.
  • Improve documentation.

- Julia
Published by nalimilan over 7 years ago

DataFrames - Version 0.14.1

  • Improve printing to HTML, GroupedDataFrame and type formatting.
  • Improve code to avoid printing deprecation warnings.
  • Add haskey method for DataFrameRow.
  • Add repeat method for AbstractDataFrame.
  • Make push! of NamedTuple to DataFrame use field names.
  • Improve documentation.

- Julia
Published by bkamins over 7 years ago

DataFrames - Version 0.14.0

  • Use new Tables.jl interface.
  • Improve printing.
  • Improve documentation.
  • Omit prefix in when using function name to create column names with aggregate.
  • Fix readtable.
  • Avoid copying a vector in stackdf.
  • Raise errors when passing a single Boolean to getindex.
  • Fix stack overflow with vcat and by/aggregate.
  • Deprecate iterating over DataFrameRow in favor of pairs.
  • Add DataFrame constructor accepting a DataFrame.

- Julia
Published by nalimilan almost 8 years ago

DataFrames - Version 0.13.1

- Julia
Published by quinnj almost 8 years ago

DataFrames - Version 0.13.0

- Julia
Published by quinnj almost 8 years ago

DataFrames - v0.12.0

- Julia
Published by ararslan almost 8 years ago

DataFrames - Version 0.11.7

- Julia
Published by nalimilan almost 8 years ago

DataFrames - Version 0.11.6

- Julia
Published by nalimilan about 8 years ago

DataFrames - Version 0.11.5

  • Fix combining DataFrames with a column with element type Missing.
  • Test non-matching joins.

- Julia
Published by nalimilan over 8 years ago

DataFrames - Version 0.11.4

  • Add filter and filter! methods.
  • Deprecate automatic deduplication of column names, add makeunique keyword argument to enable it.
  • Improve DataFrame constructors and conversions for Vector and Matrix.
  • Check categorical argument length in constructor.
  • Redesign getindex and fix stack/melt StackOverflowError.
  • Fix a stack method calling stackdf instead of stack.
  • Fix readtable with zipped files.
  • Add deprecation for vcat(Vector{<:AbstractDataFrame}).

- Julia
Published by andreasnoack over 8 years ago

DataFrames - Version 0.11.3

  • Fix similar and allowmissing!.
  • Support joining dataframes on columns with different left/right names.
  • Improve unstack.
  • Keep columns which do not allow for missing values as such in join.
  • Fix hash when passed an initial hash value.
  • Make boolean indexing stricter to match behavior used in Julia Base.
  • Julia 0.7 fixes.

- Julia
Published by ararslan over 8 years ago

DataFrames - Version 0.11.2

  • Fix readtable in the presence of missing values.
  • Use CodecZlib instead of GZip for readtable.
  • Add one-argument version of allowmissing! to change all columns to accept missing values.
  • Deprecate pipelining with |> and currying functions.
  • Correct docstring for combine.
  • Fix unstack in some special cases.
  • Fix hcat between a vector as first argument and a DataFrame as second argument.
  • Fix Julia 0.7 deprecations.

- Julia
Published by nalimilan over 8 years ago

DataFrames - Version 0.11.1

  • Fix DataStreams allocate with CategoricalString.

- Julia
Published by quinnj over 8 years ago

DataFrames - Version 0.11.0

This is a breaking release. Deprecations are in place where possible, but code will need to be adjusted to work with the new framework. Changes: - NA (from the DataArrays package) has been replaced with missing (from the Missings package, and soon in Julia Base). - Columns are no longer converted to DataArray: do it manually if you want to use this type, or use Vector{Union{T, Missing}}. PooledDataArray should be replaced with either CategoricalArray or PooledArray. - Modeling features have been moved to the StatsModels package. - readtable and writetable habe been deprecated in favor of CSV.read and CSV.write from the CSV package. - Joining and grouping algorithm has been improved to be faster and no longer fails when number of groups is too high. - complete_cases(!) functions have been rework and new dropnull(!) functions added. - Concatenation code has been reworked, with better error messages. - The manual has been improved. - Printing has been improved. - And many other fixes and improvements.

- Julia
Published by nalimilan over 8 years ago

DataFrames - Version 0.6.6

  • Deprecates array(df, ...) in favor of convert(Array, df, ...) ([#806])
  • Deprecates DataArray(df, T) in favor of convert(DataArray{T}, df) ([#806])

- Julia
Published by rofinn almost 9 years ago

DataFrames - Version 0.10.1

- Julia
Published by ararslan almost 9 years ago

DataFrames -

- Julia
Published by ararslan about 9 years ago

DataFrames -

- Julia
Published by ararslan about 9 years ago

DataFrames - v0.9.0

Support for Julia 0.6. Julia 0.4 support dropped.

- Julia
Published by ararslan over 9 years ago

DataFrames - DataFrames v0.6.2

- Julia
Published by garborg over 11 years ago

DataFrames - DataFrames v0.6.1

- Julia
Published by garborg over 11 years ago

DataFrames - DataFrames v0.6.0

- Julia
Published by garborg over 11 years ago