Changelog
Source:NEWS.md
vroom 1.6.5
- Internal changes requested by CRAN around format specification (#524).
vroom 1.6.4
CRAN release: 2023-10-02
It is now possible (again?) to read from a list of connections (@bairdj, #514).
Internal change for compatibility with cpp11 >= 0.4.6 (@DavisVaughan, #512).
vroom 1.6.1
CRAN release: 2023-01-22
str()
now works in a colorized context in the presence of a column of classinteger64
, i.e. parsed withcol_big_integer()
(@bart1, #477).The embedded implementation of the Grisu algorithm for printing floating point numbers now uses
snprintf()
instead ofsprintf()
and likewise for vroom’s own code (@jeroen, #480).
vroom 1.6.0
CRAN release: 2022-09-30
vroom(col_select=)
now handles column selection by numeric position whenid
column is provided (#455).vroom(id = "path", col_select = a:c)
is treated likevroom(id = "path", col_select = c(path, a:c))
. If anid
column is provided, it is automatically included in the output (#416).vroom_write(append = TRUE)
does not modify an existing file when appending an empty data frame. In particular, it does not overwrite (delete) the existing contents of that file (https://github.com/tidyverse/readr/issues/1408, #451).vroom::problems()
now defaults to.Last.value
for its primary input, similar to howreadr::problems()
works (#443).The warning that indicates the existence of parsing problems has been improved, which should make it easier for the user to follow-up (https://github.com/tidyverse/readr/issues/1322).
vroom()
reads more reliably from filepaths containing non-ascii characters, in a non-UTF-8 locale (#394, #438).vroom_format()
andvroom_write()
only quote values that contain a delimiter, quote, or newline. Specifically values that are equal to thena
string (or that start with it) are no longer quoted (#426).Fixed segfault when reading in multiple files and the first file has only a header row of column names, but subsequent files have at least one row (#430).
Fixed segfault when
vroom_format()
is given an empty data frame (#425)Fixed a segfault that could occur when the final field of the final line is missing and the file also does not end in a newline (#429).
Fixed recursive garbage collection error that could occur during
vroom_write()
whenoutput_column()
generates an ALTREP vector (#389).vroom_progress()
usesrlang::is_interactive()
instead ofbase::interactive()
.col_factor(levels = NULL)
honors thena
strings ofvroom()
and its owninclude_na
argument, as described in the docs, and now reproduces the behaviour of readr’s first edition parser (#396).
vroom 1.5.7
CRAN release: 2021-11-30
Jenny Bryan is now the official maintainer.
Fix uninitialized bool detected by CRAN’s UBSAN check (https://github.com/tidyverse/vroom/pull/386)
Fix buffer overflow when trying to parse an integer field that is over 64 characters long (https://github.com/tidyverse/readr/issues/1326)
Fix subset indexing when indexes span a file boundary multiple times (#383)
vroom 1.5.6
CRAN release: 2021-11-10
vroom(col_select=)
now works ifcol_names = FALSE
as intended (#381)vroom(n_max=)
now correctly handles cases when reading from a connection and the file does not end with a newline (https://github.com/tidyverse/readr/issues/1321)vroom()
no longer issues a spurious warning when the parsing needs to be restarted due to the presence of embedded newlines (https://github.com/tidyverse/readr/issues/1313)Fix performance issue when materializing subsetted vectors (#378)
vroom_format()
now uses the same internal multi-threaded code asvroom_write()
, improving its performance in most cases (#377)vroom_fwf()
no longer omits the last line if it does not end with a newline (https://github.com/tidyverse/readr/issues/1293)Empty files or files with only a header line and no data no longer cause a crash if read with multiple files (https://github.com/tidyverse/readr/issues/1297)
Files with a header but no contents, or a empty file if
col_names = FALSE
no longer cause a hang whenprogress = TRUE
(https://github.com/tidyverse/readr/issues/1297)Commented lines with comments at the end of lines no longer hang R (https://github.com/tidyverse/readr/issues/1309)
Comment lines containing unpaired quotes are no longer treated as unterminated quotations (https://github.com/tidyverse/readr/issues/1307)
Values with only a
Inf
orNaN
prefix but additional data afterwards, likeInform
or no longer inappropriately guessed as doubles (https://github.com/tidyverse/readr/issues/1319)Time types now support
%h
format to denote hour durations greater than 24, like readr (https://github.com/tidyverse/readr/issues/1312)Fix performance issue when materializing subsetted vectors (#378)
vroom 1.5.5
CRAN release: 2021-09-14
vroom()
now supports files with only carriage return newlines (\r
). (#360, https://github.com/tidyverse/readr/issues/1236)vroom()
now parses single digit datetimes more consistently as readr has done (https://github.com/tidyverse/readr/issues/1276)vroom()
now parsesInf
values as doubles (https://github.com/tidyverse/readr/issues/1283)vroom()
now parsesNaN
values as doubles (https://github.com/tidyverse/readr/issues/1277)VROOM_CONNECTION_SIZE
is now parsed as a double, which supports scientific notation (#364)vroom()
now works around specifying a\n
as the delimiter (#365, https://github.com/tidyverse/dplyr/issues/5977)vroom()
no longer crashes if given acol_name
andcol_type
both less than the number of columns (https://github.com/tidyverse/readr/issues/1271)vroom()
no longer hangs if given an empty value forlocale(grouping_mark=)
(https://github.com/tidyverse/readr/issues/1241)Fix performance regression when guessing with large numbers of rows (https://github.com/tidyverse/readr/issues/1267)
vroom 1.5.4
CRAN release: 2021-08-05
vroom(col_types=)
now accepts column type names like those accepted by utils::read.table. e.g. vroom::vroom(col_types = list(a = “integer”, b = “double”, c = “skip”))vroom()
now respects thequote
parameter properly in the first two lines of the file (https://github.com/tidyverse/readr/issues/1262)vroom_write()
now always correctly writes its output including column names in UTF-8 (https://github.com/tidyverse/readr/issues/1242)vroom_write()
now creates an empty file when given a input without any columns (https://github.com/tidyverse/readr/issues/1234)
vroom 1.5.3
CRAN release: 2021-07-14
vroom(col_types=)
now truncates the column types if the user passes too many types. (#355)vroom()
now always includes the last row when guessing (#352)vroom(trim_ws = TRUE)
now trims field content within quotes as well as without (#354). Previously vroom explicitly retained field content inside quotes regardless of the value oftrim_ws
.
vroom 1.5.2
CRAN release: 2021-07-08
vroom()
now supports inputs with unnamed column types that are less than the number of columns (#296)vroom()
now outputs the correct column names even in the presence of skipped columns (#293, tidyverse/readr#1215)vroom_fwf(n_max=)
now works as intended when the input is a connection.vroom()
andvroom_write()
now automatically detect the compression format regardless of the file extension for bzip2, xzip, gzip and zip files (#348)vroom()
andvroom_write()
now automatically support many more archive formats thanks to the archive package. These include new support for writing zip files, reading and writing 7zip, tar and ISO files.vroom(num_threads = 1)
will now not spawn any threads. This can be used on as a workaround on systems without full thread support.Threads are now automatically disabled on non-macOS systems compiling against clang’s libc++. Most systems non-macOS systems use the more common gcc libstdc++, so this should not effect most users.
vroom 1.5.0
CRAN release: 2021-06-14
Major improvements
New
vroom(show_col_types=)
argument to more simply control when column types are shown.vroom()
,vroom_fwf()
andvroom_lines()
now support multi-byte encodings such as UTF-16 and UTF-32 by converting these files to UTF-8 under the hood (#138)vroom()
now supports skipping comments and blank lines within data, not just at the start of the file (#294, #302)vroom()
now uses the tzdb package when parsing date-times (@DavisVaughan, #273)vroom()
now emits a warning of classvroom_parse_issue
if there are non-fatal parsing issues.vroom()
now emits a warning of classvroom_mismatched_column_name
if the user supplies a column type that does not match the name of a read column (#317).The vroom package now uses the MIT license, as part of systematic relicensing throughout the r-lib and tidyverse packages (#323)
Minor improvements and fixes
`vroom() correctly reads double values with comma as decimal separator (@kent37 #313)
vroom()
now correctly skips lines with only one quote if the format doesn’t use quoting (https://github.com/tidyverse/readr/issues/991#issuecomment-616378446)vroom()
andvroom_lines()
now handle files with mixed windows and POSIX line endings (https://github.com/tidyverse/readr/issues/1210)vroom()
now outputs a tibble with the expected number of columns and types based oncol_types
andcol_names
even if the file is empty (#297).vroom()
no longer mis-indexes files read from connections with windows line endings when the two line endings falls on separate sides of the read buffer (#331)vroom()
no longer crashes ifn_max = 0
andcol_names
is a character (#316)vroom()
now preserves the spec attribute when vroom and readr are both loaded (#303)vroom()
now allows specifying column names incol_types
that have been repaired (#311)vroom()
no longer inadvertently calls.name_repair
functions twice (#310).vroom()
is now more robust to quoting issues when tracking the CSV state (#301)vroom()
now registers the S3 class withmethods::setOldClass()
(r-dbi/DBI#345)col_datetime()
now supports ‘%s’ format, which represents decimal seconds since the Unix epoch.col_numeric()
now supportsgrouping_mark
anddecimal_mark
that are unicode characters, such as U+00A0 which is commonly used as the grouping mark for numbers in France (https://github.com/tidyverse/readr/issues/796).vroom_fwf()
gains askip_empty_rows
argument to skip empty lines (https://github.com/tidyverse/readr/issues/1211)vroom_fwf()
now respectsn_max
, as intended (#334)vroom_lines()
gains ana
argument.vroom_write_lines()
no longer escapes or quotes lines.vroom_write_lines()
now works as intended (#291).vroom_write(path=)
has been deprecated, in favor offile
, to match readr.vroom_write_lines()
now exposes thenum_threads
argument.problems()
now prints the correct row number of parse errors (#326)problems()
now throws a more informative error if called on a readr object (#308).problems()
now de-duplicates identical problems (#318)Fix an inadvertent performance regression when reading values (#309)
n_max
argument is correctly respected in edge cases (#306)factors with implicit levels now work when fields are quoted, as intended (#330)
Guessing double types no longer unconditionally ignores leading whitespace. Now whitespace is only ignored when
trim_ws
is set.
vroom 1.4.0
CRAN release: 2021-02-01
Major changes and new functions
vroom now tracks indexing and parsing errors like readr. The first time an issue is encountered a warning will be signaled. A tibble of all found problems can be retrieved with
vroom::problems()
. (#247)Data with newlines within quoted fields will now automatically revert to using a single thread and be properly read (#282)
NUL values in character data are now permitted, with a warning.
New
vroom_write_lines()
function to write a character vector to a file (#291)vroom_write()
gains aeol=
parameter to specify the end of line character(s) to use. Usevroom_write(eol = "\r\n")
to write a file with Windows style newlines (#263).
Minor improvements and fixes
Datetime formats used when guessing now match those used when parsing (#240)
Quotes are now only valid next to newlines or delimiters (#224)
vroom()
now signals an R error for invalid date and datetime formats, instead of crashing the session (#220).vroom(comment = )
now accepts multi-character comments (#286)vroom_lines()
now works with empty files (#285)Vectors are now subset properly when given invalid subscripts (#283)
vroom_write()
now works when the delimiter is empty, e.g.delim = ""
(#287).vroom_write()
now works with all ALTREP vectors, including string vectors (#270)An internal call to
new.env()
now correctly uses theparent
argument (#281)
vroom 1.3.0
CRAN release: 2020-08-14
The Rcpp dependency has been removed in favor of cpp11.
vroom()
now handles cases whenid
is set and a column in skipped (#237)vroom()
now supports column selections when there are some empty column names (#238)vroom()
argumentn_max
now works properly for files with windows newlines and no final newline (#244)Subsetting vectors now works with
View()
in RStudio if there are now rows to subset (#253).Subsetting datetime columns now works with
NA
indices (#236).
vroom 1.2.1
CRAN release: 2020-05-12
vroom()
now writes the column names if given an input with no rows (#213)vroom()
no longer truncates the last value in a file if the file contains windows newlines but no final newline (#219).vroom()
now works when thena
argument is encoded in non ASCII or UTF-8 locales and the file encoding is not the same as the native encoding (#233).vroom_fwf()
now verifies that the positions are valid, namely that the begin value is always less than the previous end (#217).vroom_lines()
gains alocale
argument so you can control the encoding of the file (#218)vroom_write()
now supports theappend
argument with R connections (#232)
vroom 1.2.0
CRAN release: 2020-01-13
Breaking changes
-
vroom_altrep_opts()
and the argumentvroom(altrep_opts =)
have been renamed tovroom_altrep()
andaltrep
respectively. The prior names have been deprecated.
New Features
vroom()
now supports reading Big Integer values with thebit64
package. Usecol_big_integer()
or the “I” shortcut to read a column as big integers. (#198)cols()
gains a.delim
argument andvroom()
now uses it as the delimiter if it is provided (#192)vroom()
now supports reading fromstdin()
directly, interpreted as the C-level standard input (#106).
Minor improvements and fixes
col_date
now parses single digit month and day (@edzer, #123, #170)fwf_empty()
now uses theskip
parameter, as intended.vroom()
can now read single line files without a terminal newline (#173).vroom()
now correctly copies string data for factor levels (#184)vroom()
no longer crashes when files have trailing fields, windows newlines and the file is not newline or null terminated.vroom()
now includes a spec object with thecol_types
class, as intended.vroom()
now better handles floating point values with very large exponents (#164).vroom()
now uses better heuristics to guess the delimiter and now throws an error if a delimiter cannot be guessed (#126, #141, #167).vroom()
now has an improved error message when a file does not exist (#169).vroom()
now outputs its messages onstdout()
rather thanstderr()
, which avoids the text being red in RStudio and in the Windows GUI.vroom()
no longer overflows when reading files with more than 2B entries (@wlattner, #183).vroom_fwf()
is now more robust if not all lines are the expected length (#78)vroom_fwf()
andfwf_empty()
now support passingInf
toguess_max()
.vroom_str()
now works with S4 objects.vroom_fwf()
now handles files with dos newlines properly.vroom_write()
now does not try to write anything when given empty inputs (#172).Dates, times, and datetimes now properly consider the locale when parsing.
Added benchmarks with wide data for both numeric and character data (#87, @R3myG)
The delimiter used for parsing is now shown in the message output (#95 @R3myG)
vroom 1.0.2
CRAN release: 2019-06-28
New Features
- The column created by
id
is now stored as an run length encoded Altrep vector, which uses less memory and is much faster for large inputs. (#111)
Minor improvements and fixes
vroom_lines()
now properly respects then_max
parameter (#142)vroom()
andvroom_lines()
now support reading files which do not end in newlines by using a file connection (#40).vroom_write()
now works with the standard output connectionstdout()
(#106).vroom_write()
no longer crashes non-deterministically when used on Altrep vectors.The integer parser now returns NA values for invalid inputs (#135)
Fix additional UBSAN issue in the mio project reported by CRAN (#97)
Fix indexing into connections with quoted fields (#119)
Move example files for
vroom()
out of\dontshow{}
.Fix missing columns and windows newlines (#114)
Throw an error message when writing a zip file, which is not supported (@metaOO, #145)
Default message output from
vroom()
now usesRows
andCols
(@meta00, #140)
vroom 1.0.1
CRAN release: 2019-05-14
New Features
-
vroom_lines()
function added, to (lazily) read lines from a file into a character vector (#90).
Minor improvements and fixes
Fix for a hang on Windows caused by a race condition in the progress bar (#98)
Remove accidental runtime dependency on testthat (#104)
Fix to actually return non-Altrep character columns on R 3.2, 3.3 and 3.4.
Disable colors in the progress bar when running in RStudio, to work around an issue where the progress bar would be garbled (https://github.com/rstudio/rstudio/issues/4777)
Fix for UBSAN issues reported by CRAN (#97)
Fix for rchk issues reported by CRAN (#94)
The progress bar now only updates every 10 milliseconds.
Getting started vignette index entry now more informative (#92)