Problem
When working with large .sav files, I want to both zip them (to reduce their storage space) and read them in with col_select (to reduce their space in memory). Unfortunately, when doing both (reading a zipped file with col_select), haven::read_sav breaks.
Reprex
data <- tibble::tibble(x = 1:5, y = letters[1:5])
haven::write_sav(data, path="test.sav")
zip("test.zip", "test.sav")
# this works
test_works <- haven::read_sav(
unz("test.zip", "test.sav")
)
# but this fails
test_fails <- haven::read_sav(
unz("test.zip", "test.sav"),
col_select = c(x)
)
# Error in isOpen(con) : invalid connection
Workaround
There's a simple workaround:
unz("test.zip", "test.sav")
haven::read_sav(
"test.sav",
col_select = c(x)
)
file.remove("test.sav")
but I do this often enough that I'd appreciate it working natively in haven without a function implementing the above workaround.
Problem
When working with large .sav files, I want to both zip them (to reduce their storage space) and read them in with
col_select(to reduce their space in memory). Unfortunately, when doing both (reading a zipped file withcol_select),haven::read_savbreaks.Reprex
Workaround
There's a simple workaround:
but I do this often enough that I'd appreciate it working natively in haven without a function implementing the above workaround.