Skip to content

read_sav fails on zipped files when using col_select #720

@jokroese

Description

@jokroese

Problem

When working with large .sav files, I want to both zip them (to reduce their storage space) and read them in with col_select (to reduce their space in memory). Unfortunately, when doing both (reading a zipped file with col_select), haven::read_sav breaks.

Reprex

data <- tibble::tibble(x = 1:5, y = letters[1:5])

haven::write_sav(data, path="test.sav")

zip("test.zip", "test.sav")

# this works
test_works <- haven::read_sav(
  unz("test.zip", "test.sav")
)

# but this fails
test_fails <- haven::read_sav(
  unz("test.zip", "test.sav"),
  col_select = c(x)
)
# Error in isOpen(con) : invalid connection

Workaround

There's a simple workaround:

unz("test.zip", "test.sav")
haven::read_sav(
  "test.sav",
  col_select = c(x)
)
file.remove("test.sav")

but I do this often enough that I'd appreciate it working natively in haven without a function implementing the above workaround.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions