[R] Bug in internal 'tar' implementation?

David Engster deng @end|ng |rom r@ndom@@mp|e@de
Tue Jan 31 08:38:53 CET 2023


I think I found a bug in the internal implementation of 'tar', but
before bothering the R maintainers, I was advised to ask here to make
sure I'm not missing something.

Fortunately, it can be very easily reproduced on a Linux system. In an
empty temporary directory, execute the following code:

cat("foobar", file="test.txt")
file.symlink("test.txt", "test_link.txt")
tar("test.tar", c("test_link.txt", "test.txt"), tar="internal")
system2("tar", c("tf", "test.tar"))

This file create a file "test.txt" and a symbolic link "test_link.txt"
pointing to that file. Those two are then put into "test.tar" using R's
internal tar implementation, and then the system's 'tar' binary (usually
GNU tar) will be used to display the contents of that archive.

On my system (Debian 11, GNU tar 1.34), this gives me the following
output:

[1] TRUE
test_link.txt
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

Not that *extracting* the archive with 'tar xf' (fortunately) works
fine, it's just displaying its contents that fails. After looking into
the hexdump of 'test.tar' and R's internal tar() code, I found out the
reason for this is that a wrong size for the link is put into the tar
header: it should be zero, but the size of the linked file is put in
there instead. This leads to 'tar tf' jumping over too many blocks after
displaying the link filename and hence aborting.

While I'm aware the 'tar()' help says to avoid links for portability
reasons, it also says that it supports symbolic links on OSes that
support them, which Linux of course does, so do you agree this should be
fixed? (It's a very simple one-line change.)

Best,
David



More information about the R-help mailing list