[Rd] tar R command

Henrik Bengtsson hb at biostat.ucsf.edu
Mon Nov 29 05:35:47 CET 2010


First, if you look carefully, then you see that argument 'files'
should specify *filepaths*, i.e. directories and not specific files.
Thus, if you for instance place your files in directory "foo/" and
then call

tar("foo.tar", files="foo/");

you would do the right thing.

HOWEVER, looking at the internals of base::tar(), it seems to be
designed for a non-Windows platform, i.e. it will not work on Windows
as it stands (more below).  A workaround that also illustrating the
problems are the following patch(es):

# PATCH for file.info() such that tar() works on Windows
tar <- utils::tar; environment(tar) <- globalenv();
file.info <- function(...) {
  fi <- base::file.info(...);
  fi[setdiff(c("uid", "gid", "uname", "grname"), names(fi))] <- NA;
  fi;
} # file.info()

Example:

dir.create("foo/");
cat(file="foo/foo.txt", rep(letters, times=100));
tar("foo.tar", files="foo/");
str(file.info("foo.tar"));

'data.frame':   1 obs. of  11 variables:
 $ size  : num 7680
 $ isdir : logi FALSE
 $ mode  :Class 'octmode'  int 438
 $ mtime : POSIXct, format: "2010-11-28 20:24:05"
 $ ctime : POSIXct, format: "2010-11-28 20:03:56"
 $ atime : POSIXct, format: "2010-11-28 20:07:40"
 $ exe   : chr "no"
 $ uid   : logi NA
 $ gid   : logi NA
 $ uname : logi NA
 $ grname: logi NA

This seems to generate a valid foo.tar file.


PROBLEMS:
Here are a few problems I have identified with tar().

PROBLEM #1:
The default for argument files=NULL is documented "to archive all
files under the current directory".  In reality it gives:

  Error in list.files(files, recursive = TRUE, all.files = TRUE,
full.names = TRUE: invalid 'directory' argument

because list.files(NULL) is invalid.  The default should instead be files=".".

PROBLEM #2:
If passing a non-existing path (argument 'files'), then tar()
generates an invalid *.tar file of size 1024 bytes (not empty as OP
say).  Better would be to assert that each of the directories
requested really exists and are directories, e.g. using
file.info()$dir.

PROBLEM #3:
tar() assumes that file.info() returns a data.frame with fields 'uid',
'gid' and 'uname'.  That is not the case for file.info() on Windows.


> sessionInfo()
R version 2.12.0 Patched (2010-11-24 r53656)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

My $0.20

/Henrik


On Sun, Nov 28, 2010 at 7:00 PM, Dario Strbenac
<D.Strbenac at garvan.org.au> wrote:
> Hello,
>
> The documentation for the tar command leads me to think there is an internal implementation when the command can't be found in the OS.
>
> However, it doesn't seem to be the case, as I get an empty .tar file generated on a small example I made :
>
>> dir(pattern = "jpg")
> [1] "MA56237502_635.jpg"
>> file.info("MA56237502_635.jpg")
>                     size isdir mode               mtime               ctime               atime exe
> MA56237502_635.jpg 229831 FALSE  666 2010-11-29 13:05:49 2010-11-29 13:00:36 2010-11-29 13:00:36  no
>> tar("example.tar", files = dir(pattern = "jpg"))
>> file.info("example.tar")
>            size isdir mode               mtime               ctime               atime exe
> example.tar 1024 FALSE  666 2010-11-29 13:43:29 2010-11-29 13:42:30 2010-11-29 13:42:30  no
>
> Is this an unimplemented feature ?
>
>> sessionInfo()
> R version 2.12.0 (2010-10-15)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
> ...                ...               ...
>
> Thanks,
>       Dario.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list