[R] How to reach the column names in a huge .RData file without loading it

Richard M. Heiberger rmh at temple.edu
Wed Mar 16 18:38:40 CET 2016


Barry's solution works with Windows without cygwin.
You do need Rtools, available from the Windows page on CRAN

Rtools does not have "gunzip", but that is just an abbreviation for "gzip -d".

x:\HOME\rmh\HH-R.package>path
path
PATH=c:\Progra~2\Rtools\bin;c:\Progra~2\Rtools\gcc-4.6.3\bin;c:\progra~1\R\R-3.2.3\bin\x64;c:\Progra~1\MikTeX~1.9\miktex\bin\x64;c:\windows;c:\windows\system32

x:\HOME\rmh\HH-R.package>gzip -d -c
c:\Users\rmh.DESKTOP-60G4CCO\test.RData | strings -t d
gzip -d -c c:\Users\rmh.DESKTOP-60G4CCO\test.RData | strings -t d
      0 RDX2
     35 mydataframe
    230 names
    251 mylongnamehere
    273 anotherlongname
    314 aasdkjhasdkjhaskdj
    347 row.names
    389 class
    410 data.frame

On Wed, Mar 16, 2016 at 1:17 PM, Barry Rowlingson
<b.rowlingson at lancaster.ac.uk> wrote:
> You *might* be able to get them from the raw file...
>
> First, I don't quite know what "colnames" of an .RData file means.
> "colnames" are the column names of a matrix (or data frame), so I'll
> assume your .RData file contains exactly one data frame and you want
> to column names of it.
>
> So let's create one of those:
>
>
> mydataframe = data.frame(mylongnamehere=runif(3),
> anotherlongname=runif(3), z=runif(3), y=runif(3),
> aasdkjhasdkjhaskdj=runif(3))
> save(mydataframe, file="./test.RData")
>
> Now I'm going to use some Unix utilities to see if there's any
> identifiable strings in the file. .RData files are by default
> compressed using `gzip`, so I'll `gunzip` them and pipe it into
> `strings`:
>
> $ gunzip -c test.RData | strings -t d
>       0 RDX2
>      35 mydataframe
>     230 names
>     251 mylongnamehere
>     273 anotherlongname
>     314 aasdkjhasdkjhaskdj
>     347     row.names
>     389 class
>     410 data.frame
>
>
>   - thats found the object name (mydataframe) and most of the column
> names except the short ones, which are too short for `strings` to
> recognise. But if your names are long enough (4 or more chars, I
> think) they'll show up.
>
>  Of course you'll have to filter them out from all the other string
> output, but they should all appear shortly after the word "names",
> since the colnames of a data frame are the "names" attribute of the
> data.
>
>  If you don't have a Unix or Mac machine handy you can get these
> utilities on Windows via Cygwin but that's another story...
>
>  Barry
>
>
>
>
>
>
>
>
> On Wed, Mar 16, 2016 at 3:59 PM, Lida Zeighami <lid.zigh at gmail.com> wrote:
>> Hi,
>> I have a huge .RData file and I need just to get the colnames of it. so is
>> there any way to reach the column names without loading or reading the
>> whole file?
>> Since the file is so big and I need to repeat this process several times,
>> so it takes so long to load the file first and then take the colnames!
>>
>> Thanks
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list