[R] saving and reading csv and rda files

R. Michael Weylandt michael.weylandt at gmail.com
Mon Aug 27 22:04:03 CEST 2012


On Mon, Aug 27, 2012 at 9:56 AM, Alok Bohara, PhD <bohara at unm.edu> wrote:
> Hi:
>
> I am trying to understand the link between ".csv" and  ".rda" files.    Is
> there any easy to follow tutorial on this?
>
> (I could do some of the operations below, but I got lost in the details.)
>
> 1.  Reading .rda file ?
>
> data <- load("profit.rda")         # supposed to have four variable --y x1
> x2 state_name
>
> --how do I find out about the variable names,
> --take the log of y and x1
> --extract y and calculate mean etc..
>
> (I want to use it in lm regression lny = f(lnx1,x2))

I personally find load() a little dangerous. It reloads objects with
the names they had when you saved them, potentially clobbering other
objects by that name. I'd recommend saveRDS/readRDS personally since
they don't clobber (they instead do what I think you think load()
does). That said,

data <- load(...) actually doesn't put the variables in data; rather
it puts them into the global envir by their original name and puts the
names as strings into data. I'd guess if you type data, you'll find
something like

> data
[1] "y" "x1" "x2", "state_name"

You probably don't want to use "data" as a name in real code, because
it's the name of an important function used to, not surprisingly, load
data.

As far as the model fitting code, that should be covered in most basic
R tutorials, but you're looking for something like

lm( log(y) ~ x1)

Finally, I think you're a little tripped up on the multiple uses of
the term "names". "name" most commonly refers to variable name of an
object, but you seem to be confusing it with column names, which are
often recording what the variables in a data set are. The data set as
a unit has a "name" and then each of the recorded variates corresponds
to a column name. when you read/write csv files with header = TRUE,
the text keeps the column names, not the R object names.


>
> ****************
>
> 2. How could I save this profit.rda file as a csv file  with the variable
> names attached?
> I tried doing this:
>
> profit_data <- load("profit.rda")

Again, this doesn't work because load() doesn't return the loaded
objects, only the name.

>
> #Could I do this?
> write.csv(profit_data, file="profit.csv", col.names=TRUE)
> data2 <- read.csv("profit.csv", head = TRUE)

This will work in that profit_data == data2, but profit_data isn't the
desired result of load("profit.rda").

>
> # saving .rda file without the header?
> write.csv(profit_data, file="profit2.csv", col.names=FALSE)
>



> ***********
>
> 3. Creating a .rda file  from a csv file  using the save command
>
> data2 <- read.csv("poverty.csv", head = T, sep = ",")   # poverty.csv file
> has 4 variables -- Q L K  country_name
> save(data2, file="poverty.rda")
>
> --How do I  attach names from the csv file to this .rda file?

Take a look at data2: it probably has "Q","L","K","country_name" as
_column_ names (not the same as object names). If it does, the
save/load cycle will preserve them.

Cheers,
Michael

>
>
>
> Best,
> Alok Bohara
> UNM
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list