[R] dataframe question

David Winsemius dwinsemius at comcast.net
Mon Feb 8 05:38:22 CET 2010


On Feb 7, 2010, at 11:15 PM, Vadlamani, Satish {FLNA} wrote:

> David:
> Thanks for the idea. Both the one that you suggested and the one  
> that Bill Venables suggested are very good. Unfortunately, this  
> statement is creating out of memory issues like below (system  
> limitations).
>
> When I had padded white space before the number, read.csv.sql is  
> correctly treating it as a factor. I am going to take out the  
> padding so that it treats it as numeric and then I can proceed with  
> further steps.

Idea: Write the dataframe and all other useful data to a csv or tab  
delimited file. Save all other useful data as well. Exit without  
saving the workspace. Restart and read data in with correct format  
using colClasses argument.

three_wk_out <- read.csv(file= "somename.csv", colClasses =  
rep("numeric", 209) )

Of course if it's that big, you may have problems doing anything  
useful with it in the space you have available. Details of your  
machine would be helpful, especially if you are using one of the  
Windows variant and have 4 GB of physical memory. There is information  
about this condition in the R-Win FAQ.

-- 
David.


>
> Satish
>
> Out of memory warning
> Reached total allocation of 1535Mb: see help(memory.size)
> 34: In ans[[i]] <- tmp :
>  Reached total allocation of 1535Mb: see help(memory.size)
>
>>> Bill Venable's suggestion below
>
> week_list <- paste("wk", 1:209, sep="")
> ### no need for c(...)
>
> for(week in week_list)
> 	three_wk_out[[week]] <- as.numeric(three_wk_out[[week]])
>
> ### no need for '{...}'
>
> Bill Venables
> CSIRO/CMIS Cleveland Laboratories
>
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Sunday, February 07, 2010 8:51 PM
> To: Vadlamani, Satish {FLNA}
> Cc: r-help at r-project.org help
> Subject: Re: [R] dataframe question
>
>
> On Feb 7, 2010, at 8:14 PM, David Winsemius wrote:
>
>>
>> On Feb 7, 2010, at 7:51 PM, Vadlamani, Satish {FLNA} wrote:
>>
>>> Folks:
>>> Good day. Please see the code below. three_wk_out is a dataframe
>>> with columns wk1 through wk209. I want to change the format of the
>>> columns. I am trying the code below but it does not work.  I need
>>> $week in the for loop interpreted as wk1, wk2, etc. Could you
>>> please help? Thanks.
>>> Satish
>>>
>>> R code below
>>> week_list <- paste("wk",c(1:209),sep="")
>>
>>
>> Or more "functionally":
>>
>> three_wk_out <- as.data.frame( lapply(three_wk_out, some_function) )
>
> Or if you wanted to just change the particular columns that matched
> the "wk" pattern:
>
> idx <- grep("wk", names(three_wk_out))
> three_wk_out[, idx ] <- apply( three_wk_out[, idx ], 2, as.numeric)
>
>
> (I probably should have used apply( ___ , 2,  fn) in the prior effort
> rather than coercing a list back to a dataframe.)
>
>
>>
>> E.g.:
>>>
>
>> a b c x
>> 1 1 0 0 1
>> 2 2 3 2 4
>> 3 1 2 1 5
>> 4 2 0 3 2
>>
>>> df <- as.data.frame(lapply(df, "^", 2))
>>> df
>>  a  b  c   x
>> 1  1  0  0   1
>> 2 16 81 16 256
>> 3  1 16  1 625
>> 4 16  0 81  16
>>
>>
>>> for (week in week_list)
>>> {
>>>      three_wk_out$week <- as.numeric(three_wk_out$week)
>>> }
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list