[R] OK - I got the data - now what? :-)

Mark Knecht markknecht at gmail.com
Mon Jul 6 01:58:38 CEST 2009


On Sun, Jul 5, 2009 at 1:44 PM, hadley wickham<h.wickham at gmail.com> wrote:
>>   I think the root cause of a number of my coding problems in R right
>> now is my lack of skills in reading and grabbing portions of the data
>> out of arrays. I'm new at this. (And not a programmer) I need to find
>> some good examples to read and test on that subject. If I could locate
>> which column was called C1, then read row 3 from C1 up to the last
>> value before a 0, I'd have proper data to plot for one line. Repeat as
>> necessary through the array and I get all the lines. Doing the lines
>> one at a time should allow me the opportunity to apply color or not
>> plot based on values in the first few columns.
>>
>> Thanks,
>> Mark
>>
>> test <- data.frame(A=1:10, B=100, C1=runif(10), C2=runif(10),
>> C3=runif(10), C4=runif(10), C5=runif(10), C6=runif(10))
>> test<-round(test,2)
>>
>> #Make array ragged
>> test$C3[2]<-0;test$C4[2]<-0;test$C5[2]<-0;test$C6[2]<-0
>> test$C4[3]<-0;test$C5[3]<-0;test$C6[3]<-0
>> test$C6[7]<-0
>> test$C4[8]<-0;test$C5[8]<-0;test$C6[8]<-0
>>
>> #Print array
>> test
>
> Are the zeros always going to be arranged like this? i.e. for
> experiment there is a point at which all later values are zero?  If
> so, the following is a much simpler way of getting to the core of your
> data, without fussing with overly complicated matrix indexing:
>
> library(reshape)
> testm <- melt(test, id = c("A", "B"))
> subset(testm, value > 0)
>
> I suspect you will also find this form easier to plot and analyse.
>
> Hadley
>
> --
> http://had.co.nz/
>

Hi Hadley,
   I wanted to look at reshape.

   Yes, there exists a point in each row (unless I get to the end with
all numbers) where I get to a zero and everything to the right is
zero.

   I'm looking at ReShape. It's interesting but I clearly don't
understand it yet so I'm reading your ReShaping data with the reshap
package form 11/07. Interesting.

   I know so little about R that I'm sort of drowning at this point
that it's hard for me to understand why this would make plotting
easier. Analysis possibly. Just the way it goes when you get started
with something new.

   In ReShape lingo I think I have ID's. They cover things like time,
date, success/failure and a few other things of interest. Once the
data starts on a row it is all data from there on to the end of the
row.

   My initial goal is to make a line plot of the data on a single row.
All the data points should connect together. There is no real
interaction planned with data on other rows, at least at this time.

   Thanks for the pointers and the code stub. I'll be looking at this.

Cheers,
Mark




More information about the R-help mailing list