[R] Read data in sequences

Peter Ehlers ehlers at ucalgary.ca
Sat Apr 10 19:17:27 CEST 2010


On 2010-04-09 10:04, Dieter Menne wrote:
>
>
> RockO wrote:
>>
>> I tried to find a solution in the search list, but I cannot find it. I
>> would like to read a .txt file with, let say, three variables, with two of
>> which have repeated values in a number a columns.
>>
>> The variables: Treat, x1, x2.
>> The values:
>> A 2.5 3.4 2.7 5.6 5.7 5.4 10.1 9.4 ...
>> B 5.3 5.4 6.5 7.5 1.3 4.5 10.5 4.1 ...
>> ...
>>
>> In the first column, the letters represent the variable "Treat", and the
>> sequence of numbers on a same line represent pairs of values for "x1" and
>> "x2".
>>
>>
>
> Looks like SAS is quite elegant here (don't kill me, I could not afford
> using SAS, R has save my retirement fund).
>
> I would first read it in "as usual", and do the reformatting later.
>
> library(reshape)
> wide = read.table("wideseq.txt",sep=" ")
> # renames columns
> names(wide) =  c("varname",rep(c("x1","x2"),ncol(wide)%/%2))
> str(wide)
> melt(wide)
>
> Now you have the long format, which is not exactly what you want, but
> typically much more useful in R than the format you require. You might use
> one of the function in package reshape to get your format.

Almost. But you're a bit too clever with the names() because
melt() will just re-read the first x1, x2 values.

wide <- as above ...
dm <- melt(wide)
# replace dm$variable with appropriate factor
dm$variable <- gl(2, 2, nrow(dm), c('x1','x2'))
# choose names for columns
names(dm) <- c('Treat', 'x', 'value')

I agree that this 'long' format is usually most useful for
further analysis.

  -Peter

-- 
Peter Ehlers
University of Calgary



More information about the R-help mailing list