[R] read.table input array

David Winsemius dwinsemius at comcast.net
Sat Oct 23 07:39:44 CEST 2010


On Oct 22, 2010, at 8:40 PM, Balpo wrote:

> Hello Jim
>
> How can I ensure this reading it from the file?
> The thing is, I cannot use the textConnection(), because the real  
> file has millions of rows.

The advice being offered is not that you use textConnection. That is  
just for illustration. You would put a file="filename" argument in its  
place. Pay attention to the arguments for read.table().

-- 
David.

>
> Thank you,
>
> Balpo
>
> On 22/10/10 21:34, jim holtman wrote:
>> You need to make sure that your data is read in a characters and  
>> not factors:
>>
>>> x<- read.table(textConnection("ktot attractors  
>>> pctstatesinattractors t lengths
>> + 1.0 2.0 3.8146973E-4 17 c(2,2)
>> + 1.0 1.0 5.722046E-4 28 c(2)
>> + 1.0 2.0 9.536743E-4 18 c(2,2)
>> + 1.0 1.0 0.0010490417 14 c(1)"), as.is = TRUE, header = TRUE)
>>> closeAllConnections()
>>> str(x)
>> 'data.frame':   4 obs. of  5 variables:
>>  $ ktot                 : num  1 1 1 1
>>  $ attractors           : num  2 1 2 1
>>  $ pctstatesinattractors: num  0.000381 0.000572 0.000954 0.001049
>>  $ t                    : int  17 28 18 14
>>  $ lengths              : chr  "c(2,2)" "c(2)" "c(2,2)" "c(1)"
>>> x
>>   ktot attractors pctstatesinattractors  t lengths
>> 1    1          2          0.0003814697 17  c(2,2)
>> 2    1          1          0.0005722046 28    c(2)
>> 3    1          2          0.0009536743 18  c(2,2)
>> 4    1          1          0.0010490417 14    c(1)
>>> x$varList<- lapply(x$lengths, function(a) mean(eval(parse(text=a))))
>>> x
>>   ktot attractors pctstatesinattractors  t lengths varList
>> 1    1          2          0.0003814697 17  c(2,2)       2
>> 2    1          1          0.0005722046 28    c(2)       2
>> 3    1          2          0.0009536743 18  c(2,2)       2
>> 4    1          1          0.0010490417 14    c(1)       1
>>
>> On Fri, Oct 22, 2010 at 10:19 PM, Balpo<balpo at gmx.net>  wrote:
>>> Hello again Jim (and everyone)
>>> I am having a weird problem here with the same parsing thing.
>>> For example, for the first row I have the following 5 columns.
>>>
>>> 1.0    2.0        3.8146973E-4        17    c(2,2)
>>>
>>> I need to convert that c(2,2) into a list and get its mean, in this
>>> particular case mean=2. My program does:
>>>
>>> t1<- read.table(file="file.dat", header=T, colClasses=c("numeric",
>>> "numeric", "numeric", "numeric", "factor"))
>>> t1$lengthz<- lapply(t1$lengths, function(a)  
>>> eval(parse(text=a)))#As Jim
>>> thought me
>>> t1$avglen<- as.vector(mode="numeric", lapply(t1$lengthz, function(i)
>>> mean(i)))
>>>
>>> but the 6th column is strangely getting 780 instead of 2.
>>> This solution used to work! :-(
>>> Do you have any idea about what is going on?
>>>
>>> I attach file.dat.
>>>
>>> Thank you for your support.
>>>
>>> Balpo
>>>
>>>
>>> On 19/07/10 16:38, Balpo wrote:
>>>>  Thank you a lot, Jim.
>>>> Issue solved.
>>>>
>>>> Balpo
>>>>
>>>> On 16/07/10 11:27, jim holtman wrote:
>>>>> Here is a way of creating a separate list of variable length  
>>>>> vectors
>>>>> that you can use in your processing:
>>>>>
>>>>>> # read into a dataframe
>>>>>> x<- read.table(textConnection("A    B    C    T    Lengths
>>>>> + 1    4.0    0.0015258789    18    c(1,2,3)
>>>>> + 1    4.0    0.0015258789    18    c(1,2,6,7,8,3)
>>>>> + 1    4.0    0.0015258789    18    c(1,2,3,1,2,3,4,5,6,7,9)
>>>>> + 1    4.0    0.0015258789    18    c(1,2,3)
>>>>> + 1    1.0    0.0017166138    24    c(1,1,4)"), header=TRUE)
>>>>>> # create a  'list' with the variable length vectors
>>>>>> # assuming the the "Lengths" are legal R expressions using 'c'
>>>>>> x$varList<- lapply(x$Lengths, function(a) eval(parse(text=a)))
>>>>>>
>>>>>> x
>>>>>   A B           C  T                  Lengths
>>>>> varList
>>>>> 1 1 4 0.001525879 18                  
>>>>> c(1,2,3)                         1,
>>>>> 2, 3
>>>>> 2 1 4 0.001525879 18           c(1,2,6,7,8,3)                1,  
>>>>> 2, 6, 7,
>>>>> 8, 3
>>>>> 3 1 4 0.001525879 18 c(1,2,3,1,2,3,4,5,6,7,9) 1, 2, 3, 1, 2, 3,  
>>>>> 4, 5, 6,
>>>>> 7, 9
>>>>> 4 1 4 0.001525879 18                  
>>>>> c(1,2,3)                         1,
>>>>> 2, 3
>>>>> 5 1 1 0.001716614 24                  
>>>>> c(1,1,4)                         1,
>>>>> 1, 4
>>>>>> str(x)
>>>>> 'data.frame':   5 obs. of  6 variables:
>>>>>  $ A      : int  1 1 1 1 1
>>>>>  $ B      : num  4 4 4 4 1
>>>>>  $ C      : num  0.00153 0.00153 0.00153 0.00153 0.00172
>>>>>  $ T      : int  18 18 18 18 24
>>>>>  $ Lengths: Factor w/ 4 levels "c(1,1,4)","c(1,2,3)",..: 2 4 3 2 1
>>>>>  $ varList:List of 5
>>>>>   ..$ : num  1 2 3
>>>>>   ..$ : num  1 2 6 7 8 3
>>>>>   ..$ : num  1 2 3 1 2 3 4 5 6 7 ...
>>>>>   ..$ : num  1 2 3
>>>>>   ..$ : num  1 1 4
>>>>> On Fri, Jul 16, 2010 at 10:51 AM, Balpo<balpo at gmx.net>    wrote:
>>>>>>  Hello to all!
>>>>>> I am new with R and I need your help.
>>>>>> I'm trying to read a file which contests are similar to this:
>>>>>> A    B    C    T    Lengths
>>>>>> 1    4.0    0.0015258789    18    c(1,2,3)
>>>>>> 1    1.0    0.0017166138    24    c(1,1,4)
>>>>>>
>>>>>> So all the columns are numeric values, except Lengths, which is  
>>>>>> supposed
>>>>>> to
>>>>>> be an variable length array of integers.
>>>>>> How can I make R read them as arrays of integers? Or otherwise,  
>>>>>> convert
>>>>>> the
>>>>>> character array to an array of integers.
>>>>>> When I read the file, I do it like this
>>>>>> t1 = read.table(file=paste("./borrar.dat",sep=""), header=T,
>>>>>> colClasses=c("numeric", "numeric", "numeric", "numeric",  
>>>>>> "array"))
>>>>>> But the 5th column is treated as an array of characters, and  
>>>>>> when trying
>>>>>> to
>>>>>> convert it to another class of data, I either
>>>>>> get two strings "c(1,2,3)" and "c(1,1,4)" or using a toRaw  
>>>>>> converter, I
>>>>>> get
>>>>>> the corresponding ASCII ¿? values.
>>>>>> Should the input be modified in order to be able to read it as  
>>>>>> an array
>>>>>> of
>>>>>> integers?
>>>>>>
>>>>>> Thank you for your help.
>>>>>> Balpo
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible  
>>>>>> code.
>>>>>>
>>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list