[R] Problem in 'Apply' function: does anybody have other solution

David Winsemius dwinsemius at comcast.net
Wed Jun 17 17:02:37 CEST 2009


On Jun 17, 2009, at 9:27 AM, jim holtman wrote:

> Do an 'str' of your object.  It looks like one of the columns is  
> probably
> character/factor since there are quotes around the 'numbers'.  You  
> can also
> explicity convert the offending columns to numeric is you want to.   
> Also use
> colClasses on the read.csv to define the class of the data in each  
> column.
> This will should you where the error is.

One function that might be of use is data.matrix which will attempt to  
convert character vectors to numeric vectors across an entire  
dataframe. I hope this is not beating a dead horse, but see if these  
examples are helpful in any way:

 > ?data.matrix
 > df <- data.frame(x=1:10,y=as.character(1:10))
 > df
     x  y
1   1  1
2   2  2
3   3  3
4   4  4
5   5  5
6   6  6
7   7  7
8   8  8
9   9  9
10 10 10    # .... not all is as it seems
 > apply(df,1,I)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x " 1" " 2" " 3" " 4" " 5" " 6" " 7" " 8" " 9" "10"
y "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
 > df2 <- data.frame(x=1:10,y=1:10)
 > apply(df2,1,I)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x    1    2    3    4    5    6    7    8    9    10
y    1    2    3    4    5    6    7    8    9    10
 > str(df)
'data.frame':	10 obs. of  2 variables:
  $ x: int  1 2 3 4 5 6 7 8 9 10
  $ y: Factor w/ 10 levels "1","10","2","3",..: 1 3 4 5 6 7 8 9 10 2

# so that's weird. y isn't even a character vector !?!? Such are the  
strange beasts called factors.

# solution? or at least one strategy

 > apply(data.matrix(df), 1, I)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x    1    2    3    4    5    6    7    8    9    10
y    1    3    4    5    6    7    8    9   10     2


>
> On Wed, Jun 17, 2009 at 7:41 AM, suparna mitra <
> mitra at informatik.uni-tuebingen.de> wrote:
>
>> Dear All,
>> Just to add some more lines in my previous query I am writing this.  
>> I was
>> checking with several data. The cases where the apply function is  
>> working,
>> the part of result looks like :
>>
>>> apply(Species.all[1:10,],1,max,na.rm=TRUE)
>>   1     2     3     4     5     6     7     8     9    10
>> 22392    45    45    45    14    25    25   753   101    10
>>
>> and with the problematic data it looks like:
>>
>>> apply(Species.all[1:10,],1,max,na.rm=TRUE)
>>    1      2      3      4      5      6      7      8      9     10
>> "7286" "3258" "1024"  " 45"  " 45"  " 45"   " 9"  " 25"  " 25" " 753"
>>
>> But my all the datasets are in CSV format. I am reading those  
>> datasets as
>> read.csv or read.delim
>> Can anybody please suggest me how to this problem?
>> Thanks and regards,
>> Suparna.
>>
>>
>> On Wed, Jun 17, 2009 at 1:14 PM, suparna mitra <
>> suparna.mitra at googlemail.com
>>> wrote:
>>
>>> Dear All,
>>>  I am having some problem in apply function.
>>> I have some data like below. I want to get a range vector (which is
>> max-min
>>> value for each row , ignoring NA values.)
>>>> Species.all[1:10,]
>>>       V2     V3     V4     V5    V6   V7    V8   V9
>>> 1   57543  55938  47175  54922 36032 5785 29497 7286
>>> 2   42364  40472  29887  40107 19723 2691 14445 3258
>>> 3   19461  19646  18538  22392  6744  794  4919 1024
>>> 4      45     41     28     34    33   NA    26   NA
>>> 5      45     41     28     34    33   NA    26   NA
>>> 6      45     41     28     34    33   NA    26   NA
>>> 7      14      9     14     14     7   NA    10   NA
>>> 8      20     25     10     15    21   NA    10   NA
>>> 9      20     25     10     15    21   NA    10   NA
>>> 10    578    566    478    753   361  150   262  170
>>>> dim(Species.all)
>>> [1] 1862    8
>>>
>>> I used apply function like below. I used this same function for some
>> other
>>> data, there it worked. But here its not working (giving error  
>>> message).
>>>
>>>> Range.j=apply(Species.all,1,max,na.rm =
>>> TRUE)-apply(Species.all,1,min,na.rm = TRUE)
>>> Error in apply(Species.all, 1, max, na.rm = TRUE) -  
>>> apply(Species.all,  :
>>>    non-numeric argument to binary operator
>>>
>>> When i tried to check, you can see from the steps it is giving  
>>> totally
>>> wrong results.
>>>
>>>> apply(Species.all[1:10,],1,max)
>>>     1      2      3      4      5      6      7      8      9     10
>>> "7286" "3258" "1024"     NA     NA     NA     NA     NA     NA "  
>>> 753"
>>>> apply(Species.all[1:10,],1,min)
>>>       1        2        3        4        5        6        7
>>> 8        9       10
>>> " 47175" " 29887" " 18538"       NA       NA       NA       NA
>>> NA       NA  "  262"
>>>
>>>
>>> Main problem is, this code is working for some cases, but not for  
>>> all.
>> Does
>>> any body have an idea, why it is so? Or can anyone show me some  
>>> other way
>> to
>>> do the same.
>>> Thanks in advance,
>>> With best regard,
>>> Suparna
>>>
>


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list