[R] row-wise conditional update in dataframe

Jon Erik Ween jween at klaru-baycrest.on.ca
Tue Jan 22 03:10:58 CET 2008


Thanks Jim

That got me there. I suppose R prefers absolute field references in  
scripts rather than macrosubstitutions of field names like you would  
do in pearl or shell scripts?

Anyway, thanks for you help.

Cheers

Jon


Soli Deo Gloria

Jon Erik Ween, MD, MS
Scientist, Kunin-Lunenfeld Applied Research Unit
Director, Stroke Clinic, Brain Health Clinic
     Baycrest Centre for Geriatric Care
Assistant Professor, Dept. of Medicine, Div. of Neurology
     University of Toronto Faculty of Medicine

Posluns Building, 6th Floor, Room 644
Baycrest Centre for Geriatric Care
3560 Bathurst Street
Toronto, Ontario M6A 2E1
Canada

Phone: 416-785-2500 x3636
Fax: 416-785-2484
Email: jween at klaru-baycrest.on.ca


Confidential: This communication and any attachment(s) may contain  
confidential or privileged information and is intended solely for the  
address(es) or the entity representing the recipient(s). If you have  
received this information in error, you are hereby advised to destroy  
the document and any attachment(s), make no copies of same and inform  
the sender immediately of the error. Any unauthorized use or  
disclosure of this information is strictly prohibited.


On 21-Jan-08, at 8:57 PM, jim holtman wrote:

> If you only want a subset, then use that in the function:
>
> Dataset.target <- apply(x,1,function(.row) sum(is.na(.row[3:8])))
>
> This will put it back in column1:
>
>> x <- matrix(1,10,10)
>> x[sample(1:100,10)] <- NA
>> x[,1] <- 0  # make sure column 1 has no NAs so sums are correct
>> x
>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>  [1,]    0    1   NA    1    1    1    1   NA    1     1
>  [2,]    0    1    1    1    1    1    1    1    1     1
>  [3,]    0    1    1    1    1    1    1    1    1    NA
>  [4,]    0    1    1    1    1    1    1   NA    1     1
>  [5,]    0    1    1   NA    1    1    1    1    1     1
>  [6,]    0    1    1    1    1    1    1    1    1     1
>  [7,]    0    1    1    1    1    1    1    1    1     1
>  [8,]    0   NA    1   NA   NA    1   NA    1    1    NA
>  [9,]    0    1    1    1    1    1    1    1    1     1
> [10,]    0    1    1    1    1    1    1    1    1     1
>> # get the sums of NA in 3:8 and put in column 1
>> x[,1] <- apply(x, 1, function(.row) sum(is.na(.row[3:8])))
>>
>>
>> x
>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>  [1,]    2    1   NA    1    1    1    1   NA    1     1
>  [2,]    0    1    1    1    1    1    1    1    1     1
>  [3,]    0    1    1    1    1    1    1    1    1    NA
>  [4,]    1    1    1    1    1    1    1   NA    1     1
>  [5,]    1    1    1   NA    1    1    1    1    1     1
>  [6,]    0    1    1    1    1    1    1    1    1     1
>  [7,]    0    1    1    1    1    1    1    1    1     1
>  [8,]    3   NA    1   NA   NA    1   NA    1    1    NA
>  [9,]    0    1    1    1    1    1    1    1    1     1
> [10,]    0    1    1    1    1    1    1    1    1     1
>>
>
>
> On Jan 21, 2008 8:47 PM, Jon Erik Ween <jween at klaru-baycrest.on.ca>  
> wrote:
>> Thanks  Jim
>>
>> I see how this works. Problem is, I need to interrogate only a subset
>> of fields. In your example, I need to put the total number of "NA"
>> fields out of fields 3..8, excluding 1,2 9 10. Also, I don't see how
>> the method inserts the sum into a particular field in a row. I guess
>> you could do
>>
>> Dataset.target <- apply(x,1,function(.row) sum(is.na(.row)))
>>
>> Thanks
>>
>> Jon
>>
>>
>> Soli Deo Gloria
>>
>> Jon Erik Ween, MD, MS
>> Scientist, Kunin-Lunenfeld Applied Research Unit
>> Director, Stroke Clinic, Brain Health Clinic
>>     Baycrest Centre for Geriatric Care
>> Assistant Professor, Dept. of Medicine, Div. of Neurology
>>     University of Toronto Faculty of Medicine
>>
>> Posluns Building, 6th Floor, Room 644
>> Baycrest Centre for Geriatric Care
>> 3560 Bathurst Street
>> Toronto, Ontario M6A 2E1
>> Canada
>>
>> Phone: 416-785-2500 x3636
>> Fax: 416-785-2484
>> Email: jween at klaru-baycrest.on.ca
>>
>>
>> Confidential: This communication and any attachment(s) may contain
>> confidential or privileged information and is intended solely for the
>> address(es) or the entity representing the recipient(s). If you have
>> received this information in error, you are hereby advised to destroy
>> the document and any attachment(s), make no copies of same and inform
>> the sender immediately of the error. Any unauthorized use or
>> disclosure of this information is strictly prohibited.
>>
>>
>>
>> On 21-Jan-08, at 8:28 PM, jim holtman wrote:
>>
>>> You need to do 'is.na(x)' instead of "x == NA"..  Here is a way of
>>> doing it:
>>>
>>>> x <- matrix(1,10,10)
>>>> x[sample(1:100,10)] <- NA
>>>> x
>>>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>>>  [1,]    1    1    1    1    1    1    1    1    1     1
>>>  [2,]    1    1    1    1    1    1   NA    1    1     1
>>>  [3,]    1    1    1    1    1    1    1    1    1     1
>>>  [4,]    1    1    1    1    1    1    1    1    1     1
>>>  [5,]    1    1    1    1    1    1    1    1    1     1
>>>  [6,]   NA    1    1    1    1    1    1    1   NA     1
>>>  [7,]    1    1   NA   NA    1   NA    1    1    1    NA
>>>  [8,]    1    1    1    1    1   NA    1    1    1     1
>>>  [9,]    1    1    1    1    1    1    1    1   NA     1
>>> [10,]    1   NA    1    1    1    1    1    1    1     1
>>>>
>>>> apply(x,1,function(.row) sum(is.na(.row)))
>>>  [1] 0 1 0 0 0 2 4 1 1 1
>>>>
>>>
>>>
>>> On Jan 21, 2008 7:23 PM, Jon Erik Ween <jween at klaru-baycrest.on.ca>
>>> wrote:
>>>> Hi!
>>>>
>>>> I need to conditionally update a dataframe field based on values in
>>>> other fields and can't find even how to search for this right.  
>>>> Sorry
>>>> if this has been asked before.
>>>>
>>>> But, specifically, I have a 490 X 221 dataframe and need to  
>>>> count, by
>>>> row, how many fields in Dataframe$field_a...Dataframe$field_zz are
>>>> non-null and enter this value in Dataset$ABCtaskNum. I have field
>>>> name definitions in a vector "vars" and tried writing a custom
>>>> function to handle the within-row calculation
>>>>
>>>> myfunct <-function () {for (i in 1:length(vars)) {if (vars[i] !=  
>>>> NA)
>>>> {Dataset$ABCtaskNum<-Dataset$ABCtaskNum+1}}}
>>>>
>>>> and then use "apply" to handle the row to row calculation
>>>>
>>>> Dataset <- apply(Dataset, 1, myfunc) Where Dataset already has  
>>>> field
>>>> Dataset$ABCtaskNum set to 0 in all rows.
>>>>
>>>> But that didn't work. Doesn't help if I declare variables (vars and
>>>> ABCtaskNum) in the function declaration either, but then I haven't
>>>> quite figured out how best to do variable substitutions in R.
>>>>
>>>> Thanks for any help. Cheers
>>>>
>>>> Jon
>>>>
>>>> Soli Deo Gloria
>>>>
>>>> Jon Erik Ween, MD, MS
>>>> Scientist, Kunin-Lunenfeld Applied Research Unit
>>>> Director, Stroke Clinic, Brain Health Clinic
>>>>     Baycrest Centre for Geriatric Care
>>>> Assistant Professor, Dept. of Medicine, Div. of Neurology
>>>>     University of Toronto Faculty of Medicine
>>>>
>>>> Posluns Building, 6th Floor, Room 644
>>>> Baycrest Centre for Geriatric Care
>>>> 3560 Bathurst Street
>>>> Toronto, Ontario M6A 2E1
>>>> Canada
>>>>
>>>> Phone: 416-785-2500 x3636
>>>> Fax: 416-785-2484
>>>> Email: jween at klaru-baycrest.on.ca
>>>>
>>>>
>>>> Confidential: This communication and any attachment(s) may contain
>>>> confidential or privileged information and is intended solely  
>>>> for the
>>>> address(es) or the entity representing the recipient(s). If you  
>>>> have
>>>> received this information in error, you are hereby advised to  
>>>> destroy
>>>> the document and any attachment(s), make no copies of same and  
>>>> inform
>>>> the sender immediately of the error. Any unauthorized use or
>>>> disclosure of this information is strictly prohibited.
>>>>
>>>>
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>> guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem you are trying to solve?
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting- 
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
>



More information about the R-help mailing list