[R] Saving misclassified records into dataframe within a loop

David Winsemius dwinsemius at comcast.net
Fri May 13 00:06:55 CEST 2011


On May 12, 2011, at 5:41 PM, John Dennison wrote:

> Having poked the problem a couple more times it appears my issue is  
> that the
> object i save within the loop is not available after the function  
> ends. I
> have no idea why it is acting in this manner.
>
>
> library(rpart)
>
> # grow tree
> fit <- rpart(Kyphosis ~ Age + Number + Start,
> method="class", data=kyphosis)
> #predict
> prediction<-predict(fit, kyphosis)
>
> #misclassification index function
>
> results<-as.data.frame(1)
>
> predict.function <- function(x){
>  j<-0
> for (i in 1:length(kyphosis$Kyphosis)) {
> if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){
>
> j<-j+1
> results[j,]<-row.names(testing[c(i),])

Are we  supposed to know where to find 'testing" (and if we cannot  
find it, how is the R interpreter going to find it)?


> print( row.names(kyphosis[c(i),]))
> } }
> {
> print(results)
> save(results, file="results") } }
>
>
> i can load results from file and my out put is there. how ever if i  
> just
> type results i get the original 1. what is in the lords name is  
> occurring.
>
> Thanks
>
> John
>
>
>
> On Thu, May 12, 2011 at 1:50 PM, Phil Spector <spector at stat.berkeley.edu 
> >wrote:
>
>> John -
>>  In your example, the misclassified observations (as defined by
>> your predict.function) will be
>>
>> kyphosis[kyphosis$Kyphosis == 'absent' & prediction[,1] != 1,]
>>
>> so you could start from there.
>>                                       - Phil Spector
>>                                        Statistical Computing Facility
>>                                        Department of Statistics
>>                                        UC Berkeley
>>                                        spector at stat.berkeley.edu
>>
>>
>>
>> On Thu, 12 May 2011, John Dennison wrote:
>>
>> Greetings R world,
>>>
>>> I know some version of the this question has been asked before,  
>>> but i need
>>> to save the output of a loop into a data frame to eventually be  
>>> written to
>>> a
>>> postgres data base with dbWriteTable. Some background. I have  
>>> developed
>>> classifications models to help identify problem accounts. The  
>>> logic is
>>> this,
>>> if the model classifies the record as including variable X and it  
>>> turns
>>> out
>>> that record does not have X then it should be reviewed(ie i need  
>>> the row
>>> number/ID saved to a database). Generally i want to look at the
>>> misclassified records. This is a little hack i know, anyone got a  
>>> better
>>> idea please let me know. Here is an example
>>>
>>> library(rpart)
>>>
>>> # grow tree
>>> fit <- rpart(Kyphosis ~ Age + Number + Start,
>>> method="class", data=kyphosis)
>>> #predict
>>> prediction<-predict(fit, kyphosis)
>>>
>>> #misclassification index function
>>>
>>> predict.function <- function(x){
>>> for (i in 1:length(kyphosis$Kyphosis)) {
>>> #the idea is that if the record is "absent" but the prediction is
>>> otherwise
>>> then show me that record
>>> if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){
>>> #THIS WORKS
>>> print( row.names(kyphosis[c(i),]))
>>> }
>>> } }
>>>
>>> predict.function(x)
>>>
>>> Now my issue is that i want to save these id to a data.frame so i  
>>> can
>>> later
>>> save them to a database. This this an incorrect approach. Can I  
>>> save each
>>> id
>>> to the postgres instance as it is found. i have a ignorant fear of  
>>> lapply,
>>> but it seems it may hold the key.
>>>
>>>
>>> Ive tried
>>>
>>> predict.function <- function(x){
>>> results<-as.data.frame(1)
>>> for (i in 1:length(kyphosis$Kyphosis)) {
>>> #the idea is that if the record is "absent" but the prediction is
>>> otherwise
>>> then show me that record
>>> if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){
>>> #THIS WORKS
>>> results[i,]<- as.data.frame(row.names(kyphosis[c(i),]))
>>> }
>>> } }
>>>
>>> this does not work. results object does not get saved. Any Help  
>>> would be
>>> greatly appreciated.
>>>
>>>
>>> Thanks
>>>
>>> John Dennison
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list