[R] adding factor scores back to an incomplete dataset...

Mark Difford mark_difford at yahoo.co.uk
Sat Aug 29 19:21:35 CEST 2009


Hi David, Phil,

Phil Spector wrote:
>> David - 
>> Here's the easiest way I've been able to come up with.

Easiest? You are making unnecessary work for yourselves and seem not to
understand the purpose of ?naresid (i.e. na.action = na.exclude). Why not
take the simple route that I gave, which really is R's + factanal's route.
Using Phil's data as example:

##
dat = data.frame(matrix(rnorm(100),20,5)) 
dat[3,4] = NA 
dat[12,3] = NA

scrs <- factanal(~X1+X2+X3+X4+X5, data=dat,factors=2,scores='regression',
     na.action=na.exclude)$scores
TrueDat <- merge(dat,scrs,by=0,all.x=TRUE,sort=FALSE)
TrueDat

Regards, Mark.


David G. Tully wrote:
> 
> Thanks, Prof Spector. Your first solution works well for me.
> 
> Phil Spector wrote:
>> David -
>>    Here's the easiest way I've been able to come up with. I'll provide 
>> some sample data to make things clearer (hint, hint):
>>
>>> dat = data.frame(matrix(rnorm(100),20,5))
>>> dat[3,4] = NA
>>> dat[12,3] = NA
>>> scrs = factanal(na.omit(dat),factors=2,scores='regression')$scores
>>> rownames(scrs) = rownames(na.omit(dat))
>>> newdat = merge(dat,scrs,by=0,all.x=TRUE,sort=FALSE)
>>
>> This will result in the observations with missing values being
>> at the end of the data frame.  If you want the original order
>> (assuming default row names), you could use
>>
>> newdat[order(as.numeric(newdat$Row.names)),]
>>
>> A somewhat more complicated approach is, in some sense, more direct:
>>
>>> dat$Factor1 = NA
>>> dat$Factor2 = NA
>>> dat[rownames(na.omit(dat[,-c(6,7)])),c('Factor1','Factor2')] = 
>> +    
>> factanal(na.omit(dat[,-c(6,7)]),factors=2,scores='regression')$scores
>>
>> The order of the data is preserved.
>>                     - Phil Spector
>>                      Statistical Computing Facility
>>                      Department of Statistics
>>                      UC Berkeley
>>                      spector at stat.berkeley.edu
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, 25 Aug 2009, David G. Tully wrote:
>>
>>> I am sure there is a simple way to do the following, but i haven't 
>>> been able to find it. I am hoping a merciful soul on R-help could 
>>> point me in the right direction.
>>>
>>> I am doing a factor analysis on survey data with missing values. to 
>>> do this, I run:
>>>
>>> FA1<-factanal(na.omit(DATA), factors = X, rotation = 'oblimin', 
>>> scores = 'regression')
>>>
>>> Now that I have my factors and factor scores, I want to add those 
>>> scores back to my original dataset so I can plot factor scores by 
>>> demographics. However, when I try to add the scores back to the 
>>> original data frame, the variables are of different lengths.
>>>
>>> Is there a way to subset from my original data set that will work 
>>> with factanal() and preserve the original rows or that will allow me 
>>> to append the factor scores back onto the original dataset with the 
>>> proper rows and NAs where there could be no data?
>>>
>>> Again, I apologize if I am missing something basic. I am a self 
>>> taught R user and couldn't find an answer to this question.
>>>
>>> Thanks in advance,
>>>  David
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/adding-factor-scores-back-to-an-incomplete-dataset...-tp25140959p25204698.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list