[R] generate list of variable names

Jon Erik Ween jween at klaru-baycrest.on.ca
Wed Jun 9 19:02:51 CEST 2010


Thanks Erik

I can't figure out how to use the various x_apply functions in this setting, nor post datasets to reproduce. But anyhow: the table structure is something like this:

id (integer), handedness(R,L,A), gender(M,F), cat1(patient, control). cat2(stroke, MS, dement, control), accuracy(integer), reaction time(numeric)....

so, I want to extract the factor levels from cat1, cat2, etc and run, say, ANOVAs or ROCs on each of the response variables (accuracy, reaction_time, etc) extracting F-values, AUCs, etc, sticking the results in a table of results. Here is an example script I wrote for ROCR:


#######
library(ROCR) # Load stats package to use if not standard
varslist<-scan("/Users/jween/Desktop/INCAS/INCASvars.txt","list") # Read variable list
results<-as.data.frame(array(,c(3,length(varslist)))) # Initialize results array, one type of stat at a time for now

for (i in 1:length(varslist)){ # Loop through the variables you want to process. Determined by varslist
	j<-noquote(varslist[i])
	vars<-c(varslist[i],"Issue_class") # Variables to be analyzed
	temp<-na.omit(MSsmv[vars]) # Have to subset to get rid of NA values causing ROCR to choke
	n<-nrow(temp) # Record how many cases the analysis ios based on. Need to figure out how to calc cases/controls
	#.table<-table(temp$SubjClass)  # Maybe for later figure out cases/controls
	results[1,i]<-j # Name particular results column
	results[2,i]<-n # Number of subjects in analysis
	test<-try(aucval(i,j),silent=TRUE) # Error handling in case procedure craps oust so loop can continue. Supress annoying error messages
	if(class(test)=="try-error") next else # Run procedure only if OK, otherwise skip
	pred<-prediction(MSsmv[[j]], MSsmv$Issue_cat); # Procedure
	perf<-performance(pred,"auc");
	results[3,i]<-as.numeric(perf at y.values) # Enter result into appropriate row
	
}
write.table(results,"/Users/jween/Desktop/IncasRres_MSsmv.csv",sep=",",col.names=FALSE,row.names=FALSE) # Write results to table
rm(aucval,i,n,temp,vars,results,pred,perf,j,varslist) # Clean up test,

aucval<-function(i,j){ # Function to trap errors. Should be the same as real procedure above
	pred<-prediction(MSsmv[[j]], MSsmv$Issue_cat); # Don't put any real results here, they don't seem to be passed back
	perf<-performance(pred,"auc");
}
#######

Cheers


Jon

Soli Deo Gloria

Jon Erik Ween, MD, MS
Scientist, Kunin-Lunenfeld Applied Research Unit 
Director, Stroke Clinic, Brain Health Clinic, Baycrest Centre
Assistant Professor, Dept. of Medicine, Div. of Neurology
    University of Toronto Faculty of Medicine

Kimel Family Building, 6th Floor, Room 644 
Baycrest Centre
3560 Bathurst Street 
Toronto, Ontario M6A 2E1
Canada 

Phone: 416-785-2500 x3648
Fax: 416-785-2484
Email: jween at klaru-baycrest.on.ca


Confidential: This communication and any attachment(s) may contain confidential or privileged information and is intended solely for the address(es) or the entity representing the recipient(s). If you have received this information in error, you are hereby advised to destroy the document and any attachment(s), make no copies of same and inform the sender immediately of the error. Any unauthorized use or disclosure of this information is strictly prohibited.



On 2010-06-09, at 12:20 PM, Erik Iverson wrote:

> 
> 
> Jon Erik Ween wrote:
>> Hi!
>> Would anyone know how to generate a list of variable names from a
>> data frame by the class of the variable?
> 
> a start...
> 
> df <- data.frame(f1 = factor(1:10),
>                 f2 = factor(1:10),
>                 n1 = 1:10,
>                 n2 = 1:10)
> 
> 
> sapply(df, class)
> 
>> I have large tables with different numbers of columns and am trying
>> to script some rote analyses. There are several categorizing
>> variables (factors) and many response variables (integers and
>> numeric). I want to extract a list of classifier column names in one
>> list and response variable names in another list, then run for-loops
>> to calculate various statistics on the response variables in terms of
>> the classifier variables. I thought something like this might work
>> (but didn't):
> 
> Reproducible example needed.  All this can surely be done more elegantly with lapply/mapply instead of for-loops.



More information about the R-help mailing list