[R] drop rare factors

Sam Steingold sds at gnu.org
Wed Jan 18 23:25:22 CET 2012


I have a data frame with some factor columns.
I want to drop the rows with rare factor values
(and remove the factor values from the factors).
E.g.,  frame$MyFactor takes values
A 1,000 times,
B 2,000 times,
C 30 times and
D 4 times.
I want to remove all rows which assume rare values (<1%), i.e., C and D.
i.e.,
frame <- frame[[! (frame$MyFactor %in% c("A","B"))]]
except that I probably got the syntax wrong
and I want c("A","B") to be generated automatically from frame$MyFactor
and the number 0.01 (1%).

Thanks!
-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://thereligionofpeace.com http://mideasttruth.com http://memri.org
http://palestinefacts.org http://dhimmi.com http://truepeace.org
DRM "access management" == prison "freedom management".



More information about the R-help mailing list