[R] Case weighting

David Winsemius dwinsemius at comcast.net
Sat Mar 3 20:16:25 CET 2012


You might want to look at the various wtd.* functions in the Hmisc  
package:

require(Hmisc)
?wtd.stats

'wtd.mean' is just one of the functions supplied. You might want to  
contemplate the simplicity of Harrell's function code, since it is not  
hidden. Just type:

wtd.mean

-- 
David.


On Mar 3, 2012, at 2:04 PM, Hed Bar-Nissan wrote:

> Following David example if i just wanted to do means
> would multiplying the cases according to the weight do the work?
>
>
> Something like this on a data.frame
> (Must be a simpler way to do it with R - the sapply scope confused me)
>
>
> weightBy <- function(origDataFrame,weightVector)
> {
>     case_Number_After_Weighting = sum(weightVector);
>     #print ( "case_Number_After_Weighting  =  
> ");#print(case_Number_After_Weighting );
>
>     data.weighted.local = data.frame  
> ( 1:case_Number_After_Weighting );
>     assign("data.weighted.tmp",data.weighted.local,env=globalenv())
>
>     sapply(1:NCOL(origDataFrame),
>         function(colNo) {
>             #print ( "dealing with colomn ");#print(colNo);
>             data.weighted.tmp[,colNo] =
>                  unlist(
>                     sapply(1:NROW(origDataFrame),
>                         function(x) rep(origDataFrame[x,colNo],  
> times=weightVector[x] )
>                     )
>                 )
>             names(data.weighted.tmp)[colNo] <- names(origDataFrame) 
> [colNo]
>              
> assign("data.weighted.tmp",data.weighted.tmp,env=globalenv())
>             #print (data.weighted.tmp);
>         }
>     )
>     data.weighted.local = data.weighted.tmp;
>     rm(data.weighted.tmp, envir=globalenv());
>     return(data.weighted.local);
> }
>
>
>
> data.recieved <- data.frame(
>     f1 = factor(c(2,1,1,1), labels = c("Yes", "No")),
>     f2 = factor(c(1,2,3,4), labels = c("One", "Two","Three","Four"))
> );
>
> weight=c(10, 1, 1, 1)
>
>
> weightBy(data.recieved,weight);
>
>
>
> On Fri, Feb 24, 2012 at 8:03 AM, Thomas Lumley <tlumley at uw.edu> wrote:
> >On Fri, Feb 24, 2012 at 9:40 AM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
> >
> > On Feb 23, 2012, at 3:27 PM, Hed Bar-Nissan wrote:
> >
> >> It's really weighting - it's just that my simplified example was  
> too
> >> simplified
> >> Here is my real weight vector:
> >> > sc$W_FSCHWT
> >>  [1]  14.8579  61.9528   3.0420   2.9929   5.1239  14.7507   2.7535
> >> 2.2693   3.6658   8.6179   2.5926   2.5390   1.7354   2.9767    
> 9.0477
> >> 2.6589   3.4040   3.0519
> >> ....
> >
> >
> > You should always convey the necessary complexity of the problem.
> >
> >>
> >>
> >> And still it should somehow set the case weight.
> >> I could multiply all by 10000 and use maybe your method but it  
> would
> >> create such a bloated dataframe
> >>
> >> working with numeric only i could probably create weighted means
> >>
> >> But something simple as WEIGHTED BY would be nice.
> >
> >
> > The survey package by Thomas Lumley provides for a wide variety of  
> weighted
> > analyses.
>
> Yes.  It doesn't do everything that SPSS WEIGHTED BY will do, but it
> does a lot.  SPSS is more general partly because it cheats -- it
> doesn't always compute the right standard errors if the weights are
> sampling weights   [SPSS now has some proper survey analysis commands,
> which do get the right standard errors, but are more limited]
>
>  - thomas
>
> --
> Thomas Lumley
> Professor of Biostatistics
> University of Auckland
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list