[R] Case weighting

David Winsemius dwinsemius at comcast.net
Thu Feb 23 18:43:05 CET 2012


On Feb 23, 2012, at 10:49 AM, Hed Bar-Nissan wrote:

> The need comes from the PISA data. (http://www.pisa.oecd.org)
>
> In the data there are many cases and each of them carries a numeric
> variable that signifies it's weight.
> In SPSS the command would be "WEIGHT BY"
>
> In simpler words here is an R sample ( What is get  VS  what i want  
> to get )
>
>
>> data.recieved <- data.frame(
> + kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes",  
> "No")),
> + weight=c(10, 1, 1, 1)
> + );
>> data.recieved;
>  kindergarten_attendance weight
> 1                      No     10
> 2                     Yes      1
> 3                     Yes      1
> 4                     Yes      1
>>
>>
>>
>> data.weighted <- data.frame(
> + kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,1,1,1),  
> labels =
> c("Yes", "No")) );

You want "case repetition" not case weighting, which I would use as a  
term when working on estimation problems:

 >  ( data.weighted <- unlist(sapply(1:NROW(data.recieved),  
function(x) rep(data.recieved[x,1], times=data.recieved[x,2] ))  ) )
  [1] No  No  No  No  No  No  No  No  No  No  Yes Yes Yes
Levels: Yes No

>>
>>
>> par(mfrow=c(1,2));
>> plot(data.recieved$kindergarten_attendance,main="What i get");
>> plot(data.weighted$kindergarten_attendance,main="What i want to  
>> get");

Seems to work with the factor vector, although I didn't replicate  
dataframe rows, but I guess you could.

>>
>
> tnx in advance
> Hed
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list