[R] Unique?

Francisco J. Zagmutt gerifalte28 at hotmail.com
Thu May 11 19:10:15 CEST 2006


Hi Cameron

You need to be more specific when you ask a question so you can get a better 
answer.  Anyhow, when you say that you want to retain all the other 
variables do you mean that you want to create a new column in the dataset 
that contains the calculated sum?   If that is the case you can use a 
construction like:

set.seed(1)
step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40))
result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum)
step4[,"SUM"]=result[match(step4[,"TRIPID"],names(result))]
step4
  TRIPID CONVUNIT Sum
1    111       36 122
2    222       48 121
3    333       48 129
4    111       42 122
5    222       30 121
6    333       43 129
7    111       44 122
8    222       43 121
9    333       38 129


Cheers

Francisco

>From: "Guenther, Cameron" <Cameron.Guenther at MyFWC.com>
>To: "Francisco J. Zagmutt" <gerifalte28 at hotmail.com>
>Subject: RE: [R] Unique?
>Date: Thu, 11 May 2006 12:08:31 -0400
>
>It is close but not quite what I want.  I need to retain all of the
>other variables as well.
>
>
>Cameron Guenther, Ph.D.
>Associate Research Scientist
>FWC/FWRI, Marine Fisheries Research
>100 8th Avenue S.E.
>St. Petersburg, FL 33701
>(727)896-8626 Ext. 4305
>cameron.guenther at myfwc.com
>-----Original Message-----
>From: Francisco J. Zagmutt [mailto:gerifalte28 at hotmail.com]
>Sent: Wednesday, May 10, 2006 6:06 PM
>To: Guenther, Cameron; r-help at stat.math.ethz.ch
>Subject: RE: [R] Unique?
>
>If you only care about the sum of CONVUNIT by each TRIPID then you can
>use tapply i.e.:
>
>step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40))
>result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum)
>result
>111 222 333
>115 107 123
>
>Is this what you wanted to do?  I can't think of anything faster than
>tapply for your problem.
>
>I hope this helps
>
>Francisco
>
>
>
>
> >From: "Guenther, Cameron" <Cameron.Guenther at MyFWC.com>
> >To: <r-help at stat.math.ethz.ch>
> >Subject: [R] Unique?
> >Date: Wed, 10 May 2006 17:02:33 -0400
> >
> >
> >Hello,
> >I have sample data set that looks like:
> >
> >YEAR	MONTH	DAY	CONTINUE	SPL		TIMEFISH
> >TIMEUNIT	AREA	COUNTY	DEPTH	DEPUNIT	GEAR		TRIPID
> >CONVUNIT
> >1992	1	26	1		SP0073928	8
> >H		7	25		4	NA		1000000
> >02163399054	161
> >1992	1	26	1		SP0073928	8
> >H		7	25		4	NA		1000000
> >02163399054	8
> >1992	1	26	2		SP0004228	8
> >H		7	25		4	NA		1000000
> >02163399054	161
> >1992	1	26	2		SP0004228	8
> >H		7	25		4	NA		1000000
> >02163399054	8
> >1992	1	25	NA		SP0052652	8
> >H		7	25		4	NA		1000000
> >02163399057	85
> >1992	1	26	NA		SP0037940	8
> >H		7	25		4	NA		1000000
> >02163399058	70
> >1992	1	27	NA		SP0072357	8
> >H		7	25		4	NA		1000000
> >02163399059	15
> >1992	1	27	NA		SP0072357	8
> >H		7	25		4	NA		1000000
> >02163399059	20
> >1992	1	27	NA		SP0026324	8
> >H		7	25		4	NA		1000000
> >02163399060	8
> >1992	1	28	1		SP0072357	8
> >H		7	25		4	NA		1000000
> >02163399062	200
> >
> >How can I use unique to extract the rows that have repeated tripid's
> >only, not a unique value for each variable but only for TRIPID.  I then
>
> >want to condense the unique values by summing the CONVUNIT for each
> >unique value of TRIPID.  I posted a similar question last week and
> >received a sufficient answer of how to do this without using uniqe.
> >The solution below worked just fine on this sample data set but the
> >full data set has 446,000 rows of data and my computer and R simply
> >cannot handle this follwing code on data this large.
> >
> >conds<-by(Step4,Step4$TRIPID,function(x)
> >replace(x[1,],"CONVUNIT",sum(x$CONVUNIT)))
> >Step5<-do.call(rbind,conds)
> >
> >Thank you,
> >
> >Cameron Guenther, Ph.D.
> >Associate Research Scientist
> >FWC/FWRI, Marine Fisheries Research
> >100 8th Avenue S.E.
> >St. Petersburg, FL 33701
> >(727)896-8626 Ext. 4305
> >cameron.guenther at myfwc.com
> >
> >______________________________________________
> >R-help at stat.math.ethz.ch mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide!
> >http://www.R-project.org/posting-guide.html
>
>




More information about the R-help mailing list