[R] a more elegant way to get percentages?

Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
Thu Mar 13 14:45:42 CET 2008


try the following:

x <- read.table(textConnection("locat val
1      a   5
2      b   5
3      b  15
4      c   5
5      c  20
6      c   5
7      c  10
8      d   5
9      d  15
10     d  10"), header = TRUE)

x$percent1 <- unlist(tapply(x$val, x$locat, function(x){
    round(100 * x / sum(x), 2)
}))
x


however, check whether the levels of the factor 'x$locat' are 
appropriately ordered.

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm

----- Original Message ----- 
From: "Monica Pisica" <pisicandru at hotmail.com>
To: <r-help at r-project.org>
Sent: Thursday, March 13, 2008 2:36 PM
Subject: [R] a more elegant way to get percentages?


>
> Hi,
>
> I am trying to get percentages in a more elegant way. I have a 
> data.frame with locations and values (counts) of species at that 
> location. Each location is repeated for each species i have values 
> for and i would like to get percentages of each species at that 
> location. I am not sure if i am clear in my explanations so i will 
> paste my code below:
>
> #####################
>
>> x
>   locat val
> 1      a   5
> 2      b   5
> 3      b  15
> 4      c   5
> 5      c  20
> 6      c   5
> 7      c  10
> 8      d   5
> 9      d  15
> 10     d  10
>> loc1 <- x$locat
>> n <- length(loc1)
>> locuniq1 <- unique(loc1)
>> m <- length(locuniq1)
>> counts <- seq(1:m)
>>
>> for (i in 1:m) {
> + count <- 0
> + for (j in 1:n) {
> + if (loc1[j]==locuniq1[i]) count <- count+1
> + counts[i] <- count
> + }
> + }
>>
>> percent1 <- rep(0,n)
>> j <- 0
>> for (i in 1:m) {
> +
> + b <- x[(j+1):(j+counts[i]),]
> + total <- sum(b$val)
> + percent1[(j+1):(j+counts[i])] <- round(apply(as.matrix(b$val), 1, 
> function(x) {x*100/total}),2)
> + j = j+counts[i]
> + }
>> x1 <- cbind(x, percent1)    # this is the result i want
>> x1
>   locat val percent1
> 1      a   5   100.00
> 2      b   5    25.00
> 3      b  15    75.00
> 4      c   5    12.50
> 5      c  20    50.00
> 6      c   5    12.50
> 7      c  10    25.00
> 8      d   5    16.67
> 9      d  15    50.00
> 10     d  10    33.33
>>
> ################
>
> I am wondering if there is any way to do it more efficiently, much 
> more that the first loop which gives how many times each location is 
> present in the data.frame is slow if you have a larger data.frame 
> and not only 10 rows.
>
> Thanks for any input and sorry if the email is on the long side,
>
> Monica
>
>
> _________________________________________________________________
> [[elided Hotmail spam]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm



More information about the R-help mailing list