[R] Subseting a data.frame

arun smartpink111 at yahoo.com
Thu Oct 17 20:33:46 CEST 2013


You may try:
mydat[with(mydat,ave(seq_along(basel_asset_class),basel_asset_class,FUN=length)>2),]
#  basel_asset_class defa_frequency
#2                 8          0.070
#3                 8          0.030
#4                 8          0.001


#or
library(plyr)
mydat[ddply(mydat,.(basel_asset_class),mutate,L=length(defa_frequency))[,3] >2,] #assuming it is sorted.

A.K.




On Thursday, October 17, 2013 1:59 PM, Katherine Gobin <katherine_gobin at yahoo.com> wrote:
 I am sorry perhaps  was not able to put the question properly. I am not looking for the subset of the data.frame where the basel_asset_class is > 2. I do agree that would have been a basic requirement. Let me try to put the question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid confusion.

> mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as subset(mydat, mydat$basel_asset_class > 2).


What I need is to select only those records for which there are more than two default frequencies (defa_frequency), Thus, there is only one default frequency = 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies w.r.t. basel aseet class 4, similarly there could be another basel asset class having say 5 default frequncies. Thus, I need to take subset of the data.frame s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the basel asset classes and to estimate values of A and B, mathematically one needs to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset s.t. no of default frequencies is greater than or equal to 2. This 2 is not same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin





On Thursday, 17 October 2013 9:33 PM, Bert Gunter <gunter.berton at gene.com> wrote:

"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read an Introduction to R (ships with R) or a R web tutorial of your choice so that you can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin <katherine_gobin at yahoo.com> wrote:

Dear Forum,
>
>I have a data frame as 
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001))
>
>> mydat
>  basel_asset_class defa_frequency
>1                 2          0.150
>2                 8          0.070
>3                 8          0.030
>4                 8          0.001
>
>
>I need to get the subset of this data.frame where no of records for the given basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as (since there is only 1 record, against basel_asset_class = 2, I want to filter it)
>
>> mydat_a
>  basel_asset_class defa_frequency
>1                 8          0.070
>2                 8          0.030
>3                 8          0.001
>
>Kindly guide
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
    [[alternative HTML version deleted]]


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list