[R] Subseting a data.frame
arun
smartpink111 at yahoo.com
Thu Oct 17 20:33:46 CEST 2013
You may try:
mydat[with(mydat,ave(seq_along(basel_asset_class),basel_asset_class,FUN=length)>2),]
# basel_asset_class defa_frequency
#2 8 0.070
#3 8 0.030
#4 8 0.001
#or
library(plyr)
mydat[ddply(mydat,.(basel_asset_class),mutate,L=length(defa_frequency))[,3] >2,] #assuming it is sorted.
A.K.
On Thursday, October 17, 2013 1:59 PM, Katherine Gobin <katherine_gobin at yahoo.com> wrote:
I am sorry perhaps was not able to put the question properly. I am not looking for the subset of the data.frame where the basel_asset_class is > 2. I do agree that would have been a basic requirement. Let me try to put the question again.
I have a data frame as
mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001))
# Please note I have changed the basel_asset_class to 4 from 2, to avoid confusion.
> mydat
basel_asset_class defa_frequency
1 4 0.150
2 8 0.070
3 8 0.030
4 8 0.001
This is just an representative example. In reality, I may have no of basel asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as subset(mydat, mydat$basel_asset_class > 2).
What I need is to select only those records for which there are more than two default frequencies (defa_frequency), Thus, there is only one default frequency = 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies w.r.t. basel aseet class 4, similarly there could be another basel asset class having say 5 default frequncies. Thus, I need to take subset of the data.frame s.t. the no of corresponding defa_frequencies is greater than 2.
The idea is we try to fit exponential curve Y = A exp( BX ) for each of the basel asset classes and to estimate values of A and B, mathematically one needs to have at least two values of X.
I hope I may be able to express my requirement. Its not that I need the subset of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset s.t. no of default frequencies is greater than or equal to 2. This 2 is not same as basel asset class 2.
Kindly guide
With warm regards
Katherine Gobin
On Thursday, 17 October 2013 9:33 PM, Bert Gunter <gunter.berton at gene.com> wrote:
"Kindly guide" ...
This is a very basic question, so the kindest guide I can give is to read an Introduction to R (ships with R) or a R web tutorial of your choice so that you can learn how R works instead of posting to this list.
Cheers,
Bert
On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin <katherine_gobin at yahoo.com> wrote:
Dear Forum,
>
>I have a data frame as
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001))
>
>> mydat
> basel_asset_class defa_frequency
>1 2 0.150
>2 8 0.070
>3 8 0.030
>4 8 0.001
>
>
>I need to get the subset of this data.frame where no of records for the given basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as (since there is only 1 record, against basel_asset_class = 2, I want to filter it)
>
>> mydat_a
> basel_asset_class defa_frequency
>1 8 0.070
>2 8 0.030
>3 8 0.001
>
>Kindly guide
>
>Katherine
>
>
[[alternative HTML version deleted]]
