[R] Odp: Finding pairs

Wed Aug 25 16:05:27 CEST 2010

Hi

well, I will add some explanation

r-help-bounces at r-project.org napsal dne 25.08.2010 11:24:38:

> Dear Mr Petr PIKAL
> After reading the R code provided by you, I realized that I would have 
never 
> figured out how this could have been done. I am going to re-read again 
and 
> again your code to understand the logic and the commands you have 
provided.
> Thanks again from the heart for your kind advice.
> Regards
> Mike
> 
> --- On Wed, 25/8/10, Petr PIKAL <petr.pikal at precheza.cz> wrote:
> 
> From: Petr PIKAL <petr.pikal at precheza.cz>
> Subject: Re: [R] Odp:  Finding pairs
> To: "Mike Rhodes" <mike_simpson07 at yahoo.co.uk>
> Cc: r-help at r-project.org
> Date: Wednesday, 25 August, 2010, 9:01
> 
> Hm
> 
> r-help-bounces at r-project.org napsal dne 25.08.2010 09:43:26:
> 
> > Dear Mr Petr Pikal
> > 
> > I am extremely sorry for the manner I have raised the query. Actually 
> that was
> > my first post to this R forum and in fact even I was also bit confused 

> while 
> > drafting the query, for which I really owe sorry to all for consuming 
> the 
> > precious time. Perhaps I will try to redraft my query in a better way 
as 
> follows. 
> > 
> > I have two datasets "A" and "B" containing the names of branch offices 

> of a 
> > particular bank say XYZ plc bank. The XYZ bank has number of main 
branch 
> 
> > offices (say Parent) and some small branch offices falling under the 
> purview 
> > of these main branch offices (say Child).
> > 
> > The datalist "A" and "B" consists of these main branch office names as 

> well as
> > small branch office names. B is subset of A and these branch names are 

> coded. 
> > Thus we have two datasets A and B as (again I am using only a
> >  portion of a large database just to have some idea)
> > 
> > 
> > A                         B
> > 144                      
>                        ^^^^what is here in B? Empty space?, 
> > 145                       
> > 146                       
> > 147                  144                        
> 
> How do you know that 144 from B relates to 147 in A? Is it according to 
> its positions? I.e. 4th item in B belongs to 4.th item in A?
> 
> > 148                  145  
> > 
> > 149                  147
> > 151                  148
> > 
> > 
> > 
> > Now the branch 144 appears in A as well as in B and in B it is mapped 
> with 
> > 147. This means branch 147 comes under the purview of main branch 144. 

> Again 
> > 147 is controlling the branch 149 (since 147 also has appeared in B 
and 
> is 
> > mapped with 149 of A).
> > 
> > Similarly, branch 145 is controlling branch 148 which further controls 

> > operations of bank branch 151 and like wise.
> 
> Well as you did not say anything about structure of your data
> A<-144:151
> B<-144:148
> data.frame(A,B)
>     A   B
> 1 144  NA
> 2 145  NA
> 3 146  NA
> 4 147 144
> 5 148 145
> 6 149 146
> 7 150 147
> 8 151 148
> DF<-data.frame(A,B)

This was just making a data frame with 2 columns to have some data to play 
with

> main<-DF$A[is.na(DF$B)]

Above are codes from A which are NA in B

> branch1<-DF[!is.na(DF$B),]

Above is data frame of remaining codes (other than main)

> selected.branch1<-branch1$A[branch1$B%in%main]

Above is codes from column A for which B column and main are the same

> branch2<-branch1[!branch1$B%in%main,]

This is the rest of yet not selected rows

> selected.branch2<-branch2$A[branch2$B%in%selected.branch1]

and this is selection of values from column A for which B column and 
selected.branch1 values are same.

But it works for this particular data, I am not sure how it behaves with 
duplicates and further issues. It also depends on how your data is 
organised.

And if you are in reading you could also go through setdiff, merge and 
maybe sqldf package and Rdata Import/export manual 

Regards
Petr

> 
> and for cbinding your data which has uneven number of values see Jim 
> Holtman's answer to this
> 
> How to cbind DF:s with differing number of rows?
> 
> Regards
> Petr
> 
> 
> > 
> > So in the end I need an output something like -
> > 
> > Main Branch           Branch office1                 Branch
> >  office2
> > 144                             147                              
   149
> > 145                             148                              
   151 
>    
> > 146                             NA
> >                                   NA               
> > 
> 
...............................................................................
> > 
> 
..............................................................................
> > 
> >  
> > I understand again I am not able to put forward my query properly. But 
I 
> must 
> > thank all of you for giving a patient reading to my query and for 
> reverting 
> > back earlier. Thanks once again.
> > 
> > With warmest regards
> > 
> > Mike 
> > 
> > 
> > --- On Wed, 25/8/10, Petr PIKAL <petr.pikal at precheza.cz> wrote:
> > 
> > From: Petr PIKAL <petr.pikal at precheza.cz>
> > Subject: Odp: [R] Finding
> >  pairs
> > To: "Mike Rhodes" <mike_simpson07 at yahoo.co.uk>
> > Cc: r-help at r-project.org
> > Date: Wednesday, 25 August, 2010, 6:39
> > 
> > Hi
> > 
> > without other details it is probably impossible to give you any 
> reasonable 
> > advice. Do you have your data already in R? What is their form? Are 
they 
> 
> > in 2 columns in data frame? How did you get them paired?
> > 
> > So without some more information probably nobody will invest his time 
as 
> 
> > it seems no trivial to me.
> > 
> > Regards
> > Petr
> > 
> > r-help-bounces at r-project.org napsal dne 24.08.2010 20:28:42:
> > 
> > > 
> > > 
> > > 
> > > 
> > > Dear R Helpers,
> > > 
> > > 
> > > I am a newbie and recently got introduced to R. I have a large 
> database 
> > > containing the names of bank branch offices along-with other 
details. 
> I 
> > am 
> > > into Operational
> >  Risk as envisaged by BASEL II Accord. 
> > > 
> > > 
> > > I am trying to express my problem and I am using only an indicative 
> data 
> > which
> > > comes in coded format.
> > > 
> > > 
> > > 
> > > 
> > > A (branch)                      B (controlled by)
> > > 
> > > 
> > > 144                   
> > > 145                      
> > > 146                   
> > > 147                                       144 
> > > 148                                       145 
> > > 149       
> >                                 147
> > > 151                                       146  
> > >  ......                                      .......
> > >  
> > > ......                                      .......
> > > 
> > > 
> > > where 144's etc are branch codes in a given city and B is subset of 
A.
> > > 
> > > 
> > > 
> > > 
> > > If a branch code appearing in "A" also appears in "B" (which is 
paired 
> 
> > with 
> > > some otehr element of A e.g. 144 appearing in A, also appears in "B" 

> and 
> > is 
> > > paired with 147 of "A" and
> >  likewise), then that means 144 is controlling 
> > 
> > > operations of bank office 147. Again, 147 itself appears again in B 
> and 
> > is 
> > > paired with bank branch coded 149. Thus, 149 is controlled by 147 
and 
> > 147 is 
> > > controlled by 144. Likewise there are more than 700 hundred branch 
> name 
> > codes available.
> > > 
> > > 
> > > My objective is to group them as follows -
> > > 
> > > 
> > > Bank Branch
> > > 
> > > 
> > > 144      147    149 
> > > 
> > > 
> > > 145
> > > 
> > > 
> > > 146       151  
> > > 
> > > 
> > > 148
> > > .....
> > > 
> > > 
> > > or even the following output will do.
> > > 
> > > 
> > > 144
> > > 147
> > > 149
> > > 
> > > 
> > > 145
> > > 
> > > 
> > > 146
> > > 151
> > > 
> > > 
> > > 148
> > > 151
> > > ......
> > > 
> > > 
> > > I understand I should be writing some R
> >  code to begin with which I had 
> > tried 
> > > also but as of now I am helpless. Please guide me.
> > > 
> > > 
> > > Mike
> > > 
> > > 
> > > 
> > > 
> > > 
> > >    [[alternative HTML version deleted]]
> > > 
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > 
> > 
> > 
> > 
> > 
> >    [[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.