[R] how to merge 5 data frames by one column

David Winsemius dw|n@em|u@ @end|ng |rom comc@@t@net
Tue Dec 3 21:42:29 CET 2019


On 12/3/19 12:16 PM, Ana Marija wrote:
> would this make sense for the previous:
> mt=na.omit(m, cols = c("V1.1","V1.2","V1.3","V1.4","V1.5"))
>
> On Tue, Dec 3, 2019 at 2:09 PM Ana Marija <sokovic.anamarija using gmail.com>
> wrote:
>
>> I can perhaps do this:
>>
>> m=Reduce(function(x, y) merge(x, y, all=TRUE), list(s11, s22, s33,s44,s55))
>>
>> but than in the output of this one SNP (just for example)
>>
>>> head(m)
>>           rs            V1.1        V3.1     V4.1 V1.2 V3.2 V4.2
>>   V1.3
>> 6 rs1029829 ENSG00000154803 1.02519e-11 0.469402 <NA>   NA   NA
>> ENSG00000141030
>>           V3.3     V4.3 V1.4 V3.4 V4.4 V1.5 V3.5 V4.5
>> 6 3.06126e-28 0.726948 <NA>   NA   NA <NA>   NA   NA


It's a very simple matter when using gmail to adhere to the Posting 
Guide policy of plaintext submission to rhelp. Failing to adhere to that 
rule is making your successive posting less and less readable.

>> ...
>>
>> but how to filter out this output (m) in order to remove all rows where I
>> have NA in any of these columns: V1.1,V1.2,V1.3,V1.4,V1.5

The complete.cases function returns a logical vector suitable for 
selecting a subset.


-- 

David.

>>
>>
>>
>>
>>
>> On Tue, Dec 3, 2019 at 1:48 PM Ana Marija <sokovic.anamarija using gmail.com>
>> wrote:
>>
>>> the desired output would look like this (example give just for two genes,
>>> it should include all 5 from all 5 data frames):
>>>
>>> where the example is if say only 5 rs are shared between those two genes,
>>> what is given after rs# is values from V4 column for each gene
>>>
>>> GENES ENSG00000001629 ENSG00000127914
>>> rs1208998 -0.0337989326337439  -0.00106024397995199
>>> rs4729008 0.0630831868839983  0.00890783698397027
>>> rs11772754 0.181375539335959  0.0012636115921931
>>> rs10257459 0.0369962603988132  0.00509887844657462
>>> rs17164876 0.0307882763321834  -0.00188979524322732
>>>
>>> On Tue, Dec 3, 2019 at 1:40 PM Ana Marija <sokovic.anamarija using gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I have 5 dataframes (s11,s22,s33,s44,s55) that look like this:
>>>>
>>>>> head(s11)
>>>>                 V1.1                          rs         V3.1        V4.1
>>>> 1 ENSG00000154803  rs12940868 3.80175e-05 -0.519565
>>>> 2 ENSG00000154803   rs4383187 8.92772e-05 -0.367303
>>>> 3 ENSG00000154803   rs4404112 9.32402e-05 -0.366634
>>>> 4 ENSG00000154803   rs7214091 8.38003e-05  0.337576
>>>> 5 ENSG00000154803  rs35871790 9.67028e-05 -0.305755
>>>> 6 ENSG00000154803 rs112532541 1.08341e-04 -0.305493
>>>>
>>>>> head(s22)
>>>>                 V1.2                               rs        V3.2
>>>>   V4.2
>>>> 602 ENSG00000264589  rs62065452 1.34475e-17 -0.695948
>>>> 603 ENSG00000264589 rs377004743 1.26272e-17 -0.695627
>>>> 630 ENSG00000264589   rs1724390 1.01129e-17 -0.693518
>>>> 643 ENSG00000264589 rs367637729 4.05726e-17 -0.682833
>>>> 653 ENSG00000264589 rs376183404 1.13177e-17 -0.697646
>>>> 673 ENSG00000264589 rs112327620 1.59840e-17 -0.707904
>>>>
>>>> Each one has one unique value in respective V1
>>>>
>>>> I am trying to merge all at once all 5 data frames by the "rs" column.
>>>>
>>>> Can you please help with this,
>>>> Ana
>>>>
>>>>
>>>>
>>>>
>>>>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list