[R] how do I remove entries in data frame from a vector

Rolf Turner r@turner @end|ng |rom @uck|@nd@@c@nz
Thu Oct 22 00:46:24 CEST 2020


On Wed, 21 Oct 2020 16:15:22 -0500
Ana Marija <sokovic.anamarija using gmail.com> wrote:

> Hello,
> 
> I have a data frame with one column:
> 
> > remove
> 
>                                 V1
> 
> 1 ABAFT_g_4RWG569_BI_SNP_A10_35096
> 2 ABAFT_g_4RWG569_BI_SNP_B12_35130
> 3 ABAFT_g_4RWG569_BI_SNP_E09_35088
> 4 ABAFT_g_4RWG569_BI_SNP_E12_35136
> 5 ABAFT_g_4RWG569_BI_SNP_F11_35122
> 6 ABAFT_g_4RWG569_BI_SNP_F12_35138
> 7 ABAFT_g_4RWG569_BI_SNP_G07_35060
> 8 ABAFT_g_4RWG569_BI_SNP_G12_35140
> 
> I want to remove these 8 entries from remove data frame from this
> vector that looks like this:
> 
> > head(celFiles)
> 
> [1]
> "/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A01_34952.CEL"
> [2]
> "/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A02_34968.CEL"
> 
> [3]
> "/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A03_34984.CEL"
> 
> [4]
> "GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A04_35000.CEL"
> 
> [5]
> "/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A05_35016.CEL"
> 
> [6]
> "/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A06_35032.CEL"
> ...
> 
> I tried doing this:
> 
> b= celFiles[!basename(celFiles) %in% as.character(remove$V1)]
> 
> but none of the 8th entries in "remove" data frame have been removed.
> 
> Please advise,
> Ana

I would advise you to *look* at basename(celFiles)!!!

The entries end in ".CEL"; the names in remove$V1 do not.  So %in%
finds no matches.  Perhaps:

    b <- celFiles[!basename(celFiles) %in%
                 paste0(as.character(remove$V1),".CEL")]

Note that, for the data that you have presented, none of the entries of
celFiles "match up" with "remove" so it is *still* the case that (for
the data shown) none of the entries will be removed.  So your example
was bad.

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276



More information about the R-help mailing list