[R] Removing variables from data frame with a wile card

Andrew Simmons @kw@|mmo @end|ng |rom gm@||@com
Sun Feb 12 23:30:15 CET 2023


drop = FALSE means that should the indexing select exactly one column, then
return a data frame with one column, instead of the object in the column.
It's usually not necessary, but I've messed up some data before by assuming
the indexing always returns a data frame when it doesn't, so drop = FALSE
let's me that I will always get a data frame.

```
x <- data.frame(V1 = 1:5, V2 = letters[1:5])
x[, "V2"]
x[, "V2", drop = FALSE]
```

You'll notice that the first returns a character vector, a through e, where
the second returns a data frame with one column where the object in the
column is the same character vector.

You could alternatively use

x["V2"]

which should be identical to x[, "V2", drop = FALSE], but some people don't
like that because it doesn't look like matrix indexing anymore.


On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen using ntu.edu.tw> wrote:

> In the line suggested by Andrew Simmons,
>
> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>
> what does drop=FALSE do? Thanks.
>
> On 1/14/2023 8:48 PM, Steven Yen wrote:
>
> Thanks to all. Very helpful.
>
> Steven from iPhone
>
> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo using gmail.com>
> <akwsimmo using gmail.com> wrote:
>
> You'll want to use grep() or grepl(). By default, grep() uses extended
> regular expressions to find matches, but you can also use perl regular
> expressions and globbing (after converting to a regular expression).
> For example:
>
> grepl("^yr", colnames(mydata))
>
> will tell you which 'colnames' start with "yr". If you'd rather you
> use globbing:
>
> grepl(glob2rx("yr*"), colnames(mydata))
>
> Then you might write something like this to remove the columns starting
> with yr:
>
> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>
> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen using ntu.edu.tw>
> <styen using ntu.edu.tw> wrote:
>
>
> I have a data frame containing variables "yr3",...,"yr28".
>
>
> How do I remove them with a wild card----something similar to "del yr*"
>
> in Windows/doc? Thank you.
>
>
> colnames(mydata)
>
>   [1] "year"       "weight"     "confeduc"   "confothr" "college"
>
>   [6] ...
>
>  [41] "yr3"        "yr4"        "yr5"        "yr6" "yr7"
>
>  [46] "yr8"        "yr9"        "yr10"       "yr11" "yr12"
>
>  [51] "yr13"       "yr14"       "yr15"       "yr16" "yr17"
>
>  [56] "yr18"       "yr19"       "yr20"       "yr21" "yr22"
>
>  [61] "yr23"       "yr24"       "yr25"       "yr26" "yr27"
>
>  [66] "yr28"...
>
>
> ______________________________________________
>
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
> https://stat.ethz.ch/mailman/listinfo/r-help
>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>
> and provide commented, minimal, self-contained, reproducible code.
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list