[R] problem for strsplit function

Greg Minshall m|n@h@|| @end|ng |rom um|ch@edu
Fri Jul 9 20:56:38 CEST 2021


Kai,

> one more question, how can I know if the function is for column
> manipulations or for vector?

i still stumble around R code.  but, i'd say the following (and look
forward to being corrected! :):

1.  a column, when extracted from a data frame, *is* a vector.

2.  maybe your question is "is a given function for a vector, or for a
    data frame/matrix/array?".  if so, i think the only way is reading
    the help information (?foo).

3.  sometimes, extracting the column as a vector from a data frame-like
    object might be non-intuitive.  you might find reading ?"[" and
    ?"[.data.frame" useful (as well as ?"[.data.table" if you use that
    package).  also, the str() command can be helpful in understanding
    what is happening.  (the lobstr:: package's sxp() function, as well
    as more verbose .Internal(inspect()) can also give you insight.)

    with the data.table:: package, for example, if "DT" is a data.table
    object, with "x2" as a column, adding or leaving off quotation marks
    for the column name can make all the difference between ending up
    with a vector, or with a (much reduced) data table:
----
> is.vector(DT[, x2])
[1] TRUE
> str(DT[, x2])
 num [1:9] 32 32 32 32 32 32 32 32 32
>
> is.vector(DT[, "x2"])
[1] FALSE
> str(DT[, "x2"])
Classes ‘data.table’ and 'data.frame':  9 obs. of  1 variable:
 $ x2: num  32 32 32 32 32 32 32 32 32
 - attr(*, ".internal.selfref")=<externalptr>
----

    a second level of indexing may or may not help, mostly depending on
    the use of '[' versus of '[['.  this can sometimes cause confusion
    when you are learning the language.
----
> str(DT[, "x2"][1])
Classes ‘data.table’ and 'data.frame':  1 obs. of  1 variable:
 $ x2: num 32
 - attr(*, ".internal.selfref")=<externalptr>
> str(DT[, "x2"][[1]])
 num [1:9] 32 32 32 32 32 32 32 32 32
----

    the tibble:: package (used in, e.g., the dplyr:: package) also
    (always?) returns a single column as a non-vector.  again, a
    second indexing with double '[[]]' can produce a vector.
----
> DP <- tibble(DT)
> is.vector(DP[, "x2"])
[1] FALSE
> is.vector(DP[, "x2"][[1]])
[1] TRUE
----

    but, note that a list of lists is also a vector:
> is.vector(list(list(1), list(1,2,3)))
[1] TRUE
> str(list(list(1), list(1,2,3)))
List of 2
 $ :List of 1
  ..$ : num 1
 $ :List of 3
  ..$ : num 1
  ..$ : num 2
  ..$ : num 3

    etc.

hth.  good luck learning!

cheers, Greg



More information about the R-help mailing list