[R] problem for strsplit function

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Fri Jul 9 23:51:16 CEST 2021

"Strictly speaking", Greg is correct, Bert.


Lists in R are vectors. What we colloquially refer to as "vectors" are more precisely referred to as "atomic vectors". And without a doubt, this "vector" nature of lists is a key underlying concept that explains why adding a dim attribute creates a matrix that can hold data frames. It is also a stumbling block for programmers from other languages that have things like linked lists.

On July 9, 2021 2:36:19 PM PDT, Bert Gunter <bgunter.4567 using gmail.com> wrote:
>"1.  a column, when extracted from a data frame, *is* a vector."
>Strictly speaking, this is false; it depends on exactly what is meant
>by "extracted." e.g.:
>> d <- data.frame(col1 = 1:3, col2 = letters[1:3])
>> v1 <- d[,2] ## a vector
>> v2 <- d[[2]] ## the same, i.e
>> identical(v1,v2)
>[1] TRUE
>> v3 <- d[2] ## a data.frame
>> v1
>[1] "a" "b" "c"  ## a character vector
>> v3
>  col2
>1    a
>2    b
>3    c
>> is.vector(v1)
>[1] TRUE
>> is.vector(v3)
>[1] FALSE
>> class(v3)  ## data.frame
>[1] "data.frame"
>## but
>> is.list(v3)
>[1] TRUE
>which is simply explained in ?data.frame (where else?!) by:
>"A data frame is a **list** [emphasis added] of variables of the same
>number of rows with unique row names, given class "data.frame". If no
>variables are included, the row names determine the number of rows."
>"2.  maybe your question is "is a given function for a vector, or for a
>    data frame/matrix/array?".  if so, i think the only way is reading
>    the help information (?foo)."
>Indeed! Is this not what the Help system is for?! But note also that
>the S3 class system may somewhat blur the issue: foo() may work
>appropriately and differently for different (S3) classes of objects. A
>detailed explanation of this behavior can be found in appropriate
>resources or (more tersely) via ?UseMethod .
>"you might find reading ?"[" and  ?"[.data.frame" useful"
>Not just 'useful" -- **essential** if you want to work in R, unless
>one gets this information via any of the numerous online tutorials,
>courses, or books that are available. The Help system is accurate and
>authoritative, but terse. I happen to like this mode of documentation,
>but others may prefer more extended expositions. I stand by this claim
>even if one chooses to use the "Tidyverse", data.table package, or
>other alternative frameworks for handling data. Again, others may
>disagree, but R is structured around these basics, and imo one remains
>ignorant of them at their peril.
>Bert Gunter
>"The trouble with having an open mind is that people keep coming along
>and sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall <minshall using umich.edu>
>> Kai,
>> > one more question, how can I know if the function is for column
>> > manipulations or for vector?
>> i still stumble around R code.  but, i'd say the following (and look
>> forward to being corrected! :):
>> 1.  a column, when extracted from a data frame, *is* a vector.
>> 2.  maybe your question is "is a given function for a vector, or for
>>     data frame/matrix/array?".  if so, i think the only way is
>>     the help information (?foo).
>> 3.  sometimes, extracting the column as a vector from a data
>>     object might be non-intuitive.  you might find reading ?"[" and
>>     ?"[.data.frame" useful (as well as ?"[.data.table" if you use
>>     package).  also, the str() command can be helpful in
>>     what is happening.  (the lobstr:: package's sxp() function, as
>>     as more verbose .Internal(inspect()) can also give you insight.)
>>     with the data.table:: package, for example, if "DT" is a
>>     object, with "x2" as a column, adding or leaving off quotation
>>     for the column name can make all the difference between ending up
>>     with a vector, or with a (much reduced) data table:
>> ----
>> > is.vector(DT[, x2])
>> [1] TRUE
>> > str(DT[, x2])
>>  num [1:9] 32 32 32 32 32 32 32 32 32
>> >
>> > is.vector(DT[, "x2"])
>> [1] FALSE
>> > str(DT[, "x2"])
>> Classes ‘data.table’ and 'data.frame':  9 obs. of  1 variable:
>>  $ x2: num  32 32 32 32 32 32 32 32 32
>>  - attr(*, ".internal.selfref")=<externalptr>
>> ----
>>     a second level of indexing may or may not help, mostly depending
>>     the use of '[' versus of '[['.  this can sometimes cause
>>     when you are learning the language.
>> ----
>> > str(DT[, "x2"][1])
>> Classes ‘data.table’ and 'data.frame':  1 obs. of  1 variable:
>>  $ x2: num 32
>>  - attr(*, ".internal.selfref")=<externalptr>
>> > str(DT[, "x2"][[1]])
>>  num [1:9] 32 32 32 32 32 32 32 32 32
>> ----
>>     the tibble:: package (used in, e.g., the dplyr:: package) also
>>     (always?) returns a single column as a non-vector.  again, a
>>     second indexing with double '[[]]' can produce a vector.
>> ----
>> > DP <- tibble(DT)
>> > is.vector(DP[, "x2"])
>> [1] FALSE
>> > is.vector(DP[, "x2"][[1]])
>> [1] TRUE
>> ----
>>     but, note that a list of lists is also a vector:
>> > is.vector(list(list(1), list(1,2,3)))
>> [1] TRUE
>> > str(list(list(1), list(1,2,3)))
>> List of 2
>>  $ :List of 1
>>   ..$ : num 1
>>  $ :List of 3
>>   ..$ : num 1
>>   ..$ : num 2
>>   ..$ : num 3
>>     etc.
>> hth.  good luck learning!
>> cheers, Greg
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.

Sent from my phone. Please excuse my brevity.

More information about the R-help mailing list