[R] problem for strsplit function

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Fri Jul 9 23:51:16 CEST 2021


"Strictly speaking", Greg is correct, Bert.

https://cran.r-project.org/doc/manuals/r-release/R-lang.html#List-objects

Lists in R are vectors. What we colloquially refer to as "vectors" are more precisely referred to as "atomic vectors". And without a doubt, this "vector" nature of lists is a key underlying concept that explains why adding a dim attribute creates a matrix that can hold data frames. It is also a stumbling block for programmers from other languages that have things like linked lists.

On July 9, 2021 2:36:19 PM PDT, Bert Gunter <bgunter.4567 using gmail.com> wrote:
>"1.  a column, when extracted from a data frame, *is* a vector."
>Strictly speaking, this is false; it depends on exactly what is meant
>by "extracted." e.g.:
>
>> d <- data.frame(col1 = 1:3, col2 = letters[1:3])
>> v1 <- d[,2] ## a vector
>> v2 <- d[[2]] ## the same, i.e
>> identical(v1,v2)
>[1] TRUE
>> v3 <- d[2] ## a data.frame
>> v1
>[1] "a" "b" "c"  ## a character vector
>> v3
>  col2
>1    a
>2    b
>3    c
>> is.vector(v1)
>[1] TRUE
>> is.vector(v3)
>[1] FALSE
>> class(v3)  ## data.frame
>[1] "data.frame"
>## but
>> is.list(v3)
>[1] TRUE
>
>which is simply explained in ?data.frame (where else?!) by:
>"A data frame is a **list** [emphasis added] of variables of the same
>number of rows with unique row names, given class "data.frame". If no
>variables are included, the row names determine the number of rows."
>
>"2.  maybe your question is "is a given function for a vector, or for a
>    data frame/matrix/array?".  if so, i think the only way is reading
>    the help information (?foo)."
>
>Indeed! Is this not what the Help system is for?! But note also that
>the S3 class system may somewhat blur the issue: foo() may work
>appropriately and differently for different (S3) classes of objects. A
>detailed explanation of this behavior can be found in appropriate
>resources or (more tersely) via ?UseMethod .
>
>"you might find reading ?"[" and  ?"[.data.frame" useful"
>
>Not just 'useful" -- **essential** if you want to work in R, unless
>one gets this information via any of the numerous online tutorials,
>courses, or books that are available. The Help system is accurate and
>authoritative, but terse. I happen to like this mode of documentation,
>but others may prefer more extended expositions. I stand by this claim
>even if one chooses to use the "Tidyverse", data.table package, or
>other alternative frameworks for handling data. Again, others may
>disagree, but R is structured around these basics, and imo one remains
>ignorant of them at their peril.
>
>Cheers,
>Bert
>
>
>Bert Gunter
>
>"The trouble with having an open mind is that people keep coming along
>and sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall <minshall using umich.edu>
>wrote:
>>
>> Kai,
>>
>> > one more question, how can I know if the function is for column
>> > manipulations or for vector?
>>
>> i still stumble around R code.  but, i'd say the following (and look
>> forward to being corrected! :):
>>
>> 1.  a column, when extracted from a data frame, *is* a vector.
>>
>> 2.  maybe your question is "is a given function for a vector, or for
>a
>>     data frame/matrix/array?".  if so, i think the only way is
>reading
>>     the help information (?foo).
>>
>> 3.  sometimes, extracting the column as a vector from a data
>frame-like
>>     object might be non-intuitive.  you might find reading ?"[" and
>>     ?"[.data.frame" useful (as well as ?"[.data.table" if you use
>that
>>     package).  also, the str() command can be helpful in
>understanding
>>     what is happening.  (the lobstr:: package's sxp() function, as
>well
>>     as more verbose .Internal(inspect()) can also give you insight.)
>>
>>     with the data.table:: package, for example, if "DT" is a
>data.table
>>     object, with "x2" as a column, adding or leaving off quotation
>marks
>>     for the column name can make all the difference between ending up
>>     with a vector, or with a (much reduced) data table:
>> ----
>> > is.vector(DT[, x2])
>> [1] TRUE
>> > str(DT[, x2])
>>  num [1:9] 32 32 32 32 32 32 32 32 32
>> >
>> > is.vector(DT[, "x2"])
>> [1] FALSE
>> > str(DT[, "x2"])
>> Classes ‘data.table’ and 'data.frame':  9 obs. of  1 variable:
>>  $ x2: num  32 32 32 32 32 32 32 32 32
>>  - attr(*, ".internal.selfref")=<externalptr>
>> ----
>>
>>     a second level of indexing may or may not help, mostly depending
>on
>>     the use of '[' versus of '[['.  this can sometimes cause
>confusion
>>     when you are learning the language.
>> ----
>> > str(DT[, "x2"][1])
>> Classes ‘data.table’ and 'data.frame':  1 obs. of  1 variable:
>>  $ x2: num 32
>>  - attr(*, ".internal.selfref")=<externalptr>
>> > str(DT[, "x2"][[1]])
>>  num [1:9] 32 32 32 32 32 32 32 32 32
>> ----
>>
>>     the tibble:: package (used in, e.g., the dplyr:: package) also
>>     (always?) returns a single column as a non-vector.  again, a
>>     second indexing with double '[[]]' can produce a vector.
>> ----
>> > DP <- tibble(DT)
>> > is.vector(DP[, "x2"])
>> [1] FALSE
>> > is.vector(DP[, "x2"][[1]])
>> [1] TRUE
>> ----
>>
>>     but, note that a list of lists is also a vector:
>> > is.vector(list(list(1), list(1,2,3)))
>> [1] TRUE
>> > str(list(list(1), list(1,2,3)))
>> List of 2
>>  $ :List of 1
>>   ..$ : num 1
>>  $ :List of 3
>>   ..$ : num 1
>>   ..$ : num 2
>>   ..$ : num 3
>>
>>     etc.
>>
>> hth.  good luck learning!
>>
>> cheers, Greg
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list