[R] Problem with filling dataframe's column

Richard O'Keefe r@oknz @end|ng |rom gm@||@com
Thu Jun 15 04:34:20 CEST 2023


Consider

  m <- list(foo=c(1,2),"B'ar"=as.matrix(1:4,2,2),"!*#"=c(FALSE,TRUE))

It is a collection of elements of different types/structures, accessible
via string keys (and also by position).  Entries can be added:

  m[["fred"]] <- 47

Entries can be removed:

  m[["!*#"]] <- NULL

How much more like a Python dictionary do you need it to be?



On Wed, 14 Jun 2023 at 11:25, <avi.e.gross using gmail.com> wrote:

> Bert,
>
> I stand corrected. What I said may have once been true but apparently the
> implementation seems to have changed at some level.
>
> I did not factor that in.
>
> Nevertheless, whether you use an index as a key or as an offset into an
> attached vector of labels, it seems to work the same and I think my comment
> applies well enough that changing a few labels instead of scanning lots of
> entries can sometimes be a good think. As far as I can tell, external
> interface seem the same for now.
>
> One issue with R for a long time was how they did not do something more
> like a Python dictionary and it looks like …
>
> ABOVE
>
> From: Bert Gunter <bgunter.4567 using gmail.com>
> Sent: Tuesday, June 13, 2023 6:15 PM
> To: avi.e.gross using gmail.com
> Cc: javad bayat <j.bayat194 using gmail.com>; R-help using r-project.org
> Subject: Re: [R] Problem with filling dataframe's column
>
> Below.
>
>
> On Tue, Jun 13, 2023 at 2:18 PM <avi.e.gross using gmail.com <mailto:
> avi.e.gross using gmail.com> > wrote:
> >
> >
> > Javad,
> >
> > There may be nothing wrong with the methods people are showing you and
> if it satisfied you, great.
> >
> > But I note you have lots of data in over a quarter million rows. If much
> of the text data is redundant, and you want to simplify some operations
> such as changing some of the values to others I multiple ways, have you
> done any learning about an R feature very useful for dealing with
> categorical data called "factors"?
> >
> > If you have a vector or a column in a data.frame that contains text,
> then it can be replaced by a factor that often takes way less space as it
> stores a sort of dictionary of all the unique values and just records
> numbers like 1,2,3 to tell which one each item is.
>
> -- This is false. It used to be true a **long time ago**, but R has for
> quite a while used hashing/global string tables to avoid this problem. See
> here <
> https://stackoverflow.com/questions/50310092/why-does-r-use-factors-to-store-characters>
> for details/references.
> As a result, I think many would argue that working with strings *as
> strings,* not factors, if often a better default, though of course there
> are still situations where factors are useful (e.g. in ordering results by
> factor levels where the desired level order is not alphabetical).
>
> **I would appreciate correction/ clarification if my claims are wrong or
> misleading! **
>
> In any case, please do check such claims before making them on this list.
>
> Cheers,
> Bert
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list