[R] as.vector with mode="list" and POSIXct

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Wed May 22 03:44:10 CEST 2013


I recommend that you not plan on waiting for the hash package to be redesigned to meet your expectations. Also, your response to discovering this feature of the hash package seems illogical.

>From a computer science perspective, the hash mechanism is an implementation trick that is intended to improve lookup speed. It does not actually represent a fundamental data structure like a vector or a set does. You can always put your keys in a vector and search through them (e.g. vector indexing by string) to get an equivalent data retrieval. If the hash package is not improving the speed of your data access, adding an extra layer of data structure is hardly an appropriate solution.

Why are you not using normal vectors or data frames and accessing with string or logical indexing?

If you are avoiding vectors because they seem slow in loops, perhaps you just need to preallocate the vectors you will store your results in before your loop to regain acceptable speed. Or, perhaps the duplicated() or merge() functions could save you from this mess of incremental data processing.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Alexandre Sieira <alexandre.sieira at gmail.com> wrote:

>You are absolutely right.
>
>I am storing POSIXct objects into a hash (from the hash package).
>However, if I try to get them out as a vector using the values()
>function, they are unclassed. And that breaks my (highly vectorized)
>code. Take a look at this:
>
>
>> h = hash()
>> h[["a"]] = Sys.time()
>> str(h[["a"]])
> POSIXct[1:1], format: "2013-05-20 16:54:28"
>> str(values(h))
> Named num 1.37e+09
> - attr(*, "names")= chr "a"
>
>
>I have reported this to the hash package maintainers. In the meantime,
>however, I am storing, for each key, a list containing a single
>POSIXct. Then, when I extract all using values(), I get a list
>containing all POSIXct entries with class preserved. 
>
>
>> h = hash()
>> h[["a"]] = list( Sys.time() )
>> h[["b"]] = list( Sys.time() )
>> h[["c"]] = list( Sys.time() )
>> values(h)
>$a
>[1] "2013-05-21 09:54:03 BRT"
>
>$b
>[1] "2013-05-21 09:54:07 BRT"
>
>$c
>[1] "2013-05-21 09:54:11 BRT"
>
>> str(values(h))
>List of 3
> $ a: POSIXct[1:1], format: "2013-05-21 09:54:03"
> $ b: POSIXct[1:1], format: "2013-05-21 09:54:07"
> $ c: POSIXct[1:1], format: "2013-05-21 09:54:11"
>
>
>However, the next thing I need to do is a min() over that list, so I
>need to convert the list into a vector again.
>
>I agree completely with you that this is horrible for performance, but
>it is a temporary workaround until values() is "fixed".
>
>-- 
>Alexandre Sieira
>CISA, CISSP, ISO 27001 Lead Auditor
>
>"The truth is rarely pure and never simple."
>Oscar Wilde, The Importance of Being Earnest, 1895, Act I
>On 20 de maio de 2013 at 19:40:14, Jeff Newmiller
>(jdnewmil at dcn.davis.ca.us) wrote:
>I don't know what you plan to do with this list, but lists are quite a
>bit less efficient than fixed-mode vectors, so you are likely losing a
>lot of computational speed by using this list. I don't hesitate to use
>simple data frames (lists of vectors), but processing lists is on par
>with for loops, not vectorized computation. It may still support a
>simpler model of computation, but that is an analyst comprehension
>benefit rather than a computational efficiency benefit.  
>---------------------------------------------------------------------------
> 
>Jeff Newmiller The ..... ..... Go Live...  
>DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...  
>Live: OO#.. Dead: OO#.. Playing  
>Research Engineer (Solar/Batteries O.O#. #.O#. with  
>/Software/Embedded Controllers) .OO#. .OO#. rocks...1k  
>---------------------------------------------------------------------------
> 
>Sent from my phone. Please excuse my brevity.  
>
>Alexandre Sieira <alexandre.sieira at gmail.com> wrote:  
>
>>I was trying to convert a vector of POSIXct into a list of POSIXct,  
>>However, I had a problem that I wanted to share with you.  
>>  
>>Works fine with, say, numeric:  
>>  
>>  
>>> v = c(1, 2, 3)  
>>> v  
>>[1] 1 2 3  
>>> str(v)  
>> num [1:3] 1 2 3  
>>> l = as.vector(v, mode="list")  
>>> l  
>>[[1]]  
>>[1] 1  
>>  
>>[[2]]  
>>[1] 2  
>>  
>>[[3]]  
>>[1] 3  
>>  
>>> str(l)  
>>List of 3  
>> $ : num 1  
>> $ : num 2  
>> $ : num 3  
>>  
>>If you try it with POSIXct, on the other hand…  
>>  
>>  
>>> v = c(Sys.time(), Sys.time())  
>>> v  
>>[1] "2013-05-20 18:02:07 BRT" "2013-05-20 18:02:07 BRT"  
>>> str(v)  
>> POSIXct[1:2], format: "2013-05-20 18:02:07" "2013-05-20 18:02:07"  
>>> l = as.vector(v, mode="list")  
>>> l  
>>[[1]]  
>>[1] 1369083728  
>>  
>>[[2]]  
>>[1] 1369083728  
>>  
>>> str(l)  
>>List of 2  
>> $ : num 1.37e+09  
>> $ : num 1.37e+09  
>>  
>>The POSIXct values are coerced to numeric, which is unexpected.  
>>  
>>The documentation for as.vector says: "The default method handles 24  
>>input types and 12 values of type: the details of most coercions are  
>>undocumented and subject to change." It would appear that treatment
>for  
>>POSIXct is either missing or needs adjustment.  
>>  
>>Unlist (for the reverse) is documented to converting to base types, so
> 
>>I can't complain. Just wanted to share that I ended up giving up on  
>>vectorization and writing the two following functions:  
>>  
>>  
>>unlistPOSIXct <- function(x) {  
>>  retval = rep(Sys.time(), length(x))  
>>  for (i in 1:length(x)) retval[i] = x[[i]]  
>>  return(retval)  
>>}  
>>  
>>listPOSIXct <- function(x) {  
>>  retval = list()  
>>  for (i in 1:length(x)) retval[[i]] = x[i]  
>>  return(retval)  
>>}  
>>  
>>Is there a better way to do this (other than using *apply instead of  
>>for above) that better leverages vectorization? Am I missing something
> 
>>here?  
>>  
>>Thanks!  
>>  
>>  
>>  
>>  
>>--   
>>Alexandre Sieira  
>>CISA, CISSP, ISO 27001 Lead Auditor  
>>  
>>"The truth is rarely pure and never simple."  
>>Oscar Wilde, The Importance of Being Earnest, 1895, Act I  
>>  
>>------------------------------------------------------------------------
> 
>>  
>>______________________________________________  
>>R-help at r-project.org mailing list  
>>https://stat.ethz.ch/mailman/listinfo/r-help  
>>PLEASE do read the posting guide  
>>http://www.R-project.org/posting-guide.html  
>>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list