[R] [External] Re: R 3.6.1 and apcluster package

Tierney, Luke |uke-t|erney @end|ng |rom u|ow@@edu
Thu Jul 18 23:39:06 CEST 2019


On Thu, 18 Jul 2019, Jan Galkowski wrote:

> I have confirmed that a complete workaround to these problems is available if, as Bill Dunlap suggested, "version=2" is used in all *save* incantations.

That will mask this particular symptom, but the real problem is that
the C++ code in the package is mutating an object that it should
duplicate first. I'm not sure if the problem should be addressed in
the package or in Rcpp, but one of the two should be fixed.

The specific problem is in the source file
apcluster/src/aggExClusterC.cpp lines 160-161:

         IntegerVector newClust = concat(actClust[I], actClust[J]);
         newClust.names() = CharacterVector(newClust);

The call CharacterVector(newClust) produced a deferred string object
for newClust and marks newClust as immutable, but the assignment
ignores that and mutates anyway. I have CC'd the maintainers of
apcluster and Rcpp.

We may be able to make serialize() and .Internal(inspect()) a bit more
robust to this sort of misbehavior in package space, but as more
optimizations are added package authors who use native code will need
to be more careful about adhering to the rules for when objects can be
safely modified.

Best,

luke

> Thanks Bill!
>
> - Jan
>
> On Thu, Jul 18, 2019, at 10:39, William Dunlap wrote:
>> Note that you can reproduce this in R-3.5.1 if you specify serialization version 3 (which became the default in 3.6.0).
>>
>>> save(apresX, file="351-2.RData", version=2)
>>> save(apresX, file="351-2.RData", version=3)
>> Error: C stack usage 7969184 is too close to the limit
>>> version$version.string
>> [1] "R version 3.5.1 (2018-07-02)"
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>>
>> On Thu, Jul 18, 2019 at 12:46 AM Jan Galkowski <bayesianlogic.1 using gmail.com> wrote:
>>>> # Test for saving. Jan Galkowski, 17th July 2019.
>>> > # produceProtectionFault.R
>>> >
>>> > library(apcluster)
>>> > cl1 <- cbind(rnorm(100, 0.2, 0.05), rnorm(100, 0.8, 0.06))
>>> > cl2 <- cbind(rnorm(50, 0.7, 0.08), rnorm(50, 0.3, 0.05))
>>> > x <- rbind(cl1, cl2)
>>> >
>>> > ## compute similarity matrix and run affinity propagation
>>> > ## (p defaults to median of similarity)
>>> > simil<- negDistMat(x, r=2)
>>> > apres <- apcluster(s=simil, details=TRUE)
>>> > apresX<- aggExCluster(s=simil, x=apres)
>>> >
>>> > show(apres)
>>> > show(apresX)
>>> >
>>> > saveRDS(object=apresX, file="foo.rds", compress=TRUE)
>>> >
>>> > #save(apresX, file="bar.data", compress=TRUE)
>>> >
>>> > #save.image("crazy.RData")
>>>
>>>  The example is from the apcluster documentation. Leaving any one of the "save"s uncommented produces said fault.
>>>
>>>  - Jan
>>>
>>>  On Wed, Jul 17, 2019, at 08:18, Jeff Newmiller wrote:
>>> > It would never make sense for such messages to reflect normal and expected operation, so hypothesizing about intentionally changing stack behavior doesn't make sense.
>>> >
>>> > The default format for saveRDS changed in 3.6.0. There may be bugs associated with that, but rolling back to 3.6.0 would just trade bugs.
>>> >
>>> > https://cran.r-project.org/doc/manuals/r-devel/NEWS.html
>>> >
>>> > On July 16, 2019 8:56:28 PM CDT, Jan Galkowski <bayesianlogic.1 using gmail.com> wrote:
>>> >>Did something seriously change in R 3.6.1 at least for Windows in terms
>>> >>of stack impacts?
>>> >>
>>> >>I'm encountering many problems with the 00UNLOCK, needing to disable
>>> >>locking during installations.
>>> >>
>>> >>And I'm encountering
>>> >>
>>> >>> Error: C stack usage 63737888 is too close to the limit
>>> >>
>>> >>for cases I did not before, even when all I'm doing is serializing an
>>> >>object to be saved with *saveRDS* or even *save.image(.)*.
>>> >>
>>> >>Yes, I know, I did not append a minimally complete example. Just wanted
>>> >>to see if it was just me, or if anyone else was seeing this.
>>> >>
>>> >>It's on Windows 7 HE and I've run *R* here for years.
>>> >>
>>> >>My inclination is to drop back to 3.6.0 if it is just me or if no one
>>> >>knows about this problem.
>>> >>
>>> >>Thanks,
>>> >>
>>> >> - Jan Galkowski.
>>>  [snip]
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-help mailing list