[Rd] conflict between rJava and data.table

Simon Urbanek simon.urbanek at r-project.org
Fri Mar 1 21:19:36 CET 2013


On Mar 1, 2013, at 11:40 AM, Matthew Dowle wrote:

> On 01.03.2013 16:13, Simon Urbanek wrote:
>> On Mar 1, 2013, at 8:03 AM, Matthew Dowle wrote:
>> 
>>> 
>>> Simon Urbanek wrote :
>>>> Can you elaborate on the details as of where this will be a problem? Packages
>>>> should not be affected since they should be importing the namespaces from the
>>>> packages they use, so the only problem would be in a package that uses both
>>>> data.table and rJava --  and this is easily resolved in the namespace of such
>>>> package. So there is no technical reason why you can't have multiple
>>>> definitions of J - that's what namespaces are for.
>>> 
>>> Right. It's users using J() in their own code, iiuc. rJava's manual says "J is
>>> the high-level access to Java."  When they use J() on its own they probably
>>> want the rJava one, but if data.table is higher they get that one.
>>> They don't want to have to write out rJava::J(...).
>>> 
>>> It is not just rJava but package XLConnect, too. If there's a better way would
>>> be interested but I didn't mind removing J from data.table.
>>> 
>> 
>> For packages there is really no issue - if something breaks in
>> XTConnect then the authors are probably importing the wrong function
>> in their namespace (I still didn't see a reproducible example,
>> though). The only difference is for interactive use so not having
>> conflicting J() [if possible] would be actually useful there, since
>> J() in rJava is primarily intended for interactive use.
> 
> Yes that's what I wrote above isn't it? i.e.
> 
>> It's users using J() in their own code, iiuc.
>> "J is the high-level access to Java."
> 
> Not just interactive use (i.e. at the R prompt) but inside their functions and scripts, too.
> Although, I don't know the rJava package at all. So why J() might be used for interactive
> use but not in functions and scripts isn't clear to me.
> Any use of J from example(J) will serve as a reproducible example; e.g.,
> 
>    library(rJava)          # load rJava first
>    library(data.table)     # then data.table
>    J("java.lang.Double")
> 
> There is no error or warning, but the user would be returned a 1 row 1 column
> data.table rather than something related to Java. Then the errors/warnings follow from there.
> 
> The user can either load the packages the other way around, or, use ::
> 
>    library(rJava)                  # load rJava first
>    library(data.table)             # then data.table
>    rJava::J("java.lang.Double")    # ok now
> 

Matt,

there are two entirely separate uses 

a) interactive use
b) use in packages

you are describing a) and as I said in the latter part above J() in rJava is meant for that so it would be useful to not have a conflict there.

However, in my first part of the e-mail I was referring to b) where there is no conflict, because packages define which package will a symbol come from, so the user search path plays no role. Today, all packages should be using imports so search path pollution should no longer be an issue, so the order in which the user attached packages to their search path won't affect the functionality of the packages (that's why namespaces are mandatory). Therefore, if XLConnect breaks (again, I don't know, I didn't see it) due to the order on the search path, it indicates there is a bug in the its namespace as it's apparently importing the wrong J - it should be importing it from rJava and not data.table. Is that more clear?

Cheers,
Simon





> 
>> 
>> Cheers,
>> Simon
>> 
>> 
>>> Bunny/Matt,
>>> 
>>> To add to Steve's reply here's some background. This is well documented in NEWS
>>> and Googling "data.table J rJava" and similar returns useful links to NEWS and
>>> datatable-help (so you shouldn't have needed to post to r-devel).
>>> 
>>> From 1.8.2 (Jul 2012) :
>>> 
>>> o  The J() alias is now deprecated outside DT[...], but will still work inside
>>>  DT[...], as in DT[J(...)].
>>>  J() is conflicting with function J() in package XLConnect (#1747)
>>>  and rJava (#2045). For data.table to change is easier, with some efficiency
>>>  advantages too. The next version of data.table will issue a warning from J()
>>>  when used outside DT[...]. The version after will remove it. Only then will
>>>  the conflict with rJava and XLConnect be resolved.
>>>  Please use data.table() directly instead of J(), outside DT[...].
>>> 
>>> From 1.8.4 (Nov 2012) :
>>> 
>>> o  J() now issues a warning (when used *outside* DT[...]) that using it
>>>  outside DT[...] is deprecated. See item below in v1.8.2.
>>>  Use data.table() directly instead of J(), outside DT[...]. Or, define
>>>  an alias yourself. J() will continue to work *inside* DT[...] as documented.
>>> 
>>> From 1.8.7 (soon to be on CRAN) :
>>> 
>>> o  The J() alias is now removed *outside* DT[...], but will still work inside DT[...];
>>>  i.e., DT[J(...)] is fine. As warned in v1.8.2 (see below in this file) and deprecated
>>>  with warning() in v1.8.6. This resolves the conflict with function J() in package
>>>  XLConnect (#1747) and rJava (#2045).
>>>  Please use data.table() directly instead of J(), outside DT[...].
>>> 
>>> Matthew
>>> 
>>> 
>>> 
> 
> 



More information about the R-devel mailing list