[Rd] proposed changes to RSiteSearch

Romain Francois romain.francois at dbmail.com
Fri May 8 16:47:33 CEST 2009

Jonathan Baron wrote:
> After reading all this, I favor doing one of two things:
> 1. Put all the search stuff, including the proposed gmane function, in
>    Spencer's new package but make it one of the default packages, like
>    utils, etc., or,
> 2. Put everything in utils, including Spencer's new package and the
>    gmane function.
> I do not know enough to choose between these.
I would tend to prefer #1 so that the functionality can incubate in a 
separate package, and then when it is mature enough, we can make a call 
about what to do with it.

Something like this:
- a generic abstract function that sets up the interface to query a 
search engine.

- implementations of this, here are what I can think of:
+ jon's RSiteSearch for help pages
+ r graphical manuals
+ gmane, markmail for mail archives
+ classic help.search
+ R news (not clear how to do this right now)
+ vignettes (not clear how to do this right now)
+ JSS articles (not clear how to this right now)
+ FAQ (not clear how to this right now)
+ ... add your own by simply register your implementation

The point about having some sort of central generic function is that it 
can be responsible for asking all engines and bring all results back in 
a single format.

This somehow duplicates work I have been doing with the rsitesearch 
firefox extension, but doing it in R has several advantages.

This I think is enough design to be a separate package.

I am not sure what are the requirements for a package to be shipped with 
the distribution of R (QA, documentation, ...), but I am sure whoever 
steps me (maybe me) can make it compliant.

There is precedent for functionality that was in a package and was 
merged into utils afterwards (rcompgen), but I think it was included 
because this was necessary, don't think these search engines __have__ to 
be in utils.

> On 05/07/09 14:42, spencerg wrote:
>>       1.  Whatever we do with the "RSiteSearch" function, it should 
>> still be available every time R starts.  If we put it in its own 
>> package, it should still be autoloaded with "base", "utils", "stats", etc. 
> Good point.
>>       2.  Sundar indicated to me that, "if Jonathan would like to remove 
>> the search capability, it would be rather simple to move RSiteSearch to 
>> nabble" for the listserve archives.  The "RSiteSearch" function could be 
>> modified to combine that with a separate search of only the help pages 
>> on Jonathan's server. 
> I do not understand "rather simple" at all.  For those who are
> interested, I've put my notes on how to manage my site (which still
> need a bit of revision, but this will give you some idea of what is
> involved) in
> http://finzi.psych.upenn.edu/~baron/notes.namazu.txt
> The problem is that I have not found a way to automate this, so I
> still spend several hours each month doing it by hand.  Too many
> little glitches come up along the way, and the main problem is the
> mailing lists.  Moreover, Namazu just doesn't work all that well for
> mailing lists of this size, because of the page footers in each post.
> (Now I remove them.  That was a bad idea.  But if we're going to get
> rid of this anyway I will not take the time to figure out how to put
> them back properly.)
> Also, Liviu Androic argued that vignettes should be searchable
> separately from help pages.  This makes sense, but I would strongly
> prefer to move ahead on other changes and leave this until later.
> The need for this sort of modification is what makes me favor option
> #1 at the beginning (separate package) on the theory that it would be
> easier for me to make changes than if it were part of utils, but I
> don't know how this works.
> So, if someone can make a decision about how to proceed, I'll do what
> I can, as soon as I can.
> Jon

Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30

More information about the R-devel mailing list