[R] Function hints

Joerg van den Hoff j.van_den_hoff at fz-rossendorf.de
Tue Jun 20 13:36:34 CEST 2006


Jonathan Baron wrote:
> On 06/19/06 13:13, Duncan Murdoch wrote:
>>> `help.search' does not allow full text search in the manpages (I can
>>> imagine why (1000 hits...), but without such a thing google, for
>>> instance, would probably not be half as useful as it is, right?) and
>>> there is no "sorting by relevance" in the `help.search' output, I think.
>>> how this sorting could be achieved is a different question, of course.
>> You probably want RSiteSearch("keyword", restrict="functions") (or even
>> without the "restrict" part).
> 
> Yes.  The restrict part will speed things up quite a bit, if you
> want to restrict to functions.
> 
> Or, alternatively, you could use Namazu (which I use to generate
> what RSiteSearch provides) to generate an index specific to your
> own installed functions and packages.  The trick is to cd to the
> directory /usr/lib/R/library, or the equivalent, and then say
> 
> mknmz -q */html
> 
> which will pick up the html version of all the man pages
> (assuming you have generated them, and I have no idea whether
> this can be done on Windows).  To update, say
> 
> mknmz --update=. -q */html
> 
> Then make a bookmark for the Namazu search page in your browser,
> as a local file.  (I haven't given all the details.  You have to
> install Namazu and follow the instructions.)
> 
> Or, if you have a web server, you could let Google do it for
> you.  But, I warn you, Google will fill up your web logs pretty
> fast if you don't exclude it with robots.txt.  I don't let it
> search my R stuff.
> 
> I think that Macs and various Linux versions also have other
> alternative built-in search capabilities, but I haven't tried
> them.  Beagle is the new Linux search tool, but I don't know what
> it does.
> 
> Jon

thanks for theses tips. I was not aware of the  `RSiteSearch' function 
(I did know of the existence of the web sites, though) and this helps, 
but of course this is depdendent on web access (off-line labtop 
usage...) and does not know of 'local' (non-CRAN) packages (and knows of 
maybe "too many" contributed packages, which I might not want to 
consider for one reason or the other)

thanks also for the hint on `Namazu'. maybe I do as adviced to get a 
index which is aware of my local configuration and private packages. 
(under MacOS there is a very good and fast full text search engine, but 
it cannot be told to only search the R documentation, for instance, so 
one gets lots of other hits as well.)

what I really would love to see would be an improved help.search():
on r-devel I found a reference to the /concept tag in .Rd files and the 
fact that it is rarely used (again: I was not aware of this :-( ...), 
which might serve as keyword container suitable for improving 
help.search() results. what about changing the syntax here to something like
\concept {
    keyword = score,
    keyword = score
    ...
}
where score would be restricted to a small range of values (say, 1-3 or 
1-5). if package maintainer then would choose a handful of sensible 
keywords (and scores) for a package and its functions one could expect 
improved search results. this might be a naive idea, but could a 
sort-by-relevance in the help.search() output profit from this?

to make it short: I'm not happy with the output, for instance, of
help.search("fitting")               #1
vs.
help.search("linear fitting")        #2
vs.
help.search("non-linear fitting")    #3
I somehow feel that `lm' and `nls' should both be found in the first 
search and that they should be near the top of the lists when they are 
found.

but `lm' is found only in #1 (near the bottom of the list) and `nls' not 
at all (which is really bad). this is partly a problem, of course, of 
inconsistent nomenclature in the manpages but also due to the fact that 
help.search() only accepts single phrases as pattern (and maybe the 
absense of "concept" keywords including a score?)



More information about the R-help mailing list