[R] changes in RSiteSearch() and http://finzi.psych.upenn.edu/search.html

Jonathan Baron baron at psych.upenn.edu
Wed May 31 01:37:02 CEST 2006


This is about my searchable archive.  The function RSiteSearch()
now searches this archive.  I'm considering the following
changes.  If you have comments, please write me.  Try to avoid
cc'ing the list.

The Rhelp02a directory, which now contains all list mail from
2002 (about 100 MB), is getting larger and larger.  This probably
cannot go on forever, and performance might even improve if it
got smaller now.

I've considered two changes.  One is to start a new archive.
This would break RSiteSearch.  Even if that were fixed, the fix
would spread very slowly.

The second solution, which I plan to implement unless I hear a
better one, is to make this major archive include a maximum
four-year window, so that it would now start in 2003 rather than
2002, then (next year) would start in 2004, and so on.  The main
disadvantage is that references to current message in my archive
would be lost, because the message numbers would change.  (I will
try to save the old archive.)  There aren't very many of these
(244, including replies, in the archive itself).  Although the
archive would start in 2003, it would still be called Rhelp02a,
so that RSiteSearch(), etc., still work.

Maybe someone experienced with hypermail or namazu will tell me
that 100 MB is nothing and I should wait until I get a GB before
getting nervous.

I also plan to change the default number of items per page from
20 to 100.  This will not change in RSiteSearch() until the next
version, but this is not so big a deal.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron



More information about the R-help mailing list