[R] Newbie: Using R to analyse Apache logs

Raj Mathur raju at linux-delhi.org
Fri Feb 1 04:15:24 CET 2008


hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO
X-USF-Spam-Flag: NO

Hi Kevin,

On Thursday 31 Jan 2008, Zembower, Kevin wrote:
> Raj,
>
> I've been experimenting with R to compute simple statistics from my web
> logs somewhat similar to what you're describing. For instance, I'm
> working on trying to classify a unique IP or domain name requestor as
> 'human' or 'robot' based on the number of seconds between requests for
> pages. I've found that the easiest method of work, given my (elementary)
> knowledge of R and my (professional) knowledge of perl, is to run my
> logs through a perl program to pre-process the data, before submitting
> it to R. The output of running my Apache web log through my perl program
> looks like this tab-delimited output:
> [snip]

Coincidentally I was planning to write a Perl script before it struck me that 
R could probably do this job better.  I'd be glad to have whatever work 
you've done so far and see if I can tune it -- try to get some help from my 
academic friends.  If that doesn't work, *shrug* it's back to Perl :)

Regards,

-- Raju
-- 
Raj Mathur                raju at kandalaya.org      http://kandalaya.org/
 Freedom in Technology & Software || February 2008 || http://freed.in/
       GPG: 78D4 FC67 367F 40E2 0DD5  0FEF C968 D0EF CC68 D17F
PsyTrance & Chill: http://schizoid.in/   ||   It is the mind that moves



More information about the R-help mailing list