[R] Best Fit line trouble with rsruby

Alex Gutteridge alexg at ruggedtextile.com
Thu Nov 4 09:51:12 CET 2010


On Wed, 3 Nov 2010 18:24:43 -0700 (PDT), Deadpool <deadpool93 at comcast.net>
wrote:
> Hello, I am using R, through rsruby, to create a graph and best fit line
> for
> a set of data points, regarding data collected in a Chemistry class. The
> problem is that although the graph functions perfectly properly, the
best
> fit line will not work.
> 
> I initially used code I pretty much copied from a website with a
tutorial
> on
> this, which was:
> 
> graphData.png("/code/Beer's-Law Graph.png")
> concentration = p1Conc
> absorbance = p1AbsorbanceArray
> graphData.assign('x', p1Conc)
> graphData.assign('y', p1AbsorbanceArray)
> fit = graphData.lm('x ~ y')
> graphData.plot(concentration, absorbance)
> graphData.abline(fit["coefficients"]["(Intercept)"],
> fit["coefficients"]["y"])
> puts fit["coefficients"]
> graphData.eval_R("dev.off()")
> 
> (p1Conc and p1AbsorbanceArray are arrays)
> 
> This worked for the graph, but the best fit line looked (and the
> infinitesimally small slope supported) like it was based off a single
> point.
> The site said they had to define something in the R interpreter first,
but
> didn't elaborate, so I gave it a go, and obviously it didn't work.

It looks to me like you have the response and explanatory variables
swapped in your model (or your plot).

Try:

fit = graphData.lm("y~x")
graphData.plot(concentration, absorbance)
graphData.abline(fit["coefficients"]["(Intercept)"],fit["coefficients"]["x"])

Or just swap the axes on your plot.

> I then tried something like this, as I thought the conversion from the
> array
> to the string in the assign function was causing the problem with the
best
> fit line.

No - that should be fine. You aren't converting an array into a string
just assigning a Ruby Array to an R variable (vector) with the given name.
 
> graphData = RSRuby.instance
> graphData.png("/code/Beer's-Law Graph.png")
> concentration = graphData.c(p1Conc[0..(p1SampNum - 1)])
> absorbance = graphData.c(p1AbsorbanceArray[0..(p1SampNum - 1)])
> fit = graphData.lm(concentration ~ absorbance)
> graphData.plot(concentration, absorbance)
> graphData.abline(fit["coefficients"]["(Intercept)"],
> fit["coefficients"][absorbance])
> puts fit["coefficients"]
> print "\n"
> graphData.eval_R("dev.off()")
> 
> Basically trying to bypass that, and feed the numbers straight from the
> array into the best fit line, but the program was giving me an error,
> saying
> it didn't know what ~ was for an array (should note I tried it first
> without
> doing the graphData.c thing, but that didn't work and as the .c function
> didn't seem to store things as an array, I thought that might work, it
> didn't, as it does store data as an array).

RSRuby doesn't know about R formulas so a bare '~' is a syntax error in
Ruby. You must pass the model specification as a string as you did the
first time. Unfortunately this means you either have to do the .assign()
workaround to get the data into variables R can see or pass the data via
the 'data' argument to lm. See this irb session for an example of the
second technique:

wsp00614206:~ GUTTEA$ irb
>> require 'rsruby'
=> true
>> r = RSRuby.instance
=> #<RSRuby:0x101176c20 @class_table={}, @default_mode=-1, @caching=true,
@cache={"get"=>#<RObj:0x101176798>, "helpfun"=>#<RObj:0x101172cd8>,
"help"=>#<RObj:0x101172cd8>, "NaN"=>NaN, "FALSE"=>false, "TRUE"=>true,
"F"=>false, "NA"=>-2147483648, "eval"=>#<RObj:0x101175230>, "T"=>true,
"parse"=>#<RObj:0x1011757a8>}, @proc_table={}>
>> x = (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> y = (11..20).to_a
=> [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
>> fit = r.lm("y~x",:data=>{'x' => x, 'y' => y})
=> {"model"=>{"x"=>[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], "y"=>[11, 12, 13, 14,
15, 16, 17, 18, 19, 20]}, "qr"=>{"qr"=>[[-3.16227766016838,
-17.3925271309261], [0.316227766016838, 9.08295106229247],
[0.316227766016838, 0.15621147358221], [0.316227766016838,
0.0461150970695743], [0.316227766016838, -0.0639812794430617],
[0.316227766016838, -0.174077655955698], [0.316227766016838,
-0.284174032468334], [0.316227766016838, -0.39427040898097],
[0.316227766016838, -0.504366785493606], [0.316227766016838,
-0.614463162006242]], "pivot"=>[1, 2], "rank"=>2, "tol"=>1.0e-07,
"qraux"=>[1.31622776601684, 1.26630785009485]}, "assign"=>[0, 1],
"rank"=>2, "residuals"=>{"6"=>3.24300739408472e-16,
"7"=>3.20784180753863e-16, "8"=>-3.4886619267584e-16,
"9"=>-1.01851656610554e-15, "1"=>-3.63520003547369e-15,
"2"=>1.72099416959944e-15, "3"=>1.22302883507243e-15,
"10"=>-3.55899309985059e-16, "4"=>9.97467671492785e-16,
"5"=>7.71906507913144e-16}, "df.residual"=>8,
"effects"=>{""=>1.77635683940025e-15, "x"=>9.08295106229247,
"(Intercept)"=>-49.0153037326099}, "xlevels"=>{},
"fitted.values"=>{"6"=>16.0, "7"=>17.0, "8"=>18.0, "9"=>19.0, "1"=>11.0,
"2"=>12.0, "3"=>13.0, "10"=>20.0, "4"=>14.0, "5"=>15.0},
"call"=>#<RObj:0x10111c810>, "terms"=>#<RObj:0x10111c798>,
"coefficients"=>{"x"=>1.0, "(Intercept)"=>10.0}}
>> fit["coefficients"]
=> {"x"=>1.0, "(Intercept)"=>10.0}

> So basically I'm stuck. Not sure if anyone has any experience with
rsruby,
> but any help would be appreciated. I'm pretty sure the fit =
> graphData.lm(etcetera) line is where the trouble is, but not sure how to
> handle it.

You got pretty close!

-- 
Alex Gutteridge



More information about the R-help mailing list