[BioC] Qspline (again)

Paul Boutros pcboutro@engmail.uwaterloo.ca
Thu, 7 Nov 2002 16:45:35 -0500 (EST)


A note to anybody following this discussion.  I found that, another
solution to Laurents' idea of setting k <- 5 as a default is altering the
samples parameter.  Depending on the dataset size, a different number of
samples may be optimal.

For instance:
~200   data points use samples = 0.33
~2000  data points use samples = 0.05
~10000 data points use samples = 0.02

So that implies that perhaps the default sampling rate was just too large
for the data set I was using.

Hope this helps!
Paul

On Tue, 5 Nov 2002, Paul Boutros wrote:

> Hi again,
> 
> This works great now.  Thanks for your help!
> Paul
> 
> On Tue, 5 Nov 2002, Laurent Gautier wrote:
> 
> > On Wed, Oct 30, 2002 at 04:15:52PM -0500, Paul Boutros wrote:
> > > Hi again,
> > > 
> > > > > > > The error message I get is:
> > > > > > > Error in rep(data, t1) : invalid number of copies in "rep"
> > > > > > > 
> > > > > > 
> > > > > > Try 'traceback()' just after the error message is fired. It may give
> > > > > > us a hint about where this happened.
> > > > > 
> > > > > I get:
> > > > > 3: rep(data, t1)
> > > > > 2: array(y.offset, (k - 1))
> > > > > 1: normalize.qspline(data2)
> > > > > 
> > > > > Does this help at all?
> > > 
> > > > It does. I had a hard time with the 'k' thing (and no answer from the
> > > > corresponding author about my questions/remarks/suggestions). I suspect
> > > > that for some reason you do not have the check added.  Does the code
> > > > for the function look like follows ?
> > > > -------------------
> > > >   if (y.offset <= min.offset) { 
> > > >     y.offset <- min.offset;
> > > >     k <- round(py.inds[1]/min.offset)
> > > >   }
> > > > 
> > > >   ## here is the part I suspect missing ##
> > > >   if (k < 1) {
> > > >     warning("qspline cannot be performed (insufficient number of
> > > arrays)")
> > > >     return(x)
> > > >   }
> > > >   ## ---- ##
> > > > 
> > > >   y.offset <- c(0, array(y.offset, (k-1)))
> > > >   y.order <- order(target)
> > > > -------------------
> > > 
> > > This is exactly how my code looks -- the check on array number is not
> > > present.
> > > However, I should note that I am doing this analysis with three arrays.
> > > The code I use looks like this:
> > > 
> > > > data1 <- read.table("c:\\docume~1\\paul\\dev\\testvals.txt");
> > > > data2 <- data.matrix(data1);
> > > > data2;
> > >          qry1.S635 qry1.S532 qry2.S635 qry2.S532 qry3.S635 qry3.S532
> > > H3001A10     613.0     602.5     483.0     633.0      61.0      75.0
> > > H3001A12    1208.5    1019.0    1209.0    1187.0      91.0     153.5
> > > H3001B01     337.0     353.0     619.0     529.5     132.5     153.5
> > > H3001C02     495.0     497.5      64.5     117.5      10.5      59.0
> > > H3001C03     199.5     152.5     301.0     222.0      63.5      95.5
> > > H3001C04    1855.5    1447.0    1969.0    1721.0     286.0     349.0
> > > H3001C07    4765.5    3643.0    4889.0    4401.5    1064.0    1148.0
> > > H3001C09     720.5     894.5     347.0     602.0     188.0     326.5
> > > H3001C11     630.0     536.0     814.5     899.5      20.0      48.5
> > > > c <- normalize.qspline(data2);
> > > Error in rep(data, t1) : invalid number of copies in "rep"
> > > > c <- normalize.qspline(data1);
> > > Error in rep(data, t1) : invalid number of copies in "rep"
> > > 
> > > I tried adding the code you indicated as missing into the affy.R file.
> > > I now get:
> > > 
> > >          qry1.S635 qry1.S532 qry2.S635 qry2.S532 qry3.S635 qry3.S532
> > > H3001A10     613.0     602.5     483.0     633.0      61.0      75.0
> > > H3001A12    1208.5    1019.0    1209.0    1187.0      91.0     153.5
> > > H3001B01     337.0     353.0     619.0     529.5     132.5     153.5
> > > H3001C02     495.0     497.5      64.5     117.5      10.5      59.0
> > > H3001C03     199.5     152.5     301.0     222.0      63.5      95.5
> > > H3001C04    1855.5    1447.0    1969.0    1721.0     286.0     349.0
> > > H3001C07    4765.5    3643.0    4889.0    4401.5    1064.0    1148.0
> > > H3001C09     720.5     894.5     347.0     602.0     188.0     326.5
> > > H3001C11     630.0     536.0     814.5     899.5      20.0      48.5
> > > 
> > > Warning message: 
> > > qspline cannot be performed (insufficient number of arrays) in:
> > > normalize.qspline(data2)
> > > 
> > > This indicates to me that my input data is in an inappropriate format
> > > somehow?
> > 
> > No. The format seems to be appropriate.
> > 
> > 
> > > Any ideas where I go from here?
> > > 
> > 
> > I do. As mentioned earlier the variable 'k' is estimated in the code
> > through an iterative procedure (which has the annoying property to
> > give a value that do not make sense in some cases).
> > I contacted the corresponding author about that but get no answer.
> > I inserted the error message "insufficient blahblablah" since I
> > mainly observed that when applied with to a small number of arrays.
> > One way to get something done, waiting for the day I will have time to
> > go in details into that, would be to set a default value for k when the
> > iterative things do not behave nicely.
> > 
> > Try to replace what you inserted with :
> >  
> > ## ----------- ##
> >    if (k < 1) {
> >      warning("'k' things did not work. Set to default value")
> >      k <- 5 
> >    }
> > ## ---- ##
> > 
> > 
> > Hopin' it helps,
> > 
> > 
> > 
> > 
> > Laurent
> > 
> > 
> > 
> > 
> 
>