[R] Beginner's query - segmentation fault

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Oct 7 15:04:38 CEST 2003


On Tue, 7 Oct 2003, Laura Quinn wrote:

> thanks, have used
> 
> temp [temp==0]<- NA
> 
> and this seems to have worked, though it won't let me access individual
> columns (ie temp$t1 etc) to work on - is there any real advantage in using
> a matrix, or would i be better advised to deal with dataframes? (I have
> double checked and temp is currently a matrix).

Things are going to be a lot faster for a numerical matrix than a data 
frame: the advantage of data frames is that the columns can be of
different types.

BTW, you should really use  temp[, "t1"] for a data frame or a matrix:
temp$t1 works for data frames, `by the back door' and has a number of bugs 
(including failing to detect errors which corrupt the data frame) prior to 
1.8.0 (to be).


> 
> 
> 
> On Tue, 7 Oct 2003, Prof Brian Ripley wrote:
> 
> > On Tue, 7 Oct 2003, Laura Quinn wrote:
> >
> > > I am dealing with a huge matrix in R (20 columns, 54000 rows) and have
> > > lots of missing values within the dataset which are currently displayed as
> > > the value "-999.00" I am trying to create a new matrix (or change the
> > > existing one) to display these values as "NA" so that I can then perform
> > > the necessary analysis on the columns within the matrix.
> > >
> > > The matrix name is temp and the column names are t1 to t20 inclusive.
> > >
> > > I have tried the following command:
> > >
> > > temp$t1[temp$t1 == -999.00] <- NA
> > >
> > > and it returns a segmentation fault, can someone tell me what I am doing
> > > wrong?
> >
> > Well, R should not segfault, so there is bug here somewhere.  However, I
> > don't think what you have described can actually work. Is temp really a
> > matrix?  If so temp$t1 will return NULL, and you should get an error
> > message.
> >
> >
> > If temp is a matrix
> >
> > temp[temp == -999.00] <- NA
> >
> > will do what you want.
> >
> >
> > If as is more likely temp is a data frame with all columns numeric,
> > there are several ways to do this, e.g.
> >
> > temp[] <- lapply(temp, function(x) ifelse(x == -999, NA, x))
> >
> > temp[as.matrix(temp) == -999] <- NA  # only in recent versions of R
> >
> > as well as explicit looping over columns.
> >
> > --
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >
> >
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list