[R] deleting invariant rows and cols in a matrix

Patrick McKnight pem at theriver.com
Sun May 12 00:49:50 CEST 2002


Greetings,

I couldn't find any existing function that would allow me to scan a
matrix and eliminate invariant rows and columns so I have started to
write a simple routine from scratch.  The following code fails because
the array index goes out of bounds for obvious reasons you'll see
shortly.

Start with some data

x <- read.table("myex.dat",header=T)
x

  v1 v2 v3 v4 v5 id
1  1  0  1  2  4  1
2  1  1  1  1  1  2
3  1  2  3  1  4  3
4  1  3  4  2  4  4
5  2  2  2  2  2  5

Here's my function

---- begin R code ----

elimnovar <- function(x,first.item=NULL,nitems=NULL,responses=NULL){

# Data prep - store as matrix, strip off id's, get variable names

dat <- as.matrix(x)                                      
item.dat <- dat[,first.item:(first.item + nitems - 1)]   
inames <- dimnames(item.dat)[[2]]

# Eliminate zero variance items and persons

# Store data in temp name and keep original

clean <- item.dat

# Initialize the stop variable

stp <- 0

# Start cleanup process for both cols and rows

while (stp != 1){
  stp.row <- rep(0,nrow(clean))
  stp.col <- rep(0,ncol(clean))

# Start with rows

  for (i in 1:nrow(clean)){
    sdrow <- sd(clean[i,])
    if (sdrow==0) clean <- clean[i * -1,]
    if (sdrow==0) stp.row[i] <- 1
  }

# Next check columns

  for (j in 1:ncol(clean)){
    sdcol <- sd(clean[,j])
    if (sdcol==0) clean <- clean[,j * -1]
    if (sdcol==0) stp.col[j] <- 1
  }

# Do we need to continue with the process?

  if (sum(stp.row)==0 && sum(stp.col)==0) stp <- 1
}

# Output cleaned data to new dataset name

cleaned <<- clean

}

---- end R code ----


So my questions are:

1.  How can I create an array of rows and column numbers to later
delete?  I realize that the code above is running into problems because
the for loop is indexing non-existent rows/cols after they have been
deleted.  The deletion process must occur after the loop.  I know how to
easily drop a row or a column while in the for loop but storing those
rows and column numbers and then deleting them after the loop just
escapes me.  Any suggestions?

2.  Is there a more efficient way to complete this task?  I don't
proclaim to be a programmer - a hack at best - but I can't imagine that
there is not a simpler method for achieving the goal of eliminating
invariant rows and columns.

Thanks in advance for any and all suggestions.


-- 
Cheers,

Patrick
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list