[R] removing only rows/columns with "na" value from square ( symmetrical ) matrix.

Mon May 21 15:03:36 CEST 2012

Hi

You can do it by hand and remove row/col with max number of NA values.

rem<-which.max(colSums(is.na(M)))
M1<-M[-rem, -rem]
rem<-which.max(colSums(is.na(M1)))
M2<-M1[-rem, -rem]
M2
     1   2  3   4  5   7   8  10  11  12
1    0 143 92 134 42 123  40 107  49  93
2  143   0 77   6 99  46  47 114 138  82
3   92  77  0   2 89  24  62  59  97  52
4  134   6  2   0 71  23  43  80  35  86
5   42  99 89  71  0  68  95  27  55  14
7  123  46 24  23 68   0 124  18  53 101
8   40  47 62  43 95 124   0 126  11 129
10 107 114 59  80 27  18 126   0  31  13
11  49 138 97  35 55  53  11  31   0  75
12  93  82 52  86 14 101 129  13  75   0

I believe this can be transformed to cycle in which you need to test 
whether there is any NA for ending a cycle or not starting it if there is 
no NA values.

Regards
Petr

> Yes  the matrix is symmetric 
> Gabor provided a partial solution:
> Try this:
> 
> ix <- na.action(na.omit(replace(M, upper.tri(M), 0)))
> M[-ix, -ix]
> 
> However this removes all rows containing an NA in the lower half of the 
> matrix - even if the corresponding column has also been removed
> 
> I I have revised the example to show this.
> 
> thanks all for you help
> 
> in the below case I would like to retain row and column [c(1:5,7,8,10:
> 12),c(1:5,7,8,10:12)]
> M<-matrix(sample(144),12,12)
> M[10,9]<-NA
> M<-as.matrix(as.dist(M))
> N=M
> #the above rows are to create the symmetric matrix M and a copy N
> M[6,]<-NA
> M[,6]<-NA
> #above two rows - make corresponding row and column NA
> print (M)
> ix <- na.action(na.omit(replace(M, upper.tri(M), 0)))
> M<-M[-ix, -ix]
> print (M)
> 
> print ("however what I would like to retain is the maximum amout of data 

> while removing rows or columns containing NA  ie:")
> print(N [c(1:5,7,8,10:12),c(1:5,7,8,10:12)])
> 
> -- 
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
> 
> thanks to all
> On 21/05/2012, at 1:10 AM, peter dalgaard wrote:
> 
> > 
> > On May 20, 2012, at 16:37 , Bert Gunter wrote:
> > 
> >> Your problem is not well-defined. In your example below, why not
> >> remove rows 1,2,6, and 10, all of which contain NA's? Is the matrix
> >> supposed to be symmetric?
> YES
> 
> >> Do NA's always occur symmetrically?
> YES
> > 
> > ...and even if they do, how do you decide whether to remove row/col 9 
or
> row/col 10 in the example? (Or, for that matter, between (1 and 2) and 
6. 
> In that case you might chose to remove the smallest no. of row/cols but 
in
> "9 vs. 10", the situation is completely symmetric.) 
> > 
> >> 
> >> You either need to rethink what you want to do or clarify your 
statement of it.
> >> 
> >> -- Bert
> >> 
> >> On Sun, May 20, 2012 at 7:17 AM, Nevil Amos <nevil.amos at monash.edu> 
wrote:
> >>> I have some square matrices with na values in corresponding rows and
> >>> columns.
> >>> 
> >>> M<-matrix(1:2,10,10)
> >>> M[6,1:2]<-NA
> >>> M[10,9]<-NA
> >>> M<-as.matrix(as.dist(M))
> >>> print (M)
> >>> 
> >>>   1 2 3 4 5 6 7 8 9 10
> >>> 1   0  2 1 2 1 NA 1 2  1  2
> >>> 2   2  0 1 2 1 NA 1 2  1  2
> >>> 3   1  1 0 2 1  2 1 2  1  2
> >>> 4   2  2 2 0 1  2 1 2  1  2
> >>> 5   1  1 1 1 0  2 1 2  1  2
> >>> 6  NA NA 2 2 2  0 1 2  1  2
> >>> 7   1  1 1 1 1  1 0 2  1  2
> >>> 8   2  2 2 2 2  2 2 0  1  2
> >>> 9   1  1 1 1 1  1 1 1  0 NA
> >>> 10  2  2 2 2 2  2 2 2 NA  0
> >>> 
> >>> 
> >>> How do I remove just the row/column pair( in this trivial example 
row 6 and
> >>> 10 and column 6 and 10) containing the NA values?
> >>> 
> >>> so that I end up with all rows/ columns that are not NA - e.g.
> >>> 
> >>> 1 2 3 4 5 7 8 9
> >>> 1 0 2 1 2 1 1 2 1
> >>> 2 2 0 1 2 1 1 2 1
> >>> 3 1 1 0 2 1 1 2 1
> >>> 4 2 2 2 0 1 1 2 1
> >>> 5 1 1 1 1 0 1 2 1
> >>> 7 1 1 1 1 1 0 2 1
> >>> 8 2 2 2 2 2 2 0 1
> >>> 9 1 1 1 1 1 1 1 0
> >>> 
> >>> 
> >>> if i use na omit I lose rows 1,2,6, and 9
> >>> which is not what I want.
> >>> 
> >>> thanks
> >>> --
> >>> Nevil Amos
> >>> Molecular Ecology Research Group
> >>> Australian Centre for Biodiversity
> >>> Monash University
> >>> CLAYTON VIC 3800
> >>> Australia
> >>> 
> >>>       [[alternative HTML version deleted]]
> >>> 
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >> 
> >> 
> >> 
> >> -- 
> >> 
> >> Bert Gunter
> >> Genentech Nonclinical Biostatistics
> >> 
> >> Internal Contact Info:
> >> Phone: 467-7374
> >> Website:
> >> 
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >> 
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> > 
> > -- 
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.