[R] Removing the rows where all the elements are 0

arun smartpink111 at yahoo.com
Mon Aug 5 20:00:19 CEST 2013


Not sure I understand the problem.
dat1<- read.table(text="
gene    ZPT.1   ZPT.0   ZPT.2   ZPT.3   PDGT.1  PDGT.0
XLOC_000001 3516    626 1277    770 4309    9030
XLOC_000002 342 82  185 72  835 1095
XLOC_000003 2000    361 867 438 454 687
XLOC_000004 143 30  67  37  90  236
XLOC_000005 0   0.21   0.1   0   0   0
XLOC_000006 0   0.1   0   0.01   0   0
XLOC_000007 0   0   0   0   1   3
XLOC_000008 0   0   0   0   0.15   0
XLOC_000009 0   0   0.12   0   0   0
XLOC_000010 7   1   5   3   0   1
XLOC_000011 63  10  19  15  92  228
",sep="",stringsAsFactors=FALSE,header=TRUE)
mat1<- as.matrix(dat1[,-1])
 row.names(mat1)<- dat1[,1]


 mat1[rowSums(mat1<=0.2)!=ncol(mat1),]
            ZPT.1  ZPT.0  ZPT.2 ZPT.3 PDGT.1 PDGT.0
XLOC_000001  3516 626.00 1277.0   770   4309   9030
XLOC_000002   342  82.00  185.0    72    835   1095
XLOC_000003  2000 361.00  867.0   438    454    687
XLOC_000004   143  30.00   67.0    37     90    236
XLOC_000005     0   0.21    0.1     0      0      0 ##row is selected because at least one of the element is >0.2
XLOC_000007     0   0.00    0.0     0      1      3
XLOC_000010     7   1.00    5.0     3      0      1
XLOC_000011    63  10.00   19.0    15     92    228
as.vector(which(!rowSums(mat1<=0.2)!=ncol(mat1)))
#[1] 6 8 9

mat1[c(6,8,9),]
#            ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
#XLOC_000006     0   0.1  0.00  0.01   0.00      0
#XLOC_000008     0   0.0  0.00  0.00   0.15      0
#XLOC_000009     0   0.0  0.12  0.00   0.00      0



A.K.



________________________________
From: Vivek Das <vd4mmind at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, August 5, 2013 1:05 PM
Subject: Re: Removing the rows where all the elements are 0



Hi Arun,

This seems to work only if the values are perfect 0 but if there are values in rows like 0.01, 0.08 and 0.05 then if I want to use the command 
 res2<-mat1[rowSums(mat1<=0.2)!=ncol(mat1),]

Then it does not work. Can you tell me why? Lets say I want to remove the rows which have values less than 0.2 for the columns then what should be the condition?


----------------------------------------------------------

Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy

emails: vivek.das at ieo.eu
            vchris_05 at yahoo.co.in
            vd4mmind at gmail.com



On Mon, Aug 5, 2013 at 2:31 PM, arun <smartpink111 at yahoo.com> wrote:

Hi Vivek,
>
>dat1<- read.table(text="
>
>gene    ZPT.1   ZPT.0   ZPT.2   ZPT.3   PDGT.1  PDGT.0
>XLOC_000001 3516    626 1277    770 4309    9030
>XLOC_000002 342 82  185 72  835 1095
>XLOC_000003 2000    361 867 438 454 687
>XLOC_000004 143 30  67  37  90  236
>XLOC_000005 0   0   0   0   0   0
>XLOC_000006 0   0   0   0   0   0
>XLOC_000007 0   0   0   0   1   3
>XLOC_000008 0   0   0   0   0   0
>XLOC_000009 0   0   0   0   0   0
>XLOC_000010 7   1   5   3   0   1
>XLOC_000011 63  10  19  15  92  228
>",sep="",stringsAsFactors=FALSE,header=TRUE)
>
>res<- dat1[rowSums(dat1[,-1]==0)!=(ncol(dat1)-1),]
>res
>#          gene ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
>#1  XLOC_000001  3516   626  1277   770   4309   9030
>#2  XLOC_000002   342    82   185    72    835   1095
>#3  XLOC_000003  2000   361   867   438    454    687
>#4  XLOC_000004   143    30    67    37     90    236
>#7  XLOC_000007     0     0     0     0      1      3
>#10 XLOC_000010     7     1     5     3      0      1
>#11 XLOC_000011    63    10    19    15     92    228
>
>If it is a matrix:
>mat1<- as.matrix(dat1[,-1])
> row.names(mat1)<- dat1[,1]
>
>
> res2<-mat1[rowSums(mat1==0)!=ncol(mat1),]
> res2
>#            ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
>#XLOC_000001  3516   626  1277   770   4309   9030
>#XLOC_000002   342    82   185    72    835   1095
>#XLOC_000003  2000   361   867   438    454    687
>#XLOC_000004   143    30    67    37     90    236
>#XLOC_000007     0     0     0     0      1      3
>#XLOC_000010     7     1     5     3      0      1
>#XLOC_000011    63    10    19    15     92    228
>
>
>#I don't have an account in stackoverflow.  So, it is must be somebody else.
>A.K.
>
>
>
>________________________________
>From: Vivek Das <vd4mmind at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Monday, August 5, 2013 6:31 AM
>Subject: Removing the rows where all the elements are 0
>
>
>
>
>Hi Arun,
>Am using a matrix of gene expression, frag counts to calculate differentially expressed genes. I would like to know how to remove the rows which have values as 0. Then my data set will be compact and less spurious results will be given for the downstream analysis I do using this matrix.
>Input
>gene    ZPT.1ZPT.0ZPT.2ZPT.3PDGT.1PDGT.0XLOC_000001 3516626127777043099030XLOC_000002 34282185728351095XLOC_000003 2000361867438454687XLOC_000004 14330673790236XLOC_000005 000000XLOC_000006 000000XLOC_000007 000013XLOC_000008 000000XLOC_000009 000000XLOC_000010 715301XLOC_000011 6310191592228
>Desired output
>gene    ZPT.1ZPT.0ZPT.2ZPT.3PDGT.1PDGT.0XLOC_000001 3516626127777043099030XLOC_000002 34282185728351095XLOC_000003 2000361867438454687XLOC_000004 14330673790236XLOC_000007 000013XLOC_000010 715301XLOC_000011 6310191592228
>As of now I only want to remove those rows where all the frag count columns are 0 if in any row some values are 0 and others are non zero I would like to keep that row intact as you can see my example above.
>Please let me know how to do this.
>
> Hey arun I did not understand the command you wrote in the R stack overflow forum can you plase write here and help me out.
>----------------------------------------------------------
>
>Vivek Das
>PhD Student in Computational Biology
>Giuseppe Testa's Lab
>European School of Molecular Medicine
>IFOM-IEO Campus
>Via Adamello, 16
>Milan, Italy
>
>emails: vivek.das at ieo.eu
>            vchris_05 at yahoo.co.in
>            vd4mmind at gmail.com
>



More information about the R-help mailing list