[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW

Bogdan Tanasa t@n@@@ @end|ng |rom gm@||@com
Fri Nov 2 05:07:26 CET 2018


Dear Bill, and Bill,

many thanks for taking the time to advice, and for your suggestions. I
believe that I shall rephrase a bit my question, with a better example :
thank you again in advance for your help.

Let's assume that we start from a data frame :

x = data.frame(  TTT=c(0,1,0,0),
               TTA=c(0,1,1,0),
                ATA=c(1,0,0,0),
                 ATT=c(0,0,0,0),
                row.names=c("gene1", "gene2", "gene3", "gene4"))

Shall we select "gene2", at the end, we would like to have ONLY the
COLUMNS, where "gene2" is NOT-ZERO. In other words, the output contains
only the first 2 columns :

output = data.frame(  TTT=c(0,1,0,0),
                                   TTA=c(0,1,1,0),
                                   row.names=c("gene1", "gene2", "gene3",
"gene4"))

 with much appreciation,

-- bogdan

On Thu, Nov 1, 2018 at 6:34 PM William Michels <wjm1 using caa.columbia.edu>
wrote:

> Hi Bogdan,
>
> Are you saying you want to drop columns that sum to zero? If so, I'm
> not sure you've given us a good example dataframe, since all your
> numeric columns give non-zero sums.
>
> Otherwise, what you're asking for is trivial. Below is an example
> dataframe ("ygene") with an example "AGA" column that gets dropped:
>
> > xgene <- data.frame(TTT=c(0,1,0,0),
> +                TTA=c(0,1,1,0),
> +                ATA=c(1,0,0,0),
> +                gene=c("gene1", "gene2", "gene3", "gene4"))
> >
> > xgene[ , colSums(xgene[,1:3]) > 0 ]
>   TTT TTA ATA  gene
> 1   0   0   1 gene1
> 2   1   1   0 gene2
> 3   0   1   0 gene3
> 4   0   0   0 gene4
> >
> > ygene <- data.frame(TTT=c(0,1,0,0),
> +                 TTA=c(0,1,1,0),
> +                 AGA=c(0,0,0,0),
> +                 gene=c("gene1", "gene2", "gene3", "gene4"))
> >
> > ygene[ , colSums(ygene[,1:3]) > 0 ]
>   TTT TTA  gene
> 1   0   0 gene1
> 2   1   1 gene2
> 3   0   1 gene3
> 4   0   0 gene4
>
>
> HTH,
>
> Bill.
>
> William Michels, Ph.D.
>
>
> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa using gmail.com> wrote:
> > Dear all, please may I ask for a suggestion :
> >
> > considering a dataframe  that contains the numerical values for gene
> > expression, for example :
> >
> >  x = data.frame(TTT=c(0,1,0,0),
> >                TTA=c(0,1,1,0),
> >                ATA=c(1,0,0,0),
> >                gene=c("gene1", "gene2", "gene3", "gene4"))
> >
> > how could I select only the COLUMNS where the value of a GENE (a ROW) is
> > non-zero ?
> >
> > thank you !
> >
> > -- bogdan
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list