[R] Calculating frequencies of multiple values in 200 colomns

Eric Berger ericjberger at gmail.com
Fri Nov 10 17:28:44 CET 2017


How about this workaround - add 1 to the vector
x <- c(1,0,2,1,0,2,2,0,2,1)
tabulate(x)
# [1] 3 4
tabulate(x+1)
#[1] 3 3 4


On Fri, Nov 10, 2017 at 4:34 PM, Marc Schwartz <marc_schwartz at me.com> wrote:

> Hi,
>
> To clarify the default behavior that Boris is referencing below, note the
> definition of the 'bin' argument to the tabulate() function:
>
> bin: a numeric vector ***(of positive integers)***, or a factor. Long
> vectors are supported.
>
> I added the asterisks for emphasis.
>
> This is also noted in the examples used for the function in ?tabulate at
> the bottom of the help page.
>
> The second argument, 'nbins', which defaults to max(1, bin, na.rm = TRUE),
> also affects the output:
>
> > tabulate(c(2, 3, 5))
> [1] 0 1 1 0 1
>
> In this case, with each element in the returned vector indicating how many
> 1's, 2's, 3's, 4's and 5's are present in the source vector.
>
> Compare that to:
>
> > tabulate(c(2, 3, 5), nbins = 3)
> [1] 0 1 1
>
> In the above example, 5 is ignored.
>
> Note also that tabulate(), unlike table(), does not return a named vector,
> just the frequencies.
>
> While tabulate() is used within the table() function, reviewing the code
> for the latter reveals how the default behavior of tabulate() is modified
> and preceded/wrapped in other code for use there.
>
> Regards,
>
> Marc Schwartz
>
>
> > On Nov 10, 2017, at 8:43 AM, Boris Steipe <boris.steipe at utoronto.ca>
> wrote:
> >
> > |> x <- sample(0:2, 10, replace = TRUE)
> > |> x
> > [1] 1 0 2 1 0 2 2 0 2 1
> > |> tabulate(x)
> > [1] 3 4
> > |> table(x)
> > x
> > 0 1 2
> > 3 3 4
> >
> >
> >
> > B.
> >
> >
> >
> >> On Nov 10, 2017, at 4:32 AM, Allaisone 1 <allaisone1 at hotmail.com>
> wrote:
> >>
> >>
> >>
> >> Thank you for your effort Bert..,
> >>
> >>
> >> I knew what is the problem now, the values (1,2,3) were only an
> example. The values I have are 0 , 1, 2 . Tabulate () function seem to
> ignore calculating the frequency of 0 values and this is my exact problem
> as the frequency of 0 values should also be calculated for the maf to be
> calculated correctly.
> >>
> >> ________________________________
> >> From: Bert Gunter <bgunter.4567 at gmail.com>
> >> Sent: 09 November 2017 23:51:35
> >> To: Allaisone 1; R-help
> >> Subject: Re: [R] Calculating frequencies of multiple values in 200
> colomns
> >>
> >> [[elided Hotmail spam]]
> >>
> >> "For example, if I have the values : 1 , 2 , 3 in each column, applying
> Tabulate () would calculate the frequency of 1 and 2 without 3"
> >>
> >> Huh??
> >>
> >>> x <- sample(1:3,10,TRUE)
> >>> x
> >> [1] 1 3 1 1 1 3 2 3 2 1
> >>> tabulate(x)
> >> [1] 5 2 3
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <allaisone1 at hotmail.com<
> mailto:allaisone1 at hotmail.com>> wrote:
> >>
> >> Thank you so much for your replay
> >>
> >>
> >> Actually, I tried apply() function but struggled with the part of
> writing the appropriate function inside it which calculate the frequency of
> the 3 values. Tabulate () function is a good start but the problem is that
> this calculates the frequency of two values only per column which means
> that when I apply maf () function , maf value will be calculated using the
> frequency of these 2 values only without considering the frequency of the
> 3rd value. For example, if I have the values : 1 , 2 , 3 in each column,
> applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I
> need a way to calculate the frequencies of all of the 3 values so the
> calculation of maf will be correct as it will consider all the 3
> frequencies but not only 2 .
> >>
> >>
> >> Regards
> >>
> >> Allahisone
> >>
> >> ________________________________
> >> From: Bert Gunter <bgunter.4567 at gmail.com<mailto:bgunter.4567 at gmail.com
> >>
> >> Sent: 09 November 2017 20:56:39
> >> To: Allaisone 1
> >> Cc: r-help at R-project.org
> >> Subject: Re: [R] Calculating frequencies of multiple values in 200
> colomns
> >>
> >> This is not a good way to do things! R has many powerful built in
> functions to do this sort of thing for you. Searching  -- e.g. at
> rseek.org<http://rseek.org> or even a plain old google search -- can help
> you find them. Also, it looks like you need to go through a tutorial or two
> to learn more about R's basic functionality.
> >>
> >> In this case, something like (no reproducible example given, so can't
> confirm):
> >>
> >> apply(Values, 2, function(x)maf(tabulate(x)))
> >>
> >> should be close to what you want .
> >>
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <allaisone1 at hotmail.com<
> mailto:allaisone1 at hotmail.com>> wrote:
> >>
> >> Hi All
> >>
> >>
> >> I have a dataset of 200 columns and 1000 rows , there are 3 repeated
> values under each column (7,8,10). I wanted to calculate the frequency of
> each value under each column and then apply the function maf () given that
> the frequency of each value is known. I can do the analysis step by step
> like this :-
> >>
> >>
> >>> Values
> >>
> >>
> >>        A       B       C       ... 200
> >>
> >> 1      7       10      7
> >>
> >> 2      7       8        7
> >>
> >> 3      10     8        7
> >>
> >> 4       8      7         10
> >>
> >> .
> >>
> >> .
> >>
> >> .
> >>
> >>
> >>
> >>
> >> For column A : I calculate the frequency for the 3 values as follows :
> >>
> >> count7 <- length(which(Values$A == 7))
> >>
> >> count8 <- length(which(Values$A == 8))
> >>
> >> count10 <- length(which(Values$A == 10))
> >>
> >>
> >> count7 = 2, count8 = 1 , count10= 1.
> >>
> >>
> >> Then, I create a vector  and type the frequencies manually :
> >>
> >>
> >> Freq<- c( count7=2  ,count8= 1,count10=1)
> >>
> >>
> >> Then I apply the function maf ()  :-
> >>
> >> maf(Freq)
> >>
> >>
> >> This gives me the result I need for column A , could you please help me
> >>
> >> to perform the analysis for all of the 200 columns at once ?
> >>
> >>
> >> Regards
> >>
> >> Allahisone
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list