[R] Getting minimum value of a column according a factor column of a dataframe

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Thu Aug 25 13:44:25 CEST 2022


To keep the original order, try

res <- lapply(split(df1, df1$Code), \(x) x[which.min(x$Q),])
res <- do.call(rbind, res)
i <- order(unique(df1$Code))
res[order(i), ]

Hope this helps,

Rui Barradas

Às 08:53 de 25/08/2022, javad bayat escreveu:
> Dear Rui;
> Thank you very much. Both of your codes worked correctly. Now I can see the
> whole row's value.
> But I found a problem in the results. When I run your codes, the results
> are shown in a sorted table. I do not know why the results have been sorted
> according to the "Code" column, smallest to largest. Is there any way to
> get the results like their order in the first data frame (bilan2)? I used
> your codes as follow:
>> bilan3 <- lapply(split(bilan2, bilan2$Code), \(x) x[which.min(x$Q),])
>> bilan3 = data.frame(do.call(rbind, bilan3))
> Sincerely
> On Thu, Aug 25, 2022 at 11:52 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>> Hello,
>> OK, what about
>> res <- lapply(split(df1, df1$Code), \(x) x[which.min(x$Q),])
>> do.call(rbind, res)
>> #         Code  Y  M  D     Q    N    O
>> #  41003 41003 81  1 19 0.160 7.17 2.50
>> #  41005 41005 79  8 17 0.210 5.50 7.20
>> #  41009 41009 79  2 21 0.218 5.56 4.04
>> #  41017 41017 79 10 20 0.240 5.30 7.10
>> A dplyr solution.
>> suppressPackageStartupMessages(library(dplyr))
>> df1 %>%
>>     group_by(Code) %>%
>>     slice_min(Q) %>%
>>     slice_head(n = 1)
>> #  # A tibble: 4 × 7
>> #  # Groups:   Code [4]
>> #    Code      Y     M     D     Q     N     O
>> #    <fct> <int> <int> <int> <dbl> <dbl> <dbl>
>> #  1 41003    81     1    19 0.16   7.17  2.5
>> #  2 41005    79     8    17 0.21   5.5   7.2
>> #  3 41009    79     2    21 0.218  5.56  4.04
>> #  4 41017    79    10    20 0.24   5.3   7.1
>> Hope this helps,
>> Rui Barradas
>> Às 05:56 de 25/08/2022, javad bayat escreveu:
>>> Dear all,
>>> Many thanks for your suggested methods and codes, but unfortunately they
>>> did not give the desired results.
>>> All the codes you have provided are correct but they did not represent
>> the
>>> whole row which is related to the minimum of "Q".
>>> The code must result in 4 rows, with the minimum value of "Q" and other
>>> column values, as below:
>>>          Code
>>>                 Y
>>>                 M
>>>                  D
>>>              Q
>>>               N
>>>                O
>>> 41003
>>> 81
>>> 1
>>> 19
>>> 0.16
>>> 7.17
>>> 2.5
>>> 41005
>>> 79
>>> 8
>>> 17
>>> 0.21
>>> 5.5
>>> 7.2
>>> 41009
>>> 79
>>> 2
>>> 21
>>> 0.218
>>> 5.56
>>> 4.04
>>> 41017 79 10 20 0.24 5.3 7.1
>>> Sincerely
>>> 41017 79 10 20 0.24 5.3 7.1
>>> On Wed, Aug 24, 2022 at 9:24 PM Rui Barradas <ruipbarradas using sapo.pt>
>> wrote:
>>>> Hello,
>>>> Here are two options, the 1st outputs a vector, the 2nd a data.frame.
>>>> x<-'41003 81 1 19 0.16 7.17 2.5
>>>> 41003 77 9 22 0.197 6.8 2.2
>>>> 41003 79 7 28 0.21 4.7 6.2
>>>> 41005 79 8 17 0.21 5.5 7.2
>>>> 41005 80 10 30 0.21 6.84 2.6
>>>> 41005 80 12 20 0.21 6.84 2.4
>>>> 41005 79 6 14 0.217 5.61 3.55
>>>> 41009 79 2 21 0.218 5.56 4.04
>>>> 41009 79 5 27 0.218 6.4 3.12
>>>> 41009 80 11 29 0.22 6.84 2.8
>>>> 41009 78 5 28 0.232 6 3.2
>>>> 41009 81 8 20 0.233 6.39 1.6
>>>> 41009 79 9 30 0.24 5.6 7.5
>>>> 41017 79 10 20 0.24 5.3 7.1
>>>> 41017 80 7 30 0.24 6.73 2.6'
>>>> df1 <- read.table(textConnection(x))
>>>> names(df1) <- scan(what = character(),
>>>>                       text = 'Code Y M D Q N O')
>>>> df1$Code <- factor(df1$Code)
>>>> # 1st option
>>>> with(df1, tapply(Q, Code, min))
>>>> #  41003 41005 41009 41017
>>>> #  0.160 0.210 0.218 0.240
>>>> # 2nd option
>>>> aggregate(Q ~ Code, df1, min)
>>>> #     Code     Q
>>>> #  1 41003 0.160
>>>> #  2 41005 0.210
>>>> #  3 41009 0.218
>>>> #  4 41017 0.240
>>>> Hope this helps,
>>>> Rui Barradas
>>>> Às 08:44 de 24/08/2022, javad bayat escreveu:
>>>>> Dear all;
>>>>> I am trying to get the minimum value of a column based on a factor
>> column
>>>>> of the same data frame. My data frame is like the below:
>>>>>           Code               Y               M                D
>>>>>     Q
>>>>>         N              O
>>>>> 41003 81 1 19 0.16 7.17 2.5
>>>>> 41003 77 9 22 0.197 6.8 2.2
>>>>> 41003 79 7 28 0.21 4.7 6.2
>>>>> 41005 79 8 17 0.21 5.5 7.2
>>>>> 41005 80 10 30 0.21 6.84 2.6
>>>>> 41005 80 12 20 0.21 6.84 2.4
>>>>> 41005 79 6 14 0.217 5.61 3.55
>>>>> 41009 79 2 21 0.218 5.56 4.04
>>>>> 41009 79 5 27 0.218 6.4 3.12
>>>>> 41009 80 11 29 0.22 6.84 2.8
>>>>> 41009 78 5 28 0.232 6 3.2
>>>>> 41009 81 8 20 0.233 6.39 1.6
>>>>> 41009 79 9 30 0.24 5.6 7.5
>>>>> 41017 79 10 20 0.24 5.3 7.1
>>>>> 41017 80 7 30 0.24 6.73 2.6
>>>>> I want to get the minimum value of the "Q" column with the whole row
>>>>> values, according to the "Code"  column  which is a factor. Overall it
>>>> will
>>>>> give me 4 rows, with the value of "Q". Below is a code that I used but
>> it
>>>>> did not give me what I wanted.
>>>>>> x[which(x$Q == min(x$Q)),]
>>>>> Sincerely

More information about the R-help mailing list