[R] max & min values within dataframe

R. Michael Weylandt michael.weylandt at gmail.com
Mon Nov 14 17:36:29 CET 2011


I took a stab at this using ddply() from the plyr package. How's this
look to you?

x<- textConnection("Col   Patient Region Score Time
1        1      X    19   28
2        1      X    20  126
3        1      X    22  100
4        1      X    25  191
5        2      Y    12    1
6        2      Y    12    2
7        2      Y    25    4
8        2      Y    26    7
9        3      X     6    1
10       3      X     6    4
11       3      X    21   31
12       3      X    22   68
13       3      X    23   31
14       3      X    24   38
15       3      X    21   15
16       3      X    22   24
17       3      X    23   15
18       3      X    24  243
19       3      X    25   77
20       4      Y     6    5
21       4      Y    22   28
22       4      Y    23   75
23       4      Y    24   19
24       5      Y    23    3
25       5      Y    24    1
26       5      Y    23   33
27       5      Y    24   13
28       5      Y    25   42
29       5      Y    26   21
30       5      Y    27    4
31       6      Y    24    4
32       6      Y    32    8")
V = read.table(x, header = T)[,-1]
closeAllConnections()
rm("x")
# Everything above is just stuff to get the data in.

R <- ddply(V, c("Patient","Region"), function(d) {c(max =
max(d$Score),min = min(d$Score))})

Patient Region max min
1       1      X  25  19
2       2      Y  26  12
3       3      X  25   6
4       4      Y  24   6
5       5      Y  27  23
6       6      Y  32  24

Michael

On Mon, Nov 14, 2011 at 11:32 AM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
> Hi Laura,
>
> You were close.  Just use range() instead of min/max:
>
> ## your data (read in and then pasted the output of dput() to make it easy)
> dat <- structure(list(Patient = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 6L, 6L), Region = structure(c(1L, 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("X",
> "Y"), class = "factor"), Score = c(19L, 20L, 22L, 25L, 12L, 12L,
> 25L, 26L, 6L, 6L, 21L, 22L, 23L, 24L, 21L, 22L, 23L, 24L, 25L,
> 6L, 22L, 23L, 24L, 23L, 24L, 23L, 24L, 25L, 26L, 27L, 24L, 32L
> ), Time = c(28L, 126L, 100L, 191L, 1L, 2L, 4L, 7L, 1L, 4L, 31L,
> 68L, 31L, 38L, 15L, 24L, 15L, 243L, 77L, 5L, 28L, 75L, 19L, 3L,
> 1L, 33L, 13L, 42L, 21L, 4L, 4L, 8L)), .Names = c("Patient", "Region",
> "Score", "Time"), class = "data.frame", row.names = c("1", "2",
> "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
> "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
> "26", "27", "28", "29", "30", "31", "32"))
>
> tmp <- with(dat, aggregate(Score, list(Patient), range))
> tmpreg <-  with(dat, Region[!duplicated(Patient)])
>
> results <- data.frame(tmp$Group.1, tmpreg, tmp$x)
> colnames(results) <- c("Patient", "Region", "Min", "Max")
>
> Note it is a little tricky to get the results in a data frame, because
> tmp is a bit of an odd data frame---due to the way aggregate works,
> the the first column of the data frame is a regular vector, but the
> second column actually contains a two column matrix.  To get it into
> regular form, I extracted them separately when creating 'results'.
>
> Cheers,
>
> Josh
>
> On Mon, Nov 14, 2011 at 8:10 AM, B Laura <gm.spam2011 at gmail.com> wrote:
>> dear R-team
>>
>> I need to find the min, max values for each patient from dataset and keep
>> the output of it as a dataframe with the following columns
>>  - Patient nr
>>  - Region (remains same per patient)
>>  - Min score
>>  - Max score
>>
>>
>>    Patient Region Score Time
>> 1        1      X    19   28
>> 2        1      X    20  126
>> 3        1      X    22  100
>> 4        1      X    25  191
>> 5        2      Y    12    1
>> 6        2      Y    12    2
>> 7        2      Y    25    4
>> 8        2      Y    26    7
>> 9        3      X     6    1
>> 10       3      X     6    4
>> 11       3      X    21   31
>> 12       3      X    22   68
>> 13       3      X    23   31
>> 14       3      X    24   38
>> 15       3      X    21   15
>> 16       3      X    22   24
>> 17       3      X    23   15
>> 18       3      X    24  243
>> 19       3      X    25   77
>> 20       4      Y     6    5
>> 21       4      Y    22   28
>> 22       4      Y    23   75
>> 23       4      Y    24   19
>> 24       5      Y    23    3
>> 25       5      Y    24    1
>> 26       5      Y    23   33
>> 27       5      Y    24   13
>> 28       5      Y    25   42
>> 29       5      Y    26   21
>> 30       5      Y    27    4
>> 31       6      Y    24    4
>> 32       6      Y    32    8
>>
>> So far I could find the min and max values for each patient, but the output
>> of it is not (yet) what I need.
>>
>>> Patient.nr = unique(Patient)
>>> aggregate(Score, list(Patient), max)
>>  Group.1  x
>> 1       1 25
>> 2       2 26
>> 3       3 25
>> 4       4 24
>> 5       5 27
>> 6       6 32
>>
>>> aggregate(Score, list(Patient), min)
>>  Group.1  x
>> 1       1 19
>> 2       2 12
>> 3       3  6
>> 4       4  6
>> 5       5 23
>> 6       6 24
>> I would like to do same but writing this new information (min, max values)
>> in a dataframe with following columns
>>  - Patient nr
>> - Region (remains same per patient)
>> - Min score
>> - Max score
>>
>> Can anybody help me with this?
>>
>> Thanks
>> Laura
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, ATS Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list