[R] Adding SORT to UNIQUE

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Tue Dec 21 17:38:35 CET 2021


On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:
> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:
>> Thanks for the reply.
>>
>> sort(unique(Data[1]))
>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
>> decreasing)) :
>>      undefined columns selected
> 
> That's the wrong syntax:  Data[1] is not "column one of Data".  Use
> Data[[1]] for that, so
> 
>     sort(unique(Data[[1]]))

Actually, I'd probably recommend

   sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list. 
Dataframes are lists that look like matrices, but to me the matrix 
aspect is usually more intuitive.

Duncan Murdoch

> 
> I think Rui already pointed out the typo in the quoted text below...
> 
> Duncan Murdoch
> 
>>
>> The recommended syntax did not work, as listed above.
>>
>> What I want is the sort of distinct column output. Again, the column may
>> be text or numbers. This is a huge analysis effort with data coming at
>> me from many different sources.
>>
>>
>> *Stephen Dawson, DSL*
>> /Executive Strategy Consultant/
>> Business & Technology
>> +1 (865) 804-3454
>> http://www.shdawson.com <http://www.shdawson.com>
>>
>>
>> On 12/21/21 11:07 AM, Duncan Murdoch wrote:
>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:
>>>> Thanks everyone for the replies.
>>>>
>>>> It is clear one either needs to write a function or put the unique
>>>> entries into another dataframe.
>>>>
>>>> It seems odd R cannot sort a list of unique column entries with ease.
>>>> Python and SQL can do it with ease.
>>>
>>> I've seen several responses that looked pretty simple.  It's hard to
>>> beat sort(unique(x)), though there's a fair bit of confusion about
>>> what you actually want.  Maybe you should post an example of the code
>>> you'd use in Python?
>>>
>>> Duncan Murdoch
>>>
>>>>
>>>> QUESTION
>>>> Is there a simpler means than other than the unique function to capture
>>>> distinct column entries, then sort that list?
>>>>
>>>>
>>>> *Stephen Dawson, DSL*
>>>> /Executive Strategy Consultant/
>>>> Business & Technology
>>>> +1 (865) 804-3454
>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>
>>>>
>>>> On 12/20/21 5:53 PM, Rui Barradas wrote:
>>>>> Hello,
>>>>>
>>>>> Inline.
>>>>>
>>>>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:
>>>>>> Thanks.
>>>>>>
>>>>>> sort(unique(Data[[1]]))
>>>>>>
>>>>>> This syntax provides row numbers, not column values.
>>>>>
>>>>> This is not right.
>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
>>>>> extracts the column vector.
>>>>>
>>>>> As for my previous answer, it was not addressing the question, I
>>>>> misinterpreted it as being a question on how to sort by numeric order
>>>>> when the data is not numeric. Here is a, hopefully, complete answer.
>>>>> Still with package stringr.
>>>>>
>>>>>
>>>>> cols_to_sort <- 1:4
>>>>>
>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){
>>>>>      stringr::str_sort(unique(x), numeric = TRUE)
>>>>> })
>>>>>
>>>>>
>>>>> Or using Avi's suggestion of writing a function to do all the work and
>>>>> simplify the lapply loop later,
>>>>>
>>>>>
>>>>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)
>>>>>
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Rui Barradas
>>>>>
>>>>>
>>>>>>
>>>>>> *Stephen Dawson, DSL*
>>>>>> /Executive Strategy Consultant/
>>>>>> Business & Technology
>>>>>> +1 (865) 804-3454
>>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>>
>>>>>>
>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>> Running a simple syntax set to review entries in dataframe columns.
>>>>>>> Here is the working code.
>>>>>>>
>>>>>>> Data <- read.csv("./input/Source.csv", header=T)
>>>>>>> describe(Data)
>>>>>>> summary(Data)
>>>>>>> unique(Data[1])
>>>>>>> unique(Data[2])
>>>>>>> unique(Data[3])
>>>>>>> unique(Data[4])
>>>>>>>
>>>>>>> I would like to add sort the unique entries. The data in the various
>>>>>>> columns are not defined as numbers, but also text. I realize 1 and
>>>>>>> 10 will not sort properly, as the column is not defined as a number,
>>>>>>> but want to see what I have in the columns viewed as sorted.
>>>>>>>
>>>>>>> QUESTION
>>>>>>> What is the best process to sort unique output, please?
>>>>>>>
>>>>>>>
>>>>>>> Thanks.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>



More information about the R-help mailing list