[R] Adding SORT to UNIQUE

Stephen H. Dawson, DSL @erv|ce @end|ng |rom @hd@w@on@com
Wed Dec 22 16:55:22 CET 2021


I see.

So, we are talking taking the output into a new dataframe. I was hoping 
to have the output rendered on screen without another dataframe, but I 
can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com <http://www.shdawson.com>


On 12/22/21 10:47 AM, Duncan Murdoch wrote:
> On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:
>> Thanks for the reply.
>>
>> Both syntax options work to render the correct (unique) output. However,
>> the output is rendered as horizontal. What needs to happen to get the
>> output to render vertical, please?
>
> The result of those expressions is a vector of the same type as the 
> column, so your question is really about how to get a vector to print 
> one element per line.
>
> Probably the simplest way is to put the vector in a dataframe (or 
> matrix, or tibble, depending on which formatting you prefer).  For 
> example,
>
> >   v <- c("red", "green", "blue")
> >   data.frame(v)
>       v
> 1   red
> 2 green
> 3  blue
>
> If you want a more minimal display, try
>
> > cat(v, sep = "\n")
> red
> green
> blue
>
> or
>
> > cat(format(v, justify = "right"), sep = "\n")
>   red
> green
>  blue
>
> If you want this to happen when you auto-print the object, you can 
> give it a class attribute and write a function to print that class, e.g.
>
> >  class(v) <- "oneperline"
> >
> >   print.oneperline <- function(x, ...) cat(format(x, justify = 
> "right"), sep = "\n")
> >
> >   v
>   red
> green
>  blue
>
> Duncan Murdoch
>
>>
>>
>> *Stephen Dawson, DSL*
>> /Executive Strategy Consultant/
>> Business & Technology
>> +1 (865) 804-3454
>> http://www.shdawson.com <http://www.shdawson.com>
>>
>>
>> On 12/21/21 11:38 AM, Duncan Murdoch wrote:
>>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:
>>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:
>>>>> Thanks for the reply.
>>>>>
>>>>> sort(unique(Data[1]))
>>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
>>>>> decreasing)) :
>>>>>       undefined columns selected
>>>>
>>>> That's the wrong syntax:  Data[1] is not "column one of Data". Use
>>>> Data[[1]] for that, so
>>>>
>>>>      sort(unique(Data[[1]]))
>>>
>>> Actually, I'd probably recommend
>>>
>>>    sort(unique(Data[, 1]))
>>>
>>> instead.  This treats Data as a matrix rather than as a list.
>>> Dataframes are lists that look like matrices, but to me the matrix
>>> aspect is usually more intuitive.
>>>
>>> Duncan Murdoch
>>>
>>>>
>>>> I think Rui already pointed out the typo in the quoted text below...
>>>>
>>>> Duncan Murdoch
>>>>
>>>>>
>>>>> The recommended syntax did not work, as listed above.
>>>>>
>>>>> What I want is the sort of distinct column output. Again, the column
>>>>> may
>>>>> be text or numbers. This is a huge analysis effort with data 
>>>>> coming at
>>>>> me from many different sources.
>>>>>
>>>>>
>>>>> *Stephen Dawson, DSL*
>>>>> /Executive Strategy Consultant/
>>>>> Business & Technology
>>>>> +1 (865) 804-3454
>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>
>>>>>
>>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote:
>>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:
>>>>>>> Thanks everyone for the replies.
>>>>>>>
>>>>>>> It is clear one either needs to write a function or put the unique
>>>>>>> entries into another dataframe.
>>>>>>>
>>>>>>> It seems odd R cannot sort a list of unique column entries with 
>>>>>>> ease.
>>>>>>> Python and SQL can do it with ease.
>>>>>>
>>>>>> I've seen several responses that looked pretty simple. It's hard to
>>>>>> beat sort(unique(x)), though there's a fair bit of confusion about
>>>>>> what you actually want.  Maybe you should post an example of the 
>>>>>> code
>>>>>> you'd use in Python?
>>>>>>
>>>>>> Duncan Murdoch
>>>>>>
>>>>>>>
>>>>>>> QUESTION
>>>>>>> Is there a simpler means than other than the unique function to
>>>>>>> capture
>>>>>>> distinct column entries, then sort that list?
>>>>>>>
>>>>>>>
>>>>>>> *Stephen Dawson, DSL*
>>>>>>> /Executive Strategy Consultant/
>>>>>>> Business & Technology
>>>>>>> +1 (865) 804-3454
>>>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>>>
>>>>>>>
>>>>>>> On 12/20/21 5:53 PM, Rui Barradas wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Inline.
>>>>>>>>
>>>>>>>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> sort(unique(Data[[1]]))
>>>>>>>>>
>>>>>>>>> This syntax provides row numbers, not column values.
>>>>>>>>
>>>>>>>> This is not right.
>>>>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
>>>>>>>> extracts the column vector.
>>>>>>>>
>>>>>>>> As for my previous answer, it was not addressing the question, I
>>>>>>>> misinterpreted it as being a question on how to sort by numeric
>>>>>>>> order
>>>>>>>> when the data is not numeric. Here is a, hopefully, complete 
>>>>>>>> answer.
>>>>>>>> Still with package stringr.
>>>>>>>>
>>>>>>>>
>>>>>>>> cols_to_sort <- 1:4
>>>>>>>>
>>>>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){
>>>>>>>>       stringr::str_sort(unique(x), numeric = TRUE)
>>>>>>>> })
>>>>>>>>
>>>>>>>>
>>>>>>>> Or using Avi's suggestion of writing a function to do all the
>>>>>>>> work and
>>>>>>>> simplify the lapply loop later,
>>>>>>>>
>>>>>>>>
>>>>>>>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
>>>>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)
>>>>>>>>
>>>>>>>>
>>>>>>>> Hope this helps,
>>>>>>>>
>>>>>>>> Rui Barradas
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Stephen Dawson, DSL*
>>>>>>>>> /Executive Strategy Consultant/
>>>>>>>>> Business & Technology
>>>>>>>>> +1 (865) 804-3454
>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Running a simple syntax set to review entries in dataframe
>>>>>>>>>> columns.
>>>>>>>>>> Here is the working code.
>>>>>>>>>>
>>>>>>>>>> Data <- read.csv("./input/Source.csv", header=T)
>>>>>>>>>> describe(Data)
>>>>>>>>>> summary(Data)
>>>>>>>>>> unique(Data[1])
>>>>>>>>>> unique(Data[2])
>>>>>>>>>> unique(Data[3])
>>>>>>>>>> unique(Data[4])
>>>>>>>>>>
>>>>>>>>>> I would like to add sort the unique entries. The data in the
>>>>>>>>>> various
>>>>>>>>>> columns are not defined as numbers, but also text. I realize 
>>>>>>>>>> 1 and
>>>>>>>>>> 10 will not sort properly, as the column is not defined as a
>>>>>>>>>> number,
>>>>>>>>>> but want to see what I have in the columns viewed as sorted.
>>>>>>>>>>
>>>>>>>>>> QUESTION
>>>>>>>>>> What is the best process to sort unique output, please?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>> PLEASE do read the posting guide
>>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>> and provide commented, minimal, self-contained, reproducible 
>>>>>>>>> code.
>>>>>>>>
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>



More information about the R-help mailing list