[R] Consistent size dashes...

Noah Marconi noah.marconi at noah-marconi.com
Sat Mar 29 19:52:00 CET 2014


If you rename two or more levels with the same string, R collapses them. 
Using gsub like you suggest works well. If there are more than the two 
characters causing problems you may need to expand the pattern that's 
being matched. With the example you sent, this works:


df <- structure(c(6L, 4L, 8L, 8L, 10L, 6L, 3L, 7L, 5L, 3L, 3L, 3L,
             3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L,
             9L, 3L, 3L, 3L, 5L, 9L, 9L, 5L, 5L, 7L, 7L, 7L, 7L, 7L, 7L, 
9L,
             9L, 9L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 
9L,
             9L, 9L, 9L, 9L, 9L, 9L, 9L, 7L, 7L, 7L, 7L, 9L, 9L, 4L, 4L, 
4L,
             4L, 4L, 4L, 4L, 6L, 6L, 6L, 6L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L,
             8L, 7L, 10L, 10L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 
6L,
             6L, 5L, 5L, 5L, 5L, 5L, 5L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L,
             8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 10L, 10L, 10L, 10L, 
10L,
             10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 8L, 4L, 4L, 4L, 4L, 
4L,
             4L, 6L, 6L, 8L, 8L, 8L, 10L, 10L, 4L, 4L, 4L, 6L, 6L, 6L, 
8L,
             8L, 10L, 10L, 10L, 4L, 4L, 4L, 4L, 3L, 8L, 8L, 8L, 8L, 4L, 
4L,
             4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 6L, 
6L,
             6L, 6L, 6L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L,
             8L, 8L, 8L, 10L, 10L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 6L, 8L, 
8L,
             8L, 8L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 2L, 4L, 4L, 4L, 
4L,
             4L, 5L, 6L, 6L, 7L, 3L, 3L, 5L, 5L, 7L, 7L, 7L, 3L, 3L, 3L, 
3L,
             3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 7L, 
7L,
             7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 9L, 3L, 3L, 3L, 
3L,
             5L, 7L, 7L, 7L, 7L, 7L, 7L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L,
             9L, 9L, 11L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 7L,
             7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
1L,
             3L, 6L, 5L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 7L, 7L, 7L, 7L, 
7L,
             7L, 9L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 7L, 
7L,
             7L, 7L, 7L, 7L, 7L, 9L, 9L, 9L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 
5L,
             7L, 7L, 7L, 3L, 3L, 7L, 5L, 4L, 6L, 6L, 6L, 6L, 8L, 8L, 10L,
             3L, 4L, 5L, 7L, 3L, 3L, 3L, 5L, 5L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L,
             7L, 7L), .Label = c("1 - Type1", "1 – Type1", "2 - Type2", 
"2 – Type2",
                                 "3 - Type3", "3 – Type3", "4 - Type4", 
"4 – Type4", "5 - Type5",
                                 "5 – Type5", "Unassigned"), class = 
"factor")



levels(df) <- gsub("–", replacement="-", x=levels(df))



> table(df)
df
  1 - Type1  2 - Type2  3 - Type3  4 - Type4  5 - Type5 Unassigned
          3        116         89        147         63          1




On 2014-03-28 11:29, Jason Rupert wrote:
> Evidently different sized dashes were used in my data set.  Using gsub
> or some other method, is there a way to use a consistent dash?  With
> the different dash types it is difficult to build histograms, tables,
> barplots and perform other analysis. 
> 
> 
> Thanks again for your help and insights. 
> 
> 
> 1 - Type1
> 1 – Type1
> 2 - Type2
> 2 – Type2
> 3 - Type3
> 3 – Type3
> 4 - Type4
> 4 – Type4
> 5 - Type5
> 5 – Type5
> Etc.
> 
> 	[[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list