[R] fast way to find most common value across columns dataframe

Luigi Marongiu m@rong|u@|u|g| @end|ng |rom gm@||@com
Sat Oct 31 09:56:01 CET 2020


Hello,
I have a large dataframe (1 000 000 rows, 1000 columns) where the
columns contain a character. I would like to determine the most common
character for each row.
In the example below, I can parse one row at the time and find the
most common character (apart for ties...). But I think this will be
very slow and memory consuming.
Is there a way to run it more efficiently?
Thank you

```
V = c("A", "B", "C", "D")
df = data.frame(n = 1:10,
       col_01 = sample(V, 10, replace = TRUE, prob = NULL),
       col_02 = sample(V, 10, replace = TRUE, prob = NULL),
       col_03 = sample(V, 10, replace = TRUE, prob = NULL),
       col_04 = sample(V, 10, replace = TRUE, prob = NULL),
       col_05 = sample(V, 10, replace = TRUE, prob = NULL),
       stringsAsFactors = FALSE)

q = vector()
for(i in 1:nrow(df)) {
  x = as.vector(t(df[i,2:ncol(df)]))
  q[i] =    names(which.max(table(x)))
}
df$most = q
```



More information about the R-help mailing list