[R] How to Reformat a dataframe

Chris Evans chr|@ho|d @end|ng |rom p@yctc@org
Sat Oct 28 09:32:26 CEST 2023


The tidyverse idiom looks very different but does what you want and I have come to like it.
What idiom of R one likes, for the mostly small datasets I handle, is largely a matter
of preferenceds for "readability", itself very personal.  Here's my tidyverse way of doing
what you wanted:

### start of code
library(tidyverse)
# tmpDF <- structure(list(...1 = c(92.9925354, 76.0024254, 44.99547465,
### I omitted the rest of reconstructing the dataframe for brevity, easy to reconstruct

tmpDF %>%
   pivot_longer(cols = everything()) -> tmpTibLong

tmpTibLong
### showed:
# # A tibble: 1,512 × 2
# name  value
# <chr> <dbl>
#   1 ...1   93.0
# 2 ...2   35.0
# 3 ...3   24.0
# 4 ...4   43.0
# 5 ...5   53.0
# 6 ...6   62.0
# 7 ...7   91.0
# 8 ...8   89.0
# 9 ...9   54.0
# 10 ...10  75.0
# # ℹ 1,502 more rows
# # ℹ Use `print(n = ...)` to see more rows

### as you don't want the missing values
tmpTibLong %>%
   drop_na()
### gave:
# # A tibble: 1,509 × 2
# name  value
# <chr> <dbl>
#   1 ...1   93.0
# 2 ...2   35.0
# 3 ...3   24.0
# 4 ...4   43.0
# 5 ...5   53.0
# 6 ...6   62.0
# 7 ...7   91.0
# 8 ...8   89.0
# 9 ...9   54.0
# 10 ...10  75.0
# # ℹ 1,499 more rows
# # ℹ Use `print(n = ...)` to see more rows
### end

Very best all,

Chris

On 28/10/2023 07:41, Paul Bernal wrote:

> Hi Iris,
>
> Thank you so much for your valuable feedback. I wonder why your code gives
> you 1512 rows, given that the original structure has 12 columns and 126
> rows, so I would expect (125*12)+ 9=1,509 total rows.
>
> Cheers,
> Paul
> El El vie, 27 de oct. de 2023 a la(s) 10:40 p. m., Iris Simmons <
> ikwsimmo using gmail.com> escribió:
>
>> You are not getting the structure you want because the indexes are
>> wrong. They should be something more like this:
>>
>> i <- 0
>> for (row in 1:nrow(alajuela_df)){
>>    for (col in 1:ncol(alajuela_df)){
>>      i <- i + 1
>>      df[i,1]=alajuela_df[row,col]
>>    }
>> }
>>
>> but I think what you are doing can be written much shorter and will run
>> faster:
>>
>> ## transpose here matches your original code
>> df <- data.frame(aportes_alajuela = c(t(alajuela_df)))
>>
>> ## but if you do not want to transpose, then do this
>> df <- data.frame(aportes_alajuela = unlist(alajuela_df, use.names = FALSE))
>>
>> However, you said you expected 1509 observations, but this gives you
>> 1512 observations. If you want to exclude the 3 NA observations, do
>> something like:
>>
>> df <- df[!is.na(df$aportes_alajuela), , drop = FALSE]
>>
>> On Fri, Oct 27, 2023 at 11:14 PM Paul Bernal <paulbernal07 using gmail.com>
>> wrote:
>>> Dear friends,
>>>
>>> I have the following dataframe:
>>> dim(alajuela_df)
>>> [1] 126  12
>>>
[dput snipped]

>>>
>>> What I want to do is, instead of having 12 observations  by row, I want
>> to
>>> have one observation by row. I want to have a single column with 1509
>>> observations instead of 126 rows with 12 columns per row.
>>>
>>> I tried the following:
>>> df = data.frame(matrix(nrow = Length, ncol = 1))
>>> colnames(df) = c("aportes_alajuela")
>>>
>>>
>>>
>>> for (row in 1:nrow(alajuela_df)){
>>>    for (col in 1:ncol(alajuela_df)){
>>>      df[i,1]=alajuela_df[i,j]
>>>    }
>>> }
>>>
>>> But I am not getting the data in the structure I want.
>>>
>>> Any help will be greatly appreciated.
>>>
>>> Best regards,
>>> Paul
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Chris Evans (he/him)
Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, 
University of Roehampton, London, UK.
Work web site: https://www.psyctc.org/psyctc/
CORE site: http://www.coresystemtrust.org.uk/
Personal site: https://www.psyctc.org/pelerinage2016/
Emeetings (Thursdays): 
https://www.psyctc.org/psyctc/booking-meetings-with-me/
(Beware: French time, generally an hour ahead of UK)
<https://ombook.psyctc.org/book>



More information about the R-help mailing list