[R] Arranging column data to create plots

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Sun Jul 16 23:48:51 CEST 2017


Correction at the end.

On Sun, 16 Jul 2017, Jeff Newmiller wrote:

> On Sat, 15 Jul 2017, Michael Reed via R-help wrote:
>
>> Dear All,
>> 
>> I need some help arranging data that was imported.
>
> It would be helpful if you were to use dput to give us the sample data since 
> you say you have already imported it.
>
>> The imported data frame looks something like this (the actual file is huge, 
>> so this is example data)
>> 
>> DF:
>> IDKey  X1  Y1  X2  Y2  X3  Y3  X4  Y4
>> Name1  21  15  25  10
>> Name2  15  18  35  24  27  45
>> Name3  17  21  30  22  15  40  32  55
>
> That data is missing in X3 etc, but would be NA in an actual data frame, so I 
> don't know if my workaround was the same as your workaround. Dput
> would have clarified the starting point.
>
>> I would like to create a new data frame with the following
>> 
>> NewDF:
>> IDKey   X   Y
>> Name1  21  15
>> Name1  25  10
>> Name2  15  18
>> Name2  35  24
>> Name2  27  45
>> Name3  17  21
>> Name3  30  22
>> Name3  15  40
>> Name3  32  55
>> 
>> With the data like this I think I can do the following
>> 
>> ggplot(NewDF, aes(x=X, y=Y, color=IDKey) + geom_line
>
> You are missing parentheses. If you use the reprex library to test your 
> examples before posting them, you can be sure your simple errors don't send 
> us off on wild goose chases.
>
>> and get 3 lines with the various number of points.
>> 
>> The point is that each of the XY pairs is a data point tied to NameX. I 
>> would like to rearrange the data so I can plot the points/lines by the 
>> IDKey.  There will be at least 2 points, but the number of points for each 
>> IDKey can be as many as 4.
>> 
>> I have tried using the gather() function from the tidyverse package, but
>
> The tidyverse package is a virtual package that pulls in many packages.
>
>> I can't make it work.  The issue is that I believe I need two separate 
>> gather statements (one for X, another for Y) to consolidate the data. This 
>> causes the pairs to not stay together and the data becomes jumbled.
>
> No, what you need is a gather-spread.
>
> ######
> library(dplyr)
> library(tidyr)
>
> DF <- read.table( text=
> "IDKey  X1  Y1  X2  Y2  X3  Y3  X4  Y4
> Name1   21  15  25  10  NA  NA  NA  NA
> Name2   15  18  35  24  27  45  NA  NA
> Name3   17  21  30  22  15  40  32  55
> ", header=TRUE, as.is=TRUE )
>
> NewDF <- (   dta
>         %>% gather( XY, value, -IDKey )
>         %>% separate( XY, c( "Coord", "Num" ), 1 )
>         %>% spread( Coord, value )
>         %>% filter( !is.na( X ) & !is.na( Y ) )
>         )
> ######

Sorry, should have practiced what I preached...

##########
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union
library(tidyr)

DF <- structure(list(IDKey = c("Name1", "Name2", "Name3"), X1 = c(21L, 
15L, 17L), Y1 = c(15L, 18L, 21L), X2 = c(25L, 35L, 30L), Y2 = c(10L, 24L, 
22L), X3 = c(NA, 27L, 15L), Y3 = c(NA, 45L, 40L), X4 = c(NA, NA, 32L), Y4 
= c(NA, NA, 55L)), .Names = c("IDKey", "X1", "Y1", "X2", "Y2", "X3", "Y3", 
"X4", "Y4"), class = "data.frame", row.names = c(NA, -3L))

NewDF <- (   DF
          %>% gather( XY, value, -IDKey )
          %>% separate( XY, c( "Coord", "Num" ), 1 )
          %>% spread( Coord, value )
          %>% filter( !is.na( X ) & !is.na( Y ) )
          )
NewDF
#>   IDKey Num  X  Y
#> 1 Name1   1 21 15
#> 2 Name1   2 25 10
#> 3 Name2   1 15 18
#> 4 Name2   2 35 24
#> 5 Name2   3 27 45
#> 6 Name3   1 17 21
#> 7 Name3   2 30 22
#> 8 Name3   3 15 40
#> 9 Name3   4 32 55
##########

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list