[R] Removing NAs from dataframe (for use in Vioplot)

Mike Smith mike at hsm.org.uk
Sun May 1 09:15:44 CEST 2016


>>> On Apr 30, 2016, at 12:58 PM, Mike Smith <mike at hsm.org.uk> wrote:

>>> Hi

>>> First post and a relative R newbie....

>>> I am using the vioplot library to produce some violin plots.

DW> It's a package,  .... not a library.

>>> I have an input CSV with columns off irregular length that contain NAs. I want to strip the NAs out and produce a multiple violin plot automatically labelled using the headers. At the moment I do this

>>> Code: 
>>> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv")
>>> library(vioplot)
>>> y6<-na.omit(ds1$y6)
>>> y5<-na.omit(ds1$y5)
>>> y4<-na.omit(ds1$y4)
>>> y3<-na.omit(ds1$y3)
>>> y2<-na.omit(ds1$y2)
>>> y1<-na.omit(ds1$y1)
>>> vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", "Y5","Y4","Y3","Y2","Y1"), col = "lightblue")


>>> Two queries:

>>> 1. Is there a more elegant way of automatically stripping the NAs, passing the columns to the function along with the header names??


>> ds2 <- lapply( ds1, na.omit)


Fantastic - that does the trick! Easy when you know how!! 

Follow-on: is there a way feed all the lists from ds2 to vioplot? It is now a series of lists (rather than a dataframe - is that right?). So this works, 

library(vioplot)
ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv")
ds2 <- lapply( ds1, na.omit)
vioplot(ds2$y1,ds2$y2)

but this doesnt

library(vioplot)
ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv")
ds2 <- lapply( ds1, na.omit)
vioplot(ds2)

>>> 2. Can I easily add the sample size to each violin plotted??

>>> ?violplot
>> No documentation for ‘violplot’ in specified packages and libraries:
>> you could try ‘??violplot’

DW> I see that I mispled that _package_ name. However, after loading
DW> it I realized that I had no way of replicating what you are
DW> seeing, because you didn't provide that file (or even something
DW> that resembles it. It's rather unclear how you wanted this information presented.

The original code *should* have worked as the csv was online. There doesnt seem to be any option in vioplot to add the sample size (these are all small samples which I wanted to highlight) so I dont know if this is easily done elsewhere.

Thanks again!!
---
Mike Smith



More information about the R-help mailing list