[R] Problems with Boxplot

gug guygreen at netvigator.com
Wed Sep 2 14:23:42 CEST 2009


Hello,

I have been having difficulty getting boxplot to give the output I want -
probably a result of the way I have been handling the data.

The data is arranged in columns: each date has two sets of data.  The number
of data points varies with the date, so each column is of different length. 
I want to get a series of boxplots with the date along the x-axis, with
alternating colors, so that it is easy to see the difference between the
results within each date, as well as across dates.

testdata<- c("C:\\Files\\R\\Sample R code\\Post trial data.csv")
data_headings <- read.table(testdata, skip = 0, sep = ",", header =
FALSE)[1,]
my_data <- read.table(testdata, skip = 1, sep = ",", na.strings =
"na",header = FALSE)
boxplot(my_data*100, names = data_headings, outline = FALSE, range = 0.3,
border = c(2,4))

The result is a boxplot, but it does not show the date along the bottom (the
"names = data_headings" bit achieves nothing).  I can alternatively try
this:

new_data<- read.table(testdata, skip = 0, sep = ",", na.strings =
"na",header = TRUE)
boxplot(new_data,outline = FALSE, range = 0.3,border = c(2,4))

This takes all the data and plots it, but I then lose the ability to
multiply by 100 (I'm trying to show percentages: e.g. 10% as "10", rather
than as "0.1").

1) My first question is: is there a simple way of getting both dates along
the x-axis and the "*100" calculation (or percentages)?

2) Next is how can I put a legend somewhere to show that red is "data set 1"
and blue is "data set 2".

3) Is it possible to get the date to straddle across each of the two dates
it covers: as it is, one tick has the date, the other does not.

4) Is it possible to show both the median and the mean with boxplot?

5) Finally, the code works as described above (i.e. up to a point) with the
"Post trial data.csv" file I have posted.  However when I try with a larger
file ("Larger trial.csv", also posted), I get the message: "Error in
scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :  line
145 did not have 50 elements" when I get to the "data_headings" line.  I
have no idea why R is seeing a difference between these two files.
http://www.nabble.com/file/p25256461/Post%2Btrial%2Bdata.csv
Post+trial+data.csv  http://www.nabble.com/file/p25256461/Larger%2Btrial.csv
Larger+trial.csv 
Thanks for any suggestions,

Guy Green
 

-- 
View this message in context: http://www.nabble.com/Problems-with-Boxplot-tp25256461p25256461.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list