[R] Plot of a subset of a data.frame()

David Winsemius dwinsemius at comcast.net
Mon Jul 26 14:30:01 CEST 2010


On Jul 26, 2010, at 7:38 AM, Steffen Uhlig wrote:

> Hello,
>
> my data.frame is sort of a collection of process values, i.e. huge  
> run-chart. It consists of a time-stamp in the first column (date as  
> string), factors in the following columns (used for subset- 
> filtering), and some process-data columns.
> Hereafter, two examples are listed, showing the problems that occour  
> during print:
>
> At first the example, that works fine:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> a = c(1:10) 		# create a vector of integers
> b = rep(c("a","b"),5)	# create a vector of chars, used
> 			# as factor-levels	
> d = rnorm(10)		# some random numbers
> e = data.frame(a,b,d)	# connect to a data.frame

You've gotten several answers, but none have addressed an aspect of R  
behavior that took me longer to appreciate than it perhaps should  
have. The "b" column inside the "e" data.frame is now a factor column.  
I mention that because you later referred to it as a "string" which it  
is not. It is an integer with an associated  indexed level character  
vector. Many of the functions that you might think would "work" on  
"strings" will give either errors or unexpected results when applied  
to factors.


>
> e.1 = subset(e, b=="a")	# create two subsets
> e.2 = subset(e, b=="b")
> plot(d~a, e.1, pch=3, col=2) # plot first data-subset
> points(d~a, e.2, pch=4, col=3) # plot the 2nd one
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> all looks fine in theses plots.
>
>
> However, changing the content of vector "a" to a set of strings the  
> following happens:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> a = c("a","b","c","d","e","f","g","h","i","j")
> e = data.frame(a,b,d)       # re-build data.frame
>
> e.1 = subset(e, b=="a")     # create two subsets
> e.2 = subset(e, b=="b")
> plot(d~a, e.1, pch=3, col=2)
> points(d~a, e.2, pch=4, col=3)
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> The plot-command produces horizontal lines instead of dots. This  
> seems to happen when the x-axis contains strings rather than  
> numbers. is there a way out?
>
> Best regards,
> /Steffen
-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list