[R] qplot: plotting precipitation data

Winston Chang winstonchang1 at gmail.com
Thu Sep 20 23:30:44 CEST 2012


It was a bit hard for me to follow the thread and figure out exactly
what the problem is that you're having, but I think it has something
to do with the ticks on the x axis not appearing in the correct order?

It's probably related to this issue:
https://github.com/hadley/ggplot2/issues/577
I believe it's happening because the x scale gets "trained" separately
on xmin and xmax values, and if that happens when one of them
_doesn't_ have all the factor levels, the factor levels get
recomputed, and are placed in lexicographical order.

One workaround is to use scale_x_discrete(drop=FALSE).

-Winston

mydata <- structure(list(chrom = structure(c(3L, 3L, 3L, 3L, 3L, 3L), .Label =
c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16",
"chr17", "chr18", "chr19", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7",
"chr8", "chr9", "chrX"), class = "factor"), start = c(5291000L, 10988025L,
11767950L, 11840900L, 12267450L, 12276675L), end = c(5291926L, 10988526L,
11768676L, 11841851L, 12268076L, 12277051L), peak = c(8L, 7L, 8L, 8L, 12L, 7L)),
.Names = c("chrom", "start", "end", "peak" ), row.names = c(NA, -6L), class =
"data.frame")

# Continuous x axis - rects are very, very narrow, but they're there
ggplot(mydata) +
  geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = peak))

# Continuous x axis and geom_bar: Width of each bar is determined by
# resolution of the data. Still narrow, but not as much.
ggplot(mydata) +
  geom_bar(aes(x = start, y = peak), stat="identity")


unique(mydata$start)
unique(mydata$end)

# Convert start and end to factors with the same set of levels
levels <- sort(unique(c(mydata$start, mydata$end)))
mydata$start <- factor(mydata$start, levels = levels)
mydata$end <- factor(mydata$end, levels = levels)

# X ticks appear in wrong order
ggplot(mydata) +
  geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = peak))

# Can specify the limits directly
ggplot(mydata) +
  geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = peak)) +
  xlim(as.character(levels))

# Or can use scale_x_discrete(drop = FALSE)
ggplot(mydata) +
  geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = peak)) +
  scale_x_discrete(drop = FALSE)


On Thu, Sep 20, 2012 at 7:19 AM, Hadley Wickham <hadley at rice.edu> wrote:
> Hmmm, I'm not sure what the problem is, but I suspect it's related to
> the fact the xmin and xmax have different factors levels and there are
> some bugs in ggplot2 related to combining factors in some situations
> (it's basically impossible to always do it right).
>
> Explicitly ensuring the levels were the same doesn't help, but setting
> the xlims does.  Winston, is this related to some of the other bugs
> we've been working on lately?
>
> mydata <- structure(list(chrom = structure(c(3L, 3L, 3L, 3L, 3L, 3L), .Label =
> c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16",
> "chr17", "chr18", "chr19", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7",
> "chr8", "chr9", "chrX"), class = "factor"), start = c(5291000L, 10988025L,
> 11767950L, 11840900L, 12267450L, 12276675L), end = c(5291926L, 10988526L,
> 11768676L, 11841851L, 12268076L, 12277051L), peak = c(8L, 7L, 8L, 8L, 12L, 7L)),
> .Names = c("chrom", "start", "end", "peak" ), row.names = c(NA, -6L), class =
> "data.frame")
>
> ggplot(mydata) +
>   geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = peak))
>
> unique(mydata$start)
> unique(mydata$end)
>
> levels <- sort(unique(c(mydata$start, mydata$end)))
> mydata$start <- factor(mydata$start, levels = levels)
> mydata$end <- factor(mydata$end, levels = levels)
>
> ggplot(mydata, aes(x = start)) +
>   geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = peak)) +
>   xlim(as.character(levels))
>
>
> On Sun, Sep 16, 2012 at 11:11 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>> Maybe a bug in ggplot2::geom_rect?
>>
>> I'm Cceing this to Hadley Wickham, maybe he has an answer.
>>
>> Rui Barradas
>>
>> Em 16-09-2012 17:04, John Kane escreveu:
>>>>
>>>> -----Original Message-----
>>>> From: ruipbarradas at sapo.pt
>>>> Sent: Sun, 16 Sep 2012 13:13:47 +0100
>>>> To: jrkrideau at inbox.com
>>>> Subject: Re: [R] qplot: plotting precipitation data
>>>>
>>>> Hello,
>>>>
>>>> Relative to the op's "request" for rectangls, I'm not understanding them.
>>>
>>> Neither am I really, I just googled a couple of sites for possible
>>> "chromatin precipitation" graphs and since the OP was not sure of the name
>>> of the geom made the assumption that they wanted a bar chart as it seemed
>>> like the simplest graph matching the 'rectanngles" statement.  I was
>>> assuming a terminology or language problem here and I could not see any
>>> reason the OP wanted purely rectangles.
>>>
>>>> In your plot using geom_bar, the levels of as.factor(start) are sorted
>>>> ascending. If both
>>>>
>>>>   > as.factor(mydata$start)
>>>> [1] 5291000  10988025 11767950 11840900 12267450 12276675
>>>> Levels: 5291000 10988025 11767950 11840900 12267450 12276675
>>>>   > as.factor(mydata$end)
>>>> [1] 5291926  10988526 11768676 11841851 12268076 12277051
>>>> Levels: 5291926 10988526 11768676 11841851 12268076 12277051
>>>>
>>>> also are, why isn't geom_rect ploting them by that order?
>>>>
>>>> p2 <- ggplot(mydata, aes(x = as.factor(start), y = peak))
>>>> p2 + geom_rect(aes(xmin = as.factor(start), xmax = as.factor(end), ymin
>>>> = 0, ymax = peak))
>>>>
>>>> The level 5291926 is place last. Shouldn't it be expected to plot as
>>>> first?
>>>
>>> This is far beyond my knowledge of ggplot but I would certainly think it
>>> should.
>>>   as.numeric( as.factor(mydata$start)))
>>> [1] 1 2 3 4 5 6
>>>
>>> so why would we get something like 2 3 4 5 6 1  if I am reading this
>>> correctly?
>>>
>>>
>>>> Rui Barradas
>>>>
>>>> Em 16-09-2012 00:20, John Kane escreveu:
>>>>>
>>>>> Thanks for the data. It makes things much easier.
>>>>>
>>>>> Do you want a bar chart (i.e. geom  = bar in qplot or geom_bar in
>>>>> ggplot)? That sounds like what you mean when you speak of rectangles.
>>>>>
>>>>> If so try this ggplot) command -- I almost never use qplot() so I am not
>>>>> quite sure how to specify it there.
>>>>>
>>>>> p  <-  ggplot(mydata , aes(as.factor(start), peak )) + geom_bar(stat=
>>>>> "identity", )
>>>>> p
>>>>>
>>>>>
>>>>> John Kane
>>>>> Kingston ON Canada
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: hnorpois at googlemail.com
>>>>>> Sent: Sat, 15 Sep 2012 18:39:54 +0200
>>>>>> To: r-help at r-project.org
>>>>>> Subject: [R] qplot: plotting precipitation data
>>>>>>
>>>>>> Dear list,
>>>>>>
>>>>>> I wish to plot chromatin precipitation data: I would like to have a
>>>>>> rectangles (x:end-start, y:peak) but I do not have an idea how to
>>>>>> define
>>>>>> x
>>>>>> (in terms of qplot syntax) and to choose the correct geom.
>>>>>>    mydata is a subset of a larger file.
>>>>>>
>>>>>>> mydata
>>>>>>
>>>>>>     chrom    start      end       peak
>>>>>> 1 chr11  5291000  5291926    8
>>>>>> 2 chr11 10988025 10988526    7
>>>>>> 3 chr11 11767950 11768676    8
>>>>>> 4 chr11 11840900 11841851    8
>>>>>> 5 chr11 12267450 12268076   12
>>>>>> 6 chr11 12276675 12277051    7
>>>>>>>
>>>>>>> dput(mydata)
>>>>>>
>>>>>> structure(list(chrom = structure(c(3L, 3L, 3L, 3L, 3L, 3L), .Label =
>>>>>> c("chr1",
>>>>>> "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16",
>>>>>> "chr17", "chr18", "chr19", "chr2", "chr3", "chr4", "chr5", "chr6",
>>>>>> "chr7", "chr8", "chr9", "chrX"), class = "factor"), start = c(5291000L,
>>>>>> 10988025L, 11767950L, 11840900L, 12267450L, 12276675L), end =
>>>>>> c(5291926L,
>>>>>> 10988526L, 11768676L, 11841851L, 12268076L, 12277051L), peak = c(8L,
>>>>>> 7L, 8L, 8L, 12L, 7L)), .Names = c("chrom", "start", "end", "peak"
>>>>>> ), row.names = c(NA, -6L), class = "data.frame")
>>>>>> Thanks for some instructions.
>>>>>>
>>>>>> Hermann Norpois
>>>>>>
>>>>>>         [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>> ____________________________________________________________
>>>>> FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ____________________________________________________________
>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on
>>> your desktop!
>>> Check it out at http://www.inbox.com/marineaquarium
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> RStudio / Rice University
> http://had.co.nz/




More information about the R-help mailing list