[R] Plotting question

Mon Aug 1 22:48:19 CEST 2011

On 11-08-01 11:48 AM, Bert Gunter wrote:
> IMHO:
>
> On Mon, Aug 1, 2011 at 7:51 AM, Duncan Murdoch<murdoch.duncan at gmail.com>  wrote:
>> On 11-08-01 5:44 AM, Andrew McCulloch wrote:
>>>
>>> Hi,
>>>
>>> I use R to draw my graphs. I have 100 points on a simple xy-plot. The
>>> points are
>>> distinguished by a third variable which is categorical with 10 levels. I
>>> have
>>> been plotting x against y and using gray scales to distinguish the level
>>> of the
>>> categorical variable for each point. It looks ok to me but a journal
>>> reviewer
>>> says this is not any use. I cannot afford to pay for colour prints. Any
>>> ideas on
>>> what is the best way to distinguish 10 groups on an xy scatter plot?
>>
>> Plot digits or letters or other symbols.
>>
>> Duncan Murdoch
>>
> No, this does not work.

You have amazing perception to know that it doesn't work in Andrew's 
graph.  But then you go on to suggest that sometimes it does, and then 
suggest using symbols.

Obviously you need to see the graph to know what works.  If the 10 
categories are ordered, then something like thermometer plots would 
work.  If they are grouped into a small number of variations on a small 
number of groups, then digits or letters combined with shading might 
work, especially if the groups are well separated, or there are clear 
patterns.

I'd agree with the reviewer than 10 levels of shading is probably too 
many to distinguish, and I'd agree with you that digits 0-9 in equal 
quantities in an unstructured scatterplot are probably not a good 
presentation, but I wouldn't want to give specific advice about plotting 
a dataset without seeing it.

Duncan Murdoch

See Cleveland's books (e.g. "Visualizing
> Data"). 10 is too many symbols to constantly refer to a legend to keep
> straight, and digits or letters do not allow you to readily perceive
> the pattern. (Caveat: If "most" of the data are only 2 or 3 of the
> symbols, then these can work).
>
> I think the OP's idea of using gray scales was better. I would dispute
> the reviewer and refer them to appropriate references. Alternatively,
> thermometer plots (aka "filled rectangle" plots) would be best. Again,
> Cleveland's books provide scientific justification rather than merely
> the (possibly uninformed) aesthetic opinion of a reviewer. Presumably,
> the journal editor would accept hard data and psychological research
> in preference to opinions.
>
>>>
>>>
>>>
>>> If all else fails I can just remove the graph and give them a table of
>>> regression coefficients.
>
> No. I think your attempt to use a graph is a much better way to go.
> Try to resist poor practices such as just publishing summary
> statistics.
>
> Cheers,
> Bert
>>>
>>>
>>> Thanks.
>>>
>>> Yours Sincerely
>>> Andrew McCulloch
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>