[R] Changing the label name in the plot

Subhamitra Patra @ubh@m|tr@@p@tr@ @end|ng |rom gm@||@com
Wed Jun 12 06:36:27 CEST 2019


Hello Sir,

Thank you very much for your help for which I shall be always grateful to
you.

Concerning your questions,
"*1) Are the states in column 3 the same as those in column 1? As
you initially named the data frame "ts", perhaps the values in columns 2
and are taken at different times. If not, perhaps they are measured in
another set of countries, as yet unknown. Perhaps "DMs" and "EMs" are codes
that will resolve this.", *
I would like to answer that
the states in column 3 are not the same as in column 1. The data points in
column 2, and 4 are the values measured for different sets of countries.
Thus, they are a different set of values for a different set of countries,
and will not require one label for both the points in column 2 and 4 (i.e.
data columns). In particular, column 2 consists of the 45 data points that
measured for 45 countries (state names in column 1) whereas column 4
contains the 42 data points that measured for 42 another set of countries
(state name mentioned in column 3).  I tried both the column 2, and 4
separately along with their respective name columns, but unable to do
because K-mean test clusters only the numeric data points and is not
considering any non-numeric columns (i.e. state names). Thus, I considered
both the data points simultaneously, and after removing NAs from the data
table, both columns consist of the 42 data points. Hence, the number of
observations rather than the states name is coming in the clustered plot.
In this case, I stuck with the problem of setting a different label
(mentioned in column 1, and 3) for the different data points of column 2,
and 4.

Hope I successfully answered your question.

Thank you.



[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
06/12/19,
9:47:04 AM

On Wed, Jun 12, 2019 at 6:04 AM Jim Lemon <drjimlemon using gmail.com> wrote:

> Hi Subhamitra,
> It is time to admit that I had the wrong idea about what you wanted to
> do, due to the combination of trying to solve two problems at once
> while I was very tired. I appreciate your patience.
>
> From your last email, you have a data frame with four columns. The
> first and third are cryptic names for political states and the second
> and fourth are values that I assume are measured in those states.
> 1) Are the states in column 3 the same as those in column 1? As you
> initially named the data frame "ts", perhaps the values in columns 2
> and 4 are taken at different times. If not, perhaps they are measured
> in another set of countries, as yet unknown. Perhaps "DMs" and "EMs"
> are codes that will resolve this.
>
> I assumed that "DMs" and "EMs" should be used as the X and Y values on
> a scatterplot, as your initial example seemed to indicate. If so, they
> are different values for the same country and only require one label
> for each point. Proceeding from this, you can do something like this:
>
> spdf<-read.table(text="State DMs EMs
> JP 2.071 2.038
> CH 2.0548 2.017
> AT 2.0544 2.007
> CL 2.047 1.963
> ES 2.033 1.947
> PT 2.0327 1.942
> PL 2.0321 1.932
> FR 2.031 1.924
> SE 2.0293 1.913
> DE 2.0291 1.906
> DK 2.027 1.892
> UK 2.022 1.877
> TW 1.9934 1.869
> NL 1.993 1.849
> HK 1.989 1.848
> LU 1.988 1.836
> CA 1.987 1.835
> NZ 1.9849 1.819
> US 1.9842 1.798
> AU 1.981 1.771
> MY 1.978 1.762
> HU 1.968 1.717
> LT 1.96 1.707
> SG 1.958 1.688
> FI 1.955 1.683
> CR 1.953 1.671
> BY 1.952 1.664
> IL 1.95 1.646
> EE 1.948 1.633
> NO 1.945 1.624
> IE 1.937 1.621
> SI 1.913 1.584
> LV 1.901 1.487
> SK 1.871 1.482
> BH 1.801 1.23
> SK 1.761 1.129
> AE 1.751 1.168
> IS 1.699 0.941
> BM 1.687 0.591
> KW 1.668 0.387
> CY 1.633 0.16
> AP 1.56 0.0002",
> header = TRUE,stringsAsFactors=FALSE)
> library(cluster)
> k2 <- kmeans(spdf[,c(2,3)], centers = 2, nstart = 25)
> plot(spdf[,c(2,3)],col=k2$cluster,pch=19,xlim=c(1.55,2.1))
> text(spdf[,2]+rep(c(0.02,-0.02),42),
>  spdf[,3]+rep(c(-0.05,0.05),42),spdf[,1],col=k2$cluster)
> segments(spdf[,2],spdf[,3],spdf[,2]+rep(c(0.02,-0.02),42),
>  spdf[,3]+rep(c(-0.05,0.05),42),col=k2$cluster)
>
> I took the liberty of replacing your abbreviations with internet top
> level domains. As I hope you can see, you have a problem with crowded
> points and labels, even with the trick of spreading the labels out.
> You could modify the X and Y offsets by hand and get a much more
> readable plot.
>
> If this is not what you want, a bit more explanation of what you do
> want may get you there.
>
> Jim
>


-- 
*Best Regards,*
*Subhamitra Patra*
*Phd. Research Scholar*
*Department of Humanities and Social Sciences*
*Indian Institute of Technology, Kharagpur*
*INDIA*

	[[alternative HTML version deleted]]



More information about the R-help mailing list