[R] merge and polylist

Roger Bivand Roger.Bivand at nhh.no
Mon Oct 9 16:20:36 CEST 2006


On Mon, 9 Oct 2006, Mihai Nica wrote:

> The answer to all your questions is simple. By the time I get a little
> grasp on things, they become "deprecated" :-). But my programming skills
> are so low, that I find this normal.

> My problem comes from the last two #comment lines :

> #(if data row names do not match polygon IDs, will reorder, or fail if 
> # any differ or absent

> The two data row names differ and some are absent, that's why I used
> "merge" (and posted on this list, not on r-sig-geo :-)). How else can I
> intersect the two data.frames?

Yes, use merge, but do look carefully at both of the input data frames, 
the by.x= and by.y= columns, and the output data frame. The sp classes 
give an ID to each polygon:

sapply(slot(object, "polygons"), function(x) slot(x, "ID"))

and that has to be identical to the row names of the output data frame - 
but can be in a different order. Constructing the SpatialPolygonsDataFrame 
will put the data frame rows in the order of the polygons.

Roger


> Thanks so much for your help!


----- Original Message ----
From: Roger Bivand <Roger.Bivand at nhh.no>
To: Mihai Nica <mihai.p.nica at gmail.com>
Cc: r-help at stat.math.ethz.ch
Sent: Sunday, October 8, 2006 3:10:16 PM
Subject: Re: [R] merge and polylist

On Sat, 7 Oct 2006, Mihai Nica wrote:

> Greetings:
> 
> I would like to kindly ask for a little help. The rough code is:
> 

Maybe R-sig-geo would be a more relevant list. Comments inline.

> #________________________________________________________
> 
> dat=data.frame(read.delim(file="all.txt", header = TRUE, sep = "\t",
> quote="\"", dec=".",na.strings = "NA"))

We do not know if dat is in the same order as the shapefile. If dat$cod is 
malformed wrt. nc$att.data$AREA (very strange choice, in ESRI generated 
files often the polygon area), you are asking for trouble.

> 
> nc=read.shape("astae.shp", dbf.data=TRUE, verbose=TRUE)
> 
> mappolys=Map2poly(nc)
> 
> submap <- subset(mappolys, nc$att.data$NAME!="Honolulu, HI")
> 
> nc$att.data=subset(nc$att.data, nc$att.data$NAME!="Honolulu, HI")
> 

In situations like this, overwriting the input is not advisable, bercause
you destroy your ability to check that the output corresponds to your
intentions.
  
> nc$att.data[,1]=as.numeric(paste(nc$att.data$MSACMSA))
> 
> #attributes(nc$att.data)
> 
> nc$att.data=merge(nc$att.data, dat, by.x="AREA", by.y="cod", all.x=TRUE,
> sort=FALSE)

Ditto.

> 
> #attributes(nc$att.data)
> 
> tmp=file("tmp")
> 
> write.polylistShape(submap, nc$att.data, "tmp")
> 

Any good reason for not using the sp class framework? The objects you are 
using here are very low-level and messy.

library(maptools)
nc_1 <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],
  ID="FIPS") # put shapefile in SpatialPolygonsDataFrame
nc_2 <- nc_1[coordinates(nc_1)[,1] < -80,] # subset with "[" method
row.names(as(nc_2, "data.frame"))
as.character(nc_2$FIPS)
tmpfl <- paste(tempfile(), "dbf", sep=".")
download.file("http://spatial.nhh.no/misc/nc_xtra.dbf";, tmpfl, mode="wb")
nc.df <- read.dbf(tmpfl) # extra data keyed on CNTY_ID
nc_2$CNTY_ID
nc.df$CNTY_ID
nc_df2 <- merge(as(nc_2, "data.frame"), nc.df, by="CNTY_ID", sort=FALSE)
all.equal(nc_df2$CNTY_ID, nc_2$CNTY_ID)
row.names(nc_df2) <- nc_df2$FIPS # re-instate IDs
nc_3 <- SpatialPolygonsDataFrame(as(nc_2, "SpatialPolygons"), data=nc_df2)
# (if data row names do not match polygon IDs, will reorder, or fail if 
# any differ or absent
writePolyShape(nc_3, "nc_3")

This still isn't as tidy as it could be, but gives much more control than 
the original old-style classes.

Roger


> #_________________________________________________________
> 
> All works fine, but merge() changes the rownames and the link between the
> polygons and the corresponding rows is lost. I tried numerous other
> solutions (such as to "paste" back the old rownames), to no avail. After a
> few days, here I am. Please, if you have a moment, send a tip.
> 
>  Thanks,
> 
>  mihai
> 
> 
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-help mailing list