[R] Average distance in kilometers between subsets of points with ggmap /geosphere

Eric Berger er|cjberger @end|ng |rom gm@||@com
Mon Sep 23 09:32:18 CEST 2019


Hi Malte,
I only skimmed your question and looked at the desired output.
I wondered if the apply function could meet your needs.
Here's a small example that might help you:

m <- matrix(1:9,nrow=3)
m <- cbind(m,apply(m,MAR=1,mean))  # MAR=1 says to apply the function
row-wise
m

#         [,1] [,2] [,3] [,4]
# [1,]    1    4    7    4
# [2,]    2    5    8    5
# [3,]    3    6    9    6

HTH,
Eric


On Mon, Sep 23, 2019 at 10:18 AM Malte Hückstädt <
deaddatascientists using gmail.com> wrote:

> I would like to determine the geographical distances from a number of
> addresses and determine the mean value (the mean distance) from these.
>
> In case the dataframe has only one row, I have found a solution:
>
> ```r
> # Pakete laden
> library(readxl)
> library(openxlsx)
> library(googleway)
> #library(sf)
> library(tidyverse)
> library(geosphere)
> library("ggmap")
>
> #API Key bestimmen
> set_key("")
> api_key <- ""
> register_google(key=api_key)
>
> #  Data
> df <- data.frame(
>   V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538
> München, Germany",
>          "07745 Jena, Germany",    "10117 Berlin, Germany"),
>   V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152
> Planegg, Germany",
>          "07743 Jena, Germany",    "14195 Berlin, Germany"),
>   V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748
> Garching, Germany",
>          NA,     "10318 Berlin, Germany"),
>   V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805
> München, Germany",
>          "07745 Jena, Germany", NA), stringsAsFactors=FALSE
> )
>
> #replace NA for geocode-funktion
> df[is.na(df)] <- ""
>
> #slice it
> df1 <- slice(df, 5:5)
>
> #  lon lat Informations
> df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()
>
> # to Matrix
> mat_df  <- as.matrix(df_2)
>
> #dist-mat
> dist_mat <- distm(mat_df)
>
> #mean-dist of row 5
> mean(dist_mat[lower.tri(dist_mat)])/1000
> ```
>
> Unfortunately, I fail to implement a function that executes the code for
> an entire data set. My current problem is, that the function does not
> calculate the distance-averages rowwise, but calculates the average value
> from all lines of the data set.
>
> ```r
> #Funktion
>
> Mean_Dist <- function(df,w,x,y,z) {
>
>   # for (row in 1:nrow(df)) {
>   #   dist_mat <- geocode(c(w, x, y, z))
>   #
>   # }
>
>   df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen
> aus Adressen
>
>   mat_df <- as.matrix(df) # schreibe diese in eine Matrix
>
>   dist_mat <- distm(mat_df)
>
>   dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
>
>   return(dist_mean)
> }
>
> df %>%  mutate(lon =  Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)
>
> ```
> Do you have any idea what mistake I made?
>
> to clarify my question: What I'm trying to create a dataframe like this
> one (V5):
>
> ```r
>   V1                     V2                     V3
> V4                      V5
>   <chr>                  <chr>                  <chr>
>  <chr>                   <numeric>
> 1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
> 80805 München, Germany Mean_Dist_row1
> 2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany
> 01187 Dresden, Germany Mean_Dist_row2
> 3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
> 80805 München, Germany Mean_Dist_row3
> 4 07745 Jena, Germany    07743 Jena, Germany    07745 Jena, Germany
>  07745 Jena, Germany Mean_Dist_row4
> 5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany
>  14476 Potsdam, Germany Mean_Dist_row5
> ```
>
> eg an average of the distance of each row.
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list