[R] organizing data in a matrix avoiding loop

Duncan Murdoch murdoch.duncan at gmail.com
Fri May 26 14:20:16 CEST 2017


On 26/05/2017 7:46 AM, A M Lavezzi wrote:
> Dear R-Users
>
> I have data on bilateral trade flows among countries in the following form:
>
>> head(dataTrade)
>
>       iso_o iso_d year FLOW
> 1   ABW   AFG 1985   NA
> 2   ABW   AFG 1986   NA
> 3   ABW   AFG 1987   NA
> 4   ABW   AFG 1988   NA
> 5   ABW   AFG 1989   NA
> 6   ABW   AFG 1990   NA
>
> where:
> iso_o: code of country of origin
> iso_d: code of country of destination
> year: 1985:2015
> FLOW: amount of trade (values are "NA", 0s, or positive numbers)
>
> I have 215 countries. I would like to create a 215x215 matrix , say M, in
> which element M(i,j) is the total trade between countries i and j between
> 1985 and 2015 (i.e. the sum of annual amounts of trade).
>
> After collecting the country codes in a variable named "my_iso", I can
> obtain M in a straightforward way using a loop such as:
>
> for (i in my_iso){
>   for(j in my_iso)
>     if(i!=j){
>       M[seq(1:length(my_iso))[my_iso==i],seq(1:length(my_iso))[my_iso==j]]
> <-
>         sum(dataTrade[dataTrade$iso_o==i &
> dataTrade$iso_d==j,"FLOW"],na.rm=TRUE)
>     }
> }
>
> However, it takes ages.
>
> Is there a way to avoid these loops?

Assuming that you have unique entries for each of the first 3 columns, 
you could so something like this:

# Put all the data into an array, indexed by origin, destination, year:

dataMatrix <- as.matrix(dataTrade)  # Converts everything to character

dataArray <- array(0, c(215, 215, 31))
dimnames(dataArray) <- list(unique(dataMatrix[,1]), 
unique(dataMatrix[,2]), unique(dataMatrix[,3]))

dataArray[dataMatrix[,1:3]] <- dataTrade$FLOW

# Sum across years

apply(dataArray, 3, sum)

I haven't tried this (you didn't give a reproducible example...), so you 
may need to tweak it a bit.

Duncan Murdoch



More information about the R-help mailing list