Ashley Patton ashley.patton at aol.co.uk
Wed Mar 22 15:05:20 CET 2017

Good afternoon,

I was wondering if someone could help me with what I am sure is likely to be a really simple problem but I cannot work out what I have done wrong. I have tried searching the forums/Google etc but can't find anything quite like the code I am using other than things that do not differ from what I have done. I suspect then that the problem is in my naming of things but I don't know what is causing the issue.

I have data that comprises 53 columns containing temperature data for 53 sites recording continuously for a year, 48 times a day (half hourly). I also have one column that contains average air temperature for a city during the same time period. I would like to see if my collective site temperature data shows any correlation with the city air temperature data and so I have attempted to combined the data from the 53 site columns using the code below and then repeat the air temperature 53 times to correlate it against and then perform a Pearson's correlation. My data looks something like this:

Site   BHCS306   BH9OB1U   BHCS276   BHCS207...      AirTempC
         12.2          12.4            12.2           12.7                 15.3
         12.2          12.5            12.3           12.7                 16.2
         12.3          12.5            12.5           12.8                 16.1... 
repeating for 53 sites recording every half hour for a year

The code I used was this:

#String together data from all 53 sites into one column
AllTemps <- c(data[,"BHCS306"],data[,"BH9OB1U"],data[,"BHCS276"],data[,"BHCS207AL"],data[,"BHCS178AL"],data[,"BHCS159AL"],data[,"BHCS318"],data[,"BHCS211"],data[,"BH7OB1L"],data[,"BHCS274B"],data[,"BHCS337"],data[,"BH2PB1"],data[,"BHCS038"],data[,"BHCS074AL"],data[,"BH9OB1L"],data[,"Site 5"],data[,"BH6PB4"],data[,"BH6PB1"],data[,"BHCS329"],data[,"BH5PB1T"],data[,"BH4PB1T"],data[,"BHCS233T"],data[,"BHCS229"],data[,"BHCS272T"],data[,"BHCS217T"],data[,"BHCS283"],data[,"BHCS248"],data[,"BHCS002A"],data[,"BHCS245B"],data[,"BH4PB2T"],data[,"BH6PB2"],data[,"BH5PB1B"],data[,"BH4PB1B"],data[,"BHCS233B"],data[,"BHCS313L"],data[,"BHCS272B"],data[,"BHCS266"],data[,"BHCS217B"],data[,"BHCS241"],data[,"BH4PB2B"],data[,"BHCS116AL"],data[,"BHCS067A"],data[,"BHCS304L"],data[,"BH1OB1L"],data[,"BHCS307L"],data[,"BHCS037C"],data[,"BHCS301L"],data[,"BHCS238A"],data[,"BH3OB1"],data[,"BHCS308L"],data[,"BHCS278"],data[,"BHCS285"],data[,"BHCS133CL"],data[,"BHCS332L"])

#Copy air temp data 53 times
airTemps53 <- c(rep(AirTempC, times = 53))

#Run correlation between site temps and air temps
cor.test(AllTemps, airTemps53, alternative = "two.sided", method = "pearson")

The error it returned was this:

> #Copy air temp data 53 times
> airTemps53 <- c(rep(AirTempC, times = 53))
> #Run correlation between site temps and air temps
> cor.test(AllTemps, airTemps53, alternative = "two.sided", method = "pearson")
Error in cor.test(AllTemps, airTemps53, alternative = "two.sided", method = "pearson") : 
  object 'AllTemps' not found

Can anyone spot my mistake? I am very new to this so I am sure I have done something obvious and silly so please forgive me.

Additionally I was wondering if there was a an easy way to offset the data to see if, for example, I can see if there is a lag time between changes in air temperature correlating with changes in temperature at my sites or do I need to do this by manually offsetting the data in Excel first?

Many thanks,

