[R] Finding values in a dataframe at a specified hour

Alexandra Catena amc5981 at gmail.com
Sat Apr 11 01:24:11 CEST 2015


Hi Jim,

Thanks for the response, but unfortunately it results in the same
error.  I think it is something wrong with the if statement.  I tried
it out manually for the first row and hour that it's testing and
indeed, the wind speed is not higher than the 5*sigma value.  Since it
is not higher than the 5*sigma value, I would think it would just pass
to the next loop, yet it doesn't. I will keep trying!

Thanks,
Alexandra

On Fri, Apr 10, 2015 at 3:43 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
> Hi Alexandra,
> The error probably comes from the first iteration of i in 0:23. As indexing
> in R begins at 1, there is no element 0. Try using:
>
> for(i in 1:24) {
> ...
>
> and see what happens.
>
> Jim
>
>
> On Sat, Apr 11, 2015 at 7:06 AM, Alexandra Catena <amc5981 at gmail.com> wrote:
>>
>> Update:
>>
>> I have this so far.  * The first column of windHW is the wind speed.
>> The 5th column of the dataframe, spring, is the 5*sigma value of every
>> hour.  hourRow gives out all the rows of wind speed at a given hour.
>>
>> for (i in 0:23){
>>   hourRow = which(windHW$hour==i,arr.ind=TRUE)
>>   for (h in hourRow){
>>     if (windHW[h,1]>=spring[spring$hour==i,5]){
>>       windHW[h,1]<-NA}
>>   }
>> }
>>
>> This then gives the error: Error in if (windHW[h, 1] >=
>> spring[spring$hour == i, 5]) { : argument is of length zero
>>
>> *Note: The dataframe for each of the seasons have 24 rows
>> corresponding to each hour of the day 0:23.
>>
>> Thanks,
>> Alexandra
>>
>>
>> On Fri, Apr 10, 2015 at 1:07 PM, Alexandra Catena <amc5981 at gmail.com>
>> wrote:
>> > Hello,
>> >
>> > I have a large dataframe (windHW) of wind speeds (ws) at each hour
>> > from many days over a set of years.  Some of these values are
>> > obviously wrong (600 m/s) and I want to get rid of all the values that
>> > are larger than 5*sigma for each hour.  The 5*sigma (variable name
>> > sigma5) values are located in different dataframes for each season,
>> > with each dataframe titled as a season.  For example, in the
>> > dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1.
>> >
>> > So my question is as follows: how can I get it so that the code will
>> > be able to find all the wind speed values in the dataframe, windHW, of
>> > a specific hour be higher than the 5*sigma value at that hour?
>> > For example, I would like to find if any of the wind speed values at
>> > hour 1 are higher than 79.6 m/s, and if so, then replace that value
>> > with NA.
>> >
>> > I have something like this but I can't seem to figure out how to get
>> > it for specific hours:
>> >
>> > windHW$ws[windHW$ws>=spring$sigma5] <- NA
>> >
>> > I imported the data using readLines and into the dataframe windHW.  I
>> > also have R version 3.1.1
>> >
>> > Any help would be appreciated!
>> >
>> > Thanks,
>> > Alexandra
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list