[R] find number of consecutive days in NC files by R

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Fri Aug 7 07:17:07 CEST 2020


Hi Ahmet,
Here is a way to get the result you ask for for one geographic grid
cell. You may want more detail or something, but this is a
"reproducible example".

# retrieved from
ftp://ftp2.psi.noaa.gov/Datasets/ncep.renalysis.dailyavgs/surface_gauss/soilw.1-10cm.gauss.1949.nc
library(ncdf4)
soilm<-nc_open("soilw.0-10cm.gauss.1949.nc")
soil_moist<-ncvar_get(soilm)
# find a suitable grid cell
for(i in 1:192) {
 for(j in 1:94) {
  minmoist<-min(soil_moist[i,j,],na.rm=TRUE)
  if(minmoist < 0.3) cat(i,j,minmoist,"\n")
 }
}
# data is 3D numeric array use cell 159,66
soil_moisture<-soil_moist[159,66,]
dates<-as.Date(as.Date("1949-01-01"):as.Date("1949-12-31"),
 origin=as.Date("1970-01-01"))
# get a logical vector for your condition
under.28<-soil_moisture < 0.28
plot(dates,soil_moisture,
 main="Soil moisture for grid cell 159,66 1949",
 col=under.28+3)
abline(h=0.28)
sm.rle<-rle(soil_moisture < 0.28)
day_of_year<-1
cat("Consecutive days below 0.28\n")
for(run in 1:length(sm.rle$lengths)) {
 if(sm.rle$values[run]) {
  cat("Day",day_of_year,"-",day_of_year+sm.rle$lengths[run],
   "-",sm.rle$lengths[run],"days\n")
  day_of_year<-day_of_year+sm.rle$lengths[run]
 }
}

Jim

On Fri, Aug 7, 2020 at 11:54 AM ahmet varlı <varli61 using windowslive.com> wrote:
>
>  Many thanks for your answer
>
> > nc
>
> File soilw_1949.nc (NC_FORMAT_NETCDF4_CLASSIC):
>
>
>
>      1 variables (excluding dimension variables):
>
>         float soilw[lon,lat,time]
>
>             long_name: mean Daily Volumetric Soil Moisture between 0-10 cm Below Ground Level
>
>             units: fraction
>
>             precision: 4
>
>             least_significant_digit: 3
>
>             GRIB_id: 144
>
>             GRIB_name: SOILW
>
>             var_desc: Volumetric Soil Moisture
>
>             dataset: NCEP Reanalysis Daily Averages
>
>             level_desc: Between 0-10 cm BGL
>
>             statistic: Mean
>
>             parent_stat: Individual Obs
>
>             missing_value: -9.96920996838687e+36
>
>             actual_range: 0.100000143051147
>
>              actual_range: 0.434000015258789
>
>             valid_range: 0
>
>              valid_range: 1
>
>
>
>      3 dimensions:
>
>         lon  Size:192
>
>             units: degrees_east
>
>             long_name: Longitude
>
>             actual_range: 0
>
>              actual_range: 358.125
>
>             standard_name: longitude
>
>             axis: X
>
>         lat  Size:94
>
>             units: degrees_north
>
>             actual_range: 88.5419998168945
>
>              actual_range: -88.5419998168945
>
>             long_name: Latitude
>
>             standard_name: latitude
>
>             axis: Y
>
>         time  Size:365   *** is unlimited ***
>
>             long_name: Time
>
>             delta_t: 0000-00-01 00:00:00
>
>             avg_period: 0000-00-01 00:00:00
>
>             standard_name: time
>
>             axis: T
>
>             units: hours since 1800-01-01 00:00:0.0
>
>             actual_range: 1306104
>
>              actual_range: 1314840
>
>
>
>     7 global attributes:
>
>         Conventions: COARDS
>
>         title: mean daily NMC reanalysis (1949)
>
>         description: Data is from NMC initialized reanalysis
>
> (4x/day).  It consists of T62 variables interpolated to
>
> pressure surfaces from model (sigma) surfaces.
>
>         platform: Model
>
>         history: created 99/05/29 by Hoop (netCDF2.3)
>
> Converted to chunked, deflated non-packed NetCDF4 2014/09
>
>         dataset_title: NCEP-NCAR Reanalysis 1
>
>         References: http://www.psl.noaa.gov/data/gridded/data.ncep.reanalysis.html
>
>
>
> I swiched nc to array to calculate threshold it is a 3d matrix and there is no date in files
>
>
>
>
>
> > dim(a)
>
>
>
> [1]  94 192 365
>
>
>
> >a
>
> , , 1
>
>
>
>       [,169]    [,170]    [,171]    [,172]    [,173]    [,174]    [,175]    [,176]    [,177]    [,178]    [,179]    [,180]    [,181]    [,182]
>
>  [1,] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>
>  [2,] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>
>  [3,] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>
>  [4,] 0.3580000 0.3580000 0.3580000 0.3580000 0.3580000 0.3580000 0.3580000 0.3577001 0.3577001 0.3577001 0.0000000 0.0000000 0.0000000 0.0000000
>
>  [5,] 0.3580000 0.3575001 0.3572001 0.3572001 0.3572001 0.3570001 0.3570001 0.3575001 0.3577001 0.3580000 0.3580000 0.3580000 0.3580000 0.3580000
>
>
>
>
>
>
>
> [ reached getOption("max.print") -- omitted 89 row(s) and 364 matrix slice(s) ]
>
>
>
>
>
> Windows 10 için Posta ile gönderildi
>
>
>
> Kimden: Jim Lemon
> Gönderilme: 7 Ağustos 2020 Cuma 02:17
> Kime: ahmet varlı; r-help mailing list
> Konu: Re: [R] find number of consecutive days in NC files by R
>
>
>
> Hi Ahmet,
> I think what you are looking for can be done using run length encoding (rle).
>
> # make up some data
> soil_moisture<-sin(seq(0,4*pi,length.out=730))+1.1
> dates<-as.Date(as.Date("2018-01-01"):as.Date("2019-12-31"),
>  origin=as.Date("1970-01-01"))
> # get a logical vector for your condition
> under.28<-soil_moisture < 0.28
> # show the soil moisture against time
> plot(dates,soil_moisture,pch=".",col=under.28+3,cex=2)
> abline(h=0.28)
> # use rle to get  the runs of low soil moisture
> sm.rle<-rle(soil_moisture < 0.28)
> cat("Consecutive days below 0.28",
>  paste(1:sum(sm.rle$values),sm.rle$lengths[sm.rle$values==TRUE],sep="-"),
>  "\n")
>
> Jim
>
> On Fri, Aug 7, 2020 at 3:33 AM ahmet varlı <varli61 using windowslive.com> wrote:
> >
> > Hi all,
> >
> >
> > There are 365 days of soil moisture NC files and I am trying to find out how many days the values are below and above this certain threshold are repeated by R. However, I couldn't reach exactly what I wanted. For example, Daily soil moisture is below 0.3 without interrupting how many days in 365 days. NC file contains annual soil moisture values daily
> >
> > nctoarray <- function(ncfname, varid = NA) {   nc <- nc_open(ncfname)
> >
> > a <- aperm(ncvar_get(nc), c(2,1,3))   nc_close(nc)   a }
> >
> >
> >
> > function(x, threshold = 0.28, below = TRUE) {
> >
> >     if (below) {
> >
> >         y <- ifelse(x < threshold,1,0)
> >
> >     } else y <- ifelse(x > threshold,1,0)
> >
> >
> >
> >     y2 <- rle(y)
> >
> >     sel <- which(y2$values == 1)
> >
> >     max(y2$lengths[sel])
> >
> > }
> >
> >
> >
> > m1 <- suppressWarnings(apply(a,c(1,2), consechours, 0.3, TRUE))
> >
> >
> >
> > m2 <- suppressWarnings(apply(a,c(1,2), consechours, 0.4, FALSE))
> >
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list