[R] netcdf data precision or least significant digit

Ismail SEZEN sezenismail at gmail.com
Fri Jul 8 01:49:36 CEST 2016


Thank you Roy. 

I use NCEP/NCAR Reanalysis 2 data [1]. More precisely, u-wind data of the year 2015 [2]. I am also pretty sure that the variables like scale_factor or add_offset should be precise like 0.01 or 187.65 but somehow (I hope this is not an issue originated by me) they are not, including data. Also let me note that I already contacted to author of ncdf4 package and also sent an email to ESRL, too, but no luck yet.

For a vectoral data, absolute and mutual u components of wind speed at the poles must be equal. For instance, at “2015-01-01 00 GMT”, u-wind at longitude=0 and latitude=90 is 9.1999979 m/s and u-wind at longitude=180 and latitude=90 is -9.2000017 m/s. Minus sign comes from positive north direction. Physically, their absolute values must be equal.

1- http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.html
2- ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis2.dailyavgs/pressure/uwnd.2015.nc



> On 08 Jul 2016, at 02:27, Roy Mendelssohn - NOAA Federal <roy.mendelssohn at noaa.gov> wrote:
> 
> Hi Ismail:
> 
> Can you point me to a particular netcdf file you are working with.  I would like to play with it for awhile.  I am pretty certain the scale factor is 0.01 and what you are seeing in rounding error (or mor precisely I should say problems with representations of floating point numbers),  but i would like to see if there is away around this.
> 
> Thank,
> 
> -Roy
> 
>> On Jul 7, 2016, at 4:16 PM, Ismail SEZEN <sezenismail at gmail.com> wrote:
>> 
>> Thank you very much Jeff.  I think I’m too far to be able to explain myself. Perhaps, this is the wrong list for this question but I sent it in hope there is someone has deep understanding of netcdf data and use R. Let me tell the story simpler. Assume that you read a numeric vector of data from a netcdf file:
>> 
>> data <- c(9.1999979, 8.7999979, 7.9999979, 3.0999980, 6.1000018, 10.1000017, 10.4000017, 9.2000017)
>> 
>> you know that the values above are a model output and also you know that, physically, first and last values must be equal but somehow they are not.
>> 
>> And now, you want to use “periodic” spline for the values above.
>> 
>> spline(1:8, data, method = “periodic”)
>> 
>> Voila! spline method throws a warning message: “spline: first and last y values differ - using y[1] for both”. Then I go on digging and discover 2 attributes in netcdf file: “precision = 2” and “least_significant_digit = 1”. And I also found their definitions at [1].
>> 
>> precision -- number of places to right of decimal point that are significant, based on packing used. Type is short.
>> least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value. Type is short.
>> 
>> Please, do not condemn me, english is not my main language :). At this point, as a scientist, what would you do according to explanations above? I think I didn’t exactly understand the difference between precision and least_significant_digit. One says “significant” and latter says “reliable”. Should I round the numbers to 2 decimal places or 1 decimal place after decimal point?
>> 
>> Thanks,
>> 
>> 1- http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml
>> 
>> 
>>> On 08 Jul 2016, at 01:29, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>>> 
>>> Correction:
>>> 
>>> ?options (not par)
>>> -- 
>>> Sent from my phone. Please excuse my brevity.
>>> 
>>> On July 7, 2016 3:26:06 PM PDT, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>>>> Same as with any floating point numeric computation environment... you
>>>> don't. There is always uncertainty in any floating point number... it
>>>> is just larger in this data than you might be used to.
>>>> 
>>>> Once you get to the stage where you want to output values, read up on
>>>> 
>>>> ?round
>>>> ?par (digits)
>>>> 
>>>> and don't worry about the incidental display of extra digits prior to
>>>> presentation (output). 
>>>> -- 
>>>> Sent from my phone. Please excuse my brevity.
>>>> 
>>>> On July 7, 2016 12:50:54 AM PDT, Ismail SEZEN <sezenismail at gmail.com>
>>>> wrote:
>>>>> Hello,
>>>>> 
>>>>> I use ncdf4 and ncdf4.helpers packages to get wind data from ncep/ncar
>>>>> reanalysis ncetcdf files. But data is in the form of (9.199998,
>>>>> 8.799998, 7.999998, 3.099998, -6.8000018, …). I’m aware of precision
>>>>> and least_significant_digit attributes of ncdf4 object [1]. For uwnd
>>>>> data, precision = 2 and least_significant_digits = 1. My doubt is that
>>>>> should I round data to 2 decimal places or 1 decimal place after
>>>>> decimal point?
>>>>> 
>>>>> Same issue is valid for some header info.
>>>>> 
>>>>> Output of ncdf4 object:
>>>>> 
>>>>> 
>>>>> Output of ncdump on terminal:
>>>>> 
>>>>> 
>>>>> for instance, ncdump's scale factor is 0.01f but ncdf4 object’s
>>>>> scale_factor is 0.00999999977648258. You can notice same issue for
>>>>> actual_range and add_offset. Also a similar issue exist for the data.
>>>>> How can I truncate those extra unsignificant decimal places or round
>>>>> the numbers to significant decimal places?
>>>>> 
>>>>> 1 -
>>>>> http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml
>>>>> <http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> **********************
> "The contents of this message do not reflect any position of the U.S. Government or NOAA."
> **********************
> Roy Mendelssohn
> Supervisory Operations Research Analyst
> NOAA/NMFS
> Environmental Research Division
> Southwest Fisheries Science Center
> ***Note new address and phone***
> 110 Shaffer Road
> Santa Cruz, CA 95060
> Phone: (831)-420-3666
> Fax: (831) 420-3980
> e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/
> 
> "Old age and treachery will overcome youth and skill."
> "From those who have been given much, much will be expected" 
> "the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
> 



More information about the R-help mailing list