[R] R: R: gstat problem with lidar data

Alessandro alessandro.montaghi at unifi.it
Thu Jul 17 01:46:51 CEST 2008


Ciao Dylan,

THANKS for your help. When I arrive in this step "V <- variogram(z~1,
d.small)", appear this note:

 Error in gstat(formula = object, locations = locations, data = data) : 
  l'argomento "data"  non è specificato e non ha un valore predefinito (data
argument it's not specified and it has not a value definied)

I show you my code. I hope to improve this code in R, because I believe that
R is a solution for this new kind of data (lidar). In fact, for ecological,
hidrological and other application is more important to study many solution
of processing and testing more software and procedures.

Thank you again, your help is very important for me

Ale

*****************************************R**********************************
************************

> testground <- read. table
(file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T,
sep=" ")
> library (sp)
> class (testground) 
[1] "data.frame"
> coordinates (testground)=~X+Y
> library (gstat)
> class (testground)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
> x <- 1:100000
> sample(x, 100)
  [1] 38465 18997 98968 56905 31535  5297 91034 57374 56148  4407 16033
74842
 [13] 49516 91422 31812 94924 44332 30412 21990 61698 53816 51227 24848
26824
 [25] 95203 20714 28172 60565 61309 24883 14063 19545 45505 24654 99649
92476
 [37] 84208 73181 13319  1559 67268 13935 57486  4162 49480 68167 38897
33295
 [49] 83067 47544 73390  9646 73967 81101 97055 96514 28011 99185 95511
98106
 [61] 86564  9635 58078 72627  2634 77933 80923 19056 13540 30066 66614
35185
 [73] 28856 61629 90387 30456 78108 18232 64321 68473  9021 15150 74326
17764
 [85] 98459 38203 62364 86437 65911 14058 27638 86792 82157 13721 15988
62189
 [97] 47190   912 33741 95151
> d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100))
> rand_rows <- sample(1:nrow(d), 10)
> d.small <- d[rand_rows, ]
> summary (d.small)
       x                 y                 z          
 Min.   :-1.9838   Min.   :-1.7096   Min.   :-1.8724  
 1st Qu.:-0.5412   1st Qu.:-0.3629   1st Qu.:-1.3087  
 Median : 0.1373   Median : 0.3014   Median :-0.6858  
 Mean   :-0.1825   Mean   : 0.0811   Mean   :-0.5395  
 3rd Qu.: 0.5796   3rd Qu.: 0.8645   3rd Qu.: 0.1156  
 Max.   : 1.1075   Max.   : 0.9342   Max.   : 1.4642  
>



****************************************************
-----Messaggio originale-----
Da: Dylan Beaudette [mailto:dylan.beaudette a gmail.com] 
Inviato: mercoledì 16 luglio 2008 14.23
A: Alessandro
Cc: r-help a r-project.org
Oggetto: Re: R: [R] gstat problem with lidar data

On Wednesday 16 July 2008, Alessandro wrote:
> Hey Dylan,
>
> Thank you. I wish to test for my PhD: TIN (ok, with Arcmap), IDW (ok, with
> Arcmap) and kriging model (and other if it is possible) to create DSM and
> DEM, and DCM (DSM-DEM). I tried with gstat in IDRISI, but my PC outs of
> memory.
> I wish improve in R the gstat to develop map surface (in grid format for
> idrisi or arcmap). Unfortunately I have the same problem in R (out of
> memory), because the dataset is big. Therefore I wish create a random sub
> sampling set by 5000,000.00 over points.
> I show you my code (sorry I am a brand new in R)
>
> Data type (in *.txt format)
>
> X		y		X
> .......	.......	........
> .......	.......	........
>
> testground <- read.table
> (file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T,
> sep=" ")
> summary (testground)
> plot(testground[,1],testground[,2])
> library (sp)
> class (testground)
> coordinates (testground)=~X+Y
> library (gstat)
> class (testground)
> V <- variogram(z~1, testground)
>
> When I arrive in this step appear "out of memory"
>
> If do you help me, it's a very pleasure because I stopped my work.
>
> Ale
>

Hi Ale. Please remember to CC the list next time.

Since R is memory-bound (for the most part), you should be summarizing your 
data first, then loading into R. 

If you can install GRASS, I would highly recommend using the r.in.xyz
command 
to pre-grid your data to a reasonable cell size, such that the resulting 
raster will fit into memory.

If you cannot, and can somehow manage to get the raw data into R, sampling 
random rows would work.

# make some data:
x <- 1:100000

# just some of the data
sample(x, 100)

# use this idea to extract x,y,z triplets
# from some fake data:
d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100))

# select 10 random rows:
rand_rows <- sample(1:nrow(d), 10)

# just the selected rows:
d.small <- d[rand_rows, ]

keep in mind you will need enough memory to contain the original data AND
your 
subset data. trash the original data once you have the subset data with
rm().

As for the statistical implications of randomly sampling a point cloud for 
variogram analysis-- someone smarter than I may be helpful.

Cheers,

Dylan



>
>
> -----Messaggio originale-----
> Da: Dylan Beaudette [mailto:dylan.beaudette a gmail.com]
> Inviato: mercoledì 16 luglio 2008 12.45
> A: r-help a r-project.org
> Cc: Alessandro
> Oggetto: Re: [R] gstat problem with lidar data
>
> On Wednesday 16 July 2008, Alessandro wrote:
> > Hey,
> >
> >
> >
> > I am a PhD student in forestry science, and I am a brand new in R. I am
> > working with lidar data (cloud points with X, Y and Z value). I wish to
> > create a spatial map with kriging form points cloud. My problem is the
> > Big data-set (over 5,000,000.00 points) and I always went out of memory.
> >
> >
> >
> > Is there a script to create un subset or modify the radius of variogram?
>
> Do you have any reason to prefer kriging over some other, less intensive
> method such as RST (regularized splines with tension)?
>
> Check out GRASS or GMT for ideas on how to grid such a massive point set.
> Specifically the r.in.xyz and v.surf.rst modules from GRASS.
>
> Cheers,



-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341



More information about the R-help mailing list