[R] Unwanted Levels in R

Don MacQueen macq at llnl.gov
Wed May 22 16:05:13 CEST 2002


If this is a one-time-only job, and the number of data sets that has 
been combined isn't too large, simply use a text editor to split the 
file into separate files, one for each set, and read each one 
separately with read.table, using its skip argument.

Then rbind() them in R.

Or not, for that matter. The other solutions appear to discard the 
information about which values are for which location.

-Don

At 10:15 AM -0400 5/21/02, MATT BORKOWSKI wrote:
>To clarify:
>The lines beginning with A,B,C,D,E are part of a header file.  Below 
>the header
>are lines that contain values that correspond.  The problem is that there are
>a number of data sets combined, so the header randomly repeats after an
>varying number of data lines.  Would it solve the problem to simply 
>treat the line
>that begin with A,B,C,D,, or E differently?  If so, how do they need 
>to be treated?
>I've copied a bit more of the data below to demonstrate more clearly how the
>data is arranged within the file.
>
>A  900003024 ODEN     SWEDEN          ODEN91          NSIDC.ORG/PROJE
>B     900003     -9  1 NAN OBS         0
>C 1991  9  7 13 -9 XX   90.0000     .0000 XX
>D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
>E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
>    25.0   25.3 -1.7050 -1.7054 31.4970 25.3313 34.8074 43.8571 -9.0000  8.630 
>    50.0   50.6 -1.7400 -1.7408 32.3660 26.0382 35.5010 44.5377 -9.0000  8.280 
>    89.0   90.0 -1.6550 -1.6566 32.8530 26.4320 35.8807 44.9043 -9.0000  7.430 
>    109.0  110.3 -1.5420 -1.5444 33.8830 27.2659 36.6893 45.6886 -9.0000  7.360
>...
>...
>...
>A  900002034 LOUIS ST: LAURENT   UNITED STATES   AO1994   NSIDC.ORG/PROJE
>B     900002         -9  1 NAN OBS         0
>C 1994  8 20 22 -9 XX   89.0167  137.1517 XX
>D    36   13.0   13.1 4075.0 4159.4 4075.0 Z 13  0 LASTLE
>E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
>   13.0   13.1 -1.7650 -1.7652 32.9160 26.4856 35.9403 44.9690 -9.0000  8.580
>
>Matt
>
>
>On Tue, 21 May 2002 15:46:58 +0200, Peter Dalgaard BSA 
><p.dalgaard at biostat.ku.dk> wrote:
>
>>  "MATT BORKOWSKI" <mpb170 at psu.edu> writes:
>>
>>  > is there anyway to overcome it?  Here are a few lines of the data I'm
>>  > attempting to read in:
>>  >
>>  > A  900003024 ODEN   SWEDEN  ODEN91 NSIDC.ORG/PROJE
>>  > B     900003         -9  1 NAN OBS         0
>>  > C 1991  9  7 13 -9 XX   90.0000     .0000 XX
>>  > D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
>>  > E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
>>  >    10.0   10.1 -1.6970 -1.6971 31.4940 25.3287 34.8044 43.8535 -9.0000
>>  >
>>  > Here are the commands I have tried using to read in the data:
>>  >
>>  > >alldata <- read.table("/home/mattb/xxx.dat", fill = TRUE, quote = "")
>>  >
>>  > >alldata <- as.list(read.table("/home/mattb/xxx.dat", fill = 
>>TRUE, quote = "")
>>
>>  As far as I can see, there is no connection between values in the same
>>  position in different lines? If so, trying to make a data frame out of
>>  the file is simply inappropriate and you should rather use ReadLines
>>  and postprocess the lines according to whatever logic they are
>>  supposed to obey.
>  >
>  > --
>  >    O__  ---- Peter Dalgaard             Blegdamsvej 3
>  >   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N
>  >  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
>>  ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
>
>
>-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
>r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
>_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


-- 
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
--------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list