[R] Gobbling up a repeating, irregular list of data

Peter Langfelder peter.langfelder at gmail.com
Fri Nov 11 05:53:17 CET 2016


It's not clear whether your numbers are tab or space-separated, I will
assume space-separated. My lowtech (and not R) solution would be to
dump the output into a text file (call it data.in), then run a sed
command to first replace two initial spaces from each line, then
replace initial spaces with 4 (if I count correctly) tabs, then
replace all contiguous blocks of spaces by tabs, something like

sed 's/^  //' data.in | sed 's/^  */\t\t\t\t/' | sed 's/  */\t/g' > data.txt

This should produce a regular 6-column table that should be readable
using standard read.delim or read.table. You will then have figure out
how to deal with the empty cells in R.

Peter

On Thu, Nov 10, 2016 at 8:26 PM, Morway, Eric <emorway at usgs.gov> wrote:
> What would be the sophisticated R method for reading the data shown below
> into a list?  The data is output from a numerical model.  Pasting the
> second block of example R commands (at the end of the message) results in a
> failure ("Error in scan...line 2 did not have 6 elements").  I no doubt
> could cobble together some script for reading line-by-line using for loops,
> and then appending vectors with values from each line, but this strikes me
> as bad form.
>
> One final note, the lines with 6 values contain important values that
> should somehow remain associated with the data appearing in columns 5 & 6
> (the continuous data).  The first value, which is always 1, can be
> discarded, but the second value on these lines contain the time step number
> ("1.00E+00", "2.00E+00", etc.), the 3rd and 4th values are contain a depth
> and thickness, respectively. Columns 5 & 6 are a depth and water content
> pairing and should be associated with the time steps.
>
> Thanks, Eric
>
> Start of example output data (Use of an R script to read in this data below)
>
>   1    1.00E+00  1.24E+03  7.79E+00  1.925E-01  1.88E-01
>                                      3.850E-01  1.88E-01
>                                      5.775E-01  1.88E-01
>                                      7.700E-01  1.88E-01
>                                      9.626E-01  1.88E-01
>                                      1.155E+00  1.88E-01
>                                      1.347E+00  1.88E-01
>   1    2.00E+00  1.26E+03  7.80E+00  1.925E-01  2.80E-01
>                                      1.732E+00  2.80E-01
>                                      1.925E+00  2.80E-01
>                                      2.310E+00  2.93E-01
>                                      2.502E+00  2.22E-01
>                                      2.695E+00  1.88E-01
>                                      2.887E+00  1.88E-01
>   1    3.00E+00  1.28E+03  7.70E+00  1.925E-01  1.03E-01
>                                      3.850E-01  1.30E-01
>                                      5.775E-01  1.48E-01
>                                      7.701E-01  1.61E-01
>                                      9.626E-01  1.72E-01
>                                      1.155E+00  1.86E-01
>                                      1.347E+00  1.93E-01
>   1    4.00E+00  1.29E+03  7.60E+00  1.901E-01  1.80E-01
>                                      3.803E-01  1.80E-01
>                                      5.705E-01  1.38E-01
>                                      7.607E-01  1.32E-01
>                                      2.282E+00  1.86E-01
>                                      2.472E+00  1.98E-01
>                                      2.662E+00  2.00E-01
>
> Same data as above, but scan function fails.
>
> dat <- read.table(textConnection("  1    1.00E+00  1.24E+03  7.79E+00
>  1.925E-01  1.88E-01
>                                      3.850E-01  1.88E-01
>                                      5.775E-01  1.88E-01
>                                      7.700E-01  1.88E-01
>                                      9.626E-01  1.88E-01
>                                      1.155E+00  1.88E-01
>                                      1.347E+00  1.88E-01
>   1    2.00E+00  1.26E+03  7.80E+00  1.925E-01  2.80E-01
>                                      1.732E+00  2.80E-01
>                                      1.925E+00  2.80E-01
>                                      2.310E+00  2.93E-01
>                                      2.502E+00  2.22E-01
>                                      2.695E+00  1.88E-01
>                                      2.887E+00  1.88E-01
>   1    3.00E+00  1.28E+03  7.70E+00  1.925E-01  1.03E-01
>                                      3.850E-01  1.30E-01
>                                      5.775E-01  1.48E-01
>                                      7.701E-01  1.61E-01
>                                      9.626E-01  1.72E-01
>                                      1.155E+00  1.86E-01
>                                      1.347E+00  1.93E-01
>   1    4.00E+00  1.29E+03  7.60E+00  1.901E-01  1.80E-01
>                                      3.803E-01  1.80E-01
>                                      5.705E-01  1.38E-01
>                                      7.607E-01  1.32E-01
>                                      2.282E+00  1.86E-01
>                                      2.472E+00  1.98E-01
>                                      2.662E+00  2.00E-01"),header=FALSE)
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list