[R] reading in csv files, some of which have column names and some of which don't

Benoit Vaillant beno|t@v@|||@nt @end|ng |rom no-|og@org
Wed Aug 14 07:09:43 CEST 2019


On Tue, Aug 13, 2019 at 01:59:56PM -0400, Christopher W Ryan wrote:
> But this assumes that all files have column names in their first row. In
> this case, some don't. Any advice how to handle it so that those with
> column names and those without are read in and combined properly?

It obvously depends on the data, but here is an other approach (which
I hope has not been suggested yet):

1. For each file, read only the first row (and keep track of
   file => first row),
2. Make counts of dinstinct first rows. If data is sufficely not
   identical on first rows, the highest count will indicate that its a
   header, so mark this as the header,
3. Reread files, since there is some form of mapping kept in step 1,
   one knows if header should be TRUE or FALSE, and fix headers after
   reading with headers set to FALSE.
This can of course miserably fail if the set of saved files do not
have enough ones with headers included.

HTH in your case, but it's definitely not generic.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 866 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20190814/b6e32028/attachment.sig>

More information about the R-help mailing list