[Rd] foreign::read.dbf fails to parse dbf properly

Ezra Tucker ezr@ @end|ng |rom |@ndtucker@com
Sat Jul 30 19:36:17 CEST 2022


Thank you all for your thoughts, I'll definitely submit my bug report
and patch there.

With respect to the data themselves and the data formats, in my
research I came across this document:
http://www.manmrk.net/tutorials/database/xbase/data_types.html
as a helpful reference

Forgot to mention, attempted all this on
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Fedora Linux 36 (Workstation Edition)

and also on a Windows 10 machine running R version 4.2.1 with UCRT.

To Andre's and Roger's points, great minds- I did exactly the same
things, opening the dbf files in LibreOffice (works perfectly, so does
the soffice command line tool), also tried converting to csv using
ogr2ogr, and got EXACTLY the same problem as I'm seeing in R.

dbfopen.c says it's derived from Shapelib, documentation here:
http://shapelib.maptools.org/dbf_api.html
stating that DBFGetNativeFieldType() has support for C, D, F, N, L and
M data types, and if ogr2ogr uses the same source code from shapelib,
makes sense to me why it would interpret these values the same way.

I'll note really quickly that if this were a simple matter of
converting the ASCII, this wouldn't be an issue but certain characters
(mostly the control characters, backspace, delete, and a few others)
prevented me from accurately reconstructing all the original data once
they were loaded into R.

-Ezra

On Sat, 2022-07-30 at 08:17 -0400, Duncan Murdoch wrote:
> On 29/07/2022 4:52 p.m., Ezra Tucker wrote:
> > Dear R developers,
> > 
> > tl;dr I've been trying to read foxpro dbf files with
> > foreign::read.dbf(), they weren't being read properly, I patched
> > the
> > foreign package to make it work, now what?
> > 
> > Long version:
> > I recently encountered unexpected behavior attempting to read dbf
> > files
> > using foreign::read.dbf() from here:
> > 
> > https://forms.ferc.gov/f1allyears/f1_2020.zip
> > 
> > unzipped, in UPLOADERS/FORM1/working/F1_15.DBF - and as a note,
> > this is
> > a foxpro database. I would expect the first row of the first column
> > to
> > be 40, instead I am getting "(" (realizing that "(" has a decimal
> > ascii
> > value of 40). The xbase docs indicate that this is a field of type
> > "I"
> > which is a 4-byte integer unique to foxpro, and it doesn't look
> > like
> > this case is contemplated by read.dbf()
> > 
> > I made some modifications to Rdbfread.c and dbfopen.c in the
> > foreign
> > package (version 0.8-82) to add specific handling for field type
> > "I".
> > 
> > I'm not current set up to contribute directly, I don't have SVN
> > access.
> > 
> > 1. Is this patch of general interest? I'm weighing in the
> > development
> > guidelines:
> >    - DO NOT fix exotic bugs that haven't bugged anyone
> >    - DO make small enhancements if they are badly needed
> > and I feel like this is maybe a bit of an exotic lack-of-feature
> > (wouldn't call it a bug), and I have no idea if this is badly
> > needed
> > (by anybody, other than myself)
> > 
> > 2. if of general interest, how can I get set up with SVN
> > credentials
> > for R-packages?
> 
> Roger addressed your first question.  I'll give some information
> about 
> the second one.
> 
> You should automatically have read permission on the R-packages 
> repository, as with most R svn repositories.
> 
> I think to get write permission, you'd need to be invited to join R 
> Core, or to be a maintainer of the package.
> 
> The more common way to have changes accepted is to post them to the 
> bugs.r-project.org web site.  See https://www.r-
> project.org/bugs.html 
> for more details on how to get an account set up there so you can
> post 
> things.
> 
> Duncan Murdoch



More information about the R-devel mailing list