[R] Problem reading mixed CSV file

Peter Ehlers ehlers at ucalgary.ca
Fri Mar 16 19:05:56 CET 2012


On 2012-03-16 10:48, Ashish Agarwal wrote:
> Line 10 has City and State that too separated by comma. For line 10
> how can I read differently as compared to the other lines?

Edit the file and put quotes around the city-state combination:
  "Raleigh, North Carol"

Also: always run count.fields() on your files before importing.

Peter Ehlers

>
> On Fri, Mar 16, 2012 at 10:59 PM, David Winsemius
> <dwinsemius at comcast.net>  wrote:
>>
>> On Mar 16, 2012, at 1:11 PM, Ashish Agarwal wrote:
>>
>>> I want to import this CSV file into R.
>>>
>>> The CSV file is
>>>
>>> ,,,1968,21,0
>>> ,,Boston,1968,13,0
>>> ,,Boston,1968,18,0
>>> ,,Chicago,1967,44,0
>>> ,,Providence,1968,17,0
>>> ,,Providence,1969,48,0
>>> ,,Binky,1968,24,0
>>> ,,Chicago,1968,23,0
>>> ,,Dally,1968,7,0
>>> ,,Raleigh, North Carol,1968,25,0
>>> Addy ABC-Dogs Stars-W8.1,,Providence,1968,38,0
>>> DEF_REQPRF/,,Dartmouth,1967,31,1
>>> PL,,,1967,38,1
>>> XY,PopatLal,,1967,5,1
>>> XY,PopatLal,,1967,6,8
>>> XY,PopatLal,,1967,7,7
>>> XY,PopatLal,,1967,9,1
>>> XY,PopatLal,,1967,10,1
>>> XY,PopatLal,,1967,13,1
>>> XY,PopatLal,Boston,1967,6,1
>>> XY,PopatLal,Boston,1967,7,11
>>> XY,PopatLal,Boston,1967,9,2
>>> XY,PopatLal,Boston,1967,10,3
>>> XY,PopatLal,Boston,1967,7,2
>>>
>>> I tried using scan and read.table but results are not visible :(
>>>
>>>> scan("D:/data/temp.csv",list("","","",0,0,0),sep=",") ->x
>>>
>>> Read 51 records
>>>>
>>>> x
>>>
>>> [[1]]
>>>   [1] "ÿþ" ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""
>>> ""
>>> [16] ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""
>>> ""
>>> [31] ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""   ""
>>> ""
>>> [46] ""   ""   ""   ""   ""   ""
>>> ....
>>>
>>>> read.table("D:/data/temp.csv",header=F,sep=",") ->x
>>>> x
>>>
>>>    V1 V2
>>> 1   ÿþ NA
>>> 2      NA
>>> 3      NA
>>> 4      NA
>>>
>>> Can somebody please help in importing this CSV file?
>>
>>
>> Looks like an encoding mismatch. You have not offered the requested
>> information about you setup so further comment would all be guesswork. But
>> you can perhaps educate yourself by reading:
>>
>> ?Encoding
>>
>> And line ten has 7 elements.
>>
>>> count.fields(textConnection(",,,1968,21,0
>> + ,,Boston,1968,13,0
>> + ,,Boston,1968,18,0
>> + ,,Chicago,1967,44,0
>> + ,,Providence,1968,17,0
>> + ,,Providence,1969,48,0
>> + ,,Binky,1968,24,0
>> + ,,Chicago,1968,23,0
>> + ,,Dally,1968,7,0
>> + ,,Raleigh, North Carol,1968,25,0
>> + Addy ABC-Dogs Stars-W8.1,,Providence,1968,38,0
>> + DEF_REQPRF/,,Dartmouth,1967,31,1
>> + PL,,,1967,38,1
>> + XY,PopatLal,,1967,5,1
>> + XY,PopatLal,,1967,6,8
>> + XY,PopatLal,,1967,7,7
>> + XY,PopatLal,,1967,9,1
>> + XY,PopatLal,,1967,10,1
>> + XY,PopatLal,,1967,13,1
>> + XY,PopatLal,Boston,1967,6,1
>> + XY,PopatLal,Boston,1967,7,11
>> + XY,PopatLal,Boston,1967,9,2
>> + XY,PopatLal,Boston,1967,10,3
>> + XY,PopatLal,Boston,1967,7,2"),sep=",")
>>   [1] 6 6 6 6 6 6 6 6 6 7 6 6 6 6 6 6 6 6 6 6 6 6 6 6
>>
>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list