[R] How to read this data correctly
jholtman at gmail.com
Sat Jan 25 14:36:01 CET 2014
Try XLConnect. It seems to read the data in just fine. Did have to
create a named region, otherwise it would just read in the area that
> wb <- loadWorkbook("C:\\Users\\Owner\\Downloads\\Book1 (1).xlsx")
> x <- readWorksheet(wb, 1, header = FALSE) # without region; only the data
> createName(wb, "read", "Sheet1!A1:F7") # create region including the data
> x <- readNamedRegion(wb, 'read', header = FALSE)
Col1 Col2 Col3 Col4 Col5 Col6
1 NA NA NA NA NA NA
2 NA NA NA NA NA NA
3 NA NA NA NA NA NA
4 NA NA NA NA NA -0.4195477
5 NA NA NA NA NA NA
6 NA NA NA NA NA NA
7 NA NA NA NA NA -0.4195477
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Fri, Jan 24, 2014 at 4:47 PM, Marc Schwartz <marc_schwartz at me.com> wrote:
> I don't know that it is a problem in R reading the file per se. It is more of an issue, as far as I can see, that read.xls() is not written to deal with some aspects of cell formatting of certain types. In this case, the cell is formatted using a financial format with Japanese Yen. I did not take the time to look through the Perl script included.
> The intermediate CSV file that is created by the Perl script that opens and reads the Excel file contains:
> I captured this while running read.xls() under debug(), since the CSV is created as a temp file that is deleted upon function exit. It would seem that the financial cell data is not simply read as a numeric value.
> The CSV file will then be directly converted to a data frame in R as is using read.csv() to result in:
>> read.xls("Book1.xlsx", 1, header = FALSE)
> 1 -0.419547704894512
> 2 -[$¥-411]0.42
> You may need to use alternative Excel file importing functions, such as XLConnect or similar, that provide more robust functionality. Of course, R itself does not have financial data types, thus there may yet need to be some form of post import data clean up, even with the other options depending upon how they function.
> Marc Schwartz
> On Jan 24, 2014, at 2:49 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
>> Hi Rui,
>> Thanks for your reply.
>> However why you said, 'shouldn't read properly in R'?
>> Basically I was looking for some way so that I would get -0.419547704894512
>> value in R against cell F4 & F7. Because F7 is linked with F4.
>> Ofcourse I can open Excel file then format that cell accordingly. However I
>> am looking for some way in R so to avoid any manual process.
>> Thanks and regards,
>> On Sat, Jan 25, 2014 at 1:21 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>> Cell F7 has a formula, =F4, and when I open the file in excel, I get
>>> -¥0.42, which shouldn't read properly in R.
>>> The problem seems to be in the file, not in read.xls.
>>> Hope this helps,
>>> Rui Barradas
>>> Em 24-01-2014 19:22, Christofer Bogaso escreveu:
>>>> Hi again,
>>>> I need to read below xlsx file correctly (available here:
>>>> http://snk.to/f-ch3exae5), and used following code (say, file is saved in
>>>> F: drive)
>>>>> read.xls("f:/Book1.xlsx", 1, header = F)
>>>> 1 -0.419547704894512
>>>> 2 -[$¥-411]0.42
>>>> However please notice that, in my original excel file the cells F4 and F7
>>>> have essentially the same values. Therefore I should get
>>>> -0.419547704894512, for either cases above.
>>>> Any idea on how to achieve that, without opening the xlsx file manually
>>>> then formatting the cell before reading it in R?
>>>> Thanks for your help
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help