[R] extracting pdf tables...

Sun Apr 9 21:03:36 CEST 2023

Dear Jeff,
                  Thanks for your reply.

I have the following:

> colnames(IDT[[4]])
[1] "X168"                "TATA.MOTORS.LIMITED" "TATAMOTORS"          "X4"

THe above has to be the first row of IDT[[4]]. The first row is getting parsed as the column name. How do you make that the first row of IDT[[4]]?

Thanking you,
Yours sincerely,
AKSHAY M KULKARNI
________________________________
From: Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
Sent: Monday, April 10, 2023 12:27 AM
To: akshay kulkarni <akshay_e4 using hotmail.com>; r-help using r-project.org <r-help using r-project.org>
Subject: Re: [R] extracting pdf tables...

Your code used cbind. My first answer was appropriate for rbind.

So you still need to figure out how to deal with the different columns in the tables, which requires more knowledge about their contents than we have.

On April 9, 2023 11:43:01 AM PDT, akshay kulkarni <akshay_e4 using hotmail.com> wrote:
>Dear Jeff,
>                  I want to rbind.
>
>Thanking you,
>Yours sincerely,
>AKSHAY M KULKARNI
>________________________________
>From: R-help <r-help-bounces using r-project.org> on behalf of Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
>Sent: Sunday, April 9, 2023 11:57 PM
>To: r-help using r-project.org <r-help using r-project.org>
>Subject: Re: [R] extracting pdf tables...
>
>Sorry, did not read closely enough.
>
>Did you want rbind (which has no problem with different numbers of rows) or merge (which requires that there be key columns that can be aligned by repeating data)?
>
>On April 9, 2023 10:49:09 AM PDT, Jeff Newmiller <jdnewmil using dcn.davis.ca.us> wrote:
>>Clearly the column names are different. You need to decide what to do about that. Choose the subset of dataframes where the column names are the same? Rename columns? Omit some columns? Add missing columns filled with NA?
>>
>>On April 9, 2023 10:22:32 AM PDT, akshay kulkarni <akshay_e4 using hotmail.com> wrote:
>>>Dear members,
>>>                             I am extracting a pdf table by the following code:
>>>
>>>> library(tabulizer)
>>>> IDT <- extract_tables("https://www.canmoney.in/pdf/INTRADAYLEVERAGE-20220531-latest.pdf",output = "data.frame")
>>>
>>>It returns 4 different data frames which I want to combine them and make one data frame. But when I run this:
>>>
>>>> rbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]])
>>> Error in match.names(clabs, names(xi)) :
>>>names do not match previous names
>>>
>>>Also:
>>>
>>>> class(IDT[[1]])
>>>[1] "data.frame"
>>>
>>>> cbind(IDT[[1]],IDT[[2]],IDT[[3]],IDT[[4]],make.row.names = FALSE)
>>> Error in data.frame(..., check.names = FALSE) :
>>>arguments imply differing number of rows: 55, 56, 30, 1
>>>
>>>Can anyone please help me to combine all these 4 different data frames?
>>>
>>>Thanking you,
>>>Yours sincerely,
>>>AKSHAY M KULKARNI
>>>
>>>      [[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>
>
>--
>Sent from my phone. Please excuse my brevity.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

	[[alternative HTML version deleted]]