[R] List / Matrix to Data Frame

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Fri Jul 2 06:14:56 CEST 2021


Bill,

A Matrix can only contain one kind of data. I ran your code after modifying
it to be proper and took a transpose to get it risght-side up:

t(wanted)
date         netIncome        grossProfit      
2020-09-30 "2020-09-30" "57411000000.00" "104956000000.00"
2019-09-30 "2019-09-30" "55256000000.00" "98392000000.00" 
2018-09-30 "2018-09-30" "59531000000.00" "101839000000.00"
2017-09-30 "2017-09-30" "48351000000.00" "88186000000.00"

That looks better, I think.

So I did this:

wanted <- t(wanted)

It has rownames and colnames and you can make it a data frame easily enough:

mydf <- data.frame(wanted)

But the columns are all character strings, so CONVERT them as you wish:

> mydf
date      netIncome     grossProfit
2020-09-30 2020-09-30 57411000000.00 104956000000.00
2019-09-30 2019-09-30 55256000000.00  98392000000.00
2018-09-30 2018-09-30 59531000000.00 101839000000.00

Your numbers are quite large and may or may not be meant to be integers:

> mydf$netIncome <- as.numeric(mydf$netIncome)
> mydf$grossProfit <- as.numeric(mydf$grossProfit)
> head(mydf)
date  netIncome grossProfit
2020-09-30 2020-09-30 5.7411e+10 1.04956e+11
2019-09-30 2019-09-30 5.5256e+10 9.83920e+10
2018-09-30 2018-09-30 5.9531e+10 1.01839e+11
2017-09-30 2017-09-30 4.8351e+10 8.81860e+10
2016-09-30 2016-09-30 4.5687e+10 8.42630e+10
2015-09-30 2015-09-30 5.3394e+10 9.36260e+10

The first entries may have something wrong as they become NA when I make
them integers.

The date column is the same as the rownames and is not in a normal vector
format. It shows as a list and you may want to convert it to one of several
formats R supports for dates or a more normal character string. 

So here is how I made it a character string:

> mydf <- as.data.frame(wanted)
> mydf$date <- as.character(mydf$date)
> mydf$netIncome <- as.numeric(mydf$netIncome)
> mydf$grossProfit <- as.numeric(mydf$grossProfit)
> head(mydf)
date  netIncome grossProfit
2020-09-30 2020-09-30 5.7411e+10 1.04956e+11
2019-09-30 2019-09-30 5.5256e+10 9.83920e+10
2018-09-30 2018-09-30 5.9531e+10 1.01839e+11
2017-09-30 2017-09-30 4.8351e+10 8.81860e+10
2016-09-30 2016-09-30 4.5687e+10 8.42630e+10
2015-09-30 2015-09-30 5.3394e+10 9.36260e+10

If you want a DATE, it can now be converted again using one of many methods.

Just FYI, numbers that big and rounded might work just as well measured in
millions as in what I did to grossProfit:

> mydf$grossProfit <- as.integer(mydf$grossProfit/1000000)
> mydf$netIncome <- as.integer(mydf$netIncome/1000000)
> head(mydf)
date netIncome grossProfit
2020-09-30 2020-09-30     57411      104956
2019-09-30 2019-09-30     55256       98392
2018-09-30 2018-09-30     59531      101839
2017-09-30 2017-09-30     48351       88186
2016-09-30 2016-09-30     45687       84263
2015-09-30 2015-09-30     53394       93626

Some of the numbers are negative though.

If rownames are not needed:

> rownames(mydf) <- NULL
> head(mydf)
date netIncome grossProfit
1 2020-09-30     57411      104956
2 2019-09-30     55256       98392
3 2018-09-30     59531      101839
4 2017-09-30     48351       88186
5 2016-09-30     45687       84263
6 2015-09-30     53394       93626

It may be easier to work with this, but again, if you need the dates to be
real dates, as in graphing.

Hope that helps. 





-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Bill Dunlap
Sent: Thursday, July 1, 2021 9:01 PM
To: Sparks, John <jspark4 using uic.edu>
Cc: r-help using r-project.org
Subject: Re: [R] List / Matrix to Data Frame

Does this do what you want?

> df <- data.frame(check.names=FALSE,
lapply(c(Date="date",netIncome="netIncome",`Gross Profit`="grossProfit"),
function(nm)vapply(ISY, "[[", nm, FUN.VALUE=NA_character_)))
> str(df)
'data.frame':   36 obs. of  3 variables:
 $ Date        : chr  "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30"
...
 $ netIncome   : chr  "57411000000.00" "55256000000.00" "59531000000.00"
"48351000000.00" ...
 $ Gross Profit: chr  "104956000000.00" "98392000000.00" "101839000000.00"
"88186000000.00" ...
> df$Date <- as.Date(df$Date)
> df$netIncome <- as.numeric(df$netIncome) df$`Gross Profit` <- 
> as.numeric(df$`Gross Profit`)
> str(df)
'data.frame':   36 obs. of  3 variables:
 $ Date        : Date, format: "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" ...
 $ netIncome   : num  5.74e+10 5.53e+10 5.95e+10 4.84e+10 4.57e+10 ...
 $ Gross Profit: num  1.05e+11 9.84e+10 1.02e+11 8.82e+10 8.43e+10 ...
> with(df, plot(Date, netIncome))

On Thu, Jul 1, 2021 at 5:35 PM Sparks, John <jspark4 using uic.edu> wrote:

> Hi R-Helpers,
>
> I am taking it upon myself to delve into the world of lists for R.  In 
> no small part because I appear to have discovered a source of data for 
> an exceptionally good price but that delivers much of that data in json
format.
>
> So over the last day or so I managed to fight the list processing 
> tools to a draw and get a list that has only selected elements 
> (actually it ends up in matrix form).  But when I try to convert that 
> to a data frame I can't get it to a form that is workable.
>
> I have visited some pages about converting a matrix to a data frame 
> but they result in highly redundant and inelegant data.
>
> I am thinking that someone who works with lists and matrices knows how 
> to do this quite easily and would be willing to provide a solution.
>
> The reproducible example is shown below.  Just to be explicit, what I 
> am trying to get to is something along the lines of a data frame like
this.
>
> Date              netIncome      Gross Profit
> 2020-09-30 57411000000 104956000000
> 2019-09-30 55256000000   98392000000
>
> .....
>
> The closest I get is a matrix that looks like this
>
> > wanted
>             2020-09-30        2019-09-30       2018-09-30
> 2017-09-30       2016-09-30       2015-09-30       2014-09-30
> date        "2020-09-30"      "2019-09-30"     "2018-09-30"
> "2017-09-30"     "2016-09-30"     "2015-09-30"     "2014-09-30"
> netIncome   "57411000000.00"  "55256000000.00" "59531000000.00"
> "48351000000.00" "45687000000.00" "53394000000.00" "39510000000.00"
> grossProfit "104956000000.00" "98392000000.00" "101839000000.00"
> "88186000000.00" "84263000000.00" "93626000000.00" "70537000000.00"
>
> Code for example
>
> library(jsonlite)
> test <- jsonlite::fromJSON("
> https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFF
> mMliFG5orCUuwAKQ8l4WWFQ67YX
> ")
>
> hist<-test[[13]]
> ISY<-hist$Income_Statement$yearly
> wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))
>
>
> Your guidance would be much appreciated.
>
> --John J. Sparks, Ph.D.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list