[R] Parsing JSON records to a dataframe

Dieter Menne dieter.menne at menne-biomed.de
Fri Jan 7 09:05:10 CET 2011



Jeroen Ooms wrote:
> 
> What is the most efficient method of parsing a dataframe-like structure
> that has been json encoded in record-based format rather than vector
> based. For example a structure like this:
> 
> [ {"name":"joe", "gender":"male", "age":41}, {"name":"anna",
> "gender":"female", "age":23} ]
> 
> RJSONIO parses this as a list of lists, which I would then have to apply
> as.data.frame to and append them to an existing dataframe, which is
> terribly slow. 
> 
> 

unlist is pretty fast. The solution below assumes that you know how your
structure is, so it is not very flexible, but it should show you that the
conversion to data.frame is not the bottleneck.

# json
library(RJSONIO)
# [ {"name":"joe", "gender":"male", "age":41},
#  {"name":"anna", "gender":"female", "age":23} ]
n = 300000
d = data.frame(name=rep(c("joe","anna"),n),
           gender=rep(c("male","female"),n),
           age = rep(c("23","41"),n))
dj = toJSON(d)

system.time(d1 <- fromJSON(dj))
#  user  system elapsed
#   4.06    0.26    4.32

system.time(
  dd <- data.frame(
    name = unlist(d1$name),
    gender = unlist(d1$gender),
    age=as.numeric(unlist(d1$age)))
)
#   user  system elapsed
#   1.13    0.05    1.18




-- 
View this message in context: http://r.789695.n4.nabble.com/Parsing-JSON-records-to-a-dataframe-tp3178646p3178753.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list