[R] purrr::map and xml2:: read_xml

Ulrik Stervbo ulrik.stervbo at gmail.com
Fri Jan 6 17:25:22 CET 2017


Hi Maicel,

I'm guessing that B works on 50 files, and that A fails because there is no
function called 'read_xmlmap'. If the function that you map work well,
removing 'dplyr::sample_n(50)' from 'B' should solve the problem.

If that is not the case, we need a bit more information.

HTH
Ulrik

On Fri, 6 Jan 2017 at 17:08 <maicel at infomed.sld.cu> wrote:

> Hi List, I am trying to extract the key words from 1403 papers in xml
> format. I programmed such codes but they do not work but they only do
> with the modification showed below. But that variation is not the one
> I need because the 1403 xml files do not match to those in my folder.
> Could you please tell me where are the mistakes in the codes list (A
> or B) to help me to correct them? The data frame columns are an id and
> the paths.
>
> A-Does not work, but it is the one I need.
>
> keyword <-
>    muestra %>%
>    select(path) %>%
>    read_xmlmap(.f = function(x) { read_xml(x) %>%
>         xml_find_all( ".//kwd") %>%
>         xml_text(trim=T) })
>
> B-It works but only with a small number of papers.
>
> keyword <-
>    muestra %>%
>    select(path) %>%
>     dplyr::sample_n(50) %>%
>     unlist() %>%
>    map(.f = function(x) { read_xml(x) %>%
>         xml_find_all( ".//kwd") %>%
>         xml_text(trim=T) })
>
> Thank you,
> Maicel Monzon MD, PHD
>
>
> ----------------------------------------------------------------
>
>
>
>
> --
> Este mensaje le ha llegado mediante el servicio de correo electronico que
> ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema
> Nacional de Salud. La persona que envia este correo asume el compromiso de
> usar el servicio a tales fines y cumplir con las regulaciones establecidas
>
> Infomed: http://www.sld.cu/
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list