[R] How to loop over two files ...

Rasmus Liland jr@| @end|ng |rom po@teo@no
Fri Jun 19 22:49:51 CEST 2020


On 2020-06-19 14:34 -0500, Ana Marija wrote:
> 
> server <- "http://rest.ensembl.org"
> ext <- "/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
> 
> r <- GET(paste(server, ext, sep = ""), content_type("application/json"))
> 
> stop_for_status(r)
> head(fromJSON(toJSON(content(r))))
>    d_prime       r2 variation1 variation2         population_name
> 1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV
> 
> What I would like to do is to do is to run this command for every SNP
> in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
> is rs# and output every line of result in list.txt

Dear Ana,

I tried, but for some reason I get only a 
response for the first URL you supplied.  

I wrote this:

	files <- c("1g.txt", "1n.txt")
	files <- lapply(files, readLines)
	server <- "http://rest.ensembl.org"
	population.name <- "1000GENOMES:phase_3:KHV"
	ext <- apply(expand.grid(files), 1, function(x) {
	  return(paste0(server, "/ld/human/pairwise/",
	    x[1], "/", x[2],
	    "?population_name=", population.name))
	})
	
	# r <- lapply(ext, function(x) {
	#   httr::GET(x, httr::content_type("application/json"))
	# })
	# names(r) <- ext
	# file <- paste0(population.name, ".rds")
	# saveRDS(object=r, compress="xz", file=file)
	
	r <- readRDS(paste0(population.name, ".rds"))
	lapply(r[1:4], function(x) {
	  jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
	})


Which if you are able to run it (saving the 
output in that rds file), yields this: 

	$`http://rest.ensembl.org/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	  variation2         population_name  d_prime       r2 variation1
	1  rs1042779 1000GENOMES:phase_3:KHV 0.975513 0.951626  rs6792369
	
	$`http://rest.ensembl.org/ld/human/pairwise/rs1414517/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	list()
	
	$`http://rest.ensembl.org/ld/human/pairwise/rs16857712/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	list()
	
	$`http://rest.ensembl.org/ld/human/pairwise/rs16857703/rs1042779?population_name=1000GENOMES:phase_3:KHV`
	list()

For some reason, only the first url works ...

I am a bit unfamiliar working with REST 
API's.  Or web scraping in general.  Daniel 
Cegiełka knows something in this thread some 
days ago, where it might be similar to the 
API of borsaitaliana.it, where you can supply 
headers with curl like he quickly did [2].

You might be able to supply the list of SNPs 
in a header to Ensemble in httr::GET somehow 
if you read some docs on their API? 

Best,
Rasmus

[1] https://marc.info/?t=159249246100002&r=1&w=2
[2] https://marc.info/?l=r-sig-finance&m=159249894208684&w=2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200619/65f5d896/attachment.sig>


More information about the R-help mailing list