[R] URLdecode problems

Hadley Wickham h.wickham at gmail.com
Tue Sep 2 05:29:10 CEST 2014


Hi Oliver,

I think you're being misled by the default behaviour of warnings: they
all get displayed at once, before control returns to the console.  If
you making them immediate, you get a slightly more informative error:

> URLdecode("0;%20@%gIL")
Warning in URLdecode("0;%20@%gIL") :
  out-of-range values treated as 0 in coercion to raw
Error in rawToChar(out) : embedded nul in string: '0; @\0L'

So the out of range value (%g...) is getting converted to a raw(0),
aka a nul. Then rawToChar() chokes.

The code for URLdecode is simple enough that I'd recommend rewriting
yourself to better handle bad inputs.

Hadley

On Mon, Sep 1, 2014 at 11:02 AM, Oliver Keyes <okeyes at wikimedia.org> wrote:
> Hey all,
>
> So, I'm attempting to decode some (and I don't know why anyone did this)
> URl-encoded user agents. Running URLdecode over them generates the error:
>
> "Error in rawToChar(out) : embedded nul in string"
>
> Okay, so there's an embedded nul - fair enough. Presumably decoding the URL
> is exposing it in a format R doesn't like. Except when I try to dig down
> and work out what an encoded nul looks like, in order to simply remove them
> with something like gsub(), I end up with several different strings, all of
> which apparently resolve to an embedded nul:
>
>> URLdecode("0;%20@%gIL")
> Error in rawToChar(out) : embedded nul in string: '0; @\0L'
> In addition: Warning message:
> In URLdecode("0;%20@%gIL") :
>   out-of-range values treated as 0 in coercion to raw
>> URLdecode("%20%use")
> Error in rawToChar(out) : embedded nul in string: ' \0e'
> In addition: Warning message:
> In URLdecode("%20%use") :
>   out-of-range values treated as 0 in coercion to raw
>
> I'm a relative newb to encodings, so maybe the fault is simply in my
> understanding of how this should work, but - why are both strings being
> read as including nuls, despite having different values? And how would I go
> about removing said nuls?
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
http://had.co.nz/



More information about the R-help mailing list