[Rd] subRaw?

Spencer Graves spencer.graves at structuremonitoring.com
Fri Jul 20 18:22:20 CEST 2012


Hi, Hervé:


On 7/19/2012 10:19 PM, Hervé Pagès wrote:
> Hi Spencer,
>
> On 07/19/2012 08:29 PM, Spencer Graves wrote:
>> Hello, All:
>>
>>
>>        Do you know of any capability to substitute more then one byte in
>> an object of class Raw?
>>
>>
>>        Consider the following:
>>
>>
>>  > let4 <- paste(letters[1:4], collapse='')
>>  > (let4Raw <- charToRaw(let4))
>> [1] 61 62 63 64
>>  > (let. <- sub('bc', '--', let4Raw))
>> [1] "61" "62" "63" "64"
>>  > # no substitution
>>  > (bc <- charToRaw('bc'))
>> [1] 62 63
>>  > (ef <- charToRaw('ef'))
>> [1] 65 66
>>  > (let. <- sub(bc, ef, let4Raw))
>> [1] "61" "65" "63" "64"
>> Warning messages:
>> 1: In sub(bc, ef, let4Raw) :
>>    argument 'pattern' has length > 1 and only the first element will be
>> used
>> 2: In sub(bc, ef, let4Raw) :
>>    argument 'replacement' has length > 1 and only the first element will
>> be used
>
> It makes no sense to use sub(), grep(), and family (i.e. all the stuff
> based on the regex code) *directly* on a raw vector because all these
> functions will start by coercing their 'x', 'text', 'pattern',
> 'replacement' args to character with as.character (if they are not
> already character).
>
> But the way as.character() operates on a raw vector won't give good
> results in that context. You'd rather do the coercion yourself first
> with rawToChar(), and coerce back the result with charToRaw():
>
>   > charToRaw(sub("bc", "--", rawToChar(let4Raw)))
>   [1] 61 2d 2d 64
>
> IMO it would make much more sense that sub(), grep(), and family()
> raise an error than blindly try to coerce to character but these
> functions (like many functions in R) are too polite to tell the
> user s/he's doing something wrong.


       Thanks for the reply.


       It sounds like you agree that a function "subRaw" to facilitate 
this would be useful.  In my testing, charToRaw(sub(pattern, 
replacement, rawToChar(x)) did NOT preserve binary codes that did not 
match legitimate characters.  I tried several things before finding one 
that seemed to work.


       Best Wishes,
       Spencer

>
> Cheers,
> H.
>
>>
>>
>>        In this example, "b" was replaced by "e", but "bc" was not
>> replaced by "ef"?  Do you know of any function to do this?
>>
>>
>>        I ask, because I need it.  I've written such a function, subRaw
>> for my own use.  If I don't hear that another exists, I plan to add the
>> one I've written to the oro.dicom package.
>>
>>
>>        Thanks,
>>        Spencer
>>
>>
>>  > sessionInfo()
>> R version 2.15.1 (2012-06-22)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>



More information about the R-devel mailing list