spencer.graves at structuremonitoring.com
Fri Jul 20 18:22:20 CEST 2012
On 7/19/2012 10:19 PM, Hervé Pagès wrote:
> Hi Spencer,
> On 07/19/2012 08:29 PM, Spencer Graves wrote:
>> Hello, All:
>> Do you know of any capability to substitute more then one byte in
>> an object of class Raw?
>> Consider the following:
>> > let4 <- paste(letters[1:4], collapse='')
>> > (let4Raw <- charToRaw(let4))
>>  61 62 63 64
>> > (let. <- sub('bc', '--', let4Raw))
>>  "61" "62" "63" "64"
>> > # no substitution
>> > (bc <- charToRaw('bc'))
>>  62 63
>> > (ef <- charToRaw('ef'))
>>  65 66
>> > (let. <- sub(bc, ef, let4Raw))
>>  "61" "65" "63" "64"
>> Warning messages:
>> 1: In sub(bc, ef, let4Raw) :
>> argument 'pattern' has length > 1 and only the first element will be
>> 2: In sub(bc, ef, let4Raw) :
>> argument 'replacement' has length > 1 and only the first element will
>> be used
> It makes no sense to use sub(), grep(), and family (i.e. all the stuff
> based on the regex code) *directly* on a raw vector because all these
> functions will start by coercing their 'x', 'text', 'pattern',
> 'replacement' args to character with as.character (if they are not
> already character).
> But the way as.character() operates on a raw vector won't give good
> results in that context. You'd rather do the coercion yourself first
> with rawToChar(), and coerce back the result with charToRaw():
> > charToRaw(sub("bc", "--", rawToChar(let4Raw)))
>  61 2d 2d 64
> IMO it would make much more sense that sub(), grep(), and family()
> raise an error than blindly try to coerce to character but these
> functions (like many functions in R) are too polite to tell the
> user s/he's doing something wrong.
Thanks for the reply.
It sounds like you agree that a function "subRaw" to facilitate
this would be useful. In my testing, charToRaw(sub(pattern,
replacement, rawToChar(x)) did NOT preserve binary codes that did not
match legitimate characters. I tried several things before finding one
that seemed to work.
>> In this example, "b" was replaced by "e", but "bc" was not
>> replaced by "ef"? Do you know of any function to do this?
>> I ask, because I need it. I've written such a function, subRaw
>> for my own use. If I don't hear that another exists, I plan to add the
>> one I've written to the oro.dicom package.
>> > sessionInfo()
>> R version 2.15.1 (2012-06-22)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>  LC_COLLATE=English_United States.1252
>>  LC_CTYPE=English_United States.1252
>>  LC_MONETARY=English_United States.1252
>>  LC_NUMERIC=C
>>  LC_TIME=English_United States.1252
>> attached base packages:
>>  stats graphics grDevices utils datasets methods base
More information about the R-devel