[R] strange strsplit gsub problem 0 is this a bug or a string length limitation?

tradenet nodecorum at yahoo.com
Fri Jul 10 16:07:50 CEST 2009


Thanks Marc!

I just found that the ~500 char limitation via an online search for the
specs for the formula class
The rmetrics library I'm using get's it's character array of assets by
parsing a formula passed as an input parameter to the portfolioBacktest
function.  Can I copy the portfolioBacktest function from the source, call
it portfolioBackest_hack, add an additional pamater, an array of asset
names, and have my version use this argument instead of parsing the formula? 
I'm fairly new to R so I don't know if R will find my function and if my
function will find the other fPortfolio functions that may be referenced by
the original, non "_hack" version of the function.

Warm regards,

Andrew


Marc Schwartz-3 wrote:
> 
> On Jul 10, 2009, at 7:18 AM, tradenet wrote:
> 
>>
>> I was working with the rmetrics portfolioBacktesting function and  
>> dug into
>> the code to try to find why my formula with 113 items, i.e. A1 thru  
>> A113,
>> was being truncated and I only get 85 items, not 113.
>>
>> Is it due to a string length limitation in R or is it a bug in the  
>> strsplit
>> or gsub functions, or in my string?
>>
>> I'd very much appreciate any suggestions
>>
>>
>> ============Input script:
>>
>> backtestFormula<- 
>> SPX~A1+A2+A3+A4+A5+A6+A7+A8+A9+A10+A11+A12+A13+A14+A15+A16+A17+A18+A19+A20+A21+A22+A23+A24+A25+A26+A27+A28+A29+A30+A31+A32+A33+A34+A35+A36+A37+A38+A39+A40+A41+A42+A43+A44+A45+A46+A47+A48+A49+A50+A51+A52+A53+A54+A55+A56+A57+A58+A59+A60+A61+A62+A63+A64+A65+A66+A67+A68+A69+A70+A71+A72+A73+A74+A75+A76+A77+A78+A79+A80+A81+A82+A83+A84+A85+A86+A87+A88+A89+A90+A91+A92+A93+A94+A95+A96+A97+A98+A99+A100+A101+A102+A103+A104+A105+A106+A107+A108+A109+A110+A111+A112+A113
>> benchmarkName = as.character(backtestFormula)[2]
>> print(as.character(backtestFormula)[3])
>> print(benchmarkName)
>>    assetsNames <- strsplit(gsub(" ", "",  
>> as.character(backtestFormula)[3]),
>> "\\+")[[1]]
>>    nAssets = length(assetsNames)
>> print(nAssets)
>> list(assetsNames)
>>
>> ===============output:
>>
>>
>>> backtestFormula<- 
>>> SPX~A1+A2+A3+A4+A5+A6+A7+A8+A9+A10+A11+A12+A13+A14+A15+A16+A17+A18+A19+A20+A21+A22+A23+A24+A25+A26+A27+A28+A29+A30+A31+A32+A33+A34+A35+A36+A37+A38+A39+A40+A41+A42+A43+A44+A45+A46+A47+A48+A49+A50+A51+A52+A53+A54+A55+A56+A57+A58+A59+A60+A61+A62+A63+A64+A65+A66+A67+A68+A69+A70+A71+A72+A73+A74+A75+A76+A77+A78+A79+A80+A81+A82+A83+A84+A85+A86+A87+A88+A89+A90+A91+A92+A93+A94+A95+A96+A97+A98+A99+A100+A101+A102+A103+A104+A105+A106+A107+A108+A109+A110+A111+A112+A113
>>
>>> benchmarkName = as.character(backtestFormula)[2]
>>
>>> print(benchmarkName)
>> [1] "SPX"
>>
>>> print(as.character(backtestFormula)[3])
>> [1] "A1 + A2 + A3 + A4 + A5 + A6 + A7 + A8 + A9 + A10 + A11 + A12 +  
>> A13 +
>> A14 + A15 + A16 + A17 + A18 + A19 + A20 + A21 + A22 + A23 + A24 +  
>> A25 + A26
>> + A27 + A28 + A29 + A30 + A31 + A32 + A33 + A34 + A35 + A36 + A37 +  
>> A38 +
>> A39 + A40 + A41 + A42 + A43 + A44 + A45 + A46 + A47 + A48 + A49 +  
>> A50 + A51
>> + A52 + A53 + A54 + A55 + A56 + A57 + A58 + A59 + A60 + A61 + A62 +  
>> A63 +
>> A64 + A65 + A66 + A67 + A68 + A69 + A70 + A71 + A72 + A73 + A74 +  
>> A75 + A76
>> + A77 + A78 + A79 + A80 + A81 + A82 + A83 + A84 + A85 + "
>>
>>> assetsNames <- strsplit(gsub(" ", "", as.character(backtestFormula) 
>>> [3]),
>>> "\\+")[[1]]
>>
>>> print(nAssets)
>> [1] 85
>>
>>> nAssets = length(assetsNames)
>>
>>> print(nAssets)
>> [1] 85
>>
>>> list(assetsNames)
>> [[1]]
>> [1] "A1"  "A2"  "A3"  "A4"  "A5"  "A6"  "A7"  "A8"  "A9"  "A10"  
>> "A11" "A12"
>> "A13" "A14" "A15" "A16" "A17" "A18" "A19" "A20" "A21" "A22" "A23"  
>> "A24"
>> "A25" "A26" "A27" "A28" "A29" "A30" "A31" "A32" "A33"
>> [34] "A34" "A35" "A36" "A37" "A38" "A39" "A40" "A41" "A42" "A43"  
>> "A44" "A45"
>> "A46" "A47" "A48" "A49" "A50" "A51" "A52" "A53" "A54" "A55" "A56"  
>> "A57"
>> "A58" "A59" "A60" "A61" "A62" "A63" "A64" "A65" "A66"
>> [67] "A67" "A68" "A69" "A70" "A71" "A72" "A73" "A74" "A75" "A76"  
>> "A77" "A78"
>> "A79" "A80" "A81" "A82" "A83" "A84" "A85"
> 
> 
> 
> You appear to be bumping up against the 500 character length limit of  
> as.character() when used with R language objects.
> 
> Review the Note in ?as.character:
> 
>    "as.character truncates components of language objects to 500  
> characters (was about 70 before 1.3.1)."
> 
> 
> 
> It is not a string length limitation or a bug in strsplit():
> 
>  > paste("A", 1:113, sep = "", collapse = " + ")
> [1] "A1 + A2 + A3 + A4 + A5 + A6 + A7 + A8 + A9 + A10 + A11 + A12 +  
> A13 + A14 + A15 + A16 + A17 + A18 + A19 + A20 + A21 + A22 + A23 + A24  
> + A25 + A26 + A27 + A28 + A29 + A30 + A31 + A32 + A33 + A34 + A35 +  
> A36 + A37 + A38 + A39 + A40 + A41 + A42 + A43 + A44 + A45 + A46 + A47  
> + A48 + A49 + A50 + A51 + A52 + A53 + A54 + A55 + A56 + A57 + A58 +  
> A59 + A60 + A61 + A62 + A63 + A64 + A65 + A66 + A67 + A68 + A69 + A70  
> + A71 + A72 + A73 + A74 + A75 + A76 + A77 + A78 + A79 + A80 + A81 +  
> A82 + A83 + A84 + A85 + A86 + A87 + A88 + A89 + A90 + A91 + A92 + A93  
> + A94 + A95 + A96 + A97 + A98 + A99 + A100 + A101 + A102 + A103 + A104  
> + A105 + A106 + A107 + A108 + A109 + A110 + A111 + A112 + A113"
> 
> 
>  > nchar(paste("A", 1:113, sep = "", collapse = " + "))
> [1] 680
> 
> 
>  > strsplit(paste("A", 1:113, sep = "", collapse = " + "), " \\+ ")[[1]]
>    [1] "A1"   "A2"   "A3"   "A4"   "A5"   "A6"   "A7"   "A8"   "A9"
>   [10] "A10"  "A11"  "A12"  "A13"  "A14"  "A15"  "A16"  "A17"  "A18"
>   [19] "A19"  "A20"  "A21"  "A22"  "A23"  "A24"  "A25"  "A26"  "A27"
>   [28] "A28"  "A29"  "A30"  "A31"  "A32"  "A33"  "A34"  "A35"  "A36"
>   [37] "A37"  "A38"  "A39"  "A40"  "A41"  "A42"  "A43"  "A44"  "A45"
>   [46] "A46"  "A47"  "A48"  "A49"  "A50"  "A51"  "A52"  "A53"  "A54"
>   [55] "A55"  "A56"  "A57"  "A58"  "A59"  "A60"  "A61"  "A62"  "A63"
>   [64] "A64"  "A65"  "A66"  "A67"  "A68"  "A69"  "A70"  "A71"  "A72"
>   [73] "A73"  "A74"  "A75"  "A76"  "A77"  "A78"  "A79"  "A80"  "A81"
>   [82] "A82"  "A83"  "A84"  "A85"  "A86"  "A87"  "A88"  "A89"  "A90"
>   [91] "A91"  "A92"  "A93"  "A94"  "A95"  "A96"  "A97"  "A98"  "A99"
> [100] "A100" "A101" "A102" "A103" "A104" "A105" "A106" "A107" "A108"
> [109] "A109" "A110" "A111" "A112" "A113"
> 
> HTH,
> 
> Marc Schwartz
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/strange-strsplit-gsub-problem-0-is-this-a-bug-or-a-string-length-limitation--tp24426457p24428237.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list