[R] StrSplit

Jeffrey Spies jspies at virginia.edu
Sat Oct 9 18:46:03 CEST 2010


Jim's solution is the ideal way to read in the data: using the sep=";"
argument in read.table.

However, if you do for some reason have a vector of strings like the
following (maybe someone gives you an Rdata file instead of the raw
data file):

MF_Data <- c("106506;AIG India Liquid Fund-Institutional Plan-Daily
Dividend Option;1001.0000;1001.0000;1001.0000;02-Oct-2010","106511;AIG
India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010")

Then you can use this to get a data frame:

as.data.frame(do.call(rbind, lapply(MF_Data, function(x)
unlist(strsplit(x, ';')))))

Cheers,

Jeff.

On Sat, Oct 9, 2010 at 12:30 PM, jim holtman <jholtman at gmail.com> wrote:
> Is this what you are after:
>
>> x <- c("Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date"
> + , ""
> +  ,"Open Ended Schemes ( Liquid )"
> + , ""
> + , ""
> + , "AIG Global Investment Group Mutual Fund"
> + , "106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
> Option;1001.0000;1001.0000;1001.0000;02-Oct-2010"
> + , "106511;AIG India Liquid Fund-Institutional Plan-Growth
> Option;1210.4612;1210.4612;1210.4612;02-Oct-2010"
> + , "106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
> Option;1001.8765;1001.8765;1001.8765;02-Oct-2010"
> + , "106503;AIG India Liquid Fund-Retail Plan-DailyDividend
> Option;1001.0000;1001.0000;1001.0000;02-Oct-2010")
>>
>> myData <- read.table(textConnection(x[7:10]), sep=';')
>> closeAllConnections()
>> str(myData)
> 'data.frame':   4 obs. of  6 variables:
>  $ V1: int  106506 106511 106507 106503
>  $ V2: Factor w/ 4 levels "AIG India Liquid Fund-Institutional
> Plan-Daily Dividend Option",..: 1 2 3 4
>  $ V3: num  1001 1210 1002 1001
>  $ V4: num  1001 1210 1002 1001
>  $ V5: num  1001 1210 1002 1001
>  $ V6: Factor w/ 1 level "02-Oct-2010": 1 1 1 1
>> myData
>      V1
> V2       V3       V4       V5          V6
> 1 106506  AIG India Liquid Fund-Institutional Plan-Daily Dividend
> Option 1001.000 1001.000 1001.000 02-Oct-2010
> 2 106511          AIG India Liquid Fund-Institutional Plan-Growth
> Option 1210.461 1210.461 1210.461 02-Oct-2010
> 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend
> Option 1001.876 1001.876 1001.876 02-Oct-2010
> 4 106503          AIG India Liquid Fund-Retail Plan-DailyDividend
> Option 1001.000 1001.000 1001.000 02-Oct-2010
>>
>>
>
>
> On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas
> <santosh.srinivas at gmail.com> wrote:
>> Newbie question ...
>>
>> I am looking something equivalent to read.delim but  which accepts a text line as parameter instead of a file input.
>>
>> Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though
>>
>> I have a data frame with 10 lines called MF_Data
>>> MF_Data [1:10]
>>  [1] "Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date"
>>  [2] ""
>>  [3] "Open Ended Schemes ( Liquid )"
>>  [4] ""
>>  [5] ""
>>  [6] "AIG Global Investment Group Mutual Fund"
>>  [7] "106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.0000;1001.0000;1001.0000;02-Oct-2010"
>>  [8] "106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010"
>>  [9] "106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010"
>> [10] "106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.0000;1001.0000;1001.0000;02-Oct-2010"
>>
>>
>> Now for the lines below .. they are delimted by ; ... I am using
>>
>>  tempTxt <- MF_Data[7]
>>  MF_Data_F <-   unlist(strsplit(tempTxt,";", fixed = TRUE))
>>  tempTxt <- MF_Data[8]
>>  MF_Data_F1 <-  unlist(strsplit(tempTxt,";", fixed = TRUE))
>>  MF_Data_F <- rbind(MF_Data_F,MF_Data_F1)
>>
>> But MF_Data_F is not a simple 2X6 data frame which is what I want
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list