[R] newbie - read.csv creates a (data.frame, table, array, matrix, ...) and plotting one column

Daniel Nordlund djnordlund at verizon.net
Mon Jun 29 06:51:06 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Mark Knecht
> Sent: Sunday, June 28, 2009 7:32 PM
> To: Gabor Grothendieck
> Cc: r-help at r-project.org
> Subject: Re: [R] newbie - read.csv creates a (data.frame, 
> table, array, matrix, ...) and plotting one column
> 
> Thank you VERY much Gabor. I learned a LOT from this little post:
> 
> 1) I can read/modify/write data in the array using the sub function
> 2) I can address columns using $COLUMN_HEADER
> 3) I can do an XY plot using X ~ Y
> 
> Brilliant! Thanks!
> 
> Small result PDF attached.
> 
> Cheers,
> Mark
> 
> On Sun, Jun 28, 2009 at 7:04 PM, Gabor
> Grothendieck<ggrothendieck at gmail.com> wrote:
> > Try removing double quotes and commas from T1$EQUITY
> > and then convert it to numeric:
> >
> > T1$EQUITY <- as.numeric(sub('[",]', '', T1$EQUITY))
> > plot(EQUITY ~ TRADE, T1)
> >
> >
> > On Sun, Jun 28, 2009 at 9:14 PM, Mark 
> Knecht<markknecht at gmail.com> wrote:
> >> On Sun, Jun 28, 2009 at 4:02 PM, Mark 
> Knecht<markknecht at gmail.com> wrote:
> >>> Hi all,
> >>>   Newbie alert.
> >> <SNIP>
> >>>
> >>>   The second question is about plotting one column from data set.
> >>> I've used read.csv to read in somethign called PFA_VWAP. 
> row.names,
> >>> names and dim all return sensible values. The command
> >>> PFA_VWAP_Equity<-PFA_VWAP[,10] seems to load the new 
> variable with the
> >>> right data for the equity curve, but how do I plot it?
> >>> plot(PFA_VWAP_Equity) gives me a chart that doesn't make 
> sense to me.
> >>> Note however that the original values that make up column 10 were
> >>> surrounded by quotes so I'm not sure how to tell what 
> sort of data R
> >>> thinks is in column 10. Is it numeric or text? I think 
> it's integer
> >>> from the last command.
> >> <SNIP>
> >>
> >> So I've tried to boil this down to the something simple 
> others can try
> >> and duplicate my confusion, or just see what's going on from this
> >> post.
> >>
> >> Basically I'm trying to get a handle on the difference 
> between a list
> >> and an array, and how to take my data read with read.csv 
> and plot it
> >> on a scatter chart.
> >>
> >> 1) Here's a small portion of one of my data files. I put 
> this at C:\Test1.csv
> >>
> >> PORTFOLIO EQUITY TABLE
> >> TRADE,MARK-SYS,DATE/TIME,PL/SIZE,PS METHOD,POS SIZE,POS
> >> PL,DRAWDOWN,DRAWDOWN(%),EQUITY
> >> 1,1,1/9/2004 1:11:00 PM,-146.00,As 
> Given,1,-146.00,146.00,1.460,"9,854.00"
> >> 2,1,1/12/2004 1:11:00 PM,874.00,As 
> Given,1,874.00,0.00,0,"10,728.00"
> >> 3,1,1/13/2004 1:11:00 PM,224.00,As 
> Given,1,224.00,0.00,0,"10,952.00"
> >> 4,1,1/28/2004 12:28:00 PM,-626.00,As 
> Given,1,-626.00,626.00,5.716,"10,326.00"
> >> 5,1,2/9/2004 1:11:00 PM,64.00,As 
> Given,1,64.00,562.00,5.131,"10,390.00"
> >> 6,1,2/13/2004 1:11:00 PM,-116.00,As 
> Given,1,-116.00,678.00,6.191,"10,274.00"
> >> 7,1,2/20/2004 1:11:00 PM,364.00,As 
> Given,1,364.00,314.00,2.867,"10,638.00"
> >> 8,1,2/23/2004 11:23:00 AM,-626.00,As 
> Given,1,-626.00,940.00,8.583,"10,012.00"
> >> 9,1,2/24/2004 1:11:00 PM,114.00,As 
> Given,1,114.00,826.00,7.542,"10,126.00"
> >> 10,1,2/25/2004 1:11:00 PM,444.00,As 
> Given,1,444.00,382.00,3.488,"10,570.00"
> >>
> >> (In case of line breakage everything from TRADE to EQUITY 
> is on line 2
> >> and there are 10 lines that follow making 12 lines total.)
> >>
> >> 2) I read this into R creating T1 using the command
> >>
> >> T1<-read.csv("C:\\Test1.csv",skip=1,header=TRUE)
> >>
> >> 3) If I type T1 and hit return then I see the data.
> >>
> >> 4) dim(T1) says 10 10 which is correct.
> >>
> >> 5) I can read the two columns I want to use to create the 
> scatter plot using:
> >>
> >>> T1[,1]
> >>  [1]  1  2  3  4  5  6  7  8  9 10
> >>> T1[,10]
> >>  [1] 9,854.00  10,728.00 10,952.00 10,326.00 10,390.00 10,274.00
> >> 10,638.00 10,012.00 10,126.00 10,570.00
> >> Levels: 10,012.00 10,126.00 10,274.00 10,326.00 10,390.00 10,570.00
> >> 10,638.00 10,728.00 10,952.00 9,854.00
> >>>
> >>
> >> Now, here's the confusion. plot(T1[,1],T1[,10]) creates a plot, but
> >> the range on both X & Y is 1-10. I want 1-10 on the X axis but need
> >> the values in the first line of the T1[,10] return as the Y axis.
> >>
> >> How can I create that scatter plot?
> >>
> >> Thanks,
> >> Mark
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> 

Mark,

I'm glad you are making progress.  However, you are going to want to continue reading up on matrices, arrays, lists, and data frames.  What read.csv creates is a data frame which is not the same as a matrix or an array.  As you move forward you will want to keep these structues separate in your mind.

Best of luck,

Dan

Daniel Nordlund
Bothell, WA USA  




More information about the R-help mailing list