[R] Quick R syntax question

Luke Miller millerlp at gmail.com
Mon Jun 20 18:35:11 CEST 2011


The quotes around 'Major.Gleason' and 'Minor.Gleason' are required for
accessing data frame columns by name. You could alternately refer to
the columns by number if you're sure you know which column is which:

> output = paste(df[ ,1], df[ ,2], sep = '+')

It's just a requirement for accessing things named with text strings
when using [ ] bracket notation. For instance, if you wanted to simply
print the contents of the 'Major.Gleason' column to your terminal, you
could do this:

> df[ ,'Major.Gleason']
[1] 4 5 2 3

or do it this way:

> df[ ,1]
[1] 4 5 2 3

As you can see, the quotes around Major/Minor Gleason don't really
have anything to do with the paste() function, they have everything to
do with extracting the desired data from the data frame column so that
paste() can go to work on the data.

On Mon, Jun 20, 2011 at 12:21 PM, Ben Ganzfried <ben.ganzfried at gmail.com> wrote:
> Thanks!  Very glad you pointed me to the paste function, it looks very
> helpful.
>
> I have a quick follow-up after reading through the online tutorial on the
> "paste" function:
>
> Why do we need quotation marks around "Major Gleason" and "Minor Gleason"
> in: output = paste(df [,'Major.Gleason'],  df[ ,'Minor.Gleason'], sep='+')?
> The "paste" function is going to concatenate the first and second parameters
> and separate them by the "+" sign, so I'm not clear why we need to put
> quotation marks around the dataframe column headers...
>
> Thanks,
>
> Ben
>
>
> On Mon, Jun 20, 2011 at 11:58 AM, David Winsemius <dwinsemius at comcast.net>
> wrote:
>>
>> On Jun 20, 2011, at 11:47 AM, Luke Miller wrote:
>>
>>> If we assume that your data are in a data frame (which doesn't allow
>>> spaces in column names, hence the periods in the call below):
>>>
>>>> df = data.frame(Major.Gleason = c(4,5,2,3), Minor.Gleason = c(3,2,4,3))
>>>
>>> You can paste together the contents of the two columns with a plus
>>> sign in between using the paste() function. The sep='' option at the
>>> end of the function call specifies that no spaces should be included
>>> between pasted items.
>>>
>>>> output = paste(as.character(df [,'Major.Gleason']), '+',
>>>> as.character(df[ ,'Minor.Gleason']), sep='')
>>
>> I do not think the as.character is needed. Coercion to character is
>> implicit in the use of paste(). And  the sep argument could be "+".
>>
>> output = paste(df [,'Major.Gleason'],  df[ ,'Minor.Gleason'], sep='+')
>>
>> --
>> David.
>>
>>>
>>> The new object 'output' is a character vector containing the 4 strings
>>> you're after:
>>>
>>>> print(output)
>>>
>>> [1] "4+3" "5+2" "2+4" "3+3"
>>>
>>>
>>> On Mon, Jun 20, 2011 at 11:31 AM, Ben Ganzfried <ben.ganzfried at gmail.com>
>>> wrote:
>>>>
>>>> Hi --
>>>>
>>>> I had a pretty quick R question since unfortunately I have not been able
>>>> to
>>>> find an answer on Google.  It shouldn't take much more than a minute to
>>>> answer.
>>>>
>>>> I'm trying to add up the major gleason grade and minor gleason grade for
>>>> an
>>>> analysis of patients with prostate cancer.  One column has values under
>>>> "Major Gleason" and another column has values under "Minor Gleason."
>>>>  For
>>>> example,
>>>> Major Gleason     Minor Gleason
>>>> 4                         3
>>>> 5                         2
>>>> 2                         4
>>>> 3                         3
>>>>
>>>> I want my output to be:
>>>> "4+3"
>>>> "5+2"
>>>> "2+4"
>>>> "3+3"
>>>>
>>>> The quasi-pseudocode in Java is basically:
>>>>
>>>> major = column$majorGleason
>>>> minor = column$minorGleason
>>>> for item in len(Major Gleason) {
>>>>  string s = major(item) "+" minor(item);
>>>> }
>>>> return s;
>>>>
>>>> But trying the same idea in R:
>>>>
>>>> string <- major "+" minor
>>>>
>>>> gives me an error: "unexpected string constant in..."
>>>>
>>>> I would greatly appreciate any help.
>>>>
>>>> Thanks,
>>>>
>>>> Ben
>>>>
>>>>       [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>> ___________________________
>>> Luke Miller
>>> Postdoctoral Researcher
>>> Marine Science Center
>>> Northeastern University
>>> Nahant, MA
>>> (781) 581-7370 x318
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
>



-- 
___________________________
Luke Miller
Postdoctoral Researcher
Marine Science Center
Northeastern University
Nahant, MA
(781) 581-7370 x318



More information about the R-help mailing list