[R] Basic structure operations doubt

Joshua Wiley jwiley.psych at gmail.com
Mon Oct 18 19:53:45 CEST 2010


Hi,

Well I am not completely sure of the R gurus reasons for what they do,
but one explanation is that data is not thrown away unless you ask it
to be.  Factors are categorical variables and each level could have
meaning even when there are no cases in it (or particularly when there
are no cases in it).  You might have adults split into three age
categories, [18, 25), [25, 40), [40, 80], and then look at how many
survive a severe car crash (live vs. die) for those wearing a seat
belt and those without a seat belt:

        No Seat Belt
         [18, 25), [25, 40), [40, 80]
Live      0             0            0
Die       10           0           0

        Seat Belt
         [18, 25), [25, 40), [40, 80]
Live      0           9            5
Die       0           1            5


or should it be (Live and other ages dropped because no one is in them):

        No Seat Belt
         [18, 25)
Die       10


As for your other question about assign colNames, I am not completely
sure how you want to "assign" one variable to your other data frame.
Do you want to add it?  Do you want to replace the one data frames
column names?  Do you want to assign over a particular row?  See
?rbind to bind two or more data frames or matrices together by rows or
?colnames to see how to set a data frames column names.

Just as a side note, I really appreciated that you used head() on your
data frame so there was actually some sample data to look at.  If you
want to make the people on R-help super happy, try using:
dput(head(yourdata))
and copy that into your email.  This gives us an incredibly easy way
to actually get the first few rows of your data into R, just like it
would be on your end.

Best Regards,

Josh


On Sun, Oct 17, 2010 at 8:30 PM, Santosh Srinivas
<santosh.srinivas at gmail.com> wrote:
> Thanks Josh.
>
> At your convenience, Any pointers on why this was designed like this? i.e. shouldn’t droplevels() be the default behavior?
> I'm missing something in understanding on how these operations (manipulations) were designed to work.
>
>
> -----Original Message-----
> From: Joshua Wiley [mailto:jwiley.psych at gmail.com]
> Sent: 18 October 2010 07:47
> To: Santosh Srinivas
> Cc: r-help at r-project.org
> Subject: Re: [R] Basic structure operations doubt
>
> Hi,
>
> The easiest way to get rid of the empty levels is with droplevels().
> See ?droplevels for details.  It actually has a method for data frames
> even.  So you could just do something like:
>
> Indx_Constituents <- droplevels(Indx_Constituents)
>
> or whatever your data frame was called and it will drop any unused
> levels for you.
>
> Cheers,
>
> Josh
>
> On Sun, Oct 17, 2010 at 7:06 PM, Santosh Srinivas
> <santosh.srinivas at gmail.com> wrote:
>> I'm doing these manipulations on the data frame and wondering why does R
>> have to remember historical data on my operation and not just keep the
>> needed info.
>> Probably a basic fundamentals of the way R handles data .. Pls point me to
>> the manual if possible ..
>>
>> I have this Index data:
>>> head(NIFTY_INDX)
>> �Constituents.list.of.S.P.CNX.Nifty � � � � � � � � � � � � �X � � � X.1
>> X.2 � � � � �X.3
>> 1
>>
>> 2 � � � � � � � � � � � Company Name � � � � � � � � � Industry � �Symbol
>> Series � �ISIN Code
>> 3
>>
>> 4 � � � � � � � � � � � � � ACC Ltd. CEMENT AND CEMENT PRODUCTS � � � ACC
>> EQ INE012A01025
>> 5 � � � � � � � �Ambuja Cements Ltd. CEMENT AND CEMENT PRODUCTS AMBUJACEM
>> EQ INE079A01024
>> 6 � � � � � � � � � � Axis Bank Ltd. � � � � � � � � � � �BANKS �AXISBANK
>> EQ INE238A01026
>>
>>
>> I Import the section that is relevant to me:
>>
>>> Indx_Constituents <- NIFTY_INDX[4:NROW(NIFTY_INDX),]
>>> head(Indx_Constituents)
>> �Constituents.list.of.S.P.CNX.Nifty � � � � � � � � � � � � � � �X
>> X.1 X.2 � � � � �X.3
>> 4 � � � � � � � � � � � � � ACC Ltd. � � CEMENT AND CEMENT PRODUCTS
>> ACC �EQ INE012A01025
>> 5 � � � � � � � �Ambuja Cements Ltd. � � CEMENT AND CEMENT PRODUCTS
>> AMBUJACEM �EQ INE079A01024
>> 6 � � � � � � � � � � Axis Bank Ltd. � � � � � � � � � � � � �BANKS
>> AXISBANK �EQ INE238A01026
>> 7 � � � � � � � � � �Bajaj Auto Ltd. AUTOMOBILES - 2 AND 3 WHEELERS
>> BAJAJ-AUTO �EQ INE917I01010
>> 8 � � �Bharat Heavy Electricals Ltd. � � � � � ELECTRICAL EQUIPMENT
>> BHEL �EQ INE257A01018
>> 9 �Bharat Petroleum Corporation Ltd. � � � � � � � � � � REFINERIES
>> BPCL �EQ INE029A01011
>>
>>
>>> colNames <- NIFTY_INDX[2,]
>>> colNames
>> �Constituents.list.of.S.P.CNX.Nifty � � � �X � �X.1 � �X.2 � � � X.3
>> 2 � � � � � � � � � � � Company Name Industry Symbol Series ISIN Code
>>
>>
>> I want to assign the info from colNames[1,] to Indx_Constituents .... I am
>> unable to do this directly ... I can probably pull out the values and do it
>> but there should be an easier way
>>
>>
>> Now when I do this:
>>> colNames[1,1]
>> [1] Company Name
>> 52 Levels: �ACC Ltd. Ambuja Cements Ltd. Axis Bank Ltd. Bajaj Auto Ltd.
>> Bharat Heavy Electricals Ltd. Bharat Petroleum Corporation Ltd. Bharti
>> Airtel Ltd. Cairn India Ltd. Cipla Ltd. Company Name ... Wipro Ltd.
>>
>> Why does R have to remember the 52 levels?? Why can't it just have the
>> relevant data stored
>> What are the alternatives so that I can simply have my needed data in my
>> data frames?
>>
>> Thanks for your explanation.
>>
>>
>> ----------------------------------------------------------------------------
>> --------------------------
>> Thanks R-Helpers. Yes, this is a silly question and it will not be repeated!
>> :-)
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/



More information about the R-help mailing list