[R] ncol() vs. length() on data.frames

Ivan Calandra c@|@ndr@ @end|ng |rom rgzm@de
Tue Mar 31 16:41:00 CEST 2020


Thanks Matthias for the details!

Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 31/03/2020 16:30, Prof. Dr. Matthias Kohl wrote:
> should have added: dim(x)[2L] -> length(x)
>
> Am 31.03.20 um 16:21 schrieb Prof. Dr. Matthias Kohl:
>> Dear Ivan,
>>
>> if I enter ncol in the console, I get
>>
>> function (x)
>> dim(x)[2L]
>> <bytecode: 0x5559e9429030>
>> <environment: namespace:base>
>>
>> indicating that function dim is called. Function dim has a method for
>> data.frame; see methods("dim").
>>
>> The dim-method for data.frame is
>>
>> dim.data.frame
>> function (x)
>> c(.row_names_info(x, 2L), length(x))
>> <bytecode: 0x5559eb80da40>
>> <environment: namespace:base>
>>
>> Hence, it calls length on the provided data.frame. In addition, some
>> "magic" with .row_names_info is performed, where
>>
>> base:::.row_names_info
>> function (x, type = 1L)
>> .Internal(shortRowNames(x, type))
>> <bytecode: 0x5559ece50160>
>> <environment: namespace:base>
>>
>> Best
>> Matthias
>>
>> Am 31.03.20 um 16:10 schrieb Ivan Calandra:
>>> Thanks Ivan for the answer.
>>>
>>> So it confirms my first thought that these two functions are equivalent
>>> when applied to a "simple" data.frame.
>>>
>>> The reason I was asking is because I have gotten used to use
>>> length() in
>>> my scripts. It works perfectly and I understand it easily. But to be
>>> honest, ncol() is more intuitive to most users (especially the novice)
>>> so I was thinking about switching to using this function instead
>>> (all my
>>> data.frames are created from read.csv() or similar functions so there
>>> should not be any issue). But before doing that, I want to be sure that
>>> it is not going to create unexpected results.
>>>
>>> Thank you,
>>> Ivan
>>>
>>> -- 
>>> Dr. Ivan Calandra
>>> TraCEr, laboratory for Traceology and Controlled Experiments
>>> MONREPOS Archaeological Research Centre and
>>> Museum for Human Behavioural Evolution
>>> Schloss Monrepos
>>> 56567 Neuwied, Germany
>>> +49 (0) 2631 9772-243
>>> https://www.researchgate.net/profile/Ivan_Calandra
>>>
>>> On 31/03/2020 16:00, Ivan Krylov wrote:
>>>> On Tue, 31 Mar 2020 14:47:54 +0200
>>>> Ivan Calandra <calandra using rgzm.de> wrote:
>>>>
>>>>> On a simple data.frame (i.e. each element is a vector), ncol() and
>>>>> length() will give the same result.
>>>>> Are they just equivalent on such objects, or are they differences in
>>>>> some cases?
>>>> I am not aware of any exceptions to ncol(dataframe)==length(dataframe)
>>>> (in fact, ncol(x) is dim(x)[2L] and ?dim says that dim(dataframe)
>>>> returns c(length(attr(dataframe, 'row.names')),
>>>> length(dataframe))), but
>>>> watch out for AsIs columns which can have columns of their own:
>>>>
>>>> x <- data.frame(I(volcano))
>>>> dim(x)
>>>> # [1] 87  1
>>>> length(x)
>>>> # [1] 1
>>>> dim(x[,1])
>>>> # [1] 87 61
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>



More information about the R-help mailing list