[R] Problem with filling dataframe's column

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Tue Jun 13 15:18:32 CEST 2023


Às 17:18 de 13/06/2023, javad bayat escreveu:
> Dear Rui;
> Hi. I used your codes, but it seems it didn't work for me.
> 
>> pat <- c("_esmdes|_Des Section|0")
>> dim(data2)
>      [1]  281549      9
>> grep(pat, data2$Layer)
>> dim(data2)
>      [1]  281549      9
> 
> What does grep function do? I expected the function to remove 3 rows of the
> dataframe.
> I do not know the reason.
> 
> 
> 
> 
> 
> 
> On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
> 
>> Às 23:13 de 12/06/2023, javad bayat escreveu:
>>> Dear Rui;
>>> Many thanks for the email. I tried your codes and found that the length
>> of
>>> the "Values" and "Names" vectors must be equal, otherwise the results
>> will
>>> not be useful.
>>> For some of the characters in the Layer column that I do not need to be
>>> filled in the LU column, I used "NA".
>>> But I need to delete some of the rows from the table as they are useless
>>> for me. I tried this code to delete entire rows of the dataframe which
>>> contained these three value in the Layer column: It gave me the following
>>> error.
>>>
>>>> data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
>>>        Warning message:
>>>         In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
>>>         argument 'pattern' has length > 1 and only the first element will
>> be
>>> used
>>>
>>>> data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
>>>       Warning message:
>>>       In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
>>>       argument 'pattern' has length > 1 and only the first element will be
>>> used
>>>
>>> How can I do this?
>>> Sincerely
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas <ruipbarradas using sapo.pt>
>> wrote:
>>>
>>>> Às 13:18 de 11/06/2023, Rui Barradas escreveu:
>>>>> Às 22:54 de 11/06/2023, javad bayat escreveu:
>>>>>> Dear Rui;
>>>>>> Many thanks for your email. I used one of your codes,
>>>>>> "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works
>>>>>> correctly for me.
>>>>>> Actually I need to expand the codes so as to consider all "Levels" in
>>>> the
>>>>>> "Layer" column. There are more than hundred levels in the Layer
>> column.
>>>>>> If I use your provided code, I have to write it hundred of time as
>>>> below:
>>>>>> data2$LU[which(data2$Layer == "Level 1")] <- "Park";
>>>>>> data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
>>>>>> ...
>>>>>> ...
>>>>>> ...
>>>>>> .
>>>>>> Is there any other way to expand the code in order to consider all of
>>>> the
>>>>>> levels simultaneously? Like the below code:
>>>>>> data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))]
>>>> <-
>>>>>> c("Park", "Agri", "GS", ...)
>>>>>>
>>>>>>
>>>>>> Sincerely
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas <ruipbarradas using sapo.pt>
>>>>>> wrote:
>>>>>>
>>>>>>> Às 21:05 de 11/06/2023, javad bayat escreveu:
>>>>>>>> Dear R users;
>>>>>>>> I am trying to fill a column based on a specific value in another
>>>>>>>> column
>>>>>>> of
>>>>>>>> a dataframe, but it seems there is a problem with the codes!
>>>>>>>> The "Layer" and the "LU" are two different columns of the dataframe.
>>>>>>>> How can I fix this?
>>>>>>>> Sincerely
>>>>>>>>
>>>>>>>>
>>>>>>>> for (i in 1:nrow(data2$Layer)){
>>>>>>>>               if (data2$Layer == "Level 12") {
>>>>>>>>                   data2$LU == "Park"
>>>>>>>>                   }
>>>>>>>>               }
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> There are two bugs in your code,
>>>>>>>
>>>>>>> 1) the index i is not used in the loop
>>>>>>> 2) the assignment operator is `<-`, not `==`
>>>>>>>
>>>>>>>
>>>>>>> Here is the loop corrected.
>>>>>>>
>>>>>>> for (i in 1:nrow(data2$Layer)){
>>>>>>>       if (data2$Layer[i] == "Level 12") {
>>>>>>>         data2$LU[i] <- "Park"
>>>>>>>       }
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> But R is a vectorized language, the following two ways are the
>> idiomac
>>>>>>> ways of doing what you want to do.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> i <- data2$Layer == "Level 12"
>>>>>>> data2$LU[i] <- "Park"
>>>>>>>
>>>>>>> # equivalent one-liner
>>>>>>> data2$LU[data2$Layer == "Level 12"] <- "Park"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If there are NA's in data2$Layer it's probably safer to use ?which()
>> in
>>>>>>> the logical index, to have a numeric one.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> i <- which(data2$Layer == "Level 12")
>>>>>>> data2$LU[i] <- "Park"
>>>>>>>
>>>>>>> # equivalent one-liner
>>>>>>> data2$LU[which(data2$Layer == "Level 12")] <- "Park"
>>>>>>>
>>>>>>>
>>>>>>> Hope this helps,
>>>>>>>
>>>>>>> Rui Barradas
>>>>>>>
>>>>>>
>>>>>>
>>>>> Hello,
>>>>>
>>>>> You don't need to repeat the same instruction 100+ times, there is a
>> way
>>>>> of assigning all new LU values at the same time with match().
>>>>> This assumes that you have the new values in a vector.
>>>>
>>>> Sorry, this is not clear. I mean
>>>>
>>>>
>>>> This assumes that you have the new values in a vector, the vector Names
>>>> below. The vector of values to be matched is created from the data.
>>>>
>>>>
>>>> Rui Barradas
>>>>
>>>>>
>>>>>
>>>>> Values <- sort(unique(data2$Layer))
>>>>> Names <- c("Park", "Agri", "GS")
>>>>>
>>>>> i <- match(data2$Layer, Values)
>>>>> data2$LU <- Names[i]
>>>>>
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Rui Barradas
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>> Hello,
>>
>> Please cc the r-help list, R-Help is threaded and this can in the future
>> be helpful to others.
>>
>> You can combine several patters like this:
>>
>>
>> pat <- c("_esmdes|_Des Section|0")
>> grep(pat, data2$Layer)
>>
>> or, programatically,
>>
>>
>> pat <- paste(c("_esmdes","_Des Section","0"), collapse = "|")
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
> 
Hello,

I only posted a corrected grep statement, the complete code should be


pat <- c("_esmdes|_Des Section|0")
data3 <- data2[-grep(pat, data2$Layer),]


Sorry for the confusion.

Hope this helps,

Rui Barradas



More information about the R-help mailing list