[R] Dataframe Manipulation

Hemant Sain hemantsain55 at gmail.com
Wed Aug 30 13:22:53 CEST 2017


by using these two tables we have to create third table in this format
where categories will be on the top and transaction will be in the rows,

On 30 August 2017 at 16:42, Hemant Sain <hemantsain55 at gmail.com> wrote:

> Hello Ulrik,
> Can you please once check this code again on the following data set
> because it doesn't giving same output to me due to absence of quantity,a
> compare to previous demo data set becaue spiting is getting done on the
> basis of quantity and in real data set quantity is missing. so please use
> following data set and help me out please consider this mail is my final
> email i won't bother you again but its about my job please help me
> .
>
> Note* the file I'm attaching is very confidential
>
> On 30 August 2017 at 15:02, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
>
>> Hi Hemant,
>>
>> Does this help you along?
>>
>> table_1 <- textConnection("Item_1;Item_2;Item_3
>> 1KG banana;300ML milk;1kg sugar
>> 2Large Corona_Beer;2pack Fries;
>> 2 Lux_Soap;1kg sugar;")
>>
>> table_1 <- read.csv(table_1, sep = ";", na.strings = "", stringsAsFactors
>> = FALSE, check.names = FALSE)
>>
>> table_2 <- textConnection("Toiletries;Fruits;Beverages;Snacks;Vegetables;Clothings;Dairy
>> Products
>> Soap;banana;Corona_Beer;King Burger;Pumpkin;Adidas Sport Tshirt XL;milk
>> Shampoo;Mango;Red Label Whisky;Fries;Potato;Nike Shorts Black L;Butter
>> Showergel;Oranges;grey Cocktail;cheese pizza;Tomato;Puma Jersy red M;sugar
>> Lux_Soap;;2 Large corona Beer;;Cheese;Toothpaste")
>>
>> table_2 <- read.csv(table_2, sep = ";", na.strings = "", stringsAsFactors
>> = FALSE, check.names = FALSE)
>>
>> library(tidyr)
>> library(dplyr)
>>
>> table_2 <- gather(table_2, "Category", "Item")
>>
>> table_1 <- gather(table_1, "Foo", "Item") %>%
>>   filter(!is.na(Item))
>>
>> table_1 <- separate(table_1, col = "Item", into = c("Quantity", "Item"),
>> sep = " ")
>>
>> table_3 <- left_join(table_1, table_2, by = "Item") %>%
>>   mutate(Item = paste(Quantity, Item)) %>%
>>   select(-Quantity)
>>
>> table_3 %>%
>>   group_by(Foo, Category) %>%
>>   summarise(Item = paste(Item, collapse = ", ")) %>%
>>   spread(key = "Category", value = "Item")
>>
>> You need to figure out how to handle words written with different cases
>> and how to get the quantity in an universal way. For the code above, I
>> corrected these things by hand in the example data.
>>
>> HTH
>> Ulrik
>>
>> On Wed, 30 Aug 2017 at 10:16 Hemant Sain <hemantsain55 at gmail.com> wrote:
>>
>>> Hey PIKAL,
>>> It's not a homework neithe that is the real dataset i have signer NDA for
>>> my company so that i can share the original data file, Actually I'm
>>> working
>>> on a market basket analysis task but not able to convert my existing data
>>> table to appropriate format so that i can apply Apriori algorithm using
>>> R,
>>> and this is very important me to get it done because I'm an intern and
>>> if i
>>> won't get it done they will not  going to hire me as a full-time
>>> employee.
>>> i tried everything by myself but not able to get it done.
>>> your precious 10-15 can save my upcoming years. so please if you can
>>> please
>>> help me through this.
>>> i want another dataset based on first two dataset i have mentioned .
>>>
>>> Thanks
>>>
>>> On 30 August 2017 at 12:49, PIKAL Petr <petr.pikal at precheza.cz> wrote:
>>>
>>> > Hi
>>> >
>>> > It seems to me like homework, there is no homework policy on this help
>>> > list.
>>> >
>>> > What do you want to do with your table 3? It seems to me futile.
>>> >
>>> > Anyway, some combination of melt, merge, cast and regular expressions
>>> > could be employed in such task, but it could be rather tricky.
>>> >
>>> > But be aware that
>>> >
>>> > Suger does not match sugar (I wonder that sugar is dairy product)
>>> >
>>> > and you mix uppercase and lowercase letters which could be also
>>> > problematic, when matching words.
>>> >
>>> > Cheers
>>> > Petr
>>> >
>>> > > -----Original Message-----
>>> > > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>>> Hemant
>>> > Sain
>>> > > Sent: Wednesday, August 30, 2017 8:28 AM
>>> > > To: r-help at r-project.org
>>> > > Subject: [R] Dataframe Manipulation
>>> > >
>>> > > i want to do a market basket analysis and I’m trying to create a
>>> dataset
>>> > for that
>>> > > i have two tables, one table contains daily transaction of products
>>> in
>>> > which
>>> > > each row of table shows item purchased by the customer, The second
>>> table
>>> > > contains parent group under those products are fallen, for example
>>> under
>>> > fruit
>>> > > category there are several fruits like mango, banana, apple etc.
>>> > > i want to create a third table in which parent group are mentioned as
>>> > header
>>> > > which can be extracted from Table 2, and all the rows represent
>>> > transaction of
>>> > > products
>>> > >
>>> > > with their names, and if there is no transaction for any parent
>>> category
>>> > then
>>> > > the cell supposed to fill as NA. please help me with R or C/c++
>>> code( R
>>> > would be
>>> > >
>>> > > preferred) here I’m attaching you all three tables for better
>>> reference
>>> > i have
>>> > > first two tables and i want to get a table like table 3
>>> > >
>>> > > Tables are explained in the attached doc.
>>> > >
>>> > > --
>>> > > hemantsain.com
>>> >
>>> > ________________________________
>>> > Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
>>> > určeny pouze jeho adresátům.
>>> > Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
>>> > neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho
>>> kopie
>>> > vymažte ze svého systému.
>>> > Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento
>>> email
>>> > jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
>>> > Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
>>> modifikacemi
>>> > či zpožděním přenosu e-mailu.
>>> >
>>> > V případě, že je tento e-mail součástí obchodního jednání:
>>> > - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
>>> > smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
>>> > - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně
>>> přijmout;
>>> > Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
>>> > příjemce s dodatkem či odchylkou.
>>> > - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
>>> > výslovným dosažením shody na všech jejích náležitostech.
>>> > - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
>>> > společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně
>>> zmocněn
>>> > nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi
>>> tohoto
>>> > emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
>>> > existence je adresátovi či osobě jím zastoupené známá.
>>> >
>>> > This e-mail and any documents attached to it may be confidential and
>>> are
>>> > intended only for its intended recipients.
>>> > If you received this e-mail by mistake, please immediately inform its
>>> > sender. Delete the contents of this e-mail with all attachments and its
>>> > copies from your system.
>>> > If you are not the intended recipient of this e-mail, you are not
>>> > authorized to use, disseminate, copy or disclose this e-mail in any
>>> manner.
>>> > The sender of this e-mail shall not be liable for any possible damage
>>> > caused by modifications of the e-mail or by delay with transfer of the
>>> > email.
>>> >
>>> > In case that this e-mail forms part of business dealings:
>>> > - the sender reserves the right to end negotiations about entering
>>> into a
>>> > contract in any time, for any reason, and without stating any
>>> reasoning.
>>> > - if the e-mail contains an offer, the recipient is entitled to
>>> > immediately accept such offer; The sender of this e-mail (offer)
>>> excludes
>>> > any acceptance of the offer on the part of the recipient containing any
>>> > amendment or variation.
>>> > - the sender insists on that the respective contract is concluded only
>>> > upon an express mutual agreement on all its aspects.
>>> > - the sender of this e-mail informs that he/she is not authorized to
>>> enter
>>> > into any contracts on behalf of the company except for cases in which
>>> > he/she is expressly authorized to do so in writing, and such
>>> authorization
>>> > or power of attorney is submitted to the recipient or the person
>>> > represented by the recipient, or the existence of such authorization is
>>> > known to the recipient of the person represented by the recipient.
>>> >
>>>
>>>
>>>
>>> --
>>> hemantsain.com
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
> hemantsain.com
>



-- 
hemantsain.com


More information about the R-help mailing list