[R] Problem with Extracting Hash Tagged Words from Tweets

R. Michael Weylandt michael.weylandt at gmail.com
Wed May 23 00:48:45 CEST 2012


Nope -- those are still printing brackets. The double bracket tells
you which list element you're looking at and the single indicates that
you're looking at the first element of the vector (everything (almost)
in R is a vector)

Michael

On Tue, May 22, 2012 at 6:20 PM, Adedoyin-Olowe Mariam
<mariamolowe2008 at yahoo.com> wrote:
> Hi Michael,
>
> Thanks for your help.
> Sadly this is not the square bracket I'm referring to.
> The one I'm referring to is the numbering of the downloaded tweets
> highlighted in red below:
>
>> searchTwitter("#Portsmouth", n=2)
> [[1]]
> [1] "NewsFromHampton: Photo: Man tries to steal air conditioner in
> Portsmouth http://t.co/uEBqspzB #Portsmouth"
>
> [[2]]
> [1] "ItsAllFooty: #Portsmouth boss Michael Appleton confirmed Kelvin Etuhu
> is likely to be offered a two-year contract and is confident the winger will
> stay."
>
>
> Mariam
>
> ________________________________
> From: R. Michael Weylandt <michael.weylandt at gmail.com>
> To: Sarah Goslee <sarah.goslee at gmail.com>
> Cc: Adedoyin-Olowe Mariam <mariamolowe2008 at yahoo.com>;
> "r-help at r-project.org" <r-help at r-project.org>
> Sent: Tuesday, 22 May 2012, 16:08
> Subject: Re: [R] Problem with Extracting Hash Tagged Words from Tweets
>
> "The presence of these numbers in square brackets is reporting error."
>
> You mean the square brackets that show up on the left hand side when
> you do something like
>
> x <- 1:100
> print(x)
>
> ?
>
> Don't worry -- those aren't part of x -- they're just added on
> printing to make things easier for the user to see where he is in the
> vector. They won't be included in any analysis. If you need control
> over the printing to avoid them, take a look at cat()
>
> Michel
>
> On Tue, May 22, 2012 at 11:02 AM, Sarah Goslee <sarah.goslee at gmail.com>
> wrote:
>> Hi,
>>
>> On Tue, May 22, 2012 at 10:55 AM, Adedoyin-Olowe Mariam
>> <mariamolowe2008 at yahoo.com> wrote:
>>> Hi Sarah,
>>>
>>> Thanks for your help. I'm sorry my question is not clear enough.
>>> Maybe what I should ask for is how to remove the downloaded
>>> tweet numbers in
>>> x <- list
>>> (ie.[[1]], [1], [[2]], [2].....)
>>> before > sapply(x, str_extract_all, "#\\<.*?\\>").
>>
>> Those aren't part of the tweets. Those are the numbers R uses when
>> displaying portions of a list.
>
>>
>>> The presence of these numbers in square brackets is reporting error.
>
>
>
>
>>
>> What error? You'll need to give us an actual reproducible example,
>> since what you are describing is unclear.
>>
>> Although I suppose it's possible that you simply want:
>>> unlist(sapply(x, str_extract_all, "#\\<.*?\\>"))
>> [1] "#dayatthenews" "#pompeyhacks"  "#portsmouth"   "#southsea"
>> [5] "#Portsmouth"   "#portsmouth"
>>
>> It's impossible for me to tell precisely what the problem is.
>>
>> Sarah
>>
>>>
>>> Thanks.
>>> Mariam
>>>
>>>
>>> ________________________________
>>> From: Sarah Goslee <sarah.goslee at gmail.com>
>>> To: Adedoyin-Olowe Mariam <mariamolowe2008 at yahoo.com>
>>> Cc: "r-help at r-project.org" <r-help at r-project.org>
>>> Sent: Tuesday, 22 May 2012, 13:53
>>> Subject: Re: [R] Problem with Extracting Hash Tagged Words from Tweets
>>>
>>> Hi,
>>>
>>> A small reproducible bit of your data would have been nice, and I have
>>> no idea what "manually remove all regular expressions" might mean, but
>>> take a look at this:
>>>
>>> x <- list("marymaryw: Get an insight into how journalists operate at
>>> The News by following #dayatthenews today #pompeyhacks #portsmouth
>>> #southsea", "VouchAR_Ports: £5 instead of £60 for 1 month of unlimited
>>> fitness classes at Outdoor Fitness Leeds - get bikini...
>>> http://t.co/BUrkjtCh #Portsmouth", "BillieRaePhoto: RT @vintagesecret:
>>> My dad has just sent me this picture. Looks like @GunwharfQuays is on
>>> fire?! #portsmouth http://t.co/HbAV7Hw0")
>>>
>>>> sapply(x, str_extract_all, "#\\<.*?\\>")
>>> [[1]]
>>> [1] "#dayatthenews" "#pompeyhacks"  "#portsmouth"  "#southsea"
>>>
>>> [[2]]
>>> [1] "#Portsmouth"
>>>
>>> [[3]]
>>> [1] "#portsmouth"
>>>
>>> Sarah
>>>
>>> On Tue, May 22, 2012 at 7:00 AM, Adedoyin-Olowe Mariam
>>> <mariamolowe2008 at yahoo.com> wrote:
>>>> Hello All,
>>>> Can anyone help me solve this problem.
>>>> Am trying to extract hash-tagged words from tweets downloaded from
>>>> twitteR.
>>>>
>>>> I can extract hash-tagged words from single tweet using
>>>> (stringr) str_extract_all(tweets, "#[a-z//A-Z//0-9]+")
>>>> but cannot with more than one tweet at a time except I manually remove
>>>> all
>>>> regular expressions and tweets numbers such as [[1]] and [1.]
>>>>
>>>> I want to automatically extract all #words in large number of tweets at
>>>> a
>>>> go.
>>>> This is what I have done so far by removing all regular expressions
>>>> manually:
>>>>
>>>>> searchTwitter("#Portsmouth", n=20) [[1]]
>>>> [1] "marymaryw: Get an insight into how journalists operate at The News
>>>> by
>>>> following #dayatthenews today #pompeyhacks #portsmouth #southsea"
>>>> [[2]]
>>>> [1] "VouchAR_Ports: £5 instead of £60 for 1 month of unlimited fitness
>>>> classes at Outdoor Fitness Leeds - get bikini... http://t.co/BUrkjtCh
>>>> #Portsmouth"
>>>> [[3]]
>>>> [1] "BillieRaePhoto: RT @vintagesecret: My dad has just sent me this
>>>> picture. Looks like @GunwharfQuays is on fire?! #portsmouth
>>>> http://t.co/HbAV7Hw0"
>>>> [[4]]
>>>> [1] "xangma: RT @vintagesecret: My dad has just sent me this picture.
>>>> Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
>>>> [[5]]
>>>> [1] "vintagesecret: My dad has just sent me this picture. Looks like
>>>> @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
>>>> [[6]]
>>>> [1] "i_amnik: RT @BBCRadioSolent: Can you see the #GunwharfQuays fire?
>>>> Eye-witnesses please call - 0845 30 30 961. #Portsmouth."
>>>> [[7]]
>>>> [1] "vickiredmond: RT @dan_germain: RT @MatMacAulay: Best pic of
>>>> #Gunwharf
>>>> on fire I have seen http://t.co/8LNAiqiD #portsmouth"
>>>> [[8]]
>>>> [1] "EmilieRosa: Highs of 25 degrees on the island this week!! Beach
>>>> time
>>>> after exams I think! ;) #Portsmouth"
>>>> [[9]]
>>>> [1] "MrYiff: RT @dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on
>>>> fire I have seen http://t.co/8LNAiqiD #portsmouth"
>>>> [[10]]
>>>> [1] "otbsaad: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large
>>>> fire
>>>> at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent
>>>> 96.1FM"
>>>> [[11]]
>>>> [1] "PN_Newsdesk: #Portsmouth: Ferryspeed looks to build on its past
>>>> successes http://t.co/CmDglDkg"
>>>> [[12]]
>>>> [1] "PN_Newsdesk: #Portsmouth: More room for stalls at top Southsea
>>>> school
>>>> - A SOUTHSEA primary school still has room for people to se...
>>>> http://t.co/ucbYWjPR"
>>>> [[13]]
>>>> [1] "VouchAR_Ports: £14 instead of £30 for a pedicure with foiled
>>>> transfer
>>>> at Forever Young, Stoke-on-Trent - get... http://t.co/P7gJBcl8
>>>> #Portsmouth"
>>>> [[14]]
>>>> [1] "TelArnott: Looking forward to #K1 today! #gym01 #portsmouth"
>>>> [[15]]
>>>> [1] "dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on fire I have
>>>> seen http://t.co/8LNAiqiD #portsmouth"
>>>> [[16]]
>>>> [1] "dan_germain: RT @portsmouthnews: News: Large fire at Gunwharf Quays
>>>> -
>>>> http://t.co/s9RWpY0i #portsmouth #southsea"
>>>> [[17]]
>>>> [1] "i_amnik: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large
>>>> fire
>>>> at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent
>>>> 96.1FM"
>>>> [[18]]
>>>> [1] "solentmotorcars: RT @BBCRadioSolent: BREAKING NEWS - Reports of a
>>>> large fire at #GunwharfQuays in #Portsmouth. Latest updates on
>>>> @BBCRadioSolent 96.1FM"
>>>> [[19]]
>>>> [1] "HantsChiefAlex: RT @BBCRadioSolent: BREAKING NEWS - Reports of a
>>>> large fire at #GunwharfQuays in #Portsmouth. Latest updates on
>>>> @BBCRadioSolent 96.1FM"
>>>> [[20]]
>>>> [1] "BBCRadioSolent: Can you see the #GunwharfQuays fire? Eye-witnesses
>>>> please call - 0845 30 30 961. #Portsmouth."
>>>>> tweets <-c("marymaryw: Get an insight into how journalists operate at
>>>>> The
>>>>> News by following #dayatthenews today #pompeyhacks #portsmouth
>>>>> #southsea
>>>>> VouchAR_Ports £5 instead of £60 for 1 month of unlimited fitness
>>>>> classes at
>>>>> Outdoor Fitness Leeds - get bikini... http://t.co/BUrkjtCh #Portsmouth
>>>>> BillieRaePhoto RT @vintagesecret My dad has just sent me this picture.
>>>>> Looks
>>>>> like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0
>>>>> xangma: RT
>>>>> @vintagesecret My dad has just sent me this picture. Looks like
>>>>> @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0
>>>>> vintagesecret
>>>>> My dad has just sent me this picture. Looks like @GunwharfQuays is on
>>>>> fire?!
>>>>> #portsmouth http://t.co/HbAV7Hw0iamnik: RT @BBCRadioSolent Can you see
>>>>> the
>>>>> #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961.
>>>>> #Portsmouth. vickiredmond @MatMacAulay Best pic of#Gunwharf on fire I
>>>>> have
>>>>> seen http://t.co/8LNAiqiD #portsmouth EmilieRosa: Highs of 25 degrees
>>>>> on the
>>>>> island
>>>>  this week!! Beach time after exams I think!) #Portsmouth mYiff RT
>>>> @dan_germain: RT @MatMacAulay Best pic of #Gunwharf on fire I have seen
>>>> http://t.co/8LNAiqiD #portsmouth otbsaad RT @BBCRadioSolent: BREAKING
>>>> NEWS -
>>>> Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates
>>>> on
>>>> @BBCRadioSolent 96.1FM PN_Newsdesk #Portsmouth: Ferryspeed looks to
>>>> build on
>>>> its past successes http://t.co/CmDglDkg PN_Newsdesk #Portsmouth More
>>>> room
>>>> for stalls at top Southsea school - A SOUTHSEA primary school still has
>>>> room
>>>> for people to se... http://t.co/ucbYWjPR VouchAR_Ports £14 instead of
>>>> £30
>>>> for a pedicure with foiled transfer at Forever Young, Stoke-on-Trent -
>>>> get... http://t.co/P7gJBcl8 #Portsmouth TelArnott Looking forward to #K1
>>>> today! #gym01 #portsmouth Best pic of #Gunwharf on fire I have seen
>>>> http://t.co/8LNAiqiD #portsmouth dangermain RT @portsmouthnews News
>>>> Large
>>>> fire at Gunwharf Quays - http://t.co/s9RWpY0i #portsmouth #southsea
>>>> iamnik
>>>> RT
>>>>  @BBCRadioSolent BREAKING NEWS - Reports of a large fire at
>>>> #GunwharfQuays
>>>> in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM solentmotorcars
>>>> RT
>>>> @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at
>>>> #GunwharfQuays
>>>> in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM HantsChiefAlex
>>>> RT
>>>> @BBCRadioSolent BREAKING NEWS - Reports of a large fire at
>>>> #GunwharfQuays in
>>>> #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM BBCRadioSolent Can
>>>> you
>>>> see the #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961.
>>>> #Portsmouth")
>>>>> str_extract_all(tweets, "#[a-z//A-Z//0-9]+")
>>>> [[1]]
>>>>  [1] "#dayatthenews"  "#pompeyhacks"   "#portsmouth"    "#southsea"
>>>>  "#Portsmouth"    "#portsmouth"    "#portsmouth"
>>>>  [8] "#portsmouth"    "#GunwharfQuays" "#Portsmouth"    "#Gunwharf"
>>>>  "#portsmouth"    "#Portsmouth"    "#Gunwharf"
>>>> [15] "#portsmouth"    "#GunwharfQuays" "#Portsmouth"    "#Portsmouth"
>>>>  "#Portsmouth"    "#Portsmouth"    "#K1"
>>>> [22] "#gym01"         "#portsmouth"    "#Gunwharf"      "#portsmouth"
>>>>  "#portsmouth"    "#southsea"      "#GunwharfQuays"
>>>> [29] "#Portsmouth"    "#GunwharfQuays" "#Portsmouth"    "#GunwharfQuays"
>>>> "#Portsmouth"    "#GunwharfQuays" "#Portsmouth"
>>>>
>>>> Please I need help.
>>>>
>>>> Mariam
>>>
>>
>>
>>
>> --
>> Sarah Goslee
>> http://www.functionaldiversity.org
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list