Processing Text for Sentiment and Other Good Stuff

textblob integrates nltk and pattern. It allows you to easily extract and derive information from a passage of text.

To install:

$ sudo pip install textblob

Based on the example, here:

import textblob
import pprint

text = '''
The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact. Snide comparisons to gelatin be darned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant.
'''

blob = textblob.TextBlob(text)

# Get parts of speech.
blob.tags

# Get list of individual noun-phrases.
blob.noun_phrases

# Print sentence and sentiment polarity:
for sentence in blob.sentences:
    print(sentence)
    print('')
    print(sentence.sentiment.polarity)
    print('')
    print('--')
    print('')

# Convert to Spanish.
blob.translate(to="es")

Output:

>>> import textblob
>>> import pprint
>>> 
>>> text = '''
... The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact. Snide comparisons to gelatin be darned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant.
... '''
>>> 
>>> blob = textblob.TextBlob(text)
>>>
>>> # Get parts of speech.
>>> blob.tags
[(u'The', u'DT'), (u'titular', u'JJ'), (u'threat', u'NN'), (u'of', u'IN'), (u'The', u'DT'), (u'Blob', u'NNP'), (u'has', u'VBZ'), (u'always', u'RB'), (u'struck', u'VBD'), (u'me', u'PRP'), (u'as', u'IN'), (u'the', u'DT'), (u'ultimate', u'JJ'), (u'movie', u'NN'), (u'monster', u'NN'), (u'an', u'DT'), (u'insatiably', u'RB'), (u'hungry', u'JJ'), (u'amoeba-like', u'JJ'), (u'mass', u'NN'), (u'able', u'JJ'), (u'to', u'TO'), (u'penetrate', u'VB'), (u'virtually', u'RB'), (u'any', u'DT'), (u'safeguard', u'VB'), (u'capable', u'JJ'), (u'of--as', u'JJ'), (u'a', u'DT'), (u'doomed', u'VBN'), (u'doctor', u'NN'), (u'chillingly', u'RB'), (u'describes', u'VBZ'), (u'it', u'PRP'), (u'assimilating', u'VBG'), (u'flesh', u'NN'), (u'on', u'IN'), (u'contact', u'NN'), (u'Snide', u'NNP'), (u'comparisons', u'NNS'), (u'to', u'TO'), (u'gelatin', u'NN'), (u'be', u'VB'), (u'darned', u'JJ'), (u'it', u'PRP'), (u"'", u'POS'), (u's', u'PRP'), (u'a', u'DT'), (u'concept', u'NN'), (u'with', u'IN'), (u'the', u'DT'), (u'most', u'RBS'), (u'devastating', u'JJ'), (u'of', u'IN'), (u'potential', u'JJ'), (u'consequences', u'NNS'), (u'not', u'RB'), (u'unlike', u'IN'), (u'the', u'DT'), (u'grey', u'JJ'), (u'goo', u'NN'), (u'scenario', u'NN'), (u'proposed', u'VBN'), (u'by', u'IN'), (u'technological', u'JJ'), (u'theorists', u'NNS'), (u'fearful', u'JJ'), (u'of', u'IN'), (u'artificial', u'JJ'), (u'intelligence', u'NN'), (u'run', u'VB'), (u'rampant', u'JJ')]
>>>
>>> # Get list of individual noun-phrases.
>>> blob.noun_phrases
WordList([u'titular threat', 'blob', u'ultimate movie monster', u'amoeba-like mass', 'snide', u'potential consequences', u'grey goo scenario', u'technological theorists fearful', u'artificial intelligence run rampant'])
>>>
>>> # Print sentence and sentiment polarity:
>>> for sentence in blob.sentences:
...     print(sentence)
...     print('')
...     print(sentence.sentiment.polarity)
...     print('')
...     print('--')
...     print('')
... 

The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact.

0.06

--

Snide comparisons to gelatin be darned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant.


-0.341666666667

--

>>>
>>> # Convert to Spanish.
>>> blob.translate(to="es")
TextBlob("La amenaza titular de The Blob siempre me ha parecido como el último monstruo de la película: una, la masa insaciablemente hambriento ameba capaz de penetrar prácticamente cualquier salvaguardia, capaz de - como médico condenado escalofriantemente lo describe - "asimilar carne en contacto. comparaciones Snide a la gelatina ser condenados, es un concepto con el más devastador de las posibles consecuencias, no muy diferente del escenario plaga gris propuesta por los teóricos tecnológicos temerosos de la inteligencia artificial ejecutar rampante.")

Awesome, right?

Advertisements