Why is python's word in words iterating on letters instead of words? -
when enter following code:
def word_feats(words): return dict([(word, true) word in words]) print(word_feats("i love sandwich."))
i output in letters instead of words
{'a': true, ' ': true, 'c': true, 'e': true, 'd': true, 'i': true, 'h': true, 'l': true, 'o': true, 'n': true, 'i': true, 's': true, 't': true, 'w': true, 'v': true, '.': true}
what doing wrong?
thanks!
you need explicitly split string on whitespace:
def word_feats(words): return dict([(word, true) word in words.split()])
this uses str.split()
without arguments, splitting on arbitrary-width whitespace (including tabs , line separators). string sequence of individual characters otherwise, , direct iteration indeed loop on each character.
splitting words, however, has explicit operation need perform yourself, because different use-cases have different needs on how split string separate parts. punctuation count, example? parenthesis or quoting, should words grouped not split, perhaps? etc.
if doing setting values true
, it'll more efficient use dict.fromkeys()
instead:
def word_feats(words): return dict.fromkeys(words.split(), true)
demo:
>>> def word_feats(words): ... return dict.fromkeys(words.split(), true) ... >>> print(word_feats("i love sandwich.")) {'i': true, 'this': true, 'love': true, 'sandwich.': true}
Comments
Post a Comment