Why is python's word in words iterating on letters instead of words? -


when enter following code:

def word_feats(words):     return dict([(word, true) word in words]) print(word_feats("i love sandwich.")) 

i output in letters instead of words

{'a': true, ' ': true, 'c': true, 'e': true, 'd': true, 'i': true, 'h': true, 'l': true, 'o': true, 'n': true, 'i': true, 's': true, 't': true, 'w': true, 'v': true, '.': true} 

what doing wrong?

thanks!

you need explicitly split string on whitespace:

def word_feats(words):     return dict([(word, true) word in words.split()]) 

this uses str.split() without arguments, splitting on arbitrary-width whitespace (including tabs , line separators). string sequence of individual characters otherwise, , direct iteration indeed loop on each character.

splitting words, however, has explicit operation need perform yourself, because different use-cases have different needs on how split string separate parts. punctuation count, example? parenthesis or quoting, should words grouped not split, perhaps? etc.

if doing setting values true, it'll more efficient use dict.fromkeys() instead:

def word_feats(words):     return dict.fromkeys(words.split(), true) 

demo:

>>> def word_feats(words): ...     return dict.fromkeys(words.split(), true) ...  >>> print(word_feats("i love sandwich.")) {'i': true, 'this': true, 'love': true, 'sandwich.': true} 

Comments

Popular posts from this blog

java - Intellij Synchronizing output directories .. -

git - Initial Commit: "fatal: could not create leading directories of ..." -