python - Unicode object error in parsing XML using BeautifulSoup -


parsing contents of 'name' tag in xml output using beautifulsoup gives me following error:

attributeerror: 'unicode' object has no attribute 'get_text' 

xml output:

<show>   <stud>     <__readonly__>       <table_stud>         <row_stud>           <name>rice</name>           <dept>chem</dept>           .           .           .         </row_stud>       </table_stud>     </__readonly__>   </stud> </show> 

however if access contents of other tags 'dept' seems work fine.

stud_info = output_xml.find_all('row_stud') eachstud in range(len(stud_info)):      print stud_info[eachstud].dept.get_text()   #gives 'chem'     print stud_info[eachstud].name.get_text()   #---unicode error--- 

can python/beautifulsoup experts me resolve this? (i know beautifulsoup not ideal parsing xml. lets i'm compelled use )

tag.name attribute containing tag name; it's value here row_stud.

attribute access contained tags shortcut .find(attributename), works if there isn't attribute in api same name. use .find() instead:

print stud_info[eachstud].find('name').get_text() 

you can loop on stud_info result list directly, no need use range() here:

stud_info = output_xml.find_all('row_stud') eachstud in stud_info:     print eachstud.dept.get_text()     print eachstud.find('name').get_text() 

i notice searching row_stud in lower-case. if parsing xml beautifulsoup, make sure have lxml installed , tell beautifulsoup xml processing, won't html-ize tags (lowercase them):

soup = beautifulsoup(source, 'xml') 

Comments

Popular posts from this blog

How to access named pipes using JavaScript in Firefox add-on? -

multithreading - OPAL (Open Phone Abstraction Library) Transport not terminated when reattaching thread? -

node.js - req param returns an empty array -