python - Scrapy -- scrappy not returning information from a html tag -


i'm trying of scraping website, i'm using scraping scrapy , when make scraping html data, html tag need obtain data, i'm using xpath obtain of data tag not return nothing

this website ("http://www.exito.com/products/0000293501259261/arroz+fortificado?cid=&page=") , part of html i'm scraping

<div class="pdpinfoproductprice">     <meta itemprop="currency" content="cop"> <h4 itemprop="price" class="price">    $5.350</h4> </div> 

i need use scrapy on tag h4 obtain price , when i'm scraping obtain class empty, class not have tag inside should simple thing do, can not get price in way

i using xpath on page can obtain price

sel.xpath('[@id="plpcontent"]/div[3]/div[5]/h4').extract() sel.xpath('//*[@id="atg_store_two_column_main"]/div[2]').extract() //*[@id="mainwhitecontent"]/div[2]/div[1]/div[1]/div[1]/div[3]/div[1]/div/h4 

in first request, web page asks region , stores in cookie.

example dialog http://images.jenserat.de/2014-04-23_0903.png

you can reproduce either deleting cookies or using browser/private browsing session.

to workaround, have send cookie name selectedcity , region code ar. use when creating request:

request = request(             url="http://www.exito.com/products/0000293501259261/arroz+fortificado?cid=&page=",             cookies={'selectedcity': 'ar'}           ) 

for xpath expression, i'd go for

//div[@class='pdpinfoproductprice']/h4[@itemprop='price']/text() 

also take answer on matching html classes account: selecting css class xpath.


Comments

Popular posts from this blog

java - Intellij Synchronizing output directories .. -

git - Initial Commit: "fatal: could not create leading directories of ..." -