python 2.7 - Using scrapy's FormRequest no form is submitted -


after trying scrapy's first tutorial excited it. wanted try form submission well.

i have following script , if print out response.body page form , nothing happened. can me how results page?

# spiders/holidaytaxi.py import scrapy scrapy.http import request, formrequest scrapy.selector import htmlxpathselector, selector   class holidaytaxispider(scrapy.spider):     name = "holidaytaxi"     allowed_domains = ["holidaytaxis.com"]     start_urls = ['http://holidaytaxis.com/en']      def parse(self, response):          return [formrequest.from_response(             response,             formdata={                 'bookingtypeid':'return',                 'airpotzgroupid_chosen':'turkey',                 'pickup_chosen':'antalya airport',                 'dropoff_chosen':'alanya',                 'arrivaldata':'12-07-2015',                 'arrivalhour':'12',                 'arrivalmin':'00',                 'departuredata':'14-07-2015',                 'departurehour':'12',                 'departuremin':'00',                 'adults':'2',                 'children':'0',                 'infants':'0'             },             callback=self.parseresponse         )]      def parseresponse(self, response):         print "hello world"         print response.status         print response         heading = response.xpath('//div/h2')         print "heading: ", heading 

the output is:

2015-07-05 16:23:59 [scrapy] debug: telnet console listening on 127.0.0.1:6023 2015-07-05 16:24:01 [scrapy] debug: redirecting (301) <get http://www.holidaytaxis.com/en> <get http://holidaytaxis.com/en> 2015-07-05 16:24:02 [scrapy] debug: crawled (200) <get http://www.holidaytaxis.com/en> (referer: none) 2015-07-05 16:24:03 [scrapy] debug: crawled (200) <post http://www.holidaytaxis.com/en/search> (referer: http://www.holidaytaxis.com/en) hello world 200 <200 http://www.holidaytaxis.com/en/search> heading:  [] 

the main problem in how passing booking type, country, pickup , dropoff. need pass corresponding "id"s instead of literal strings.

the following work in case:

return formrequest.from_response(     response,     formxpath="//form[@id='transfer_search']",     formdata={         'bookingtypeid': '1',         'airportgroupid': '14',         'pickup': '121',         'dropoff': '1076',         'arrivaldate': '12-07-2015',         'arrivalhour': '12',         'arrivalmin': '00',         'departuredate': '14-07-2015',         'departurehour': '12',         'departuremin': '00',         'adults': '2',         'children': '0',         'infants': '0',         'submit': 'get quote'     },     callback=self.parseresponse ) 

note i've fixed arrivaldate , departuredate parameter names.


you may want ask how did these ids. question - i've used browser developer tools , studied outgoing post request issued on search form submit:

enter image description here

now real problem how ids in scrapy code. booking types easy handle - there 3 types having ids 1 3. list of countries available on same search form page in select tag id="airportgroupid" - can construct mapping dictionary between country name , it's internal id, e.g.:

countries = {     option.xpath("@label").extract()[0]: option.xpath("@value").extract()[0]     option in response.xpath("//select[@id='airportgroupid']//option") }  country_id = countries["turkey"] 

it getting more difficult pickup , dropoff locations - booking type , country dependent , retrieved additional xhr requests "http://www.holidaytaxis.com/en/search/getpickup" , "http://www.holidaytaxis.com/en/search/getdropoff" endpoints.


Comments

Popular posts from this blog

python - argument must be rect style object - Pygame -

webrtc - Which ICE candidate am I using and why? -

c# - Better 64-bit byte array hash -