python - scrapy crawler caught exception reading instance data -
i new python , want use scrapy build web crawler. go through tutorial in http://blog.siliconstraits.vn/building-web-crawler-scrapy/. spider code likes following:
from scrapy.spider import basespider scrapy.selector import htmlxpathselector nettuts.items import nettutsitem scrapy.http import request class myspider(basespider): name = "nettuts" allowed_domains = ["net.tutsplus.com"] start_urls = ["http://net.tutsplus.com/"] def parse(self, response): hxs = htmlxpathselector(response) titles = hxs.select('//h1[@class="post_title"]/a/text()').extract() title in titles: item = nettutsitem() item["title"] = title yield item
when launch spider command line: scrapy crawl nettus, has following error:
[boto] debug: retrieving credentials metadata server. 2015-07-05 18:27:17 [boto] error: caught exception reading instance data traceback (most recent call last): file "/anaconda/lib/python2.7/site-packages/boto/utils.py", line 210, in retry_url r = opener.open(req, timeout=timeout) file "/anaconda/lib/python2.7/urllib2.py", line 431, in open response = self._open(req, data) file "/anaconda/lib/python2.7/urllib2.py", line 449, in _open '_open', req) file "/anaconda/lib/python2.7/urllib2.py", line 409, in _call_chain result = func(*args) file "/anaconda/lib/python2.7/urllib2.py", line 1227, in http_open return self.do_open(httplib.httpconnection, req) file "/anaconda/lib/python2.7/urllib2.py", line 1197, in do_open raise urlerror(err) urlerror: <urlopen error [errno 65] no route host> 2015-07-05 18:27:17 [boto] error: unable read instance data, giving
really not know what's wrong. hope help
in settings.py file: add following code settings:
download_handlers = {'s3': none,}
Comments
Post a Comment