cd desktop/projects
プロジェクトを作る
projects % scrapy startproject yodobashi
spiderを新規作成
yodobashi % scrapy genspider desktop www.yodobashi.com/category/19531/11970/34646
www.google.com/?hl=ja
設定
/projects/yodobashi/yodobashi/settings.py
・追加 FEED_EXPORT_ENCODING = ‘utf-8’
・コメントアウト外す
DOWNLOAD_DELAY = 3
HTTPCACHE_ENABLED = True
HTTPCACHE_EXPIRATION_SECS = 86400
HTTPCACHE_DIR = ‘httpcache’
/Users/do/Desktop/projects/yodobashi/yodobashi/spiders/desktop.py
・ドメインだけにする allowed_domains = [‘www.yodobashi.com’]
・httpsにする start_urls = [‘https://www.yodobashi.com/category/19531/11970/34646/’]
cd Desktop/projects/yodobashi/
クローリング実行
yodobashi % scrapy crawl desktop
ファイル出力
yodobashi % scrapy crawl desktop -o data.json
yodobashi % scrapy crawl desktop -o “%(name)s_%(time)s.json”
https://app.zyte.com/p/568155/deploy?state=deploy
$ pip install shub
$ shub login
API key:
$ shub deploy 568155