写 Scrapy 爬虫时,遇到了 js 进行跳转的页面,大家有没有好的解决方法?
答案是:
[splash](https://github.com/scrapinghub/splash)
Splash is a javascript rendering service with an HTTP API. It’s a lightweight browser with an HTTP API, implemented in Python 3 using Twisted and QT5.
It’s fast, lightweight and state-less which makes it easy to distribute.
Documentation Documentation is available here: https://splash.readthedocs.io/
[scrapy-splash](https://github.com/scrapy-plugins/scrapy-splash)
This library provides Scrapy and JavaScript integration using Splash. The license is BSD 3-clause.
参考:
-
[写 Scrapy 爬虫时,遇到了 js 进行跳转的页面,大家有没有好的解决方法](https://www.v2ex.com/t/485870#)
本条目发布于[2018年9月4日](https://c4ys.com/archives/1552 "10:04")。属于[Python](https://c4ys.com/archives/category/python)分类,被贴了 [Python](scrapy(splash(https://c4ys.com/archives/tag/splash) 标签。