scrapy 设置请求Referer

Yishto 2021-08-20 21:46:43
Categories: Tags:

如果你想改变引荐在蜘蛛的要求,你可以在settings.py文件更改DEFAULT_REQUEST_HEADERS

例子:

方式一:

1
2
3
4
DEFAULT_REQUEST_HEADERS = { 
'Referer': 'http://www.google.com'
}

方式二:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from scrapy.contrib.spiders import CrawlSpider 
from scrapy.http import Request

class MySpider(CrawlSpider):
name = "myspider"
allowed_domains = ["example.com"]
start_urls = [
'http://example.com/foo'
'http://example.com/bar'
'http://example.com/baz'
]
rules = [(...)]

def start_requests(self):
requests = []
for item in start_urls:
requests.append(Request(url=item, headers={'Referer':'http://www.example.com/'}))
return requests

def parse_me(self, response):
(...)