如何使用多个请求并在scrapy python中在它们之间传递项目

我有item对象,我需要将其传递给许多页面以将数据存储在单个项目中

喜欢我的项目是

class DmozItem(Item):

title = Field()

description1 = Field()

description2 = Field()

description3 = Field()

现在,这三个描述位于三个单独的页面中。我想做些像

现在这对 parseDescription1

def page_parser(self, response):

sites = hxs.select('//div[@class="row"]')

items = []

request = Request("http://www.example.com/lin1.cpp", callback =self.parseDescription1)

request.meta['item'] = item

return request

def parseDescription1(self,response):

item = response.meta['item']

item['desc1'] = "test"

return item

但我想要类似的东西

def page_parser(self, response):

sites = hxs.select('//div[@class="row"]')

items = []

request = Request("http://www.example.com/lin1.cpp", callback =self.parseDescription1)

request.meta['item'] = item

request = Request("http://www.example.com/lin1.cpp", callback =self.parseDescription2)

request.meta['item'] = item

request = Request("http://www.example.com/lin1.cpp", callback =self.parseDescription2)

request.meta['item'] = item

return request

def parseDescription1(self,response):

item = response.meta['item']

item['desc1'] = "test"

return item

def parseDescription2(self,response):

item = response.meta['item']

item['desc2'] = "test2"

return item

def parseDescription3(self,response):

item = response.meta['item']

item['desc3'] = "test3"

return item

回答:

没问题。以下是你的代码的正确版本:

def page_parser(self, response):

sites = hxs.select('//div[@class="row"]')

items = []

request = Request("http://www.example.com/lin1.cpp", callback=self.parseDescription1)

request.meta['item'] = item

yield request

request = Request("http://www.example.com/lin1.cpp", callback=self.parseDescription2, meta={'item': item})

yield request

yield Request("http://www.example.com/lin1.cpp", callback=self.parseDescription3, meta={'item': item})

def parseDescription1(self,response):

item = response.meta['item']

item['desc1'] = "test"

return item

def parseDescription2(self,response):

item = response.meta['item']

item['desc2'] = "test2"

return item

def parseDescription3(self,response):

item = response.meta['item']

item['desc3'] = "test3"

return item

以上是 如何使用多个请求并在scrapy python中在它们之间传递项目 的全部内容, 来源链接: utcz.com/qa/407183.html

回到顶部