复习:XPath语法
需求:爬取58二手房中的房源信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
|
import requests from lxml import etree
if __name__ == "__main__": headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36' } url = 'https://bj.58.com/ershoufang/' page_text = requests.get(url=url, headers=headers).text
tree = etree.HTML(page_text) div_list = tree.xpath('//section[@class="list"]//h3/text()') print(div_list) fp = open('58.txt', 'w', encoding='utf-8') for item in div_list: fp.write(item + '\n')
|
Tips:
Please indicate the source and original author when reprinting or quoting this article.