xpath解析58二手房的房源信息【★★】

By yesmore on 2021-07-23
阅读时间 1 分钟
文章共 163
阅读量

复习:XPath语法

需求:爬取58二手房中的房源信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env python 
# -*- coding:utf-8 -*-
import requests
from lxml import etree
# 需求:爬取58二手房中的房源信息

if __name__ == "__main__":
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36'
}
# 爬取到页面源码数据
url = 'https://bj.58.com/ershoufang/'
page_text = requests.get(url=url, headers=headers).text

# 数据解析
tree = etree.HTML(page_text)
# 存储的就是li标签对象
div_list = tree.xpath('//section[@class="list"]//h3/text()')
print(div_list)
fp = open('58.txt', 'w', encoding='utf-8')
for item in div_list:
fp.write(item + '\n')


Tips: Please indicate the source and original author when reprinting or quoting this article.