scrapy 2.3 选择元素属性

有几种方法可以获得属性的值。首先，可以使用XPath语法：

>>> response.xpath("//a/@href").getall()
['image1.html', 'image2.html', 'image3.html', 'image4.html', 'image5.html']

xpath语法有几个优点：它是标准的xpath特性，并且 @attributes 可用于xpath表达式的其他部分-例如，可以按属性值筛选。

scrapy还提供了对css选择器的扩展 (::attr(...) )它允许获取属性值：

>>> response.css('a::attr(href)').getall()
['image1.html', 'image2.html', 'image3.html', 'image4.html', 'image5.html']

除此之外，还有 .attrib 选择器的属性。如果您喜欢在Python代码中查找属性，而不使用xpath或CSS扩展，则可以使用它：

>>> [a.attrib['href'] for a in response.css('a')]
['image1.html', 'image2.html', 'image3.html', 'image4.html', 'image5.html']

此属性在SelectorList上也可用；它返回一个字典，其中包含第一个匹配元素的属性。当选择器预期给出单个结果时（例如，当按元素ID选择时，或在页面上选择唯一元素时），使用它非常方便：

>>> response.css('base').attrib
{'href': 'http://example.com/'}
>>> response.css('base').attrib['href']
'http://example.com/'

.attrib 空SelectorList的属性为空：

>>> response.css('foo').attrib
{}

w3cschool 编程狮，随时随地学编程

scrapy 2.3 选择元素属性

scrapy 2.3 安装指南

scrapy 2.3 教程

scrapy 2.3 命令行工具

scrapy 2.3 蜘蛛

scrapy 2.3 选择器

scrapy 2.3 使用选择器

scrapy 2.3 使用xpaths

scrapy 2.3 使用exslt扩展

scrapy 2.3 内置选择器引

scrapy 2.3 选择器实例

scrapy 2.3 项目

scrapy 2.3 项目类型

scrapy 2.3 使用项目对象

scrapy 2.3 使用项目对象

scrapy 2.3 项目加载器

scrapy 2.3 shell

scrapy 2.3 shell使用外壳

scrapy 2.3 项目管道

scrapy 2.3 项目管道示例

scrapy 2.3 Feed导出

scrapy 2.3 请求和响应

无标题文章

scrapy 2.3 请求子类

scrapy 2.3 链接提取器

scrapy 2.3 设置

scrapy 2.3 登录

scrapy 2.3 日志记录配置

scrapy 2.3 统计数据集合

scrapy 2.3 发送电子邮件

scrapy 2.3 远程登录控制台

scrapy 2.3 常见问题

scrapy 2.3 调试spiders

scrapy 2.3 蜘蛛合约

scrapy 2.3 常用做法

scrapy 2.3 宽爬行

scrapy 2.3 使用浏览器的开发人员工具进行抓取

scrapy 2.3 选择动态加载的内容

scrapy 2.3 调试内存泄漏

scrapy 2.3 下载和处理文件和图像

scrapy 2.3 如何部署蜘蛛

scrapy 2.3 AutoThrottle扩展