scrapy 2.3 在条件中使用文本节点

当需要将文本内容用作 XPath string function 避免使用 .//text() and use just . 相反。

这是因为表达式 .//text() 生成一个文本元素集合--a node-set . 当一个节点集被转换成一个字符串时，当它作为参数传递给一个字符串函数（如 contains() 或 starts-with() ，它只为第一个元素生成文本。

例子：

>>> from scrapy import Selector
>>> sel = Selector(text='<a href="#">Click here to go to the <strong>Next Page</strong></a>')

转换A node-set 字符串：

>>> sel.xpath('//a//text()').getall() # take a peek at the node-set
['Click here to go to the ', 'Next Page']
>>> sel.xpath("string(//a[1]//text())").getall() # convert it to string
['Click here to go to the ']

A node 但是，转换为字符串后，会将其自身的文本加上其所有后代的文本组合在一起：

>>> sel.xpath("//a[1]").getall() # select the first node
['<a href="#">Click here to go to the <strong>Next Page</strong></a>']
>>> sel.xpath("string(//a[1])").getall() # convert it to string
['Click here to go to the Next Page']

所以，使用 .//text() 在这种情况下，节点集不会选择任何内容：

>>> sel.xpath("//a[contains(.//text(), 'Next Page')]").getall()
[]

但是使用 . 指的是节点：

>>> sel.xpath("//a[contains(., 'Next Page')]").getall()
['<a href="#">Click here to go to the <strong>Next Page</strong></a>']

w3cschool 编程狮，随时随地学编程

scrapy 2.3 在条件中使用文本节点

scrapy 2.3 安装指南

scrapy 2.3 教程

scrapy 2.3 命令行工具

scrapy 2.3 蜘蛛

scrapy 2.3 选择器

scrapy 2.3 使用选择器

scrapy 2.3 使用xpaths

scrapy 2.3 使用exslt扩展

scrapy 2.3 内置选择器引

scrapy 2.3 选择器实例

scrapy 2.3 项目

scrapy 2.3 项目类型

scrapy 2.3 使用项目对象

scrapy 2.3 使用项目对象

scrapy 2.3 项目加载器

scrapy 2.3 shell

scrapy 2.3 shell使用外壳

scrapy 2.3 项目管道

scrapy 2.3 项目管道示例

scrapy 2.3 Feed导出

scrapy 2.3 请求和响应

无标题文章

scrapy 2.3 请求子类

scrapy 2.3 链接提取器

scrapy 2.3 设置

scrapy 2.3 登录

scrapy 2.3 日志记录配置

scrapy 2.3 统计数据集合

scrapy 2.3 发送电子邮件

scrapy 2.3 远程登录控制台

scrapy 2.3 常见问题

scrapy 2.3 调试spiders

scrapy 2.3 蜘蛛合约

scrapy 2.3 常用做法

scrapy 2.3 宽爬行

scrapy 2.3 使用浏览器的开发人员工具进行抓取

scrapy 2.3 选择动态加载的内容

scrapy 2.3 调试内存泄漏

scrapy 2.3 下载和处理文件和图像

scrapy 2.3 如何部署蜘蛛

scrapy 2.3 AutoThrottle扩展