我有以下 xml:
<document>
<internal-code code="201">
<internal-desc>Biscuits Wrapped</internal-desc>
<top-grouping>Finished</top-grouping>
<web-category>Biscuits</web-category>
<web-sub-category>Biscuits (Wrapped)</web-sub-category>
</internal-code>
<internal-code code="202">
<internal-desc>Biscuits Sweet</internal-desc>
<top-grouping>Finished</top-grouping>
<web-category>Biscuits</web-category>
<web-sub-category>Biscuits (Sweets)</web-sub-category>
</internal-code>
<internal-code code="221">
<internal-desc>Biscuits Savoury</internal-desc>
<top-grouping>Finished</top-grouping>
<web-category>Biscuits</web-category>
<web-sub-category>Biscuits For Cheese</web-sub-category>
</internal-code>
....
</document>
我已使用以下代码将其加载到树中:
try:
groups = etree.parse(PRODUCT_GROUPS_XML_FILEPATH)
root = groups.getroot()
internalGroup = root.findall("./internal-code")
LOG.append("[INFO] product groupings file loaded and parsed ok")
except Exception as e:
LOG.append("[ERROR] PRODUCT GROUPINGS XML FILE ACCESS PROBLEM")
LOG.append("[***TERMINATED***]")
writelog()
exit()
我想使用 XPath 找到正确的,然后能够访问该组的子节点。因此,如果我正在搜索内部代码 221 并想要网络类别,我会执行以下操作:
internalGroup.find("internal-code", 221).get("web-category").text
我对 XML 和 Python 没有经验,而且我已经关注这个很久了。非常感谢所有的帮助。谢谢
根据xml.etree.ElementTree文档:
XPath 支持
该模块提供对 XPath 表达式的有限支持为了
定位树中的元素。目标是支持一小部分
缩写语法;完整的 XPath 引擎超出了范围
该模块。
Use lxml:
>>> import lxml.etree as ET
>>>
>>> s = '''
... <document>
... <internal-code code="201">
... <internal-desc>Biscuits Wrapped</internal-desc>
... <top-grouping>Finished</top-grouping>
... <web-category>Biscuits</web-category>
... <web-sub-category>Biscuits (Wrapped)</web-sub-category>
... </internal-code>
... <internal-code code="202">
... <internal-desc>Biscuits Sweet</internal-desc>
... <top-grouping>Finished</top-grouping>
... <web-category>Biscuits</web-category>
... <web-sub-category>Biscuits (Sweets)</web-sub-category>
... </internal-code>
... <internal-code code="221">
... <internal-desc>Biscuits Savoury</internal-desc>
... <top-grouping>Finished</top-grouping>
... <web-category>Biscuits</web-category>
... <web-sub-category>Biscuits For Cheese</web-sub-category>
... </internal-code>
... </document>
... '''
>>>
>>> root = ET.fromstring(s)
>>> for text in root.xpath('.//internal-code[@code="221"]/web-category/text()'):
... print(text)
...
Biscuits
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)