给定一个 html 代码可以说:
<div class="class1">
<span class="class2">some text</span>
<span class="class3">some text</span>
<span class="class4">some text</span>
</div>
如何检索所有类名?即:['class1','class2','class3','class4']
I tried:
soup.find_all(class_=True)
但它检索整个标签,然后我需要对字符串执行一些正则表达式
You can 对待每一个Tag实例被发现为字典 https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes当涉及到检索属性时。注意class
属性值将是list since class
是一个特殊的“多值”属性 https://www.crummy.com/software/BeautifulSoup/bs4/doc/#multi-valued-attributes:
classes = []
for element in soup.find_all(class_=True):
classes.extend(element["class"])
Or:
classes = [value
for element in soup.find_all(class_=True)
for value in element["class"]]
Demo:
from bs4 import BeautifulSoup
data = """
<div class="class1">
<span class="class2">some text</span>
<span class="class3">some text</span>
<span class="class4">some text</span>
</div>
"""
soup = BeautifulSoup(data, "html.parser")
classes = [value
for element in soup.find_all(class_=True)
for value in element["class"]]
print(classes)
# Returns
# ['class1', 'class2', 'class3', 'class4']
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)