我认为这里的要点是.find_next_sibling()
范围已开启下一级在树上。
While .next_element
and .next_sibling
范围是在同一水平的解析树。
因此,看一下并打印元素的名称,您将看到下一个元素不是标签,因为树的同一级别上没有任何内容:
for div in soup.find_all('div', class_="one-ad-title"):
print('-> ', div.next_element.name)
print('-> ', div.next_sibling.name)
print('-> ', div.find_next_sibling().name)
#output
-> None
-> None
-> div
因此,如果您将输入更改为一行并且没有空间,...在标签之间,您得到以下结果:
from bs4 import BeautifulSoup
html = """
<div class="......><div class="one-ad-desc"><div class="one-ad-title"><a class="one-ad-link" href="www this is the URL!"><h5>Text needed</h5></a></div><div class="one-ad-desc">...and some more needed text here!</div></div></div>"""
soup = BeautifulSoup(html, 'lxml')
for div in soup.find_all('div', class_="one-ad-title"):
print('-> ', div.next_element)
print('-> ', div.next_sibling)
print('-> ', div.find_next_sibling())
Output:
-> <a class="one-ad-link" href="www this is the URL!"><h5>Text needed</h5></a>
-> <div class="one-ad-desc">...and some more needed text here!</div>
-> <div class="one-ad-desc">...and some more needed text here!</div>
Note “需要文本”不在您所选标签的同级标签中,而是在其子标签之一中。选择“需要文本”->print('-> ', div.find_next().text)