我尝试从此 HTML 中提取价格“2 890 000Kč”和“地址”(有 12 个相同的:
<div class="list-items__content list-items__content__1"> <div class="list-items__content__in"> <a href="#" class="in-heart js-heart " data-tooltip="Přidat do oblíbených" onclick="toggleFavorite(1234, this)"> <i class="icon icon__heart-grey"></i> </a> </div> <div class="list-items__content__in"> <h2 class="list-items__item__title list-items__item__title__1" itemprop="name"> <a href="/url/..." itemprop="url" class="js-simulate-link-target" onclick="return loadPropertyToModal(1234);" title="house value"> Some text about house </a> </h2> <ul> <li> 2 890 000Kč </li> <li> Address </li> </ul> </div> </div>
我这样尝试过:
cena = soup.select(".list-items__content__in > li:nth-child(0)") print(cena)
输出: []
需要打印:2 890 000 Kč,地址
如何仅从第一个 li 标签(2 890 000Kč)和第二个 li 标签值(地址)获取值?
尝试这个:
cena = BeautifulSoup(sample, "html.parser").select(".list-items__content__in > ul > li") print([c.getText(strip=True) for c in cena])
Output:
['2 890 000Kč', 'Address']
EDIT:
You can join()输出:
join()
print(" ".join(c.getText(strip=True) for c in cena))
2 890 000Kč Address