<html>
<body>
<div class="main">
<div class="submain"><h2></h2><p></p><ul></ul>
</div>
<div class="submain"><h2></h2><p></p><ul></ul>
</div>
</div>
</body>
</html>
我将 html 加载到HtmlDocument
。然后我选择 XPath 作为submain
。然后我不知道如何访问每个标签,即h2
, p
分别地。
HtmlAgilityPack.HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class=\"submain\"]");
foreach (HtmlAgilityPack.HtmlNode node in nodes) {}
如果我使用node.InnerText
我收到了所有的文字InnerHtml
也没什么用。如何选择单独的标签?
以下内容会有所帮助:
HtmlAgilityPack.HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class=\"submain\"]");
foreach (HtmlAgilityPack.HtmlNode node in nodes) {
//Do you say you want to access to <h2>, <p> here?
//You can do:
HtmlNode h2Node = node.SelectSingleNode("./h2"); //That will get the first <h2> node
HtmlNode allH2Nodes= node.SelectNodes(".//h2"); //That will search in depth too
//And you can also take a look at the children, without using XPath (like in a tree):
HtmlNode h2Node = node.ChildNodes["h2"];
}
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)