Lxml.html Extract A String By Searching For A Keyword
I have a portion of html like below The text I want to get
Solution 1:
from lxml import html
s = '<li><label>The Keyword:</label><span><ahref="../../..">The text</a></span></li>'
tree = html.fromstring(s)
text = tree.text_content()
print text
Solution 2:
You can modify the XPath slightly to work with your current structure - by getting the parent of the label, then looking back for the fist a
element, and taking the text from that...
>>> tree.xpath('//*[contains(text(), "The Keyword:")]/..//a/text()')
['The text']
But that may not be flexible enough...
Post a Comment for "Lxml.html Extract A String By Searching For A Keyword"