Lxml.html Extract A String By Searching For A Keyword

November 16, 2024 Post a Comment

I have a portion of html like below

The Keyword:The text

I want to get

Solution 1:

from lxml import html

s = '<li><label>The Keyword:</label><span><ahref="../../..">The text</a></span></li>'

tree = html.fromstring(s)
text = tree.text_content()
print text

Solution 2:

You can modify the XPath slightly to work with your current structure - by getting the parent of the label, then looking back for the fist a element, and taking the text from that...

>>> tree.xpath('//*[contains(text(), "The Keyword:")]/..//a/text()')
['The text']

But that may not be flexible enough...

Baca Juga

Dynamically Derive A Class In Python
Why Would Using Selenium Webdriver To Execute Js Fine Locally On My Mac, But Not In A Docker Container?
Using Chromedriver With Selenium/python/ubuntu

theprettymind1987

Lxml.html Extract A String By Searching For A Keyword

Solution 1:

Solution 2:

Post a Comment for "Lxml.html Extract A String By Searching For A Keyword"

Widget HTML #3