i'm trying grab data from: http://www.boerse-frankfurt.de/de/etfs/ishares+msci+world+momentum+factor+ucits+etf+de000a12bhf2
the types of data i'm looking located in classes named singlebox list_component
. let's want extract total expense ratio (0.30%). located in td class
called: right column-datavalue lastcolofrow
.
but if do:
dues = driver.find_element_by_class_name("right column-datavalue lastcolofrow ") expense_ratio = re.search(r"(.{4})(?=%)", dues.text).group(0).encode("utf-8")
i get:
invalidselectorerror: compound class names not permitted
and adding problem, there seem multiple instances of right column-datavalue lastcolofrow
doesn't serve unique identifier.
note: if problem better solved beautifulsoup
instead of selenium
, please let me know.
you can use find_element_by_css_selector()
instead match element multiple css classes :
dues = driver.find_element_by_css_selector(".right.column-datavalue.lastcolofrow")
but claimed above selector isn't unique, can use xpath match css classes considering order (i found xpath unique on web page) :
xpath = "//td[@class='right column-datavalue lastcolofrow']" dues = driver.find_element_by_xpath(xpath)
another way approach using xpath selecting <td>
element follows <td>
containing text gesamtkostenquote
:
xpath = "//td[@class='column-datacaption' , normalize-space(text())='gesamtkostenquote']/following-sibling::td" dues = driver.find_element_by_xpath(xpath)
Comments
Post a Comment