python - In BeautifulSoup, Ignore Children Elements While Getting Parent Element Data -


i have html follows:

<html>     <div class="maindiv">         text data here          <br>         continued text data         <br>         <div class="somename">             text & data want omit         </div>     </div> </html> 

i trying the text found in maindivelement, without getting text data found in somename element. in cases, in experience anyway, text data contained within child element. have ran particular case data seems contained will-nilly , bit harder filter.

my approach follows:

textdata= soup.find('div', class_='maindiv').get_text()

this gets text data found within maindiv element, text data found in somename div element.

the logic i'd use more along lines of: textdata = soup.find('div', class_='maindiv').get_text(recursive=false) omit text data found within somename element.

i know recursive=false argument works locating parent-level elemenets when searching dom structure using beautifulsoup, can't used .get_text() method.

i've realized approach of finding text, subtracting string data found in somename element string data found in maindiv element, i'm looking little more efficient.

not far subtracting method, 1 way (at least in python 3) discard child divs.

s = soup.find('div', class_='maindiv')  child in s.find_all("div"):     child.decompose()  print(s.get_text()) 

would print like:

text data here          continued text data 

that might bit more efficient , flexible subtracting strings, though still needs go through children first.


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -