regex - Python: How do I continue searching a list for different elements efficiently? -
i have many lists of strings i've scraped , parsed , find strings in these lists using regular expressions. each string different, appear sequentially in lists. i.e. first string i'd appear before second string, second appear before third string, , on. however, can't use index because number of elements in between vary between lists.
ex. scraped these strings , stored in following lists:
personal_info = ["name: john doe", "wife: jane doe", "children: jenny doe", "children: johnny doe", "location: us", "accounts: boa", "accounts: chase", "house: own", "car: own", "other: none"]
personal_info2 = ["name: james lee", "location: can", "accounts: citibank", "house: rent", "car: own", "other: none"]
and grab elements starting name, location, , house, may or may not have multiple elements in between. location after name , house after location.
because i'll repeating on many lists, i'd search using first regex, continue searching using next regex left off, because know appear sequentially. there concise way in python? right have set of loops, break when there's match, , record index pass next loop.
if must shown:
idx = 0 string in string_list: idx +=1 if re.search('pattern', string) not none: string_one = re.search('pattern', string).group(0)
a short code prints requested fields:
x=["name", "location", "house"] y=iter(x) z=y.next() in personal_info: if a.startswith(z): print try: z=y.next() except stopiteration: break
you can replace "startswith" regex, , "print" other action.
Comments
Post a Comment