regex - Regular Expression Vs. String Parsing -


at risk of open can of worms , getting negative votes find myself needing ask,

when should use regular expressions , when more appropriate use string parsing?

and i'm going need examples , reasoning stance. i'd address things readability, maintainability, scaling, , of performance in answer.

i found question here had 1 answer bothered giving example. need more understand this.

i'm playing around in c++ regular expressions in every higher level language , i'd know how different languages use/ handle regular expressions that's more after thought.

thanks in understanding it!

edit: i'm still looking more examples , talk on response far has been great. :)

it depends on how complex language you're dealing is.

splitting

this great when works, works when there no escaping conventions. not work csv example because commas inside quoted strings not proper split points.

foo,bar,baz

can split, but

foo,"bar,baz"

cannot.

regular

regular expressions great simple languages have "regular grammar". perl 5 regular expressions little more powerful due back-references general rule of thumb this:

if need match brackets ((...), [...]) or other nesting html tags, regular expressions not sufficient.

you can use regular expressions break string known number of chunks -- example, pulling out month/day/year date. wrong job parsing complicated arithmetic expressions though.

obviously, if write regular expression, walk away cup of coffee, come back, , can't understand wrote, should clearer way express you're doing. email addresses @ limit of 1 can correctly & readably handle using regular expressions.

context free

parser generators , hand-coded pushdown/peg parsers great dealing more complicated input need handle nesting can build tree or deal operator precedence or associativity.

context free parsers use regular expressions first break input chunks (spaces, identifiers, punctuation, quoted strings) , use grammar turn stream of chunks tree form.

the rule of thumb cf grammars is

if regular expressions insufficient words in language have same meaning regardless of prior declarations cf works.

non context free

if words in language change meaning depending on context, need more complicated solution. these hand-coded solutions.

for example, in c,

#ifdef x   typedef int foo #endif  foo * bar 

if foo type, foo * bar declaration of foo pointer named bar. otherwise multiplication of variable named foo variable named bar.


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -