regex - Regular Expression Vs. String Parsing -
at risk of open can of worms , getting negative votes find myself needing ask,
when should use regular expressions , when more appropriate use string parsing?
and i'm going need examples , reasoning stance. i'd address things readability, maintainability, scaling, , of performance in answer.
i found question here had 1 answer bothered giving example. need more understand this.
i'm playing around in c++ regular expressions in every higher level language , i'd know how different languages use/ handle regular expressions that's more after thought.
thanks in understanding it!
edit: i'm still looking more examples , talk on response far has been great. :)
it depends on how complex language you're dealing is.
splitting
this great when works, works when there no escaping conventions. not work csv example because commas inside quoted strings not proper split points.
foo,bar,baz
can split, but
foo,"bar,baz"
cannot.
regular
regular expressions great simple languages have "regular grammar". perl 5 regular expressions little more powerful due back-references general rule of thumb this:
if need match brackets (
(...)
,[...]
) or other nesting html tags, regular expressions not sufficient.
you can use regular expressions break string known number of chunks -- example, pulling out month/day/year date. wrong job parsing complicated arithmetic expressions though.
obviously, if write regular expression, walk away cup of coffee, come back, , can't understand wrote, should clearer way express you're doing. email addresses @ limit of 1 can correctly & readably handle using regular expressions.
context free
parser generators , hand-coded pushdown/peg parsers great dealing more complicated input need handle nesting can build tree or deal operator precedence or associativity.
context free parsers use regular expressions first break input chunks (spaces, identifiers, punctuation, quoted strings) , use grammar turn stream of chunks tree form.
the rule of thumb cf grammars is
if regular expressions insufficient words in language have same meaning regardless of prior declarations cf works.
non context free
if words in language change meaning depending on context, need more complicated solution. these hand-coded solutions.
for example, in c,
#ifdef x typedef int foo #endif foo * bar
if foo
type, foo * bar
declaration of foo
pointer named bar
. otherwise multiplication of variable named foo
variable named bar
.
Comments
Post a Comment