c# 4.0 - Regex pattern to match groups starting with pattern -
i extract data text stream data structured such
/1-<id>/<rectype>-<data>..repeat n times../1-<id>/#-<data>..repeat n times..
in above, "/1" field precedes record data can have number of following fields, each choice of rectype
2 9 (also, each field starts "/")
for example:
/1-xxxx/2-yyyy/9-zzzz/1-aaaa/3-bbbb/5-cccc/8=nnnn/9=dddd/1-qqqq/2-wwww/3=pppp/7-eeee
so, there 3 groups of data above
1=xxxx 2=yyyy 9=zzzz 1=aaaa 3=bbbb 5=cccc 8=nnnn 9=dddd 1=qqqq 2=wwww 3=pppp 7=eeee
data simplicity, know contains [a-z0-9. ] can variable length (not 4 chars per example)
now, following expression sort of works, capturing first 2 fields of each group , none of remaining fields...
/1-(?'fld1'[a-z]+)/((?'fldno'[2-9])-(?'flddata'[a-z0-9\. ]+))
i know need sort of quantifier
in there somewhere, not know or place it.
you can use regex match these blocks using 2 .net regex features: 1) capture collection , 2) multiple capturing groups same name in pattern. then, we'll need linq magic combine captured data list of lists:
(?<fldno>1)-(?'flddata'[^/]+)(?:/(?<fldno>[2-9])[-=](?'flddata'[^/]+))*
details:
(?<fldno>1)
- groupfldno
matching1
-
- hyphen(?'flddata'[^/]+)
- group "flddata" capturing 1+ chars other/
(?:/(?<fldno>[2-9])[-=](?'flddata'[^/]+))*
- 0 or more sequences of:/
- literal/
(?<fldno>[2-9])
-2
9
digit (group "fldno")[-=]
--
or=
(?'flddata'[^/]+)
- 1+ chars other/
(group "flddata")
see regex demo, results:
see c# demo:
using system; using system.linq; using system.text.regularexpressions; public class test { public static void main() { var str = "/1-xxxx/2-yyyy/9-zzzz/1-aaaa/3-bbbb/5-cccc/8=nnnn/9=dddd/1-qqqq/2-wwww/3=pppp/7-eeee"; var res = regex.matches(str, @"(?<fldno>1)-(?'flddata'[^/]+)(?:/(?<fldno>[2-9])[-=](?'flddata'[^/]+))*") .cast<match>() .select(p => p.groups["fldno"].captures.cast<capture>().select(m => m.value) .zip(p.groups["flddata"].captures.cast<capture>().select(m => m.value), (first, second) => first + "=" + second)) .tolist(); foreach (var t in res) console.writeline(string.join(" ", t)); } }
Comments
Post a Comment