Sign in to follow this  
Followers 0
drlava

Complex Regex text extraction

3 posts in this topic

I'm trying to write a JSON to XML converter, and need some regex help.

First, assume the input is

{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Schedule":[{"ChanID":28460783,"Affiliate":"MNT","CallLetters":"WUAB"}],"Actors":"Yourdaddy"}]}

note that there is a subnode Schedule. I want to extract the shows node and leave the Schedule node text out.

{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Actors":"Yourdaddy"}]}

so far I have tried many things, such as

StringRegExp($data, '("shows":\[.*?)(?:"[\w]+":\[.*?\])?(.*\])',3)

trying to use the (?:whistle: to leave the inner node out. But nothing is working. Any help?

Or maybe there's a better way to do this without regex?

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

[/code][quote name='drlava' post='372864' date='Jul 14 2007, 09:00 AM']I'm trying to write a JSON to XML converter, and need some regex help. 

First, assume the input is
[code]{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Schedule":[{"ChanID":28460783,"Affiliate":"MNT","CallLetters":"WUAB"}],"Actors":"Yourdaddy"}]}

note that there is a subnode Schedule. I want to extract the shows node and leave the Schedule node text out.

{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Actors":"Yourdaddy"}]}
Edited by MisterBates

Share this post


Link to post
Share on other sites

[/code][quote name='drlava' post='372864' date='Jul 14 2007, 09:00 AM']I'm trying to write a JSON to XML converter, and need some regex help. 

First, assume the input is
[code]{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Schedule":[{"ChanID":28460783,"Affiliate":"MNT","CallLetters":"WUAB"}],"Actors":"Yourdaddy"}]}

note that there is a subnode Schedule. I want to extract the shows node and leave the Schedule node text out.

{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Actors":"Yourdaddy"}]}

$sIn = '{"shows":[{"Title":"Inhabited","ProgramID":29109227,"Schedule":[{"ChanID":28460783,"Affiliate":"MNT","CallLetters":"WUAB"}],"Actors":"Yourdaddy"}]}'
$Out = StringRegExpReplace($sIn, '"Schedule"\:\[.*?\],','')
                                                                    ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $Out = ' & $Out & @crlf & '>Error code: ' & @error & @crlf);### Debug Console

produces output

Note that the pattern does not currently notice nodes nested inside the schedule node.

EDIT: This pattern handles nodes nested inside the schedule node (matched pairs of []):

$Out = StringRegExpReplace($sIn, '"Schedule"\:\[.*?(\[.*\].*?)*\],','')
Thank you! I was going to try to use a recursive function to find nodes inside Schedule, so your second pattern is very helpful. Also, the inner node may or may not exist, and may not be called schedule, and may be terminated with a [,\]}] so

$Out = StringRegExpReplace($sIn,'"\w+?"\:\[.*?(\[.*\].*?)*\][,\]}]','$lengthofshows') would probably work. I'll try using the same expression with StringRegExp to extract the inner node for further processing.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0