leuce Posted May 18, 2009 Share Posted May 18, 2009 G'day everyone On a line like this: $foo = '<f1>This is the </f1><f2>BIGGEST</f2> one<s0/>!' I can either do this: $tags = StringRegExp ($foo, '(<[/]{0,1}[a-z]{0,3}[0-9]{0,5}[/]{0,1}>)', 3) to get <f1>, </f1>, <f2>, </f2> and <s0> or I can do this: $tags = StringRegExp ($foo, '([A-Z]+[a-z]+)', 3) to get "This" and "BIGGEST" But how can I get both? I mean, how can I get the array being: <f1>, This, </f1>, <f2>, BIGGEST, </f2> and <s0> If I do this: $tags = StringRegExp ($foo, '(<[/]{0,1}[a-z]{0,3}[0-9]{0,5}[/]{0,1}>)([A-Z]+[a-z]*)', 3) Then the array is only this: <f1>, This, <f2>, BIGGEST My question is how can I get both patterns evaluated at the same time? Thanks Link to comment Share on other sites More sharing options...
MrMitchell Posted May 18, 2009 Share Posted May 18, 2009 Maybe use a pipe |, I think that's the OR operator. Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted May 18, 2009 Moderators Share Posted May 18, 2009 (edited) Others will post code.. but to be honest, I'm not going to until I can see what exactly you're asking about. Are you saying you don't want two elements? [0] = This and [1] = Biggest ... That you only want one ... [0] = This Biggest? Try to be much more specific with what you want the output to be when posting regex questions... so others don't waste their times coming up with solutions that don't meat your needs. Edited May 18, 2009 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
leuce Posted May 18, 2009 Author Share Posted May 18, 2009 On a line like this: $foo = '<f1>This is the </f1><f2>BIGGEST</f2> one<s0/>!' I've thinkered on and found that this line works (almost): $tags = StringRegExp ($foo, '(<[/]{0,1}[a-z]{0,3}[0-9]{0,5}[/]{0,1}>)+?|([A-Z]+[a-z]*)+?', 3) ...except that for some reason two blank array items are added at positions 2 and 6 (just before "This" and "BIGGEST"). Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted May 18, 2009 Moderators Share Posted May 18, 2009 I've thinkered on and found that this line works (almost): $tags = StringRegExp ($foo, '(<[/]{0,1}[a-z]{0,3}[0-9]{0,5}[/]{0,1}>)+?|([A-Z]+[a-z]*)+?', 3) ...except that for some reason two blank array items are added at positions 2 and 6 (just before "This" and "BIGGEST").If you don't care about there is more than 1 index in the array this should fit your needs:"<f\d+>(\w+).*?</f\d+>" Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
leuce Posted May 18, 2009 Author Share Posted May 18, 2009 (edited) Others will post code.. but to be honest, I'm not going to until I can see what exactly you're asking about. Are you saying you don't want two elements? [0] = This and [1] = Biggest ... That you only want one ... [0] = This Biggest? Thanks for asking... I did not realise that my question was unclear. I did not want to post lots and lots of irrelevant code, see. I'm working on a script that will read a line of XML-like text from a file, reduce it to the XML-like tags, and then enable the user to paste the XML-like tags into some text editor one after the other. The script doing that work fine (as long as the line of text actually contains tags). Here it is: expandcollapse popup#cs On my keyboard, the M, < and > are next to each other bottom right. So I used these three keys for selecting the tags and accepting it. If you use the home row, and you drop your hand one row, the < and > should be under your middle and ring finger, and the M should be under your forefinger. On the home row, the K is under your middle finger. Ctrl+< = previous tag Ctrl+> = next tag Ctrl+M = accept tag Ctrl+K = accept tag You can also accept the tag by pressing the left or right arrow, obviously, but Ctrl+M (or Ctrl+K) will ensure that the cursor is directly after the tag and not one position away from it. If you use Ctrl <, >, M and K anywhere else, nothing will happen. #ce Global $foo Global $tags Global $len Global $ben Global $numtags Global $j Global $a HotKeySet("^m", "Cancel") HotKeySet("^k", "Cancel") HotKeySet("^.", "NextTag") HotKeySet("^,", "PrevTag") $j = 0 $a = 1 $z = 1 $time1 = FileGetTime ("source.txt", 0, 1) $fileopen = FileOpen ("source.txt", 128) $foo = FileRead ("source.txt") FileClose ("source.txt") $tags = StringRegExp ($foo, '(<[/]{0,1}[a-z]{0,3}[0-9]{0,5}[/]{0,1}>)', 3) $numtags = UBound($tags) While 1 Sleep(1000) $time2 = FileGetTime ("source.txt", 0, 1) If $time2 <> $time1 Then $fileopen = FileOpen ("source.txt", 128) $foo = FileRead ("source.txt") FileClose ("source.txt") $tags = StringRegExp ($foo, '(<[/]{0,1}[a-z]{0,3}[0-9]{0,5}[/]{0,1}>)', 3) $numtags = UBound($tags) $time1 = FileGetTime ("source.txt", 0, 1) EndIf WEnd ; ================================ Func NextTag () If $a = 1 Then If WinActive ("OmegaT", "") Then If $j = $numtags - 1 Then $j = 0 Else $j = $j + 1 EndIf ClipPut ($tags[$j]) $len = StringLen ($tags[$j]) Send ("^v") Send ("{LEFT " & $len & "}") Send ("+{RIGHT " & $len & "}") Send ("{ESC}") Send ("{ESC}") EndIf EndIf EndFunc Func PrevTag () If $a = 1 Then If WinActive ("OmegaT", "") Then If $j = 0 Then $j = $numtags - 1 Else $j = $j - 1 EndIf ClipPut ($tags[$j]) $len = StringLen ($tags[$j]) Send ("^v") Send ("{LEFT " & $len & "}") Send ("+{RIGHT " & $len & "}") Send ("{ESC}") Send ("{ESC}") EndIf EndIf EndFunc Func Cancel () If WinActive ("OmegaT", "") Then Send ("{LEFT}") Send ("{RIGHT}") EndIf EndFunc The line of text appears in source.txt, and the program in which the user wants to paste the XML-like tags is called OmegaT. This is actually a translators' tool and source.txt contains an exported version of the source text that the translator is translating. But I want the script to be more fancy. At present, if source.txt contains this: <f1>This is the </f1><f2>BIGGEST</f2> one<s0/>! then the script will enable the user to insert the following tags: <f1> </f1> <f2> </f2> <s0/> ...which is useful enough. But wouldn't it be great if the script can also assist the user to insert all words that start with capital letters, and words that consist entirely of capital letters? Such words are usually proper names that don't need translating and that can be carried over from the source text directly to the translation. So, I want the script to enable the user to insert any of the following: <f1> This </f1> <f2> BIGGEST </f2> <s0/> Is this clear enough? PS I think my script above still has a few bugs, eg if the source.txt contains no tags then the script dies, so I still need to fix that. Edited May 18, 2009 by leuce Link to comment Share on other sites More sharing options...
leuce Posted May 18, 2009 Author Share Posted May 18, 2009 But wouldn't it be great if the script can also assist the user to insert all words that start with capital letters, and words that consist entirely of capital letters? Such words are usually proper names that don't need translating and that can be carried over from the source text directly to the translation.If there is no solution to my problem, don't worry about it. I decided to map capital letters to a different set of hotkeys.The final script is here:http://leuce.com/tempfile/omtautoit/taggrabber.zip Link to comment Share on other sites More sharing options...
MrMitchell Posted May 18, 2009 Share Posted May 18, 2009 maybe this Regex? "(</?\w*/?>)|([A-Z]+(\w)*)" Might need to tweak it a little bit... Basically it should grab any tags that contain word characters inside < and > or it will grab anything that begins with a capital letter. I know I'm missing something, if not a few things, so look over it carefully. But is this what you are aiming at? Link to comment Share on other sites More sharing options...
Authenticity Posted May 18, 2009 Share Posted May 18, 2009 Maybe if you want you can allow things like in the HTML world like <img src="\Smiles\lol.gif">lol</img> or to recursively build the tree like <html><something></something></html> so it matches it as a while <html>...</html> and the bowels as a separated unit, close to the analyzing a DOM browser will do or parse. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now