Champak Posted January 26, 2014 Share Posted January 26, 2014 (edited) I have the following address example strings that I need to search through and edit 482 Albany Shaker Rd Osborne Rd Albany, NY 12211 875 New Scotland Ave opp Whitehall Rd Albany, NY 12208 64 Colvin Ave Central Ave Albany, NY 12206 62 Exchange St Albany, NY 12205 477 Delaware Ave Near Whitehall Rd Albany, NY 12209 351 Southern Blvd Albany, NY 12209 477 Delaware Ave Whitehall Rd Albany, NY 12209 553 Washington Ave Ontario St Albany, NY 12206 591 Broadway (NY-32) opp Fed Ex Plaza, near Village One Apts Albany, NY 12204 442 Madison Ave Albany, NY 12208 484 Loudon Rd (US-9) near Turner Ln, E of Siena Albany, NY 12211 821 New Scotland Ave near Crescent Dr Albany, NY 12208 What I'm trying to do is get rid of everything on the first line following the FIRST road designation. The problem as you see is the first line which contains the street address sometimes contains a cross street. I can't simply put it in a loop and look for the first street designation because the second one might be triggered dependent on the order I load the designations in the array. Take the last one for example. If I put a loop looking for the address designation and "Dr" is the first one in the array that I'm searching through, it wont fix my problem, but if "Ave" is the first one in the array, I can program it to delete everything after "Ave". I can't program it to look for the fourth word, because the address can contain anywhere from 2-4 words. See my dilemma. Can I get some help with this? I'm stuck. Thanks. I have no example, because I don't know where to begin. Edited January 26, 2014 by Champak Link to comment Share on other sites More sharing options...
JohnOne Posted January 26, 2014 Share Posted January 26, 2014 Post an example of what you want to be left with, based on your above target text. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Solution Malkey Posted January 26, 2014 Solution Share Posted January 26, 2014 (edited) Try this. expandcollapse popupLocal $sTestString = StringRegExpReplace(FileRead(@ScriptFullPath), "(?is)^.+#cs\v*(.+)#ce.*$", "\1") ; Extract test data from this script. ;ConsoleWrite($sTestString & @LF) ConsoleWrite(StringRegExpReplace($sTestString, "(?i)(\d+.+?\b(Rd|St|Ave|\(?[A-Z]+-\d+\)?|Dr|Blvd|Ln|Aly|Cres|Ct|Ter))[\s,.]\V*", "\1") & @LF) #cs 482 Albany Shaker Rd Osborne Rd Albany, NY 12211 875 New Scotland Ave opp Whitehall Rd Albany, NY 12208 64 Colvin Ave Central Ave Albany, NY 12206 62 Exchange St Albany, NY 12205 477 Saly Aly Near Whitehall Rd Albany, NY 12209 351 Southern Blvd Albany, NY 12209 477 Delaware Ave Whitehall Rd Albany, NY 12209 553 Terrance Ter, Ontario St Albany, NY 12206 591 Broadway (NY-32) opp Fed Ex Plaza, near Village One Apts Albany, NY 12204 442 Madison Ave Albany, NY 12208 484 Loudon Rd (US-9) near Turner Ln, E of Siena Albany, NY 12211 821 New Scotland Ave near Crescent Dr Albany, NY 12208 #ce Edit: Added "|Blvd" Edit2: RE pattern was "(?i)(d+.+?(Rd|St|Ave|(?[A-Z]+-d+)?|Dr|Blvd))V*". Just in case a street type exists as a sub-string within the street name I added "b" before street type group and "[s,.]" afterwards. Now, if a street type exists within a street name this street type will not be matched. Although, if there was a terrace called Ter, "553 Ter Ter" would be reduced to "553 Ter" by mistake. However, "477 Saly Aly" and "553 Terrance Ter" are fine. Edited January 27, 2014 by Malkey Link to comment Share on other sites More sharing options...
Champak Posted January 28, 2014 Author Share Posted January 28, 2014 (edited) THANKS! Almost perfect. One issues. 1221 Western Ave @Homestead St Albany, NY 12203 produces: 1221 West Albany, NY 12203 instead of: 1221 Western Ave Albany, NY 12203 And 116 Broadway (RT-32) @Wards Ln Albany, NY 12204 produces: 116 Broadway (RT-32) Albany, NY 12204 instead of: 116 Broadway Albany, NY 12204 And 195 21st Ave Madison St Paterson, NJ 07501 produces: 195 21st Paterson, NJ 07501 instead of: 195 21st Ave Paterson, NJ 07501 Basically it seems that if the street designation is contained within the street name like "1ST", "WeSTchester", "EdwaRD", "TAVErn", "DRum", the function will cut everything off after that point instead of the actual street designation. As far as the second example, could you show me separately how I would remove (RT-32) or a variable of that, because in places like NJ the "RT-??" is the actual street name of the address, so I'm not 100% if I want to remove that yet. Thanks. Edited January 28, 2014 by Champak Link to comment Share on other sites More sharing options...
Champak Posted January 28, 2014 Author Share Posted January 28, 2014 I fixed the first issue by adding a space before the designations like this: StringRegExpReplace($UNIVERSALVAR2, "(?i)(\d+.+?( Rd| St| Ave|\(?[A-Z]+-\d+\)?| Dr| Blvd))\V*", "\1") Let me know if that's not the best way to do this. Also, I'm trying to retrieve a specific variable string in large paragraph. The only constant is the string, which is a number that is 1 to 3 digits, always follows "Summary @crlf @crlf There are ". How can I retrieve this numeric string that I'm after? I have a feeling that it will have to do with StringRegExp, but as long as I've been doing this, that and dllcall I just can't seem to get. Thanks. Link to comment Share on other sites More sharing options...
Malkey Posted January 28, 2014 Share Posted January 28, 2014 When I run the Edit2 version of the example of my post #3, your examples of post #4, example 1 and 3 already output the required "instead of" results. To match "(NY-32)", I have used "(?[A-Z]+-d+)?". In the Edit2 version of the example of my post #3, a matching of "(NY-32)" is included in the output. In this post example, the matching of "(NY-32)" is not included in the output. expandcollapse popupLocal $sTestString = StringRegExpReplace(FileRead(@ScriptFullPath), "(?is)^.+#cs\v*(.+)#ce.*$", "\1") ; Extract test data from this script. ;ConsoleWrite($sTestString & @LF) ConsoleWrite(StringRegExpReplace($sTestString, "(?i)(\d+.+?)(\b(Rd|St|Ave|Dr|Blvd|Ln|Aly|Cres|Ct|Ter)|(?:\(?[A-Z]+-\d+\)?))[\s,.]\V*", "\1\3") & @LF); |\(?[A-Z]+-\d+\)? #cs 591 Broadway (NY-32) opp Fed Ex Plaza, near Village One Apts Albany, NY 12204 195 21st Ave Madison St Paterson, NJ 07501 1221 Western Ave @Homestead St Albany, NY 12203 482 Albany Shaker Rd Osborne Rd Albany, NY 12211 875 New Scotland Ave opp Whitehall Rd Albany, NY 12208 64 Colvin Ave Central Ave Albany, NY 12206 62 Exchange St Albany, NY 12205 477 Saly Aly Near Whitehall Rd Albany, NY 12209 351 Southern Blvd Albany, NY 12209 477 Delaware Ave Whitehall Rd Albany, NY 12209 553 Terrance Ter, Ontario St Albany, NY 12206 442 Madison Ave Albany, NY 12208 484 Loudon Rd (US-9) near Turner Ln, E of Siena Albany, NY 12211 821 New Scotland Ave near Crescent Dr Albany, NY 12208 #ce Link to comment Share on other sites More sharing options...
Champak Posted January 29, 2014 Author Share Posted January 29, 2014 Thanks Link to comment Share on other sites More sharing options...
Champak Posted February 17, 2014 Author Share Posted February 17, 2014 So an issue has popped up with this. If there is nothing to delete/remove after the street designation and there is no period at the end of the street designation, the line feed isn't added in. The line feed is only added when the period is there or when the string is edited. Does a separate stringregexpreplace need to be put in to take care of this afterwards, or can that be added into the existing one? StringRegExpReplace($UNIVERSALVAR2, "(?i)(\d+.+?)(\b(Rd|St|Ave|Dr|Blvd|Boulevard|Ln|Lane|Pkwy|Way|Ally|Cres|Ct|Ter|Concourse|Hwy|Plaza)|(?:\(?[A-Z]+-\d+\)?))[\s,.]\V*", "\1\3" & @LF) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now