Jump to content
Sign in to follow this  
GPinzone

Strange RegEx fix that seems unnecessary

Recommended Posts

I wrote a program to convert single and double quotes to curly single and double quotes. The program uses regular expression search and replaces to swap the straight quotes to their curly counterparts and to ignore any HTML tags, too.

#include <ButtonConstants.au3>
#include <EditConstants.au3>
#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>
#include <FontConstants.au3>


Func Curlify($sInput)
    Local $sOutput = StringRegExpReplace($sInput, "'(?!([^<]+)?>)", "’")
    $sOutput = StringRegExpReplace($sOutput, "((^|\s+)(<[^>]*>)*)’", "${1}‘")
    $sOutput = StringRegExpReplace($sOutput, '"(?!([^<]+)?>)', "”")
    $sOutput = StringRegExpReplace($sOutput, "((^|\s+)(<[^>]*>)*)”", "${1}“")
    ; Fix nested quotes
    While StringRegExp($sOutput, "(‘|“)(<[^>]*>)*(’|”)")
        $sOutput = StringRegExpReplace($sOutput, "((‘|“)(<[^>]*>)*)’", "${1}‘")
        $sOutput = StringRegExpReplace($sOutput, "((‘|“)(<[^>]*>)*)”", "${1}“")
    WEnd

    Return $sOutput
EndFunc   ;==>Curlify

$FormCurly = GUICreate("Form_Curly", 708, 439, 192, 124)
$Original = GUICtrlCreateEdit("", 0, 0, 305, 433)
GUICtrlSetFont(-1, 9, $FW_NORMAL, 0, "Courier New") ; Set the font of the previous control.
GUICtrlSetLimit(-1, 1500000)
$Modified = GUICtrlCreateEdit("", 408, 0, 297, 433)
GUICtrlSetFont(-1, 9, $FW_NORMAL, 0, "Courier New") ; Set the font of the previous control.
GUICtrlSetLimit(-1, 1500000)
$Curly = GUICtrlCreateButton("Curly", 320, 64, 75, 25)
$Reset = GUICtrlCreateButton("Reset", 320, 164, 75, 25)
GUISetState(@SW_SHOW)

While 1
    $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $Curly
            GUICtrlSetData($Modified, Curlify(GUICtrlRead($Original)))

        Case $Reset
            GUICtrlSetData($Original, "")
            GUICtrlSetData($Modified, "")

        Case $GUI_EVENT_CLOSE
            Exit

    EndSwitch
WEnd

 

So why am I posting a question about a program that works?  This regex:

    $sOutput = StringRegExpReplace($sOutput, "((^|\s+)(<[^>]*>)*)’", "${1}‘")

should be able to be rewritten without the "+":

    $sOutput = StringRegExpReplace($sOutput, "((^|\s)(<[^>]*>)*)’", "${1}‘")

Same goes for the one to do double quotes, obviously.

However, if I remove the "+" from the "\s" in the regex, the program will fail in some cases. Something like <b>’<i>Don’t</i>’</b> at the beginning of a line won't get fixed. I suspect it's got something to do with the fact the the previous character (aside from the HTML tags) is a newline. The "\s" should work just fine with newlines. I'm stumped because the regex without the "+" modifier works fine on http://www.regextester.com/ when I test it.

 

 

Edited by GPinzone

Gerard J. Pinzonegpinzone AT yahoo.com

Share this post


Link to post
Share on other sites

62 posts in the forum should mean you have an understanding of how to post code.


UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Share this post


Link to post
Share on other sites

it may be my ocd, but i would expect examples of where this succeeds and fails.  And if recent history is any indicator,  I would then expect the thread to grow and carry on with edge cases not mentioned in those examples while a battle for regex supremacy runs wild.

 

*and put your stuff in code tags, its for legibility purposes and consideration of the people you want to help you.  its not ocd its just your ugly ass word wall.

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

And if recent history is any indicator,  I would then expect the thread to grow and carry on with edge cases not mentioned in those examples while a battle for regex supremacy runs wild.

Well, who knows...  :D

Share this post


Link to post
Share on other sites

OCD is a terrible mental issue.

With comments like that, it's pretty clear you won't be around the forum for very long.

Edited by guinness

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • By junichironakashima
      Im creating a code that will work in this sequence:
      1. Copy the text (question) in one atea of the screen
      2. Catch the 2 strings (number)
      3. Multiply the 2 strings ( $1*$2)
      4. Click the next area to put the answer
      5. Paste the answer
       
      This is my code
       
      MouseClick($MOUSE_CLICK_LEFT, 479, 802, 3, 1) ;Clicking all of the text
      Send("^c") 
      $x = StringRegExpReplace(ClipGet(), 'What is (\d*) x (\d*) \?$', "$1*$2")
      MouseClick($MOUSE_CLICK_LEFT, 480, 844, 1, 1)
      ClipPut($x)
      Send("^v")
       
      However the output is this
      $1*$2
       
      How can I make it solve itself? Because I tried this code:
      MouseClick($MOUSE_CLICK_LEFT, 479, 802, 3, 1) ;Clicking all of the text
      Send("^c")
      MouseClick($MOUSE_CLICK_LEFT, 480, 844, 1, 1) $x = Execute(StringRegExpReplace(ClipGet(), 'What is (\d*) x (\d*) \?$', "$1*$2"))
      ClipPut($x)
      Send("^v")
      Output is just blank text

    • By lee321987
      Hello, I'm trying to match the second to last line of this:
      foo C:\ foobar foobar x C:\temp\dir Last line with chars Here's my code:
      $test = 'foo' & @CRLF $test &= 'C:\' & @CRLF $test &= 'foobar' & @CRLF $test &= 'hello' & @CRLF $test &= 'C:\temp\dir' & @CRLF $test &= 'Last line with chars' & @CRLF $test &= @CRLF $test &= @CRLF $result = StringRegExp($test, '(?m)^C:\\.*$Last.*') MsgBox(0, '', $result) I'm trying to match line "C:\temp\dir".  Anyone have any ideas?
    • By nend
      This is a program that I made to help my self learn better regular expressions.
      There are a lot of other programs/website with the similar functions.
      But the main advantage of this program is that you don't have to click a button after every changes.
      The program detected changes and react on it.
      Function:
      Match Match of arrays Match and replace Load source data from website Load source data from a website with GET/POST Load text data from file Clear fields Export and Import settings (you can finish the expression a other time, just export/import it) Cheat sheet Generate AutoIt code The source code is not difficult and I think most user will understand it.
      In the zip file there are 2 export files (POST and a reg back example), you can drag and drop these files on the gui to import them.
      Download Regex Toolkit Regex toolkit.zip (Sourcode, exmaple and exe file)
      EDIT: Updated to version V1.2.0
      Changes are:
      Expand and collapse of the cheat sheet (Thanks to Melba23 for the Guiextender UDF) Usefull regular expressions websites links included in the program Text data update time EDIT: Updated to version V1.3.0
      Changes are:
       Automatic generate AutoIt code  Icons on the tab  Few minor bug fixes EDIT: Updated to version V1.4.0
      Changes are:
      Link to AutoIt regex helpfile If the regular expression has a error than the text becomes red Option Offset with Match and array of Matches Option Count with Match and replace Some small minor bug fixed EDIT: Updated to version V1.4.1
      Changes are:
      Small bug in "create AutoIt" code fixed
    • By gruntydatsun
      I have an XML file and every time there are three lines in a row with only <null/> in them, i want to insert a fourth line with <null/>.   Each line starts with 3 white spaces, followed by <null/> and ends with a white space followed by CR LF.   The presence of the three lines as described is unique to the points where I want to insert a line in this document.
       I'm trying to figure out how to apply the repeating part of a regex  {1,4} but apply it to this whole segment. 
      So far I have the below which picks up an individual line ok:
      ^\s{3}<null/>\s\r\n I tried wrapping it all in braces () then adding {3} but I'm obviously getting something wrong. 
      Attached is a section from the xml file with a block of nulls that should be matched if anyone would like to have a look.
      Help_From_Forum.xml
    • By milkmoron
      I am trying to search in a web browser dates XX/XX/XXXX that are also links. I want to click them after and remove them from the array. This is all I have so far. Nothing shows up. What am I doing wrong?
      ControlFocus ("Customer Center", "", "")
      Local $aArray = StringRegExp('(..)/(..)/(....)', '(..)/(..)/(....)', $STR_REGEXPARRAYFULLMATCH)
      For $i = 0 To UBound($aArray) - 1
          MsgBox($MB_SYSTEMMODAL, "RegExp Test with Option 2 - " & $i, $aArray[$i])
      Next
       
×
×
  • Create New...