Jump to content
Sign in to follow this  
GPinzone

Strange RegEx fix that seems unnecessary

Recommended Posts

I wrote a program to convert single and double quotes to curly single and double quotes. The program uses regular expression search and replaces to swap the straight quotes to their curly counterparts and to ignore any HTML tags, too.

#include <ButtonConstants.au3>
#include <EditConstants.au3>
#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>
#include <FontConstants.au3>


Func Curlify($sInput)
    Local $sOutput = StringRegExpReplace($sInput, "'(?!([^<]+)?>)", "’")
    $sOutput = StringRegExpReplace($sOutput, "((^|\s+)(<[^>]*>)*)’", "${1}‘")
    $sOutput = StringRegExpReplace($sOutput, '"(?!([^<]+)?>)', "”")
    $sOutput = StringRegExpReplace($sOutput, "((^|\s+)(<[^>]*>)*)”", "${1}“")
    ; Fix nested quotes
    While StringRegExp($sOutput, "(‘|“)(<[^>]*>)*(’|”)")
        $sOutput = StringRegExpReplace($sOutput, "((‘|“)(<[^>]*>)*)’", "${1}‘")
        $sOutput = StringRegExpReplace($sOutput, "((‘|“)(<[^>]*>)*)”", "${1}“")
    WEnd

    Return $sOutput
EndFunc   ;==>Curlify

$FormCurly = GUICreate("Form_Curly", 708, 439, 192, 124)
$Original = GUICtrlCreateEdit("", 0, 0, 305, 433)
GUICtrlSetFont(-1, 9, $FW_NORMAL, 0, "Courier New") ; Set the font of the previous control.
GUICtrlSetLimit(-1, 1500000)
$Modified = GUICtrlCreateEdit("", 408, 0, 297, 433)
GUICtrlSetFont(-1, 9, $FW_NORMAL, 0, "Courier New") ; Set the font of the previous control.
GUICtrlSetLimit(-1, 1500000)
$Curly = GUICtrlCreateButton("Curly", 320, 64, 75, 25)
$Reset = GUICtrlCreateButton("Reset", 320, 164, 75, 25)
GUISetState(@SW_SHOW)

While 1
    $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $Curly
            GUICtrlSetData($Modified, Curlify(GUICtrlRead($Original)))

        Case $Reset
            GUICtrlSetData($Original, "")
            GUICtrlSetData($Modified, "")

        Case $GUI_EVENT_CLOSE
            Exit

    EndSwitch
WEnd

 

So why am I posting a question about a program that works?  This regex:

    $sOutput = StringRegExpReplace($sOutput, "((^|\s+)(<[^>]*>)*)’", "${1}‘")

should be able to be rewritten without the "+":

    $sOutput = StringRegExpReplace($sOutput, "((^|\s)(<[^>]*>)*)’", "${1}‘")

Same goes for the one to do double quotes, obviously.

However, if I remove the "+" from the "\s" in the regex, the program will fail in some cases. Something like <b>’<i>Don’t</i>’</b> at the beginning of a line won't get fixed. I suspect it's got something to do with the fact the the previous character (aside from the HTML tags) is a newline. The "\s" should work just fine with newlines. I'm stumped because the regex without the "+" modifier works fine on http://www.regextester.com/ when I test it.

 

 

Edited by GPinzone

Gerard J. Pinzonegpinzone AT yahoo.com

Share this post


Link to post
Share on other sites

62 posts in the forum should mean you have an understanding of how to post code.


UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Share this post


Link to post
Share on other sites

it may be my ocd, but i would expect examples of where this succeeds and fails.  And if recent history is any indicator,  I would then expect the thread to grow and carry on with edge cases not mentioned in those examples while a battle for regex supremacy runs wild.

 

*and put your stuff in code tags, its for legibility purposes and consideration of the people you want to help you.  its not ocd its just your ugly ass word wall.

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

And if recent history is any indicator,  I would then expect the thread to grow and carry on with edge cases not mentioned in those examples while a battle for regex supremacy runs wild.

Well, who knows...  :D

Share this post


Link to post
Share on other sites

OCD is a terrible mental issue.

With comments like that, it's pretty clear you won't be around the forum for very long.

Edited by guinness

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Share this post


Link to post
Share on other sites

it may be my ocd, but i would expect examples of where this succeeds and fails.

I did:

Something like <b>’<i>Don’t</i>’</b> at the beginning of a line won't get fixed.


Gerard J. Pinzonegpinzone AT yahoo.com

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By cruisepandey
      Hi, 
      I have a string like this : 
      Global $Msga = "urrent directory is /send.  (Submission of file with log number 29381077284 is confirmed)";
      I want to extract the number 29381077284  from the string. I did StringSplit to split based on "(" and then use space to reach there, But it's not a good choice. 
      Can anyone help me with regular expression to find the number from String using AutoIT. TIA
    • By Chimp
      Hello
      if I have a string like in the example below,
      is there a regular expression that can surround any "string" (and only strings) within quotes?.
      The whole input string is a "constructor" to populate an array so even if an element contains more words (a phrase) it should be considered as a single word (Elton John should be considered a single word and as that quoted as "Elton John")
      for example
      the following string
      [[Elton John,Peter,Sally,123],[1 one 1,2,3,4 four 4]] should be transformed to this other string
      [["Elton John","Peter","Sally",123],["1 one 1",2,3,"4 four 4"]] Thanks for your help
      Here a small script to use as "guinea pig"
      #include <Array.au3> Local $aArray = [["Elton John", "Peter", "Sally", 123],["one 1", 2, 3, "4 four 4"]] MsgBox(0, "Result", _Array2Json($aArray)) Func _Array2Json($aArray) If (Not IsArray($aArray)) Or (UBound($aArray, 0) > 2) Then Return SetError(1, 0, '') Local $sOpening, $sClosing If UBound($aArray, 0) = 1 Then $sOpening = '[' $sClosing = ']' Else $sOpening = '[[' $sClosing = ']]' EndIf $sOutpt = $sOpening & _ArrayToString($aArray, ",", -1, -1, "],[") & $sClosing ; $sOutpt = ???? how to quote strings ???? Return $sOutpt EndFunc ;==>_Array2Json  
    • By genius257
      Inspired by PHP's preg_split.
      Split string by a regular expression.
      Also supports the same flags as the PHP equivalent.
      v1.0.1
       
      Example:
      #include "StringRegExpSplit.au3" StringRegExpSplit('splitCamelCaseWords', '(?<=\w)(?=[A-Z])') ; ['split', 'Camel', 'Case', 'Words']  
    • By nend
      This is a program that I made to help my self learn better regular expressions.
      There are a lot of other programs/website with the similar functions.
      But the main advantage of this program is that you don't have to click a button after every changes.
      The program detected changes and react on it.
      Function:
      Match Match of arrays Match and replace Load source data from website Load source data from a website with GET/POST Load text data from file Clear fields Export and Import settings (you can finish the expression a other time, just export/import it) Cheat sheet Generate AutoIt code example code The source code is not difficult and I think most user will understand it.
      In the zip file there is a export files (reg back example), you can drag and drop this files on the gui to import it.
      Download Regex Toolkit Regex toolkit.zip  (Sourcode, example and compiled exe file)
      EDIT: Updated to version V1.2.0
      Changes are:
      Expand and collapse of the cheat sheet (Thanks to Melba23 for the Guiextender UDF) Usefull regular expressions websites links included in the program Text data update time EDIT: Updated to version V1.3.0
      Changes are:
       Automatic generate AutoIt code  Icons on the tab  Few minor bug fixes EDIT: Updated to version V1.4.0
      Changes are:
      Link to AutoIt regex helpfile If the regular expression has a error than the text becomes red Option Offset with Match and array of Matches Option Count with Match and replace Some small minor bug fixed EDIT: Updated to version V1.4.1
      Changes are:
      Small bug in "create AutoIt" code fixed EDIT: Updated to version V1.4.2
      Changes are:
      Small bug in "create AutoIt" code fixed Bug with website data  fixed 
      Regex toolkit.zip  (Sourcode, example and compiled exe file)
×
×
  • Create New...