AZJIO Posted May 28, 2013 Share Posted May 28, 2013 To get the full range of expandcollapse popup$timer = TimerInit() $sRange = _GetRangeSPE() MsgBox(0, "Timer", Round(TimerDiff($timer) / 1000, 2) & ' sec') $hFile = FileOpen(@ScriptDir & '\Range.txt', 2) FileWrite($hFile, $sRange) FileClose($hFile) Func _GetRangeSPE() Local $Lower, $Upper, $s, $sRange, $tmp, $trg1 = 0, $trg2 = 0 For $i = 0x80 To 0xFFFF $s = ChrW($i) $Upper = StringUpper($s) $Lower = StringLower($s) If Not ($Upper == $Lower) Then $trg1 += 1 $tmp = $i Else $trg1 = 0 EndIf Switch $trg1 Case 1 $sRange &= '\x{' & Hex($i, 4) & '}' Case 2 $trg2 = 1 Case 3 $sRange &= '-' Case 0 If $trg2 Then $trg2 = 0 $sRange &= '\x{' & Hex($tmp, 4) & '}' EndIf EndSwitch Next Return $sRange EndFunc ;==>_GetRangeSPE $timer = TimerInit() $sRes = __FO_UserLocale2('Она может задать диапазон не обязательно для русского языка. Check if a string fits a given regular expression pattern.', '\x{00C0}-\x{00D6}\x{00D8}-\x{00DE}\x{00E0}-\x{00F6}\x{00F8}-\x{012F}\x{0132}-\x{0137}\x{0139}-\x{0148}\x{014A}-\x{017E}\x{0181}-\x{018C}\x{018E}-\x{0194}\x{0196}-\x{0199}\x{019C}\x{019D}\x{019F}-\x{01A5}\x{01A7}-\x{01A9}\x{01AC}-\x{01B9}\x{01BC}\x{01BD}\x{01C4}\x{01C6}\x{01C7}\x{01C9}\x{01CA}\x{01CC}-\x{01EF}\x{01F1}\x{01F3}-\x{01F5}\x{01FA}-\x{0217}\x{0253}\x{0254}\x{0256}\x{0257}\x{0259}\x{025B}\x{0260}\x{0263}\x{0268}\x{0269}\x{026F}\x{0272}\x{0275}\x{0283}\x{0288}\x{028A}\x{028B}\x{0292}\x{0386}\x{0388}-\x{038A}\x{038C}\x{038E}\x{038F}\x{0391}-\x{03A1}\x{03A3}-\x{03AF}\x{03B1}-\x{03CE}\x{03E2}-\x{03EF}\x{0401}-\x{040C}\x{040E}-\x{044F}\x{0451}-\x{045C}\x{045E}-\x{0481}\x{0490}-\x{04BF}\x{04C1}-\x{04C4}\x{04C7}\x{04C8}\x{04CB}\x{04CC}\x{04D0}-\x{04EB}\x{04EE}-\x{04F5}\x{04F8}\x{04F9}\x{0531}-\x{0556}\x{0561}-\x{0586}\x{10A0}-\x{10C5}\x{1E00}-\x{1E95}\x{1EA0}-\x{1EF9}\x{1F00}-\x{1F15}\x{1F18}-\x{1F1D}\x{1F20}-\x{1F45}\x{1F48}-\x{1F4D}\x{1F51}\x{1F53}\x{1F55}\x{1F57}\x{1F59}\x{1F5B}\x{1F5D}\x{1F5F}-\x{1F7D}\x{1FB0}\x{1FB1}\x{1FB8}-\x{1FBB}\x{1FC8}-\x{1FCB}\x{1FD0}\x{1FD1}\x{1FD8}-\x{1FDB}\x{1FE0}\x{1FE1}\x{1FE5}\x{1FE8}-\x{1FEC}\x{1FF8}-\x{1FFB}\x{2160}-\x{217F}\x{24B6}-\x{24E9}\x{FF21}-\x{FF3A}\x{FF41}-\x{FF5A}') MsgBox(0, "Timer", Round(TimerDiff($timer), 2) & ' msec' & @LF & $sRes) Func __FO_UserLocale2($sMask, $sLocale) Local $s, $tmp $sLocale = StringRegExpReplace($sMask, '[^' & $sLocale & ']', '') $tmp = StringLen($sLocale) For $i = 1 To $tmp $s = StringMid($sLocale, $i, 1) If $s Then If StringInStr($sLocale, $s, 0, 2, $i) Then $sLocale = $s & StringReplace($sLocale, $s, '') EndIf Else ExitLoop EndIf Next If $sLocale Then $tmp = StringSplit($sLocale, '') For $i = 1 To $tmp[0] $sMask = StringReplace($sMask, $tmp[$i], '[' & StringUpper($tmp[$i]) & StringLower($tmp[$i]) & ']') Next EndIf Return $sMask EndFunc ;==>__FO_UserLocale2 $timer = TimerInit() $sRes = __FO_UserLocale('Она может задать диапазон не обязательно для русского языка. Check if a string fits a given regular expression pattern.', '\x{80}-\x{ffff}') MsgBox(0, "Timer", Round(TimerDiff($timer), 2) & ' msec' & @LF & $sRes) Func __FO_UserLocale($sMask, $sLocale) Local $s, $tmp $sLocale = StringRegExpReplace($sMask, '[^' & $sLocale & ']', '') $tmp = StringLen($sLocale) For $i = 1 To $tmp $s = StringMid($sLocale, $i, 1) If $s Then If StringInStr($sLocale, $s, 0, 2, $i) Then $sLocale = $s & StringReplace($sLocale, $s, '') EndIf Else ExitLoop EndIf Next If $sLocale Then Local $Upper, $Lower $tmp = StringSplit($sLocale, '') For $i = 1 To $tmp[0] $Upper = StringUpper($tmp[$i]) $Lower = StringLower($tmp[$i]) If Not ($Upper == $Lower) Then $sMask = StringReplace($sMask, $tmp[$i], '[' & $Upper & $Lower & ']') Next EndIf Return $sMask EndFunc ;==>__FO_UserLocale I want to make search of the files which names are sensitive to the register on any system. In such a way I form regular expression. I on a right way? My other projects or all Link to comment Share on other sites More sharing options...
jchd Posted May 28, 2013 Share Posted May 28, 2013 I'm sorry but I don't quite understand your need. The part I don't get is "files which names are sensitive to the register on any system" Anyway, changing the casing of Unicode codepoints is non-trivial. There are a number of problematic codepoints, like German eszet, title case ligatures, turkish dotted vs. dotless i and more. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
AZJIO Posted May 28, 2013 Author Share Posted May 28, 2013 (edited) For Cyrillic is a template similar to [Ww][Oo][Rr][Dd]. Currently used manual setting range. But I wanted to do automatically for all. There are a number of problematic codepoints, like German eszet, title case ligatures, turkish dotted vs. dotless i and more. You want to tell that the AutoIt3 functions too will be faulty? (StringLower, StringUpper) Edited May 28, 2013 by AZJIO My other projects or all Link to comment Share on other sites More sharing options...
jchd Posted May 28, 2013 Share Posted May 28, 2013 (edited) That's no straightforward even if our PCRE library was compiled with the UCP support (which is severely lacking). Basic and extended cyrillic are handled fine by ToUpper/ToLower but as I said, some codepoints are difficult to handle. For instance a "westerner" would say [ii] (that is [x49x69]) is OK but a turkish would need both [iı] (that is [x49x{0131}]) and [İi] (that is [x{0130}x69]). The same issue arises with dotted vs. dotless J, with German eszet ß ⇄ SS (the newly introduced uppercase eszet even makes that worse), several uppercase, titlecase and lowercase codepoints (e.g. DŽ, Dž, dž)... Edited May 28, 2013 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
jchd Posted May 28, 2013 Share Posted May 28, 2013 (edited) As a last (from me) note in this thread, observe that a number of codepoints don't roundtrip correctly and there are also exceptional cases like the greek sigma (one capital letter but two distinct lowercase letters depending on the final or not position of the letter in a word). Local $s = 'ß' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'DŽ' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'SS' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'Dž' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'dž' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'Σ' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'σ' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) $s = 'ς' MsgBox(0, '', $s & @LF & StringUpper($s) & @LF & StringLower($s)) More subtileties are detailed in this document. Also PCRE included limited support for codepoints having more than one "other cases" like the Greek sigma in version 8.32 (2012/12/30) and 8.33 (released today), but we're far behind that (and the goodness of many other new features like the JIT option). Edited May 28, 2013 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
AZJIO Posted May 29, 2013 Author Share Posted May 29, 2013 There are two options, either not to use at all, or to use with some exceptions which are difficult for processing. I think that nevertheless it will be more convenient to many people even if partial implementation by 99%. Let will answer at whom this problem. I am interested in the full range x{80}-x{ffff} or partial x{014A}-x{017E}x{0181}-x{018C}, etc. The second option is faster. I hope Unicode ranges will not change. My other projects or all Link to comment Share on other sites More sharing options...
guinness Posted May 29, 2013 Share Posted May 29, 2013 Also PCRE included limited support for codepoints having more than one "other cases" like the Greek sigma in version 8.32 (2012/12/30) and 8.33 (released today), but we're far behind that (and the goodness of many other new features like the JIT option). I see AutoIt v3.3.8.1 is using PCRE v8.12. UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
jchd Posted May 29, 2013 Share Posted May 29, 2013 AZJIO I've developped a small SQLte extension to handle such issues "mostly gracefully". You could download the source and adapt it to your needs. Search unifuzz in the forum. guinness Yeah, we're still using a prehistoric version. That's a pity since there have been a large number of very useful features introduced since 8.12. The first bonus is the native support of UTF-16 (UTF-32 as well), which would avoid going back and forth with UTF-8, speed up and simplify the code greatly. Then many dark corners have been cleared and finally the JIT engine is MUCH faster in most uses. Of course we also badly need callbacks, UCP, ... This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now