Jump to content

Using function inside StringRegExpReplace with backreference


Recommended Posts

working towards a regex solution, but sticking this here as place holder because it adds a parachute when the nested function blows bounds.  Though there is probably a better way.

local $table[2]

$table[0] = "cat"
$table[1] = "dog"

Func _decode($n)
    If $n > ubound($table) - 1 then return 0
    Return $table[$n]
EndFunc

$string = "Hello this is a &0; and this is a &1;"

$i = -1

Do
$i += 1
$string = StringReplace($string , "&" & $i & ";" , _decode($i))
if @extended = 0 then exitloop
until 0

msgbox(0, '' , $string)

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Try this:

Func decode($n)
    Static $table = ["cat", "dog"]
    Return $table[$n]
EndFunc

Local $string = "Hello this is a &0; and this is a &1;"
Local $decoded = Execute("'" & StringRegExpReplace($string, "&(\d+);", "' & decode(\1) & '") & "'")

MsgBox(0, "Result", $decoded)

If your context needs a condom in case of unsure input, prefer that:

Func decode($n)
    Static $table = ["cat", "dog"]
    Return ($n < 0 Or $n >= UBound($table)) ? '(null)' : $table[$n]
EndFunc

 

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

@jchd

what magic returns 0 for invalid values rather than exceeding the bounds of the static array?

Local $string = "Hello this is a &0; and this is a &1;  and a &2; and this"

just returns a 0 where the 2 is, but attempting to call decode(2) blows it up...

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

I just added some latex. But i suspect it isn't useful, as I believe the poster will use CharW() as decode() and if my guess is correct the likelyhood of an incorrectly encoded Unicode character in a presumably valid html stream is pretty low. In fact it all depends on the context: possibly malicious or not.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

1 hour ago, oasis375 said:

clairvoyant

No crystal ball here, I'm just seasonned.

If you need to process html entities (e.g. &amp; for &) then you can avoid maintaining a large list of html entities acronums by using this:

OTOH if your input simply uses Unicode hex or decimal entities a regexp will be faster.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

@jchd

Yeah! you did it again.

Yes, my original problem was with HTML entities. I was using UDF by Dhilip89 (autoitscript.com/forum/topic/51084-html-entity-udf), but I found that _ArraySearch method a litte bit slow when processing thousands of html files.

mLipok method is ingenious, but creating a new IE instance with every request just makes the process slower. (But I admit it's ok for just a few requests).

Here is my benchmark. (Notice the time rates)
 

;HTML Decode benchmark
#include <IE.au3>
#include <HTML_Decode.au3>

$string = "Uncle Tom&#039;s Cabin is a novel written by Harriet Beecher Stowe"
For $n = 1 To 50
    ;$decode = _HTMLDecode($string) ;Time: 6.557 <-weird, should be faster.
    ;$decode = Execute("'" & StringRegExpReplace($string, "&#(.*?);", "' & ChrW('\1') & '") & "'") ; Time: 0.922
    ;$decode = _HTML_DecodeEntities($string) ;Time: 19.64
Next

$string = "Uncle Tom&apos;s Cabin is a novel written by Harriet Beecher Stowe"
For $n = 1 To 50
    ;$decode = _HTMLDecode($string) ;Time: 0.9077
    ;$decode = _HTML_DecodeEntities($string) ;Time: 9.87
Next

;If $n>60, _HTML_DecodeEntities gives error: $oObject.document.Write

Func _HTML_DecodeEntities(ByRef $sHTML)
    $sHTML = StringReplace($sHTML, '
', '<hr>')
    Local $oIE = _IECreate("about:blank", 0, 0, 1, 0)
    _IEDocWriteHTML($oIE, $sHTML)
    Local $sResult = _IEBodyReadText($oIE)
    _IEQuit($oIE)
    Return $sResult
EndFunc   ;==>_HTML_DecodeEntities

 

Link to comment
Share on other sites

Can you time this version (requires the beta for Map datatype)?

Local $map[]
    $map["quot"] = 34
    $map["amp"] = 38
    $map["apos"] = 39
    $map["lt"] = 60
    $map["gt"] = 62
    $map["nbsp"] = 160
    $map["iexcl"] = 161
    $map["cent"] = 162
    $map["pound"] = 163
    $map["curren"] = 164
    $map["yen"] = 165
    $map["brvbar"] = 166
    $map["sect"] = 167
    $map["uml"] = 168
    $map["copy"] = 169
    $map["ordf"] = 170
    $map["laquo"] = 171
    $map["not"] = 172
    $map["shy"] = 173
    $map["reg"] = 174
    $map["macr"] = 175
    $map["deg"] = 176
    $map["plusmn"] = 177
    $map["sup2"] = 178
    $map["sup3"] = 179
    $map["acute"] = 180
    $map["micro"] = 181
    $map["para"] = 182
    $map["middot"] = 183
    $map["cedil"] = 184
    $map["sup1"] = 185
    $map["ordm"] = 186
    $map["raquo"] = 187
    $map["frac14"] = 188
    $map["frac12"] = 189
    $map["frac34"] = 190
    $map["iquest"] = 191
    $map["Agrave"] = 192
    $map["Aacute"] = 193
    $map["Acirc"] = 194
    $map["Atilde"] = 195
    $map["Auml"] = 196
    $map["Aring"] = 197
    $map["AElig"] = 198
    $map["Ccedil"] = 199
    $map["Egrave"] = 200
    $map["Eacute"] = 201
    $map["Ecirc"] = 202
    $map["Euml"] = 203
    $map["Igrave"] = 204
    $map["Iacute"] = 205
    $map["Icirc"] = 206
    $map["Iuml"] = 207
    $map["ETH"] = 208
    $map["Ntilde"] = 209
    $map["Ograve"] = 210
    $map["Oacute"] = 211
    $map["Ocirc"] = 212
    $map["Otilde"] = 213
    $map["Ouml"] = 214
    $map["times"] = 215
    $map["Oslash"] = 216
    $map["Ugrave"] = 217
    $map["Uacute"] = 218
    $map["Ucirc"] = 219
    $map["Uuml"] = 220
    $map["Yacute"] = 221
    $map["THORN"] = 222
    $map["szlig"] = 223
    $map["agrave"] = 224
    $map["aacute"] = 225
    $map["acirc"] = 226
    $map["atilde"] = 227
    $map["auml"] = 228
    $map["aring"] = 229
    $map["aelig"] = 230
    $map["ccedil"] = 231
    $map["egrave"] = 232
    $map["eacute"] = 233
    $map["ecirc"] = 234
    $map["euml"] = 235
    $map["igrave"] = 236
    $map["iacute"] = 237
    $map["icirc"] = 238
    $map["iuml"] = 239
    $map["eth"] = 240
    $map["ntilde"] = 241
    $map["ograve"] = 242
    $map["oacute"] = 243
    $map["ocirc"] = 244
    $map["otilde"] = 245
    $map["ouml"] = 246
    $map["divide"] = 247
    $map["oslash"] = 248
    $map["ugrave"] = 249
    $map["uacute"] = 250
    $map["ucirc"] = 251
    $map["uuml"] = 252
    $map["yacute"] = 253
    $map["thorn"] = 254
    $map["yuml"] = 255
    $map["OElig"] = 338
    $map["oelig"] = 339
    $map["Scaron"] = 352
    $map["scaron"] = 353
    $map["Yuml"] = 376
    $map["fnof"] = 402
    $map["circ"] = 710
    $map["tilde"] = 732
    $map["Alpha"] = 913
    $map["Beta"] = 914
    $map["Gamma"] = 915
    $map["Delta"] = 916
    $map["Epsilon"] = 917
    $map["Zeta"] = 918
    $map["Eta"] = 919
    $map["Theta"] = 920
    $map["Iota"] = 921
    $map["Kappa"] = 922
    $map["Lambda"] = 923
    $map["Mu"] = 924
    $map["Nu"] = 925
    $map["Xi"] = 926
    $map["Omicron"] = 927
    $map["Pi"] = 928
    $map["Rho"] = 929
    $map["Sigma"] = 931
    $map["Tau"] = 932
    $map["Upsilon"] = 933
    $map["Phi"] = 934
    $map["Chi"] = 935
    $map["Psi"] = 936
    $map["Omega"] = 937
    $map["alpha"] = 945
    $map["beta"] = 946
    $map["gamma"] = 947
    $map["delta"] = 948
    $map["epsilon"] = 949
    $map["zeta"] = 950
    $map["eta"] = 951
    $map["theta"] = 952
    $map["iota"] = 953
    $map["kappa"] = 954
    $map["lambda"] = 955
    $map["mu"] = 956
    $map["nu"] = 957
    $map["xi"] = 958
    $map["omicron"] = 959
    $map["pi"] = 960
    $map["rho"] = 961
    $map["sigmaf"] = 962
    $map["sigma"] = 963
    $map["tau"] = 964
    $map["upsilon"] = 965
    $map["phi"] = 966
    $map["chi"] = 967
    $map["psi"] = 968
    $map["omega"] = 969
    $map["thetasym"] = 977
    $map["upsih"] = 978
    $map["piv"] = 982
    $map["ensp"] = 8194
    $map["emsp"] = 8195
    $map["thinsp"] = 8201
    $map["zwnj"] = 8204
    $map["zwj"] = 8205
    $map["lrm"] = 8206
    $map["rlm"] = 8207
    $map["ndash"] = 8211
    $map["mdash"] = 8212
    $map["lsquo"] = 8216
    $map["rsquo"] = 8217
    $map["sbquo"] = 8218
    $map["ldquo"] = 8220
    $map["rdquo"] = 8221
    $map["bdquo"] = 8222
    $map["dagger"] = 8224
    $map["Dagger"] = 8225
    $map["bull"] = 8226
    $map["hellip"] = 8230
    $map["permil"] = 8240
    $map["prime"] = 8242
    $map["Prime"] = 8243
    $map["lsaquo"] = 8249
    $map["rsaquo"] = 8250
    $map["oline"] = 8254
    $map["frasl"] = 8260
    $map["euro"] = 8364
    $map["image"] = 8465
    $map["weierp"] = 8472
    $map["real"] = 8476
    $map["trade"] = 8482
    $map["alefsym"] = 8501
    $map["larr"] = 8592
    $map["uarr"] = 8593
    $map["rarr"] = 8594
    $map["darr"] = 8595
    $map["harr"] = 8596
    $map["crarr"] = 8629
    $map["lArr"] = 8656
    $map["uArr"] = 8657
    $map["rArr"] = 8658
    $map["dArr"] = 8659
    $map["hArr"] = 8660
    $map["forall"] = 8704
    $map["part"] = 8706
    $map["exist"] = 8707
    $map["empty"] = 8709
    $map["nabla"] = 8711
    $map["isin"] = 8712
    $map["notin"] = 8713
    $map["ni"] = 8715
    $map["prod"] = 8719
    $map["sum"] = 8721
    $map["minus"] = 8722
    $map["lowast"] = 8727
    $map["radic"] = 8730
    $map["prop"] = 8733
    $map["infin"] = 8734
    $map["ang"] = 8736
    $map["and"] = 8743
    $map["or"] = 8744
    $map["cap"] = 8745
    $map["cup"] = 8746
    $map["int"] = 8747
    $map["there4"] = 8756
    $map["sim"] = 8764
    $map["cong"] = 8773
    $map["asymp"] = 8776
    $map["ne"] = 8800
    $map["equiv"] = 8801
    $map["le"] = 8804
    $map["ge"] = 8805
    $map["sub"] = 8834
    $map["sup"] = 8835
    $map["nsub"] = 8836
    $map["sube"] = 8838
    $map["supe"] = 8839
    $map["oplus"] = 8853
    $map["otimes"] = 8855
    $map["perp"] = 8869
    $map["sdot"] = 8901
    $map["lceil"] = 8968
    $map["rceil"] = 8969
    $map["lfloor"] = 8970
    $map["rfloor"] = 8971
    $map["lang"] = 9001
    $map["rang"] = 9002
    $map["loz"] = 9674
    $map["spades"] = 9824
    $map["clubs"] = 9827
    $map["hearts"] = 9829
    $map["diams"] = 9830

Local $string = "Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;"
Local $decoded = Execute("'" & StringRegExpReplace($string, "&(\w+);", "' & ChrW($map['\1']) & '") & "'")

MsgBox(0, "Result", $decoded)

 

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

What about this ? (without beta)

$sd = ObjCreate("Scripting.Dictionary")
    $sd.add("quot", 34)
    $sd.add("amp", 38)
    $sd.add("apos", 39)
    $sd.add("lt", 60)
    $sd.add("gt", 62)
    $sd.add("nbsp", 160)
    $sd.add("iexcl", 161)
    $sd.add("cent", 162)
    $sd.add("pound", 163)
    $sd.add("curren", 164)
    $sd.add("yen", 165)
    $sd.add("brvbar", 166)
    $sd.add("sect", 167)
    $sd.add("uml", 168)
    $sd.add("copy", 169)
    $sd.add("ordf", 170)
    $sd.add("laquo", 171)
    $sd.add("not", 172)
    $sd.add("shy", 173)
    $sd.add("reg", 174)
    $sd.add("macr", 175)
    $sd.add("deg", 176)
    $sd.add("plusmn", 177)
    $sd.add("sup2", 178)
    $sd.add("sup3", 179)
    $sd.add("acute", 180)
    $sd.add("micro", 181)
    $sd.add("para", 182)
    $sd.add("middot", 183)
    $sd.add("cedil", 184)
    $sd.add("sup1", 185)
    $sd.add("ordm", 186)
    $sd.add("raquo", 187)
    $sd.add("frac14", 188)
    $sd.add("frac12", 189)
    $sd.add("frac34", 190)
    $sd.add("iquest", 191)
    $sd.add("Agrave", 192)
    $sd.add("Aacute", 193)
    $sd.add("Acirc", 194)
    $sd.add("Atilde", 195)
    $sd.add("Auml", 196)
    $sd.add("Aring", 197)
    $sd.add("AElig", 198)
    $sd.add("Ccedil", 199)
    $sd.add("Egrave", 200)
    $sd.add("Eacute", 201)
    $sd.add("Ecirc", 202)
    $sd.add("Euml", 203)
    $sd.add("Igrave", 204)
    $sd.add("Iacute", 205)
    $sd.add("Icirc", 206)
    $sd.add("Iuml", 207)
    $sd.add("ETH", 208)
    $sd.add("Ntilde", 209)
    $sd.add("Ograve", 210)
    $sd.add("Oacute", 211)
    $sd.add("Ocirc", 212)
    $sd.add("Otilde", 213)
    $sd.add("Ouml", 214)
    $sd.add("times", 215)
    $sd.add("Oslash", 216)
    $sd.add("Ugrave", 217)
    $sd.add("Uacute", 218)
    $sd.add("Ucirc", 219)
    $sd.add("Uuml", 220)
    $sd.add("Yacute", 221)
    $sd.add("THORN", 222)
    $sd.add("szlig", 223)
    $sd.add("agrave", 224)
    $sd.add("aacute", 225)
    $sd.add("acirc", 226)
    $sd.add("atilde", 227)
    $sd.add("auml", 228)
    $sd.add("aring", 229)
    $sd.add("aelig", 230)
    $sd.add("ccedil", 231)
    $sd.add("egrave", 232)
    $sd.add("eacute", 233)
    $sd.add("ecirc", 234)
    $sd.add("euml", 235)
    $sd.add("igrave", 236)
    $sd.add("iacute", 237)
    $sd.add("icirc", 238)
    $sd.add("iuml", 239)
    $sd.add("eth", 240)
    $sd.add("ntilde", 241)
    $sd.add("ograve", 242)
    $sd.add("oacute", 243)
    $sd.add("ocirc", 244)
    $sd.add("otilde", 245)
    $sd.add("ouml", 246)
    $sd.add("divide", 247)
    $sd.add("oslash", 248)
    $sd.add("ugrave", 249)
    $sd.add("uacute", 250)
    $sd.add("ucirc", 251)
    $sd.add("uuml", 252)
    $sd.add("yacute", 253)
    $sd.add("thorn", 254)
    $sd.add("yuml", 255)
    $sd.add("OElig", 338)
    $sd.add("oelig", 339)
    $sd.add("Scaron", 352)
    $sd.add("scaron", 353)
    $sd.add("Yuml", 376)
    $sd.add("fnof", 402)
    $sd.add("circ", 710)
    $sd.add("tilde", 732)
    $sd.add("Alpha", 913)
    $sd.add("Beta", 914)
    $sd.add("Gamma", 915)
    $sd.add("Delta", 916)
    $sd.add("Epsilon", 917)
    $sd.add("Zeta", 918)
    $sd.add("Eta", 919)
    $sd.add("Theta", 920)
    $sd.add("Iota", 921)
    $sd.add("Kappa", 922)
    $sd.add("Lambda", 923)
    $sd.add("Mu", 924)
    $sd.add("Nu", 925)
    $sd.add("Xi", 926)
    $sd.add("Omicron", 927)
    $sd.add("Pi", 928)
    $sd.add("Rho", 929)
    $sd.add("Sigma", 931)
    $sd.add("Tau", 932)
    $sd.add("Upsilon", 933)
    $sd.add("Phi", 934)
    $sd.add("Chi", 935)
    $sd.add("Psi", 936)
    $sd.add("Omega", 937)
    $sd.add("alpha", 945)
    $sd.add("beta", 946)
    $sd.add("gamma", 947)
    $sd.add("delta", 948)
    $sd.add("epsilon", 949)
    $sd.add("zeta", 950)
    $sd.add("eta", 951)
    $sd.add("theta", 952)
    $sd.add("iota", 953)
    $sd.add("kappa", 954)
    $sd.add("lambda", 955)
    $sd.add("mu", 956)
    $sd.add("nu", 957)
    $sd.add("xi", 958)
    $sd.add("omicron", 959)
    $sd.add("pi", 960)
    $sd.add("rho", 961)
    $sd.add("sigmaf", 962)
    $sd.add("sigma", 963)
    $sd.add("tau", 964)
    $sd.add("upsilon", 965)
    $sd.add("phi", 966)
    $sd.add("chi", 967)
    $sd.add("psi", 968)
    $sd.add("omega", 969)
    $sd.add("thetasym", 977)
    $sd.add("upsih", 978)
    $sd.add("piv", 982)
    $sd.add("ensp", 8194)
    $sd.add("emsp", 8195)
    $sd.add("thinsp", 8201)
    $sd.add("zwnj", 8204)
    $sd.add("zwj", 8205)
    $sd.add("lrm", 8206)
    $sd.add("rlm", 8207)
    $sd.add("ndash", 8211)
    $sd.add("mdash", 8212)
    $sd.add("lsquo", 8216)
    $sd.add("rsquo", 8217)
    $sd.add("sbquo", 8218)
    $sd.add("ldquo", 8220)
    $sd.add("rdquo", 8221)
    $sd.add("bdquo", 8222)
    $sd.add("dagger", 8224)
    $sd.add("Dagger", 8225)
    $sd.add("bull", 8226)
    $sd.add("hellip", 8230)
    $sd.add("permil", 8240)
    $sd.add("prime", 8242)
    $sd.add("Prime", 8243)
    $sd.add("lsaquo", 8249)
    $sd.add("rsaquo", 8250)
    $sd.add("oline", 8254)
    $sd.add("frasl", 8260)
    $sd.add("euro", 8364)
    $sd.add("image", 8465)
    $sd.add("weierp", 8472)
    $sd.add("real", 8476)
    $sd.add("trade", 8482)
    $sd.add("alefsym", 8501)
    $sd.add("larr", 8592)
    $sd.add("uarr", 8593)
    $sd.add("rarr", 8594)
    $sd.add("darr", 8595)
    $sd.add("harr", 8596)
    $sd.add("crarr", 8629)
    $sd.add("lArr", 8656)
    $sd.add("uArr", 8657)
    $sd.add("rArr", 8658)
    $sd.add("dArr", 8659)
    $sd.add("hArr", 8660)
    $sd.add("forall", 8704)
    $sd.add("part", 8706)
    $sd.add("exist", 8707)
    $sd.add("empty", 8709)
    $sd.add("nabla", 8711)
    $sd.add("isin", 8712)
    $sd.add("notin", 8713)
    $sd.add("ni", 8715)
    $sd.add("prod", 8719)
    $sd.add("sum", 8721)
    $sd.add("minus", 8722)
    $sd.add("lowast", 8727)
    $sd.add("radic", 8730)
    $sd.add("prop", 8733)
    $sd.add("infin", 8734)
    $sd.add("ang", 8736)
    $sd.add("and", 8743)
    $sd.add("or", 8744)
    $sd.add("cap", 8745)
    $sd.add("cup", 8746)
    $sd.add("int", 8747)
    $sd.add("there4", 8756)
    $sd.add("sim", 8764)
    $sd.add("cong", 8773)
    $sd.add("asymp", 8776)
    $sd.add("ne", 8800)
    $sd.add("equiv", 8801)
    $sd.add("le", 8804)
    $sd.add("ge", 8805)
    $sd.add("sub", 8834)
    $sd.add("sup", 8835)
    $sd.add("nsub", 8836)
    $sd.add("sube", 8838)
    $sd.add("supe", 8839)
    $sd.add("oplus", 8853)
    $sd.add("otimes", 8855)
    $sd.add("perp", 8869)
    $sd.add("sdot", 8901)
    $sd.add("lceil", 8968)
    $sd.add("rceil", 8969)
    $sd.add("lfloor", 8970)
    $sd.add("rfloor", 8971)
    $sd.add("lang", 9001)
    $sd.add("rang", 9002)
    $sd.add("loz", 9674)
    $sd.add("spades", 9824)
    $sd.add("clubs", 9827)
    $sd.add("hearts", 9829)
    $sd.add("diams", 9830)

Local $string = "Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;"
Local $decoded = Execute("'" & StringRegExpReplace($string, "&(\w+);", "' & ChrW($sd.item('\1')) & '") & "'")

MsgBox(0, "Result", $decoded)

 

Link to comment
Share on other sites

@jchd YES! that was what I was looking for. This thread is being very instructive.

I have tested and it's unbelievable that while other methods take longer times (proportional to repetitions), the maps[] method takes always the same time: with 10, 100 or 1000 loops. 0.8seconds.

Link to comment
Share on other sites

Pretty similar. But since the beta is very stable why not remain within AutoIt realm?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

...another one (just for fun)
not very within the AutoIt realm. It relies on Javascript.
(pheraps it can be optimized using the ScriptControl & Javascript, but I'm not skilled with it.)

Global $oIE, $ohJS
_JS_Environment() ; setup the Javascript environment to be used within AutoIt

Global $oEntity = $ohJS.he ; create a reference to the Javascript Entity encode/decode engine.
;                            See here: https://github.com/mathiasbynens/he/blob/master/README.md

; Example of use
Local $string = "Uncle Tom&#039;s Cabin is a novel written by Harriet Beecher Stowe" & @CRLF & _
        "Uncle Tom&apos;s Cabin is a novel written by Harriet Beecher Stowe" & @CRLF & _
        "Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;"
MsgBox(0, "Debug he.js", $oEntity.decode($string))


; https://dev.w3.org/html5/html-author/charref
$string = "&#x1D538;&#x1D566;&#x1D565;&#x1D560;&#x1D540;&#x1D565; &#x0260E; &spades; &clubs; &hearts; &diams; &female; &male;" & @CRLF & _
        "&#x02554;&boxH;&boxH;&boxH;&boxH;&#x02557;" & @CRLF & _
        "&boxVR;&boxH;&boxH;&boxH;&boxH;&boxVL;" & @CRLF & _
        "&#x0255A;&boxH;&boxH;&boxH;&boxH;&boxUL;"
MsgBox(0, "Debug he.js", $oEntity.decode($string))


Func _JS_Environment() ; setup Javascript engine with also embeded the "entity" library

    ; This is a robust HTML entity encoder/decoder written in JavaScript. (see here: https://github.com/mathiasbynens/he)
    Local $sJScript = BinaryToString(InetRead("https://raw.githubusercontent.com/mathiasbynens/he/master/he.js"))

    ; *** create a minimal 'html' page listing for the browser
    Local $sHTML = "<HTML><HEAD>" & @CRLF
    $sHTML &= "<script>" & @CRLF ; Javascripts goes here
    ; $sHTML &= '"use strict";' & @CRLF ;
    $sHTML &= 'var JSglobal = (1,eval)("this");' & @CRLF ; the 'global' variable get a handle to the javascript global object
    $sHTML &= $sJScript & @CRLF ; #include <he.js> ; include the entity library
    $sHTML &= "</script>" & @CRLF
    $sHTML &= "</HEAD></HTML>" & @CRLF ; html closing tags
    ; *** end of html page listing

    $oIE = ObjCreate("Shell.Explorer.2") ; a BrowserControl engine
    GUICreate("", 10, 10, @DesktopWidth + 10, @DesktopHeight + 10) ; place the gui out of screen
    GUICtrlCreateObj($oIE, 0, 0, 10, 10) ; this render $oIE usable
    GUISetState(@SW_HIDE) ; hide GUI

    $oIE.navigate('about:blank')
    While Not String($oIE.readyState) = 'complete' ; wait for about:blank
        Sleep(100)
    WEnd

    $oIE.document.Write($sHTML) ; inject lising directly to the HTML document:
    $oIE.document.close() ; close the write stream
    $oIE.document.execCommand("Refresh")

    ; this waits till the document is ready to be used (portion of code from IE.au3)
    While Not (String($oIE.readyState) = "complete" Or $oIE.readyState = 4)
        Sleep(100)
    WEnd
    While Not (String($oIE.document.readyState) = "complete" Or $oIE.document.readyState = 4)
        Sleep(100)
    WEnd

    ; https://msdn.microsoft.com/en-us/library/52f50e9t(v=vs.94).aspx
    $ohJS = $oIE.document.parentwindow.JSglobal ; $ohJS is a reference to the javascript Global Obj
    ; ---- now the javascript engine can be used in our AutoIt script using the $ohJS reference ----
EndFunc   ;==>_JS_Environment

 

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

Alternative with good perfomance using big string.

#include <String.au3>
Global $__oObjectHTMLFile = Null
Global $__oErrorHandler = ObjEvent("AutoIt.Error", "_ErrFunc")

_Example()

Func _Example()
    Local $sHTML = _StringRepeat("Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;", 1000)

    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = _HTML_DecodeEntities($sHTML)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff: " & $fDiff & @CRLF)
    ConsoleWrite($OutString & @CRLF)

EndFunc   ;==>_Example



Func _HTML_DecodeEntities($sString)
    Local $sReturn = ""
    If Not IsObj($__oObjectHTMLFile) Then $__oObjectHTMLFile = ObjCreate("htmlfile")
    $__oObjectHTMLFile.Close()
    $__oObjectHTMLFile.Write($sString)
    $sReturn = $__oObjectHTMLFile.body.innerText
    Return $sReturn
EndFunc   ;==>_HTML_DecodeEntities

; User's COM error function. Will be called if COM error occurs
Func _ErrFunc($oError)
    ; Do anything here.
    ConsoleWrite(@ScriptName & " (" & $oError.scriptline & ") : ==> COM Error intercepted !" & @CRLF & _
            @TAB & "err.number is: " & @TAB & @TAB & "0x" & Hex($oError.number) & @CRLF & _
            @TAB & "err.windescription:" & @TAB & $oError.windescription & @CRLF & _
            @TAB & "err.description is: " & @TAB & $oError.description & @CRLF & _
            @TAB & "err.source is: " & @TAB & @TAB & $oError.source & @CRLF & _
            @TAB & "err.helpfile is: " & @TAB & $oError.helpfile & @CRLF & _
            @TAB & "err.helpcontext is: " & @TAB & $oError.helpcontext & @CRLF & _
            @TAB & "err.lastdllerror is: " & @TAB & $oError.lastdllerror & @CRLF & _
            @TAB & "err.scriptline is: " & @TAB & $oError.scriptline & @CRLF & _
            @TAB & "err.retcode is: " & @TAB & "0x" & Hex($oError.retcode) & @CRLF & @CRLF)
EndFunc   ;==>_ErrFunc

Saludos

 

Link to comment
Share on other sites

@Danyfirex

Excellent for very large input!

Indicative comparative timings for a given number of stringrepeat (content of string as in previous posts) single input:

10 stringrepeat
_HTMLFILE took 34.7350182982226 ms
_MAP took 0.403514160750338 ms
_DICTIONARY took 0.56785820081885 ms

1700 stringrepeat
_HTMLFILE took 54.9865774311557 ms
_MAP took 55.4249420952054 ms
_DICTIONARY took 93.3170059864281 ms

100,000 stringrepeat
_HTMLFILE took 1706.24579103297 ms
_MAP took 3446.63038342045 ms
_DICTIONARY took 5622.36746302345 ms

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

... a little race :)

#include <String.au3>
#include <ie.au3>
;
Global $oIE = _IECreate("about:blank", 0, 0, 1, 0)
;
Global $__oObjectHTMLFile = ObjCreate("htmlfile")
; Global $__oErrorHandler = ObjEvent("AutoIt.Error", "_ErrFunc")
;
Global $oIE2, $ohJS
_JS_Environment() ; setup the Javascript environment to be used within AutoIt
Global $oEntity = $ohJS.he ; create a reference to the Javascript Entity encode/decode engine.
;                            See here: https://github.com/mathiasbynens/he/blob/master/README.md
;
Global $sd
_Dictionary() ; create the dictionary
;

 Global $sInput = _StringRepeat("Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;", 1000)
;

; ----- Race -----
ConsoleWrite("running..." & @CRLF)
mikell()
Chimp()
Danyfirex()
mLipok()
ConsoleWrite("----------" & @CRLF)

Func mikell()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = Execute("'" & StringRegExpReplace($sInput, "&(\w+);", "' & ChrW($sd.item('\1')) & '") & "'")
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff mikell: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>mikell

Func Chimp()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = $oEntity.decode($sInput)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff Chimp: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Chimp

Func Danyfirex()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $__oObjectHTMLFile.Close()
        $__oObjectHTMLFile.Write($sInput)
        $OutString = $__oObjectHTMLFile.body.innerText
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff Danyfirex: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Danyfirex

Func mLipok()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
    _IEDocWriteHTML($oIE, $sInput)
     $OutString = _IEBodyReadText($oIE)
     Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff mLipok: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>mLipok

Func _JS_Environment() ; setup Javascript engine with also embeded the "entity" library

    ; This is a robust HTML entity encoder/decoder written in JavaScript. (see here: https://github.com/mathiasbynens/he)
    Local $sJScript = BinaryToString(InetRead("https://raw.githubusercontent.com/mathiasbynens/he/master/he.js"))

    ; *** create a minimal 'html' page listing for the browser
    Local $sHTML = "<HTML><HEAD>" & @CRLF
    $sHTML &= "<script>" & @CRLF ; Javascripts goes here
    ; $sHTML &= '"use strict";' & @CRLF ;
    $sHTML &= 'var JSglobal = (1,eval)("this");' & @CRLF ; the 'global' variable get a handle to the javascript global object
    $sHTML &= $sJScript & @CRLF ; #include <he.js> ; include the entity library
    $sHTML &= "</script>" & @CRLF
    $sHTML &= "</HEAD></HTML>" & @CRLF ; html closing tags
    ; *** end of html page listing

    $oIE2 = ObjCreate("Shell.Explorer.2") ; a BrowserControl engine
    GUICreate("", 10, 10, @DesktopWidth + 10, @DesktopHeight + 10) ; place the gui out of screen
    GUICtrlCreateObj($oIE2, 0, 0, 10, 10) ; this render $oIE2 usable
    GUISetState(@SW_HIDE) ; hide GUI

    $oIE2.navigate('about:blank')
    While Not String($oIE2.readyState) = 'complete' ; wait for about:blank
        Sleep(100)
    WEnd

    $oIE2.document.Write($sHTML) ; inject lising directly to the HTML document:
    $oIE2.document.close() ; close the write stream
    $oIE2.document.execCommand("Refresh")

    ; this waits till the document is ready to be used (portion of code from IE.au3)
    While Not (String($oIE2.readyState) = "complete" Or $oIE2.readyState = 4)
        Sleep(100)
    WEnd
    While Not (String($oIE2.document.readyState) = "complete" Or $oIE2.document.readyState = 4)
        Sleep(100)
    WEnd

    ; https://msdn.microsoft.com/en-us/library/52f50e9t(v=vs.94).aspx
    $ohJS = $oIE2.document.parentwindow.JSglobal ; $ohJS is a reference to the javascript Global Obj
    ; ---- now the javascript engine can be used in our AutoIt script using the $ohJS reference ----
EndFunc   ;==>_JS_Environment

Func _Dictionary()
    $sd = ObjCreate("Scripting.Dictionary")
    $sd.add("quot", 34)
    $sd.add("amp", 38)
    $sd.add("apos", 39)
    $sd.add("lt", 60)
    $sd.add("gt", 62)
    $sd.add("nbsp", 160)
    $sd.add("iexcl", 161)
    $sd.add("cent", 162)
    $sd.add("pound", 163)
    $sd.add("curren", 164)
    $sd.add("yen", 165)
    $sd.add("brvbar", 166)
    $sd.add("sect", 167)
    $sd.add("uml", 168)
    $sd.add("copy", 169)
    $sd.add("ordf", 170)
    $sd.add("laquo", 171)
    $sd.add("not", 172)
    $sd.add("shy", 173)
    $sd.add("reg", 174)
    $sd.add("macr", 175)
    $sd.add("deg", 176)
    $sd.add("plusmn", 177)
    $sd.add("sup2", 178)
    $sd.add("sup3", 179)
    $sd.add("acute", 180)
    $sd.add("micro", 181)
    $sd.add("para", 182)
    $sd.add("middot", 183)
    $sd.add("cedil", 184)
    $sd.add("sup1", 185)
    $sd.add("ordm", 186)
    $sd.add("raquo", 187)
    $sd.add("frac14", 188)
    $sd.add("frac12", 189)
    $sd.add("frac34", 190)
    $sd.add("iquest", 191)
    $sd.add("Agrave", 192)
    $sd.add("Aacute", 193)
    $sd.add("Acirc", 194)
    $sd.add("Atilde", 195)
    $sd.add("Auml", 196)
    $sd.add("Aring", 197)
    $sd.add("AElig", 198)
    $sd.add("Ccedil", 199)
    $sd.add("Egrave", 200)
    $sd.add("Eacute", 201)
    $sd.add("Ecirc", 202)
    $sd.add("Euml", 203)
    $sd.add("Igrave", 204)
    $sd.add("Iacute", 205)
    $sd.add("Icirc", 206)
    $sd.add("Iuml", 207)
    $sd.add("ETH", 208)
    $sd.add("Ntilde", 209)
    $sd.add("Ograve", 210)
    $sd.add("Oacute", 211)
    $sd.add("Ocirc", 212)
    $sd.add("Otilde", 213)
    $sd.add("Ouml", 214)
    $sd.add("times", 215)
    $sd.add("Oslash", 216)
    $sd.add("Ugrave", 217)
    $sd.add("Uacute", 218)
    $sd.add("Ucirc", 219)
    $sd.add("Uuml", 220)
    $sd.add("Yacute", 221)
    $sd.add("THORN", 222)
    $sd.add("szlig", 223)
    $sd.add("agrave", 224)
    $sd.add("aacute", 225)
    $sd.add("acirc", 226)
    $sd.add("atilde", 227)
    $sd.add("auml", 228)
    $sd.add("aring", 229)
    $sd.add("aelig", 230)
    $sd.add("ccedil", 231)
    $sd.add("egrave", 232)
    $sd.add("eacute", 233)
    $sd.add("ecirc", 234)
    $sd.add("euml", 235)
    $sd.add("igrave", 236)
    $sd.add("iacute", 237)
    $sd.add("icirc", 238)
    $sd.add("iuml", 239)
    $sd.add("eth", 240)
    $sd.add("ntilde", 241)
    $sd.add("ograve", 242)
    $sd.add("oacute", 243)
    $sd.add("ocirc", 244)
    $sd.add("otilde", 245)
    $sd.add("ouml", 246)
    $sd.add("divide", 247)
    $sd.add("oslash", 248)
    $sd.add("ugrave", 249)
    $sd.add("uacute", 250)
    $sd.add("ucirc", 251)
    $sd.add("uuml", 252)
    $sd.add("yacute", 253)
    $sd.add("thorn", 254)
    $sd.add("yuml", 255)
    $sd.add("OElig", 338)
    $sd.add("oelig", 339)
    $sd.add("Scaron", 352)
    $sd.add("scaron", 353)
    $sd.add("Yuml", 376)
    $sd.add("fnof", 402)
    $sd.add("circ", 710)
    $sd.add("tilde", 732)
    $sd.add("Alpha", 913)
    $sd.add("Beta", 914)
    $sd.add("Gamma", 915)
    $sd.add("Delta", 916)
    $sd.add("Epsilon", 917)
    $sd.add("Zeta", 918)
    $sd.add("Eta", 919)
    $sd.add("Theta", 920)
    $sd.add("Iota", 921)
    $sd.add("Kappa", 922)
    $sd.add("Lambda", 923)
    $sd.add("Mu", 924)
    $sd.add("Nu", 925)
    $sd.add("Xi", 926)
    $sd.add("Omicron", 927)
    $sd.add("Pi", 928)
    $sd.add("Rho", 929)
    $sd.add("Sigma", 931)
    $sd.add("Tau", 932)
    $sd.add("Upsilon", 933)
    $sd.add("Phi", 934)
    $sd.add("Chi", 935)
    $sd.add("Psi", 936)
    $sd.add("Omega", 937)
    $sd.add("alpha", 945)
    $sd.add("beta", 946)
    $sd.add("gamma", 947)
    $sd.add("delta", 948)
    $sd.add("epsilon", 949)
    $sd.add("zeta", 950)
    $sd.add("eta", 951)
    $sd.add("theta", 952)
    $sd.add("iota", 953)
    $sd.add("kappa", 954)
    $sd.add("lambda", 955)
    $sd.add("mu", 956)
    $sd.add("nu", 957)
    $sd.add("xi", 958)
    $sd.add("omicron", 959)
    $sd.add("pi", 960)
    $sd.add("rho", 961)
    $sd.add("sigmaf", 962)
    $sd.add("sigma", 963)
    $sd.add("tau", 964)
    $sd.add("upsilon", 965)
    $sd.add("phi", 966)
    $sd.add("chi", 967)
    $sd.add("psi", 968)
    $sd.add("omega", 969)
    $sd.add("thetasym", 977)
    $sd.add("upsih", 978)
    $sd.add("piv", 982)
    $sd.add("ensp", 8194)
    $sd.add("emsp", 8195)
    $sd.add("thinsp", 8201)
    $sd.add("zwnj", 8204)
    $sd.add("zwj", 8205)
    $sd.add("lrm", 8206)
    $sd.add("rlm", 8207)
    $sd.add("ndash", 8211)
    $sd.add("mdash", 8212)
    $sd.add("lsquo", 8216)
    $sd.add("rsquo", 8217)
    $sd.add("sbquo", 8218)
    $sd.add("ldquo", 8220)
    $sd.add("rdquo", 8221)
    $sd.add("bdquo", 8222)
    $sd.add("dagger", 8224)
    $sd.add("Dagger", 8225)
    $sd.add("bull", 8226)
    $sd.add("hellip", 8230)
    $sd.add("permil", 8240)
    $sd.add("prime", 8242)
    $sd.add("Prime", 8243)
    $sd.add("lsaquo", 8249)
    $sd.add("rsaquo", 8250)
    $sd.add("oline", 8254)
    $sd.add("frasl", 8260)
    $sd.add("euro", 8364)
    $sd.add("image", 8465)
    $sd.add("weierp", 8472)
    $sd.add("real", 8476)
    $sd.add("trade", 8482)
    $sd.add("alefsym", 8501)
    $sd.add("larr", 8592)
    $sd.add("uarr", 8593)
    $sd.add("rarr", 8594)
    $sd.add("darr", 8595)
    $sd.add("harr", 8596)
    $sd.add("crarr", 8629)
    $sd.add("lArr", 8656)
    $sd.add("uArr", 8657)
    $sd.add("rArr", 8658)
    $sd.add("dArr", 8659)
    $sd.add("hArr", 8660)
    $sd.add("forall", 8704)
    $sd.add("part", 8706)
    $sd.add("exist", 8707)
    $sd.add("empty", 8709)
    $sd.add("nabla", 8711)
    $sd.add("isin", 8712)
    $sd.add("notin", 8713)
    $sd.add("ni", 8715)
    $sd.add("prod", 8719)
    $sd.add("sum", 8721)
    $sd.add("minus", 8722)
    $sd.add("lowast", 8727)
    $sd.add("radic", 8730)
    $sd.add("prop", 8733)
    $sd.add("infin", 8734)
    $sd.add("ang", 8736)
    $sd.add("and", 8743)
    $sd.add("or", 8744)
    $sd.add("cap", 8745)
    $sd.add("cup", 8746)
    $sd.add("int", 8747)
    $sd.add("there4", 8756)
    $sd.add("sim", 8764)
    $sd.add("cong", 8773)
    $sd.add("asymp", 8776)
    $sd.add("ne", 8800)
    $sd.add("equiv", 8801)
    $sd.add("le", 8804)
    $sd.add("ge", 8805)
    $sd.add("sub", 8834)
    $sd.add("sup", 8835)
    $sd.add("nsub", 8836)
    $sd.add("sube", 8838)
    $sd.add("supe", 8839)
    $sd.add("oplus", 8853)
    $sd.add("otimes", 8855)
    $sd.add("perp", 8869)
    $sd.add("sdot", 8901)
    $sd.add("lceil", 8968)
    $sd.add("rceil", 8969)
    $sd.add("lfloor", 8970)
    $sd.add("rfloor", 8971)
    $sd.add("lang", 9001)
    $sd.add("rang", 9002)
    $sd.add("loz", 9674)
    $sd.add("spades", 9824)
    $sd.add("clubs", 9827)
    $sd.add("hearts", 9829)
    $sd.add("diams", 9830)
EndFunc

 

Edited by Chimp
replaced previous listing

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

added jchd map example to chimp's race.

 

#include <String.au3>
#include <ie.au3>
;
Global $oIE = _IECreate("about:blank", 0, 0, 1, 0)
;
Global $__oObjectHTMLFile = ObjCreate("htmlfile")
; Global $__oErrorHandler = ObjEvent("AutoIt.Error", "_ErrFunc")
;
Global $oIE2, $ohJS
_JS_Environment() ; setup the Javascript environment to be used within AutoIt
Global $oEntity = $ohJS.he ; create a reference to the Javascript Entity encode/decode engine.
;                            See here: https://github.com/mathiasbynens/he/blob/master/README.md
;
Global $sd
_Dictionary() ; create the dictionary
;


If @AutoItVersion = "3.3.15.0" Then
    Global $map[]
    _Map()
EndIf



Global $sInput = _StringRepeat("Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;", 1000)
;

; ----- Race -----
ConsoleWrite("running..." & @CRLF)
mikell()
Chimp()
Danyfirex()
mLipok()
If @AutoItVersion = "3.3.15.0" Then jchd()
ConsoleWrite("----------" & @CRLF)



Func jchd()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = Execute("'" & StringRegExpReplace($sInput, "&(\w+);", "' & ChrW($map['\1']) & '") & "'")
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff jchd: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>jchd

Func mikell()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = Execute("'" & StringRegExpReplace($sInput, "&(\w+);", "' & ChrW($sd.item('\1')) & '") & "'")
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff mikell: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>mikell

Func Chimp()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = $oEntity.decode($sInput)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff Chimp: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Chimp

Func Danyfirex()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $__oObjectHTMLFile.Close()
        $__oObjectHTMLFile.Write($sInput)
        $OutString = $__oObjectHTMLFile.body.innerText
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff Danyfirex: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Danyfirex

Func mLipok()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        _IEDocWriteHTML($oIE, $sInput)
        $OutString = _IEBodyReadText($oIE)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff mLipok: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>mLipok

Func _JS_Environment() ; setup Javascript engine with also embeded the "entity" library

    ; This is a robust HTML entity encoder/decoder written in JavaScript. (see here: https://github.com/mathiasbynens/he)
    Local $sJScript = BinaryToString(InetRead("https://raw.githubusercontent.com/mathiasbynens/he/master/he.js"))

    ; *** create a minimal 'html' page listing for the browser
    Local $sHTML = "<HTML><HEAD>" & @CRLF
    $sHTML &= "<script>" & @CRLF ; Javascripts goes here
    ; $sHTML &= '"use strict";' & @CRLF ;
    $sHTML &= 'var JSglobal = (1,eval)("this");' & @CRLF ; the 'global' variable get a handle to the javascript global object
    $sHTML &= $sJScript & @CRLF ; #include <he.js> ; include the entity library
    $sHTML &= "</script>" & @CRLF
    $sHTML &= "</HEAD></HTML>" & @CRLF ; html closing tags
    ; *** end of html page listing

    $oIE2 = ObjCreate("Shell.Explorer.2") ; a BrowserControl engine
    GUICreate("", 10, 10, @DesktopWidth + 10, @DesktopHeight + 10) ; place the gui out of screen
    GUICtrlCreateObj($oIE2, 0, 0, 10, 10) ; this render $oIE2 usable
    GUISetState(@SW_HIDE) ; hide GUI

    $oIE2.navigate('about:blank')
    While Not String($oIE2.readyState) = 'complete' ; wait for about:blank
        Sleep(100)
    WEnd

    $oIE2.document.Write($sHTML) ; inject lising directly to the HTML document:
    $oIE2.document.close() ; close the write stream
    $oIE2.document.execCommand("Refresh")

    ; this waits till the document is ready to be used (portion of code from IE.au3)
    While Not (String($oIE2.readyState) = "complete" Or $oIE2.readyState = 4)
        Sleep(100)
    WEnd
    While Not (String($oIE2.document.readyState) = "complete" Or $oIE2.document.readyState = 4)
        Sleep(100)
    WEnd

    ; https://msdn.microsoft.com/en-us/library/52f50e9t(v=vs.94).aspx
    $ohJS = $oIE2.document.parentwindow.JSglobal ; $ohJS is a reference to the javascript Global Obj
    ; ---- now the javascript engine can be used in our AutoIt script using the $ohJS reference ----
EndFunc   ;==>_JS_Environment




Func _Dictionary()
    $sd = ObjCreate("Scripting.Dictionary")
    $sd.add("quot", 34)
    $sd.add("amp", 38)
    $sd.add("apos", 39)
    $sd.add("lt", 60)
    $sd.add("gt", 62)
    $sd.add("nbsp", 160)
    $sd.add("iexcl", 161)
    $sd.add("cent", 162)
    $sd.add("pound", 163)
    $sd.add("curren", 164)
    $sd.add("yen", 165)
    $sd.add("brvbar", 166)
    $sd.add("sect", 167)
    $sd.add("uml", 168)
    $sd.add("copy", 169)
    $sd.add("ordf", 170)
    $sd.add("laquo", 171)
    $sd.add("not", 172)
    $sd.add("shy", 173)
    $sd.add("reg", 174)
    $sd.add("macr", 175)
    $sd.add("deg", 176)
    $sd.add("plusmn", 177)
    $sd.add("sup2", 178)
    $sd.add("sup3", 179)
    $sd.add("acute", 180)
    $sd.add("micro", 181)
    $sd.add("para", 182)
    $sd.add("middot", 183)
    $sd.add("cedil", 184)
    $sd.add("sup1", 185)
    $sd.add("ordm", 186)
    $sd.add("raquo", 187)
    $sd.add("frac14", 188)
    $sd.add("frac12", 189)
    $sd.add("frac34", 190)
    $sd.add("iquest", 191)
    $sd.add("Agrave", 192)
    $sd.add("Aacute", 193)
    $sd.add("Acirc", 194)
    $sd.add("Atilde", 195)
    $sd.add("Auml", 196)
    $sd.add("Aring", 197)
    $sd.add("AElig", 198)
    $sd.add("Ccedil", 199)
    $sd.add("Egrave", 200)
    $sd.add("Eacute", 201)
    $sd.add("Ecirc", 202)
    $sd.add("Euml", 203)
    $sd.add("Igrave", 204)
    $sd.add("Iacute", 205)
    $sd.add("Icirc", 206)
    $sd.add("Iuml", 207)
    $sd.add("ETH", 208)
    $sd.add("Ntilde", 209)
    $sd.add("Ograve", 210)
    $sd.add("Oacute", 211)
    $sd.add("Ocirc", 212)
    $sd.add("Otilde", 213)
    $sd.add("Ouml", 214)
    $sd.add("times", 215)
    $sd.add("Oslash", 216)
    $sd.add("Ugrave", 217)
    $sd.add("Uacute", 218)
    $sd.add("Ucirc", 219)
    $sd.add("Uuml", 220)
    $sd.add("Yacute", 221)
    $sd.add("THORN", 222)
    $sd.add("szlig", 223)
    $sd.add("agrave", 224)
    $sd.add("aacute", 225)
    $sd.add("acirc", 226)
    $sd.add("atilde", 227)
    $sd.add("auml", 228)
    $sd.add("aring", 229)
    $sd.add("aelig", 230)
    $sd.add("ccedil", 231)
    $sd.add("egrave", 232)
    $sd.add("eacute", 233)
    $sd.add("ecirc", 234)
    $sd.add("euml", 235)
    $sd.add("igrave", 236)
    $sd.add("iacute", 237)
    $sd.add("icirc", 238)
    $sd.add("iuml", 239)
    $sd.add("eth", 240)
    $sd.add("ntilde", 241)
    $sd.add("ograve", 242)
    $sd.add("oacute", 243)
    $sd.add("ocirc", 244)
    $sd.add("otilde", 245)
    $sd.add("ouml", 246)
    $sd.add("divide", 247)
    $sd.add("oslash", 248)
    $sd.add("ugrave", 249)
    $sd.add("uacute", 250)
    $sd.add("ucirc", 251)
    $sd.add("uuml", 252)
    $sd.add("yacute", 253)
    $sd.add("thorn", 254)
    $sd.add("yuml", 255)
    $sd.add("OElig", 338)
    $sd.add("oelig", 339)
    $sd.add("Scaron", 352)
    $sd.add("scaron", 353)
    $sd.add("Yuml", 376)
    $sd.add("fnof", 402)
    $sd.add("circ", 710)
    $sd.add("tilde", 732)
    $sd.add("Alpha", 913)
    $sd.add("Beta", 914)
    $sd.add("Gamma", 915)
    $sd.add("Delta", 916)
    $sd.add("Epsilon", 917)
    $sd.add("Zeta", 918)
    $sd.add("Eta", 919)
    $sd.add("Theta", 920)
    $sd.add("Iota", 921)
    $sd.add("Kappa", 922)
    $sd.add("Lambda", 923)
    $sd.add("Mu", 924)
    $sd.add("Nu", 925)
    $sd.add("Xi", 926)
    $sd.add("Omicron", 927)
    $sd.add("Pi", 928)
    $sd.add("Rho", 929)
    $sd.add("Sigma", 931)
    $sd.add("Tau", 932)
    $sd.add("Upsilon", 933)
    $sd.add("Phi", 934)
    $sd.add("Chi", 935)
    $sd.add("Psi", 936)
    $sd.add("Omega", 937)
    $sd.add("alpha", 945)
    $sd.add("beta", 946)
    $sd.add("gamma", 947)
    $sd.add("delta", 948)
    $sd.add("epsilon", 949)
    $sd.add("zeta", 950)
    $sd.add("eta", 951)
    $sd.add("theta", 952)
    $sd.add("iota", 953)
    $sd.add("kappa", 954)
    $sd.add("lambda", 955)
    $sd.add("mu", 956)
    $sd.add("nu", 957)
    $sd.add("xi", 958)
    $sd.add("omicron", 959)
    $sd.add("pi", 960)
    $sd.add("rho", 961)
    $sd.add("sigmaf", 962)
    $sd.add("sigma", 963)
    $sd.add("tau", 964)
    $sd.add("upsilon", 965)
    $sd.add("phi", 966)
    $sd.add("chi", 967)
    $sd.add("psi", 968)
    $sd.add("omega", 969)
    $sd.add("thetasym", 977)
    $sd.add("upsih", 978)
    $sd.add("piv", 982)
    $sd.add("ensp", 8194)
    $sd.add("emsp", 8195)
    $sd.add("thinsp", 8201)
    $sd.add("zwnj", 8204)
    $sd.add("zwj", 8205)
    $sd.add("lrm", 8206)
    $sd.add("rlm", 8207)
    $sd.add("ndash", 8211)
    $sd.add("mdash", 8212)
    $sd.add("lsquo", 8216)
    $sd.add("rsquo", 8217)
    $sd.add("sbquo", 8218)
    $sd.add("ldquo", 8220)
    $sd.add("rdquo", 8221)
    $sd.add("bdquo", 8222)
    $sd.add("dagger", 8224)
    $sd.add("Dagger", 8225)
    $sd.add("bull", 8226)
    $sd.add("hellip", 8230)
    $sd.add("permil", 8240)
    $sd.add("prime", 8242)
    $sd.add("Prime", 8243)
    $sd.add("lsaquo", 8249)
    $sd.add("rsaquo", 8250)
    $sd.add("oline", 8254)
    $sd.add("frasl", 8260)
    $sd.add("euro", 8364)
    $sd.add("image", 8465)
    $sd.add("weierp", 8472)
    $sd.add("real", 8476)
    $sd.add("trade", 8482)
    $sd.add("alefsym", 8501)
    $sd.add("larr", 8592)
    $sd.add("uarr", 8593)
    $sd.add("rarr", 8594)
    $sd.add("darr", 8595)
    $sd.add("harr", 8596)
    $sd.add("crarr", 8629)
    $sd.add("lArr", 8656)
    $sd.add("uArr", 8657)
    $sd.add("rArr", 8658)
    $sd.add("dArr", 8659)
    $sd.add("hArr", 8660)
    $sd.add("forall", 8704)
    $sd.add("part", 8706)
    $sd.add("exist", 8707)
    $sd.add("empty", 8709)
    $sd.add("nabla", 8711)
    $sd.add("isin", 8712)
    $sd.add("notin", 8713)
    $sd.add("ni", 8715)
    $sd.add("prod", 8719)
    $sd.add("sum", 8721)
    $sd.add("minus", 8722)
    $sd.add("lowast", 8727)
    $sd.add("radic", 8730)
    $sd.add("prop", 8733)
    $sd.add("infin", 8734)
    $sd.add("ang", 8736)
    $sd.add("and", 8743)
    $sd.add("or", 8744)
    $sd.add("cap", 8745)
    $sd.add("cup", 8746)
    $sd.add("int", 8747)
    $sd.add("there4", 8756)
    $sd.add("sim", 8764)
    $sd.add("cong", 8773)
    $sd.add("asymp", 8776)
    $sd.add("ne", 8800)
    $sd.add("equiv", 8801)
    $sd.add("le", 8804)
    $sd.add("ge", 8805)
    $sd.add("sub", 8834)
    $sd.add("sup", 8835)
    $sd.add("nsub", 8836)
    $sd.add("sube", 8838)
    $sd.add("supe", 8839)
    $sd.add("oplus", 8853)
    $sd.add("otimes", 8855)
    $sd.add("perp", 8869)
    $sd.add("sdot", 8901)
    $sd.add("lceil", 8968)
    $sd.add("rceil", 8969)
    $sd.add("lfloor", 8970)
    $sd.add("rfloor", 8971)
    $sd.add("lang", 9001)
    $sd.add("rang", 9002)
    $sd.add("loz", 9674)
    $sd.add("spades", 9824)
    $sd.add("clubs", 9827)
    $sd.add("hearts", 9829)
    $sd.add("diams", 9830)
EndFunc   ;==>_Dictionary


Func _Map()

    $map["quot"] = 34
    $map["amp"] = 38
    $map["apos"] = 39
    $map["lt"] = 60
    $map["gt"] = 62
    $map["nbsp"] = 160
    $map["iexcl"] = 161
    $map["cent"] = 162
    $map["pound"] = 163
    $map["curren"] = 164
    $map["yen"] = 165
    $map["brvbar"] = 166
    $map["sect"] = 167
    $map["uml"] = 168
    $map["copy"] = 169
    $map["ordf"] = 170
    $map["laquo"] = 171
    $map["not"] = 172
    $map["shy"] = 173
    $map["reg"] = 174
    $map["macr"] = 175
    $map["deg"] = 176
    $map["plusmn"] = 177
    $map["sup2"] = 178
    $map["sup3"] = 179
    $map["acute"] = 180
    $map["micro"] = 181
    $map["para"] = 182
    $map["middot"] = 183
    $map["cedil"] = 184
    $map["sup1"] = 185
    $map["ordm"] = 186
    $map["raquo"] = 187
    $map["frac14"] = 188
    $map["frac12"] = 189
    $map["frac34"] = 190
    $map["iquest"] = 191
    $map["Agrave"] = 192
    $map["Aacute"] = 193
    $map["Acirc"] = 194
    $map["Atilde"] = 195
    $map["Auml"] = 196
    $map["Aring"] = 197
    $map["AElig"] = 198
    $map["Ccedil"] = 199
    $map["Egrave"] = 200
    $map["Eacute"] = 201
    $map["Ecirc"] = 202
    $map["Euml"] = 203
    $map["Igrave"] = 204
    $map["Iacute"] = 205
    $map["Icirc"] = 206
    $map["Iuml"] = 207
    $map["ETH"] = 208
    $map["Ntilde"] = 209
    $map["Ograve"] = 210
    $map["Oacute"] = 211
    $map["Ocirc"] = 212
    $map["Otilde"] = 213
    $map["Ouml"] = 214
    $map["times"] = 215
    $map["Oslash"] = 216
    $map["Ugrave"] = 217
    $map["Uacute"] = 218
    $map["Ucirc"] = 219
    $map["Uuml"] = 220
    $map["Yacute"] = 221
    $map["THORN"] = 222
    $map["szlig"] = 223
    $map["agrave"] = 224
    $map["aacute"] = 225
    $map["acirc"] = 226
    $map["atilde"] = 227
    $map["auml"] = 228
    $map["aring"] = 229
    $map["aelig"] = 230
    $map["ccedil"] = 231
    $map["egrave"] = 232
    $map["eacute"] = 233
    $map["ecirc"] = 234
    $map["euml"] = 235
    $map["igrave"] = 236
    $map["iacute"] = 237
    $map["icirc"] = 238
    $map["iuml"] = 239
    $map["eth"] = 240
    $map["ntilde"] = 241
    $map["ograve"] = 242
    $map["oacute"] = 243
    $map["ocirc"] = 244
    $map["otilde"] = 245
    $map["ouml"] = 246
    $map["divide"] = 247
    $map["oslash"] = 248
    $map["ugrave"] = 249
    $map["uacute"] = 250
    $map["ucirc"] = 251
    $map["uuml"] = 252
    $map["yacute"] = 253
    $map["thorn"] = 254
    $map["yuml"] = 255
    $map["OElig"] = 338
    $map["oelig"] = 339
    $map["Scaron"] = 352
    $map["scaron"] = 353
    $map["Yuml"] = 376
    $map["fnof"] = 402
    $map["circ"] = 710
    $map["tilde"] = 732
    $map["Alpha"] = 913
    $map["Beta"] = 914
    $map["Gamma"] = 915
    $map["Delta"] = 916
    $map["Epsilon"] = 917
    $map["Zeta"] = 918
    $map["Eta"] = 919
    $map["Theta"] = 920
    $map["Iota"] = 921
    $map["Kappa"] = 922
    $map["Lambda"] = 923
    $map["Mu"] = 924
    $map["Nu"] = 925
    $map["Xi"] = 926
    $map["Omicron"] = 927
    $map["Pi"] = 928
    $map["Rho"] = 929
    $map["Sigma"] = 931
    $map["Tau"] = 932
    $map["Upsilon"] = 933
    $map["Phi"] = 934
    $map["Chi"] = 935
    $map["Psi"] = 936
    $map["Omega"] = 937
    $map["alpha"] = 945
    $map["beta"] = 946
    $map["gamma"] = 947
    $map["delta"] = 948
    $map["epsilon"] = 949
    $map["zeta"] = 950
    $map["eta"] = 951
    $map["theta"] = 952
    $map["iota"] = 953
    $map["kappa"] = 954
    $map["lambda"] = 955
    $map["mu"] = 956
    $map["nu"] = 957
    $map["xi"] = 958
    $map["omicron"] = 959
    $map["pi"] = 960
    $map["rho"] = 961
    $map["sigmaf"] = 962
    $map["sigma"] = 963
    $map["tau"] = 964
    $map["upsilon"] = 965
    $map["phi"] = 966
    $map["chi"] = 967
    $map["psi"] = 968
    $map["omega"] = 969
    $map["thetasym"] = 977
    $map["upsih"] = 978
    $map["piv"] = 982
    $map["ensp"] = 8194
    $map["emsp"] = 8195
    $map["thinsp"] = 8201
    $map["zwnj"] = 8204
    $map["zwj"] = 8205
    $map["lrm"] = 8206
    $map["rlm"] = 8207
    $map["ndash"] = 8211
    $map["mdash"] = 8212
    $map["lsquo"] = 8216
    $map["rsquo"] = 8217
    $map["sbquo"] = 8218
    $map["ldquo"] = 8220
    $map["rdquo"] = 8221
    $map["bdquo"] = 8222
    $map["dagger"] = 8224
    $map["Dagger"] = 8225
    $map["bull"] = 8226
    $map["hellip"] = 8230
    $map["permil"] = 8240
    $map["prime"] = 8242
    $map["Prime"] = 8243
    $map["lsaquo"] = 8249
    $map["rsaquo"] = 8250
    $map["oline"] = 8254
    $map["frasl"] = 8260
    $map["euro"] = 8364
    $map["image"] = 8465
    $map["weierp"] = 8472
    $map["real"] = 8476
    $map["trade"] = 8482
    $map["alefsym"] = 8501
    $map["larr"] = 8592
    $map["uarr"] = 8593
    $map["rarr"] = 8594
    $map["darr"] = 8595
    $map["harr"] = 8596
    $map["crarr"] = 8629
    $map["lArr"] = 8656
    $map["uArr"] = 8657
    $map["rArr"] = 8658
    $map["dArr"] = 8659
    $map["hArr"] = 8660
    $map["forall"] = 8704
    $map["part"] = 8706
    $map["exist"] = 8707
    $map["empty"] = 8709
    $map["nabla"] = 8711
    $map["isin"] = 8712
    $map["notin"] = 8713
    $map["ni"] = 8715
    $map["prod"] = 8719
    $map["sum"] = 8721
    $map["minus"] = 8722
    $map["lowast"] = 8727
    $map["radic"] = 8730
    $map["prop"] = 8733
    $map["infin"] = 8734
    $map["ang"] = 8736
    $map["and"] = 8743
    $map["or"] = 8744
    $map["cap"] = 8745
    $map["cup"] = 8746
    $map["int"] = 8747
    $map["there4"] = 8756
    $map["sim"] = 8764
    $map["cong"] = 8773
    $map["asymp"] = 8776
    $map["ne"] = 8800
    $map["equiv"] = 8801
    $map["le"] = 8804
    $map["ge"] = 8805
    $map["sub"] = 8834
    $map["sup"] = 8835
    $map["nsub"] = 8836
    $map["sube"] = 8838
    $map["supe"] = 8839
    $map["oplus"] = 8853
    $map["otimes"] = 8855
    $map["perp"] = 8869
    $map["sdot"] = 8901
    $map["lceil"] = 8968
    $map["rceil"] = 8969
    $map["lfloor"] = 8970
    $map["rfloor"] = 8971
    $map["lang"] = 9001
    $map["rang"] = 9002
    $map["loz"] = 9674
    $map["spades"] = 9824
    $map["clubs"] = 9827
    $map["hearts"] = 9829
    $map["diams"] = 9830
EndFunc   ;==>_Map

Saludos

Link to comment
Share on other sites

Added example using CLR UDF. (Really fast)

 

#include <String.au3>
#include <ie.au3>
#include ".\Includes\CLR.Au3"
#include ".\Includes\SafeArray.au3"
$__g_bIEErrorNotify=False
Global $oIE = _IECreate("about:blank", 0, 0, 1, 0)
;
Global $__oObjectHTMLFile = ObjCreate("htmlfile")
; Global $__oErrorHandler = ObjEvent("AutoIt.Error", "_ErrFunc")
;
Global $oIE2, $ohJS
_JS_Environment() ; setup the Javascript environment to be used within AutoIt
Global $oEntity = $ohJS.he ; create a reference to the Javascript Entity encode/decode engine.
;                            See here: https://github.com/mathiasbynens/he/blob/master/README.md
;
Global $sd
_Dictionary() ; create the dictionary
;


If @AutoItVersion = "3.3.15.0" Then
    Global $map[]
    _Map()
EndIf



Global $sInput = _StringRepeat("Hello this is a &spades; and this is a &equiv; or a &gamma; and a &Atilde; or &iacute;", 1000)
;

; ----- Race -----
ConsoleWrite("running..." & @CRLF)
mikell()
Chimp()
Danyfirex()
mLipok()
If @AutoItVersion = "3.3.15.0" Then jchd()
CLRTest()
ConsoleWrite("----------" & @CRLF)


Func CLRTest()
    Local $hTimer = TimerInit()
    Local $aText[] = [$sInput]
    Local $oAssembly = _CLR_LoadLibrary("System")
;~  ConsoleWrite("$oAssembly: " & IsObj($oAssembly) & @CRLF)
    Local $pAssemblyType = 0
    $oAssembly.GetType_2("System.Net.WebUtility", $pAssemblyType)
;~  ConsoleWrite("$pAssemblyType = " & Ptr($pAssemblyType) & @CRLF)
    Local $oActivatorType = ObjCreateInterface($pAssemblyType, $sIID_IType, $sTag_IType)
;~  ConsoleWrite("IsObj( $oAssemblyType ) = " & IsObj($oActivatorType) & @TAB & @CRLF)
    Local $sOutString=""
    For $i=1 to 100
    $oActivatorType.InvokeMember_3("HtmlDecode", 256, 0, 0, CreateSafeArray($aText), $sOutString)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff CLRTest: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Example

Func jchd()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = Execute("'" & StringRegExpReplace($sInput, "&(\w+);", "' & ChrW($map['\1']) & '") & "'")
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff jchd: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>jchd

Func mikell()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = Execute("'" & StringRegExpReplace($sInput, "&(\w+);", "' & ChrW($sd.item('\1')) & '") & "'")
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff mikell: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>mikell

Func Chimp()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $OutString = $oEntity.decode($sInput)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff Chimp: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Chimp

Func Danyfirex()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        $__oObjectHTMLFile.Close()
        $__oObjectHTMLFile.Write($sInput)
        $OutString = $__oObjectHTMLFile.body.innerText
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff Danyfirex: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>Danyfirex

Func mLipok()
    Local $hTimer = TimerInit()
    Local $OutString = ""
    For $i = 1 To 100
        _IEDocWriteHTML($oIE, $sInput)
        $OutString = _IEBodyReadText($oIE)
    Next
    Local $fDiff = TimerDiff($hTimer)
    ConsoleWrite("TimerDiff mLipok: " & @TAB & $fDiff & @CRLF)
EndFunc   ;==>mLipok

Func _JS_Environment() ; setup Javascript engine with also embeded the "entity" library

    ; This is a robust HTML entity encoder/decoder written in JavaScript. (see here: https://github.com/mathiasbynens/he)
    Local $sJScript = BinaryToString(InetRead("https://raw.githubusercontent.com/mathiasbynens/he/master/he.js"))

    ; *** create a minimal 'html' page listing for the browser
    Local $sHTML = "<HTML><HEAD>" & @CRLF
    $sHTML &= "<script>" & @CRLF ; Javascripts goes here
    ; $sHTML &= '"use strict";' & @CRLF ;
    $sHTML &= 'var JSglobal = (1,eval)("this");' & @CRLF ; the 'global' variable get a handle to the javascript global object
    $sHTML &= $sJScript & @CRLF ; #include <he.js> ; include the entity library
    $sHTML &= "</script>" & @CRLF
    $sHTML &= "</HEAD></HTML>" & @CRLF ; html closing tags
    ; *** end of html page listing

    $oIE2 = ObjCreate("Shell.Explorer.2") ; a BrowserControl engine
    GUICreate("", 10, 10, @DesktopWidth + 10, @DesktopHeight + 10) ; place the gui out of screen
    GUICtrlCreateObj($oIE2, 0, 0, 10, 10) ; this render $oIE2 usable
    GUISetState(@SW_HIDE) ; hide GUI

    $oIE2.navigate('about:blank')
    While Not String($oIE2.readyState) = 'complete' ; wait for about:blank
        Sleep(100)
    WEnd

    $oIE2.document.Write($sHTML) ; inject lising directly to the HTML document:
    $oIE2.document.close() ; close the write stream
    $oIE2.document.execCommand("Refresh")

    ; this waits till the document is ready to be used (portion of code from IE.au3)
    While Not (String($oIE2.readyState) = "complete" Or $oIE2.readyState = 4)
        Sleep(100)
    WEnd
    While Not (String($oIE2.document.readyState) = "complete" Or $oIE2.document.readyState = 4)
        Sleep(100)
    WEnd

    ; https://msdn.microsoft.com/en-us/library/52f50e9t(v=vs.94).aspx
    $ohJS = $oIE2.document.parentwindow.JSglobal ; $ohJS is a reference to the javascript Global Obj
    ; ---- now the javascript engine can be used in our AutoIt script using the $ohJS reference ----
EndFunc   ;==>_JS_Environment




Func _Dictionary()
    $sd = ObjCreate("Scripting.Dictionary")
    $sd.add("quot", 34)
    $sd.add("amp", 38)
    $sd.add("apos", 39)
    $sd.add("lt", 60)
    $sd.add("gt", 62)
    $sd.add("nbsp", 160)
    $sd.add("iexcl", 161)
    $sd.add("cent", 162)
    $sd.add("pound", 163)
    $sd.add("curren", 164)
    $sd.add("yen", 165)
    $sd.add("brvbar", 166)
    $sd.add("sect", 167)
    $sd.add("uml", 168)
    $sd.add("copy", 169)
    $sd.add("ordf", 170)
    $sd.add("laquo", 171)
    $sd.add("not", 172)
    $sd.add("shy", 173)
    $sd.add("reg", 174)
    $sd.add("macr", 175)
    $sd.add("deg", 176)
    $sd.add("plusmn", 177)
    $sd.add("sup2", 178)
    $sd.add("sup3", 179)
    $sd.add("acute", 180)
    $sd.add("micro", 181)
    $sd.add("para", 182)
    $sd.add("middot", 183)
    $sd.add("cedil", 184)
    $sd.add("sup1", 185)
    $sd.add("ordm", 186)
    $sd.add("raquo", 187)
    $sd.add("frac14", 188)
    $sd.add("frac12", 189)
    $sd.add("frac34", 190)
    $sd.add("iquest", 191)
    $sd.add("Agrave", 192)
    $sd.add("Aacute", 193)
    $sd.add("Acirc", 194)
    $sd.add("Atilde", 195)
    $sd.add("Auml", 196)
    $sd.add("Aring", 197)
    $sd.add("AElig", 198)
    $sd.add("Ccedil", 199)
    $sd.add("Egrave", 200)
    $sd.add("Eacute", 201)
    $sd.add("Ecirc", 202)
    $sd.add("Euml", 203)
    $sd.add("Igrave", 204)
    $sd.add("Iacute", 205)
    $sd.add("Icirc", 206)
    $sd.add("Iuml", 207)
    $sd.add("ETH", 208)
    $sd.add("Ntilde", 209)
    $sd.add("Ograve", 210)
    $sd.add("Oacute", 211)
    $sd.add("Ocirc", 212)
    $sd.add("Otilde", 213)
    $sd.add("Ouml", 214)
    $sd.add("times", 215)
    $sd.add("Oslash", 216)
    $sd.add("Ugrave", 217)
    $sd.add("Uacute", 218)
    $sd.add("Ucirc", 219)
    $sd.add("Uuml", 220)
    $sd.add("Yacute", 221)
    $sd.add("THORN", 222)
    $sd.add("szlig", 223)
    $sd.add("agrave", 224)
    $sd.add("aacute", 225)
    $sd.add("acirc", 226)
    $sd.add("atilde", 227)
    $sd.add("auml", 228)
    $sd.add("aring", 229)
    $sd.add("aelig", 230)
    $sd.add("ccedil", 231)
    $sd.add("egrave", 232)
    $sd.add("eacute", 233)
    $sd.add("ecirc", 234)
    $sd.add("euml", 235)
    $sd.add("igrave", 236)
    $sd.add("iacute", 237)
    $sd.add("icirc", 238)
    $sd.add("iuml", 239)
    $sd.add("eth", 240)
    $sd.add("ntilde", 241)
    $sd.add("ograve", 242)
    $sd.add("oacute", 243)
    $sd.add("ocirc", 244)
    $sd.add("otilde", 245)
    $sd.add("ouml", 246)
    $sd.add("divide", 247)
    $sd.add("oslash", 248)
    $sd.add("ugrave", 249)
    $sd.add("uacute", 250)
    $sd.add("ucirc", 251)
    $sd.add("uuml", 252)
    $sd.add("yacute", 253)
    $sd.add("thorn", 254)
    $sd.add("yuml", 255)
    $sd.add("OElig", 338)
    $sd.add("oelig", 339)
    $sd.add("Scaron", 352)
    $sd.add("scaron", 353)
    $sd.add("Yuml", 376)
    $sd.add("fnof", 402)
    $sd.add("circ", 710)
    $sd.add("tilde", 732)
    $sd.add("Alpha", 913)
    $sd.add("Beta", 914)
    $sd.add("Gamma", 915)
    $sd.add("Delta", 916)
    $sd.add("Epsilon", 917)
    $sd.add("Zeta", 918)
    $sd.add("Eta", 919)
    $sd.add("Theta", 920)
    $sd.add("Iota", 921)
    $sd.add("Kappa", 922)
    $sd.add("Lambda", 923)
    $sd.add("Mu", 924)
    $sd.add("Nu", 925)
    $sd.add("Xi", 926)
    $sd.add("Omicron", 927)
    $sd.add("Pi", 928)
    $sd.add("Rho", 929)
    $sd.add("Sigma", 931)
    $sd.add("Tau", 932)
    $sd.add("Upsilon", 933)
    $sd.add("Phi", 934)
    $sd.add("Chi", 935)
    $sd.add("Psi", 936)
    $sd.add("Omega", 937)
    $sd.add("alpha", 945)
    $sd.add("beta", 946)
    $sd.add("gamma", 947)
    $sd.add("delta", 948)
    $sd.add("epsilon", 949)
    $sd.add("zeta", 950)
    $sd.add("eta", 951)
    $sd.add("theta", 952)
    $sd.add("iota", 953)
    $sd.add("kappa", 954)
    $sd.add("lambda", 955)
    $sd.add("mu", 956)
    $sd.add("nu", 957)
    $sd.add("xi", 958)
    $sd.add("omicron", 959)
    $sd.add("pi", 960)
    $sd.add("rho", 961)
    $sd.add("sigmaf", 962)
    $sd.add("sigma", 963)
    $sd.add("tau", 964)
    $sd.add("upsilon", 965)
    $sd.add("phi", 966)
    $sd.add("chi", 967)
    $sd.add("psi", 968)
    $sd.add("omega", 969)
    $sd.add("thetasym", 977)
    $sd.add("upsih", 978)
    $sd.add("piv", 982)
    $sd.add("ensp", 8194)
    $sd.add("emsp", 8195)
    $sd.add("thinsp", 8201)
    $sd.add("zwnj", 8204)
    $sd.add("zwj", 8205)
    $sd.add("lrm", 8206)
    $sd.add("rlm", 8207)
    $sd.add("ndash", 8211)
    $sd.add("mdash", 8212)
    $sd.add("lsquo", 8216)
    $sd.add("rsquo", 8217)
    $sd.add("sbquo", 8218)
    $sd.add("ldquo", 8220)
    $sd.add("rdquo", 8221)
    $sd.add("bdquo", 8222)
    $sd.add("dagger", 8224)
    $sd.add("Dagger", 8225)
    $sd.add("bull", 8226)
    $sd.add("hellip", 8230)
    $sd.add("permil", 8240)
    $sd.add("prime", 8242)
    $sd.add("Prime", 8243)
    $sd.add("lsaquo", 8249)
    $sd.add("rsaquo", 8250)
    $sd.add("oline", 8254)
    $sd.add("frasl", 8260)
    $sd.add("euro", 8364)
    $sd.add("image", 8465)
    $sd.add("weierp", 8472)
    $sd.add("real", 8476)
    $sd.add("trade", 8482)
    $sd.add("alefsym", 8501)
    $sd.add("larr", 8592)
    $sd.add("uarr", 8593)
    $sd.add("rarr", 8594)
    $sd.add("darr", 8595)
    $sd.add("harr", 8596)
    $sd.add("crarr", 8629)
    $sd.add("lArr", 8656)
    $sd.add("uArr", 8657)
    $sd.add("rArr", 8658)
    $sd.add("dArr", 8659)
    $sd.add("hArr", 8660)
    $sd.add("forall", 8704)
    $sd.add("part", 8706)
    $sd.add("exist", 8707)
    $sd.add("empty", 8709)
    $sd.add("nabla", 8711)
    $sd.add("isin", 8712)
    $sd.add("notin", 8713)
    $sd.add("ni", 8715)
    $sd.add("prod", 8719)
    $sd.add("sum", 8721)
    $sd.add("minus", 8722)
    $sd.add("lowast", 8727)
    $sd.add("radic", 8730)
    $sd.add("prop", 8733)
    $sd.add("infin", 8734)
    $sd.add("ang", 8736)
    $sd.add("and", 8743)
    $sd.add("or", 8744)
    $sd.add("cap", 8745)
    $sd.add("cup", 8746)
    $sd.add("int", 8747)
    $sd.add("there4", 8756)
    $sd.add("sim", 8764)
    $sd.add("cong", 8773)
    $sd.add("asymp", 8776)
    $sd.add("ne", 8800)
    $sd.add("equiv", 8801)
    $sd.add("le", 8804)
    $sd.add("ge", 8805)
    $sd.add("sub", 8834)
    $sd.add("sup", 8835)
    $sd.add("nsub", 8836)
    $sd.add("sube", 8838)
    $sd.add("supe", 8839)
    $sd.add("oplus", 8853)
    $sd.add("otimes", 8855)
    $sd.add("perp", 8869)
    $sd.add("sdot", 8901)
    $sd.add("lceil", 8968)
    $sd.add("rceil", 8969)
    $sd.add("lfloor", 8970)
    $sd.add("rfloor", 8971)
    $sd.add("lang", 9001)
    $sd.add("rang", 9002)
    $sd.add("loz", 9674)
    $sd.add("spades", 9824)
    $sd.add("clubs", 9827)
    $sd.add("hearts", 9829)
    $sd.add("diams", 9830)
EndFunc   ;==>_Dictionary


Func _Map()

    $map["quot"] = 34
    $map["amp"] = 38
    $map["apos"] = 39
    $map["lt"] = 60
    $map["gt"] = 62
    $map["nbsp"] = 160
    $map["iexcl"] = 161
    $map["cent"] = 162
    $map["pound"] = 163
    $map["curren"] = 164
    $map["yen"] = 165
    $map["brvbar"] = 166
    $map["sect"] = 167
    $map["uml"] = 168
    $map["copy"] = 169
    $map["ordf"] = 170
    $map["laquo"] = 171
    $map["not"] = 172
    $map["shy"] = 173
    $map["reg"] = 174
    $map["macr"] = 175
    $map["deg"] = 176
    $map["plusmn"] = 177
    $map["sup2"] = 178
    $map["sup3"] = 179
    $map["acute"] = 180
    $map["micro"] = 181
    $map["para"] = 182
    $map["middot"] = 183
    $map["cedil"] = 184
    $map["sup1"] = 185
    $map["ordm"] = 186
    $map["raquo"] = 187
    $map["frac14"] = 188
    $map["frac12"] = 189
    $map["frac34"] = 190
    $map["iquest"] = 191
    $map["Agrave"] = 192
    $map["Aacute"] = 193
    $map["Acirc"] = 194
    $map["Atilde"] = 195
    $map["Auml"] = 196
    $map["Aring"] = 197
    $map["AElig"] = 198
    $map["Ccedil"] = 199
    $map["Egrave"] = 200
    $map["Eacute"] = 201
    $map["Ecirc"] = 202
    $map["Euml"] = 203
    $map["Igrave"] = 204
    $map["Iacute"] = 205
    $map["Icirc"] = 206
    $map["Iuml"] = 207
    $map["ETH"] = 208
    $map["Ntilde"] = 209
    $map["Ograve"] = 210
    $map["Oacute"] = 211
    $map["Ocirc"] = 212
    $map["Otilde"] = 213
    $map["Ouml"] = 214
    $map["times"] = 215
    $map["Oslash"] = 216
    $map["Ugrave"] = 217
    $map["Uacute"] = 218
    $map["Ucirc"] = 219
    $map["Uuml"] = 220
    $map["Yacute"] = 221
    $map["THORN"] = 222
    $map["szlig"] = 223
    $map["agrave"] = 224
    $map["aacute"] = 225
    $map["acirc"] = 226
    $map["atilde"] = 227
    $map["auml"] = 228
    $map["aring"] = 229
    $map["aelig"] = 230
    $map["ccedil"] = 231
    $map["egrave"] = 232
    $map["eacute"] = 233
    $map["ecirc"] = 234
    $map["euml"] = 235
    $map["igrave"] = 236
    $map["iacute"] = 237
    $map["icirc"] = 238
    $map["iuml"] = 239
    $map["eth"] = 240
    $map["ntilde"] = 241
    $map["ograve"] = 242
    $map["oacute"] = 243
    $map["ocirc"] = 244
    $map["otilde"] = 245
    $map["ouml"] = 246
    $map["divide"] = 247
    $map["oslash"] = 248
    $map["ugrave"] = 249
    $map["uacute"] = 250
    $map["ucirc"] = 251
    $map["uuml"] = 252
    $map["yacute"] = 253
    $map["thorn"] = 254
    $map["yuml"] = 255
    $map["OElig"] = 338
    $map["oelig"] = 339
    $map["Scaron"] = 352
    $map["scaron"] = 353
    $map["Yuml"] = 376
    $map["fnof"] = 402
    $map["circ"] = 710
    $map["tilde"] = 732
    $map["Alpha"] = 913
    $map["Beta"] = 914
    $map["Gamma"] = 915
    $map["Delta"] = 916
    $map["Epsilon"] = 917
    $map["Zeta"] = 918
    $map["Eta"] = 919
    $map["Theta"] = 920
    $map["Iota"] = 921
    $map["Kappa"] = 922
    $map["Lambda"] = 923
    $map["Mu"] = 924
    $map["Nu"] = 925
    $map["Xi"] = 926
    $map["Omicron"] = 927
    $map["Pi"] = 928
    $map["Rho"] = 929
    $map["Sigma"] = 931
    $map["Tau"] = 932
    $map["Upsilon"] = 933
    $map["Phi"] = 934
    $map["Chi"] = 935
    $map["Psi"] = 936
    $map["Omega"] = 937
    $map["alpha"] = 945
    $map["beta"] = 946
    $map["gamma"] = 947
    $map["delta"] = 948
    $map["epsilon"] = 949
    $map["zeta"] = 950
    $map["eta"] = 951
    $map["theta"] = 952
    $map["iota"] = 953
    $map["kappa"] = 954
    $map["lambda"] = 955
    $map["mu"] = 956
    $map["nu"] = 957
    $map["xi"] = 958
    $map["omicron"] = 959
    $map["pi"] = 960
    $map["rho"] = 961
    $map["sigmaf"] = 962
    $map["sigma"] = 963
    $map["tau"] = 964
    $map["upsilon"] = 965
    $map["phi"] = 966
    $map["chi"] = 967
    $map["psi"] = 968
    $map["omega"] = 969
    $map["thetasym"] = 977
    $map["upsih"] = 978
    $map["piv"] = 982
    $map["ensp"] = 8194
    $map["emsp"] = 8195
    $map["thinsp"] = 8201
    $map["zwnj"] = 8204
    $map["zwj"] = 8205
    $map["lrm"] = 8206
    $map["rlm"] = 8207
    $map["ndash"] = 8211
    $map["mdash"] = 8212
    $map["lsquo"] = 8216
    $map["rsquo"] = 8217
    $map["sbquo"] = 8218
    $map["ldquo"] = 8220
    $map["rdquo"] = 8221
    $map["bdquo"] = 8222
    $map["dagger"] = 8224
    $map["Dagger"] = 8225
    $map["bull"] = 8226
    $map["hellip"] = 8230
    $map["permil"] = 8240
    $map["prime"] = 8242
    $map["Prime"] = 8243
    $map["lsaquo"] = 8249
    $map["rsaquo"] = 8250
    $map["oline"] = 8254
    $map["frasl"] = 8260
    $map["euro"] = 8364
    $map["image"] = 8465
    $map["weierp"] = 8472
    $map["real"] = 8476
    $map["trade"] = 8482
    $map["alefsym"] = 8501
    $map["larr"] = 8592
    $map["uarr"] = 8593
    $map["rarr"] = 8594
    $map["darr"] = 8595
    $map["harr"] = 8596
    $map["crarr"] = 8629
    $map["lArr"] = 8656
    $map["uArr"] = 8657
    $map["rArr"] = 8658
    $map["dArr"] = 8659
    $map["hArr"] = 8660
    $map["forall"] = 8704
    $map["part"] = 8706
    $map["exist"] = 8707
    $map["empty"] = 8709
    $map["nabla"] = 8711
    $map["isin"] = 8712
    $map["notin"] = 8713
    $map["ni"] = 8715
    $map["prod"] = 8719
    $map["sum"] = 8721
    $map["minus"] = 8722
    $map["lowast"] = 8727
    $map["radic"] = 8730
    $map["prop"] = 8733
    $map["infin"] = 8734
    $map["ang"] = 8736
    $map["and"] = 8743
    $map["or"] = 8744
    $map["cap"] = 8745
    $map["cup"] = 8746
    $map["int"] = 8747
    $map["there4"] = 8756
    $map["sim"] = 8764
    $map["cong"] = 8773
    $map["asymp"] = 8776
    $map["ne"] = 8800
    $map["equiv"] = 8801
    $map["le"] = 8804
    $map["ge"] = 8805
    $map["sub"] = 8834
    $map["sup"] = 8835
    $map["nsub"] = 8836
    $map["sube"] = 8838
    $map["supe"] = 8839
    $map["oplus"] = 8853
    $map["otimes"] = 8855
    $map["perp"] = 8869
    $map["sdot"] = 8901
    $map["lceil"] = 8968
    $map["rceil"] = 8969
    $map["lfloor"] = 8970
    $map["rfloor"] = 8971
    $map["lang"] = 9001
    $map["rang"] = 9002
    $map["loz"] = 9674
    $map["spades"] = 9824
    $map["clubs"] = 9827
    $map["hearts"] = 9829
    $map["diams"] = 9830
EndFunc   ;==>_Map

 

Need 

Saludos

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...