Jump to content

Convert HTML Encoding Strings


Recommended Posts

Hi there!

My first post here, so please bare with me ;)

 

Basically, I'm writing a tool that downloads a json file with the steam web api and I want to display parts of it in my gui (I use the WinHTTP UDF to download).

Sometimes the json file contains HTML encoding strings.  As long as the string contains solely these, this code is working fine:

 

$sText = "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002"
$aText = StringSplit($sText, "\u", 3)

$sNew = ""
For $i = 1 To UBound($aText) - 1
    $sNew &= ChrW(Dec($aText[$i]))
Next
MsgBox(0, "", $sNew)

Unfortunately, quite often, there are special letters like "®", "©" or "&" mixed with "normal" ASCII letters, like this:

Quote

F1\u00ae 2020 allows you to create your F1\u00ae team & whatever

Here's my solution for this:

$sText = "F1\u00ae 2020 allows you to create your F1\u00ae team & whatever"
Dim $aCodes[][2] = [['®', ChrW(0x00AE)],[' ', ChrW(0x0020)],['\u2019', ChrW(0x2019)],[''', ChrW(0x0027)], _
                    ['\u0026', ChrW(0x0026)],['&', ChrW(0x0026)],['\u00ae', ChrW(0x00ae)],['\u2122', ChrW(0x2122)], _
                    ['Ü', ChrW(0x00DC)],['ü',ChrW(0x00FC)],['\u00fc', ChrW(0x00fc)],['\u00e4', ChrW(0x00e4)], _
                    ['\u201c', ChrW(0x201c)],['\u201d',ChrW(0x201d)],['"', ChrW(0x0022)],['’',ChrW(0x2019)], _
                    ['\u00e8', ChrW(0x00e8)],['™',ChrW(0x2122)]]
For $i = 0 To UBound($aCodes) - 1
    $sText = StringReplace($sText, $aCodes[$i][0], $aCodes[$i][1], 0 , 1)
Next
MsgBox(0, "", $sText)

I'm not very pleased with this solution and I think, the best way for any of these cases is Regex. Here's, what's working ONLY for "\u4e3a"):

$sText = "\u4e3aand more text\u4e3a"
$sText = StringRegExpReplace($sText, '\\u(\w{4})', ChrW(Dec("4e3a")))
MsgBox(0, "", $sText)

But how can I do this for any "\uXXX"? I have no Idea, how to fit in the backreference parameter "$1".

I was thinking, something like this might work, but it doesn't:

$sText = "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002"

$sReplace = StringRegExpReplace($sText, '\\u(\w{4})', Execute("ChrW(" & Dec("'$1'") & ")"))
MsgBox(0, "", $sReplace)

;~ or this

$sReplace = StringRegExpReplace($sText, '\\u(\w{4})', Execute("ChrW(Dec(" & "'$1'" & "))"))
MsgBox(0, "", $sReplace)

I may have messed up the Execute command. Is there a correct way to do this?

If that's not the way at all, I'm grateful for any suggestions...

 

Thanks in advance,

WinWiesel!

Link to comment
Share on other sites

;$sText = "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002"
$sText = "F1\u00ae 2020 allows you to create your F1\u00ae team & whatever"

$sDecoded = _Encoding_JavaUnicodeDecode($sText)
$sDecoded = _HTMLEntities_Decode($sDecoded)

MsgBox(64, @ScriptName, $sDecoded)

Func _Encoding_JavaUnicodeDecode($sString)
    Local $sRet = ''
    Local $aString = StringRegExp($sString, "(\\\\|\\'|\\u[[:xdigit:]]{4}|[[:ascii:]])", 3)
    
    If @error Then
        Return SetError(1, 0, '')
    EndIf
    
    For $i = 0 To UBound($aString) - 1
        Switch StringLen($aString[$i])
            Case 1
                $sRet &= $aString[$i]
            Case 2
                $sRet &= StringRight($aString[$i], 1)
            Case 6
                $sRet &= ChrW(Dec(StringRight($aString[$i], 4)))
        EndSwitch
    Next
    
    Return $sRet
EndFunc

Func _HTMLEntities_Decode($sTxt)
    Local $aTxt, $iAsc, $sChr
    
    Local Const $aisEntities[246][2] = [[34, 'quot'], [38, 'amp'], [39, 'apos'], [60, 'lt'], [62, 'gt'], [160, 'nbsp'], [161, 'iexcl'], [162, 'cent'], [163, 'pound'], [164, 'curren'], [165, 'yen'], [166, 'brvbar'], [167, 'sect'], [168, 'uml'], [169, 'copy'], [170, 'ordf'], [171, 'laquo'], [172, 'not'], [173, 'shy'], [174, 'reg'], [175, 'macr'], [176, 'deg'], [177, 'plusmn'], [180, 'acute'], [181, 'micro'], [182, 'para'], [183, 'middot'], [184, 'cedil'], [186, 'ordm'], [187, 'raquo'], [191, 'iquest'], [192, 'Agrave'], [193, 'Aacute'], [194, 'Acirc'], [195, 'Atilde'], [196, 'Auml'], [197, 'Aring'], [198, 'AElig'], [199, 'Ccedil'], [200, 'Egrave'], [201, 'Eacute'], [202, 'Ecirc'], [203, 'Euml'], [204, 'Igrave'], [205, 'Iacute'], [206, 'Icirc'], [207, 'Iuml'], [208, 'ETH'], [209, 'Ntilde'], [210, 'Ograve'], [211, 'Oacute'], [212, 'Ocirc'], [213, 'Otilde'], [214, 'Ouml'], [215, 'times'], [216, 'Oslash'], [217, 'Ugrave'], [218, 'Uacute'], [219, 'Ucirc'], [220, 'Uuml'], [221, 'Yacute'], [222, 'THORN'], [223, 'szlig'], [224, 'agrave'], [225, 'aacute'], [226, 'acirc'], [227, 'atilde'], [228, 'auml'], [229, 'aring'], [230, 'aelig'], [231, 'ccedil'], [232, 'egrave'], [233, 'eacute'], [234, 'ecirc'], [235, 'euml'], [236, 'igrave'], [237, 'iacute'], [238, 'icirc'], [239, 'iuml'], [240, 'eth'], [241, 'ntilde'], [242, 'ograve'], [243, 'oacute'], [244, 'ocirc'], [245, 'otilde'], [246, 'ouml'], [247, 'divide'], [248, 'oslash'], [249, 'ugrave'], [250, 'uacute'], [251, 'ucirc'], [252, 'uuml'], [253, 'yacute'], [254, 'thorn'], [255, 'yuml'], [338, 'OElig'], [339, 'oelig'], [352, 'Scaron'], [353, 'scaron'], [376, 'Yuml'], [402, 'fnof'], [710, 'circ'], [732, 'tilde'], [913, 'Alpha'], [914, 'Beta'], [915, 'Gamma'], [916, 'Delta'], [917, 'Epsilon'], [918, 'Zeta'], [919, 'Eta'], [920, 'Theta'], [921, 'Iota'], [922, 'Kappa'], [923, 'Lambda'], [924, 'Mu'], [925, 'Nu'], [926, 'Xi'], [927, 'Omicron'], [928, 'Pi'], [929, 'Rho'], [931, 'Sigma'], [932, 'Tau'], [933, 'Upsilon'], [934, 'Phi'], [935, 'Chi'], [936, 'Psi'], [937, 'Omega'], [945, 'alpha'], [946, 'beta'], [947, 'gamma'], [948, 'delta'], [949, 'epsilon'], [950, 'zeta'], [951, 'eta'], [952, 'theta'], [953, 'iota'], [954, 'kappa'], [955, 'lambda'], [956, 'mu'], [957, 'nu'], [958, 'xi'], [959, 'omicron'], [960, 'pi'], [961, 'rho'], [962, 'sigmaf'], [963, 'sigma'], [964, 'tau'], [965, 'upsilon'], [966, 'phi'], [967, 'chi'], [968, 'psi'], [969, 'omega'], [977, 'thetasym'], [978, 'upsih'], [982, 'piv'], [8194, 'ensp'], [8195, 'emsp'], [8201, 'thinsp'], [8204, 'zwnj'], [8205, 'zwj'], [8206, 'lrm'], [8207, 'rlm'], [8211, 'ndash'], [8212, 'mdash'], [8216, 'lsquo'], [8217, 'rsquo'], [8218, 'sbquo'], [8220, 'ldquo'], [8221, 'rdquo'], [8222, 'bdquo'], [8224, 'dagger'], [8225, 'Dagger'], [8226, 'bull'], [8230, 'hellip'], [8240, 'permil'], [8242, 'prime'], [8243, 'Prime'], [8249, 'lsaquo'], [8250, 'rsaquo'], [8254, 'oline'], [8260, 'frasl'], [8364, 'euro'], [8465, 'image'], [8472, 'weierp'], [8476, 'real'], [8482, 'trade'], [8501, 'alefsym'], [8592, 'larr'], [8593, 'uarr'], [8594, 'rarr'], [8595, 'darr'], [8596, 'harr'], [8629, 'crarr'], [8656, 'lArr'], [8657, 'uArr'], [8658, 'rArr'], [8659, 'dArr'], [8660, 'hArr'], [8704, 'forall'], [8706, 'part'], [8707, 'exist'], [8709, 'empty'], [8711, 'nabla'], [8712, 'isin'], [8713, 'notin'], [8715, 'ni'], [8719, 'prod'], [8721, 'sum'], [8722, 'minus'], [8727, 'lowast'], [8730, 'radic'], [8733, 'prop'], [8734, 'infin'], [8736, 'ang'], [8743, 'and'], [8744, 'or'], [8745, 'cap'], [8746, 'cup'], [8747, 'int'], [8764, 'sim'], [8773, 'cong'], [8776, 'asymp'], [8800, 'ne'], [8801, 'equiv'], [8804, 'le'], [8805, 'ge'], [8834, 'sub'], [8835, 'sup'], [8836, 'nsub'], [8838, 'sube'], [8839, 'supe'], [8853, 'oplus'], [8855, 'otimes'], [8869, 'perp'], [8901, 'sdot'], [8968, 'lceil'], [8969, 'rceil'], [8970, 'lfloor'], [8971, 'rfloor'], [9001, 'lang'], [9002, 'rang'], [9674, 'loz'], [9824, 'spades'], [9827, 'clubs'], [9829, 'hearts'], [9830, 'diams']]
    
    For $i = 0 To 245
        $sTxt = StringReplace($sTxt, '&' & $aisEntities[$i][1] & ';', ChrW($aisEntities[$i][0]), 0, 1)
    Next
    
    $aTxt = StringRegExp($sTxt, '(&#\d+;)', 3)
    
    For $i = 0 To UBound($aTxt)-1
        $iAsc = StringRegExpReplace($aTxt[$i], '\D+', '')
        
        If $iAsc > 255 Then
            $sChr = ChrW($iAsc)
        Else
            $sChr = Chr($iAsc)
        EndIf
        
        $sTxt = StringReplace($sTxt, $aTxt[$i], $sChr)
    Next
    
    Return $sTxt
EndFunc

 

Edited by MrCreatoR

 

Spoiler

Using OS: Win 7 Professional, Using AutoIt Ver(s): 3.3.6.1 / 3.3.8.1

AutoIt_Rus_Community.png AutoIt Russian Community

My Work...

Spoiler

AutoIt_Icon_small.pngProjects: ATT - Application Translate Tool {new}| BlockIt - Block files & folders {new}| SIP - Selected Image Preview {new}| SISCABMAN - SciTE Abbreviations Manager {new}| AutoIt Path Switcher | AutoIt Menu for Opera! | YouTube Download Center! | Desktop Icons Restorator | Math Tasks | KeyBoard & Mouse Cleaner | CaptureIt - Capture Images Utility | CheckFileSize Program

AutoIt_Icon_small.pngUDFs: OnAutoItErrorRegister - Handle AutoIt critical errors {new}| AutoIt Syntax Highlight {new}| Opera Library! | Winamp Library | GetFolderToMenu | Custom_InputBox()! | _FileRun UDF | _CheckInput() UDF | _GUIInputSetOnlyNumbers() UDF | _FileGetValidName() UDF | _GUICtrlCreateRadioCBox UDF | _GuiCreateGrid() | _PathSplitByRegExp() | _GUICtrlListView_MoveItems - UDF | GUICtrlSetOnHover_UDF! | _ControlTab UDF! | _MouseSetOnEvent() UDF! | _ProcessListEx - UDF | GUICtrl_SetResizing - UDF! | Mod. for _IniString UDFs | _StringStripChars UDF | _ColorIsDarkShade UDF | _ColorConvertValue UDF | _GUICtrlTab_CoverBackground | CUI_App_UDF | _IncludeScripts UDF | _AutoIt3ExecuteCode | _DragList UDF | Mod. for _ListView_Progress | _ListView_SysLink | _GenerateRandomNumbers | _BlockInputEx | _IsPressedEx | OnAutoItExit Handler | _GUICtrlCreateTFLabel UDF | WinControlSetEvent UDF | Mod. for _DirGetSizeEx UDF
 
AutoIt_Icon_small.pngExamples: 
ScreenSaver Demo - Matrix included | Gui Drag Without pause the script | _WinAttach()! | Turn Off/On Monitor | ComboBox Handler Example | Mod. for "Thinking Box" | Cool "About" Box | TasksBar Imitation Demo

Like the Projects/UDFs/Examples? Please rate the topic (up-right corner of the post header: Rating AutoIt_Rating.gif)

* === My topics === *

==================================================
My_Userbar.gif
==================================================

 

 

 

AutoIt is simple, subtle, elegant. © AutoIt Team

Link to comment
Share on other sites

It's chinese ?  :blink:

$gui = GuiCreate("test", 500, 100)
$label = GUICtrlCreateLabel("", 10, 10, 490, 20)
GUICtrlSetFont(-1, 9, 0, 0, "Arial Unicode MS")
GuiSetState()

Local $string = '\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002'

$string = Execute('"' & StringRegExpReplace($string, '\\u([[:xdigit:]]{4})', '" & ChrW(0x$1) & "') & '"')
GuiCtrlSetData($label, $string)

While GuiGetMsg()<>-3
Wend

 

Link to comment
Share on other sites

Very nice, mikell!

That's exactly, what I was looking for. I knew, it was doable with Execute, I just got confused with the quotations.

...and yes, this is simlyfied chinese :)

 

@MrCreatoR: thanks for your effort, but this is pretty much the way, I wanted to avoid. It's more or less the same, as the second solution in my post.

Link to comment
Share on other sites

Here's my final (?) solution, that should cover every kind of decoding HTML text. Maybe it's useful for someone:

$sText = '\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002' & _
                'F1\u00ae 2020 allows you to create your F1\u00ae team &amp; whatever &#064; &#084;&#072;&#088; &#071;&#085;&#089;&#083;!!!'
MsgBox(0, "", _ConvertHTML($sText))

Func _ConvertHTML($sText)
    Local Const $aisEntities[][2] = [[34, 'quot'], [38, 'amp'], [39, 'apos'], [60, 'lt'], [62, 'gt'], [160, 'nbsp'], [161, 'iexcl'], [162, 'cent'], [163, 'pound'], [164, 'curren'], [165, 'yen'], [166, 'brvbar'], [167, 'sect'], [168, 'uml'], [169, 'copy'], [170, 'ordf'], [171, 'laquo'], [172, 'not'], [173, 'shy'], [174, 'reg'], [175, 'macr'], [176, 'deg'], [177, 'plusmn'], [180, 'acute'], [181, 'micro'], [182, 'para'], [183, 'middot'], [184, 'cedil'], [186, 'ordm'], [187, 'raquo'], [191, 'iquest'], [192, 'Agrave'], [193, 'Aacute'], [194, 'Acirc'], [195, 'Atilde'], [196, 'Auml'], [197, 'Aring'], [198, 'AElig'], [199, 'Ccedil'], [200, 'Egrave'], [201, 'Eacute'], [202, 'Ecirc'], [203, 'Euml'], [204, 'Igrave'], [205, 'Iacute'], [206, 'Icirc'], [207, 'Iuml'], [208, 'ETH'], [209, 'Ntilde'], [210, 'Ograve'], [211, 'Oacute'], [212, 'Ocirc'], [213, 'Otilde'], [214, 'Ouml'], [215, 'times'], [216, 'Oslash'], [217, 'Ugrave'], [218, 'Uacute'], [219, 'Ucirc'], [220, 'Uuml'], [221, 'Yacute'], [222, 'THORN'], [223, 'szlig'], [224, 'agrave'], [225, 'aacute'], [226, 'acirc'], [227, 'atilde'], [228, 'auml'], [229, 'aring'], [230, 'aelig'], [231, 'ccedil'], [232, 'egrave'], [233, 'eacute'], [234, 'ecirc'], [235, 'euml'], [236, 'igrave'], [237, 'iacute'], [238, 'icirc'], [239, 'iuml'], [240, 'eth'], [241, 'ntilde'], [242, 'ograve'], [243, 'oacute'], [244, 'ocirc'], [245, 'otilde'], [246, 'ouml'], [247, 'divide'], [248, 'oslash'], [249, 'ugrave'], [250, 'uacute'], [251, 'ucirc'], [252, 'uuml'], [253, 'yacute'], [254, 'thorn'], [255, 'yuml'], [338, 'OElig'], [339, 'oelig'], [352, 'Scaron'], [353, 'scaron'], [376, 'Yuml'], [402, 'fnof'], [710, 'circ'], [732, 'tilde'], [913, 'Alpha'], [914, 'Beta'], [915, 'Gamma'], [916, 'Delta'], [917, 'Epsilon'], [918, 'Zeta'], [919, 'Eta'], [920, 'Theta'], [921, 'Iota'], [922, 'Kappa'], [923, 'Lambda'], [924, 'Mu'], [925, 'Nu'], [926, 'Xi'], [927, 'Omicron'], [928, 'Pi'], [929, 'Rho'], [931, 'Sigma'], [932, 'Tau'], [933, 'Upsilon'], [934, 'Phi'], [935, 'Chi'], [936, 'Psi'], [937, 'Omega'], [945, 'alpha'], [946, 'beta'], [947, 'gamma'], [948, 'delta'], [949, 'epsilon'], [950, 'zeta'], [951, 'eta'], [952, 'theta'], [953, 'iota'], [954, 'kappa'], [955, 'lambda'], [956, 'mu'], [957, 'nu'], [958, 'xi'], [959, 'omicron'], [960, 'pi'], [961, 'rho'], [962, 'sigmaf'], [963, 'sigma'], [964, 'tau'], [965, 'upsilon'], [966, 'phi'], [967, 'chi'], [968, 'psi'], [969, 'omega'], [977, 'thetasym'], [978, 'upsih'], [982, 'piv'], [8194, 'ensp'], [8195, 'emsp'], [8201, 'thinsp'], [8204, 'zwnj'], [8205, 'zwj'], [8206, 'lrm'], [8207, 'rlm'], [8211, 'ndash'], [8212, 'mdash'], [8216, 'lsquo'], [8217, 'rsquo'], [8218, 'sbquo'], [8220, 'ldquo'], [8221, 'rdquo'], [8222, 'bdquo'], [8224, 'dagger'], [8225, 'Dagger'], [8226, 'bull'], [8230, 'hellip'], [8240, 'permil'], [8242, 'prime'], [8243, 'Prime'], [8249, 'lsaquo'], [8250, 'rsaquo'], [8254, 'oline'], [8260, 'frasl'], [8364, 'euro'], [8465, 'image'], [8472, 'weierp'], [8476, 'real'], [8482, 'trade'], [8501, 'alefsym'], [8592, 'larr'], [8593, 'uarr'], [8594, 'rarr'], [8595, 'darr'], [8596, 'harr'], [8629, 'crarr'], [8656, 'lArr'], [8657, 'uArr'], [8658, 'rArr'], [8659, 'dArr'], [8660, 'hArr'], [8704, 'forall'], [8706, 'part'], [8707, 'exist'], [8709, 'empty'], [8711, 'nabla'], [8712, 'isin'], [8713, 'notin'], [8715, 'ni'], [8719, 'prod'], [8721, 'sum'], [8722, 'minus'], [8727, 'lowast'], [8730, 'radic'], [8733, 'prop'], [8734, 'infin'], [8736, 'ang'], [8743, 'and'], [8744, 'or'], [8745, 'cap'], [8746, 'cup'], [8747, 'int'], [8764, 'sim'], [8773, 'cong'], [8776, 'asymp'], [8800, 'ne'], [8801, 'equiv'], [8804, 'le'], [8805, 'ge'], [8834, 'sub'], [8835, 'sup'], [8836, 'nsub'], [8838, 'sube'], [8839, 'supe'], [8853, 'oplus'], [8855, 'otimes'], [8869, 'perp'], [8901, 'sdot'], [8968, 'lceil'], [8969, 'rceil'], [8970, 'lfloor'], [8971, 'rfloor'], [9001, 'lang'], [9002, 'rang'], [9674, 'loz'], [9824, 'spades'], [9827, 'clubs'], [9829, 'hearts'], [9830, 'diams']]
    For $i = 0 To UBound($aisEntities) - 1
        $sText = StringReplace($sText, "&" & $aisEntities[$i][1] & ";", ChrW($aisEntities[$i][0]), 0 , 1)
    Next
    $sText = Execute('"' & StringRegExpReplace($sText, '\\u([[:xdigit:]]{4})', '" & ChrW(0x$1) & "') & '"')
    $sText = Execute('"' & StringRegExpReplace($sText, '&#(\d{3});', '" & Chr($1) & "') & '"')
    Return $sText
EndFunc

Since I didn't find a short algorithm to decode HTML entities (tried something with Execute and an Object dictionary), I "borrowed" the array containing the codes, from MrCreator (thx m8).

Thanks a lot for your help, guys!

WinWiesel

Link to comment
Share on other sites

10 hours ago, WinWiesel said:

this is pretty much the way, I wanted to avoid

Why?

 

Spoiler

Using OS: Win 7 Professional, Using AutoIt Ver(s): 3.3.6.1 / 3.3.8.1

AutoIt_Rus_Community.png AutoIt Russian Community

My Work...

Spoiler

AutoIt_Icon_small.pngProjects: ATT - Application Translate Tool {new}| BlockIt - Block files & folders {new}| SIP - Selected Image Preview {new}| SISCABMAN - SciTE Abbreviations Manager {new}| AutoIt Path Switcher | AutoIt Menu for Opera! | YouTube Download Center! | Desktop Icons Restorator | Math Tasks | KeyBoard & Mouse Cleaner | CaptureIt - Capture Images Utility | CheckFileSize Program

AutoIt_Icon_small.pngUDFs: OnAutoItErrorRegister - Handle AutoIt critical errors {new}| AutoIt Syntax Highlight {new}| Opera Library! | Winamp Library | GetFolderToMenu | Custom_InputBox()! | _FileRun UDF | _CheckInput() UDF | _GUIInputSetOnlyNumbers() UDF | _FileGetValidName() UDF | _GUICtrlCreateRadioCBox UDF | _GuiCreateGrid() | _PathSplitByRegExp() | _GUICtrlListView_MoveItems - UDF | GUICtrlSetOnHover_UDF! | _ControlTab UDF! | _MouseSetOnEvent() UDF! | _ProcessListEx - UDF | GUICtrl_SetResizing - UDF! | Mod. for _IniString UDFs | _StringStripChars UDF | _ColorIsDarkShade UDF | _ColorConvertValue UDF | _GUICtrlTab_CoverBackground | CUI_App_UDF | _IncludeScripts UDF | _AutoIt3ExecuteCode | _DragList UDF | Mod. for _ListView_Progress | _ListView_SysLink | _GenerateRandomNumbers | _BlockInputEx | _IsPressedEx | OnAutoItExit Handler | _GUICtrlCreateTFLabel UDF | WinControlSetEvent UDF | Mod. for _DirGetSizeEx UDF
 
AutoIt_Icon_small.pngExamples: 
ScreenSaver Demo - Matrix included | Gui Drag Without pause the script | _WinAttach()! | Turn Off/On Monitor | ComboBox Handler Example | Mod. for "Thinking Box" | Cool "About" Box | TasksBar Imitation Demo

Like the Projects/UDFs/Examples? Please rate the topic (up-right corner of the post header: Rating AutoIt_Rating.gif)

* === My topics === *

==================================================
My_Userbar.gif
==================================================

 

 

 

AutoIt is simple, subtle, elegant. © AutoIt Team

Link to comment
Share on other sites

11 hours ago, mikell said:

It's chinese ?  :blink:

$gui = GuiCreate("test", 500, 100)
$label = GUICtrlCreateLabel("", 10, 10, 490, 20)
GUICtrlSetFont(-1, 9, 0, 0, "Arial Unicode MS")
GuiSetState()

Local $string = '\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002'

$string = Execute('"' & StringRegExpReplace($string, '\\u([[:xdigit:]]{4})', '" & ChrW(0x$1) & "') & '"')
GuiCtrlSetData($label, $string)

While GuiGetMsg()<>-3
Wend

 

Nice solution!

 

Spoiler

Using OS: Win 7 Professional, Using AutoIt Ver(s): 3.3.6.1 / 3.3.8.1

AutoIt_Rus_Community.png AutoIt Russian Community

My Work...

Spoiler

AutoIt_Icon_small.pngProjects: ATT - Application Translate Tool {new}| BlockIt - Block files & folders {new}| SIP - Selected Image Preview {new}| SISCABMAN - SciTE Abbreviations Manager {new}| AutoIt Path Switcher | AutoIt Menu for Opera! | YouTube Download Center! | Desktop Icons Restorator | Math Tasks | KeyBoard & Mouse Cleaner | CaptureIt - Capture Images Utility | CheckFileSize Program

AutoIt_Icon_small.pngUDFs: OnAutoItErrorRegister - Handle AutoIt critical errors {new}| AutoIt Syntax Highlight {new}| Opera Library! | Winamp Library | GetFolderToMenu | Custom_InputBox()! | _FileRun UDF | _CheckInput() UDF | _GUIInputSetOnlyNumbers() UDF | _FileGetValidName() UDF | _GUICtrlCreateRadioCBox UDF | _GuiCreateGrid() | _PathSplitByRegExp() | _GUICtrlListView_MoveItems - UDF | GUICtrlSetOnHover_UDF! | _ControlTab UDF! | _MouseSetOnEvent() UDF! | _ProcessListEx - UDF | GUICtrl_SetResizing - UDF! | Mod. for _IniString UDFs | _StringStripChars UDF | _ColorIsDarkShade UDF | _ColorConvertValue UDF | _GUICtrlTab_CoverBackground | CUI_App_UDF | _IncludeScripts UDF | _AutoIt3ExecuteCode | _DragList UDF | Mod. for _ListView_Progress | _ListView_SysLink | _GenerateRandomNumbers | _BlockInputEx | _IsPressedEx | OnAutoItExit Handler | _GUICtrlCreateTFLabel UDF | WinControlSetEvent UDF | Mod. for _DirGetSizeEx UDF
 
AutoIt_Icon_small.pngExamples: 
ScreenSaver Demo - Matrix included | Gui Drag Without pause the script | _WinAttach()! | Turn Off/On Monitor | ComboBox Handler Example | Mod. for "Thinking Box" | Cool "About" Box | TasksBar Imitation Demo

Like the Projects/UDFs/Examples? Please rate the topic (up-right corner of the post header: Rating AutoIt_Rating.gif)

* === My topics === *

==================================================
My_Userbar.gif
==================================================

 

 

 

AutoIt is simple, subtle, elegant. © AutoIt Team

Link to comment
Share on other sites

19 hours ago, WinWiesel said:

Basically, I'm writing a tool that downloads a json file with the steam web api and I want to display parts of it in my gui (I use the WinHTTP UDF to download).

Sometimes the json file contains HTML encoding strings.

@WinWiesel

If you are working with JSON data, are you aware that JSON parsers and processors automatically encode and decode JSON as needed?  It is a part of the specification.  Below is a brief example that takes encoded JSON,  decodes it, displays the decoded JSON fields, adds another JSON field and redisplays the encoded JSON.  Notice that string3 is entered in its original form and encoded by the parser.  Json.au3 is a JSON parser based on JSMN.  You can read more about it in the associated link that I provided in the example.

#include <Constants.au3>
#include <json.au3> ;https://www.autoitscript.com/forum/topic/148114-a-non-strict-json-udf-jsmn/

Const $JSON = '{' & _
                  '"string1" : "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002",' & _
                  '"string2" : "F1\u00ae 2020 allows you to create your F1\u00ae team whatever"' & _
              '}'

Global $oJson = Json_Decode($JSON)

;Display encoded JSON
WriteNotepadLogLine("Original Encoded JSON")
WriteNotepadLogLine(Json_Encode($oJson, $JSON_PRETTY_PRINT))

;Parse and display JSON fields
WriteNotepadLogLine()
WriteNotepadLogLine("Parse decoded JSON")
WriteNotepadLogLine("String 1 = " & Json_Get($oJson, ".string1"))
WriteNotepadLogLine("String 2 = " & Json_Get($oJson, ".string2"))

;Add a 3rd JSON field and redisplay encoded JSON
Json_Put($oJson, ".string3", "为了防止圣物金坷垃落入侵略者的手中")
WriteNotepadLogLine()
WriteNotepadLogLine("Encoded JSON with New Field")
WriteNotepadLogLine(Json_Encode($oJson, $JSON_PRETTY_PRINT))

Func WriteNotepadLogLine($sMsg = "")
    Const $TITLE_NOTEPAD = "[RegExpTitle:Untitled - Notepad]"
    Static $hWndNotepad = -1

    ;If we don't have a handle to notepad yet
    If $hWndNotepad = -1 Then
        ;If there isn't an existing instance of notepad running, launch one
        If Not WinExists($TITLE_NOTEPAD) Then Run("Notepad.exe")

        ;Get handle to notepad window
        $hWndNotepad = WinWait($TITLE_NOTEPAD, "", 3)
        If Not $hWndNotepad Then Exit MsgBox($MB_ICONERROR, "ERROR", "Unable to find Notepad window.")
    EndIf

    ;Write to Notepad
    If WinExists($hWndNotepad) Then ControlCommand($hWndNotepad, "", "Edit1", "EditPaste", $sMsg & @CRLF)
EndFunc

Output:

Original Encoded JSON
{
    "string1": "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002",
    "string2": "F1® 2020 allows you to create your F1® team whatever"
}

Parse decoded JSON
String 1 = 为了防止圣物金坷垃落入侵略者的手中,圣地亚戈大陆上的战士们纷纷挺身而出。
String 2 = F1® 2020 allows you to create your F1® team whatever

Encoded JSON with New Field
{
    "string1": "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d\uff0c\u5723\u5730\u4e9a\u6208\u5927\u9646\u4e0a\u7684\u6218\u58eb\u4eec\u7eb7\u7eb7\u633a\u8eab\u800c\u51fa\u3002",
    "string2": "F1® 2020 allows you to create your F1® team whatever",
    "string3": "\u4e3a\u4e86\u9632\u6b62\u5723\u7269\u91d1\u5777\u5783\u843d\u5165\u4fb5\u7565\u8005\u7684\u624b\u4e2d"
}

 

Edited by TheXman
Link to comment
Share on other sites

6 hours ago, TheXman said:

@WinWiesel

If you are working with JSON data, are you aware that JSON parsers and processors automatically encode and decode JSON as needed?  It is a part of the specification.  Below is a brief example that takes encoded JSON,  decodes it, displays the decoded JSON fields, adds another JSON field and redisplays the encoded JSON.  Notice that string3 is entered in its original form and encoded by the parser.  Json.au3 is a JSON parser based on JSMN.  You can read more about it in the associated link that I provided in the example.

 

Thx a lot Xman!

I'll have a look at this parser, though I'm pretty satisfied with my program so far. I get the needed Information from the JSON file with a couple of regex'es and the decoding is done by my code, posted above.

But its good to know, these parsers exist. When I need something more complex, I'll try them out...

 

Best regards, WinWiesel!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...