Sign in to follow this  
Followers 0
AutID

BinaryToString weird characters after convert

16 posts in this topic

#1 ·  Posted (edited)

$URL = "http://translate.google.com/translate_a/t?client=t&sl=auto&text=hometown+&tl=el"
 Local $dData = InetRead($URL)
 ;ConsoleWrite($dData & @CRLF)
 Local $sData = BinaryToString($dData)
 ConsoleWrite($sData & @LF)

String after converted from binary looks like this: ðáôñßäá.

This apparently happens because the string contains French or greek(Bulgarian, Chinese and other languages which have non-latin characters) characters and it can't display them normally.

What can I do about this?

 

Edited by AutID

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Hi,

Take a look at the flag parameter of the BinaryToString function.

Br, FireFox.

Edited by FireFox

 

OS : Win XP SP2 (32 bits) / Win 7 SP1 (64 bits) / Win 8 (64 bits) | Autoit version: latest stable / beta.
Hardware : Intel(R) Core(TM) i5-2400 CPU @ 3.10Ghz / 8 GiB RAM DDR3.

My UDFs : Skype UDF | TrayIconEx UDF | GUI Panel UDF | Excel XML UDF | Is_Pressed_UDF

My Projects : YouTube Multi-downloader | FTP Easy-UP | Lock'n | WinKill | AVICapture | Skype TM | Tap Maker | ShellNew | Scriptner | Const Replacer | FT_Pocket | Chrome theme maker

My Examples : Capture toolIP Camera | Crosshair | Draw Captured Region | Picture Screensaver | Jscreenfix | Drivetemp | Picture viewer

My Snippets : Basic TCP | Systray_GetIconIndex | Intercept End task | Winpcap various | Advanced HotKeySet | Transparent Edit control

 

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

$sText = "http://translate.google.com/translate_a/t?client=t&sl=auto&text=hometown+&tl=el"

hometown+   = the encoded string

el at the end is the language - Greek

auto in the middle is the auto detect which could be en that stands for English since I am translating English to other languages

 

Edit: if I use winhttp instead of inetread and get the response text which converts the binary to string automatically, it gives the right characters

Edited by AutID

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Anyone up to this? It must be binarytostring function. I don't think inetread would mess it up.
Again if I use winhttp instead of inetread it works good but I don't like this way because I am using winhttp object not the winhttp.au3 and I can not check errors or anythings since it is not pure autoit.

Edited by AutID

Share this post


Link to post
Share on other sites

since almost always encoding is UTF8, you should decode it like this:

$sHTMLsource = BinaryToString($sHTMLsource,4) ;to decode as UTF8
1 person likes this

Share this post


Link to post
Share on other sites

#include <MsgBoxConstants.au3>


$URL = "http://translate.google.com/translate_a/t?client=t&sl=auto&text=hometown+&tl=el"
 Local $dData = InetRead($URL)
 ;ConsoleWrite($dData & @CRLF)
 Local $sData = BinaryToString($dData,4)

MsgBox($MB_SYSTEMMODAL, "Title", $sData )

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

the correct output string should be like this:

[[["πατρίδα","hometown","patrída",""]],,"en",,[["πατρίδα",[1],true,false,572,0,1,0]],[["hometown",1,[["πατρίδα",572,true,false],["γενέτειρά",276,true,false],["hometown",128,true,false],["πόλη",13,true,false],["την πατρίδα",8,true,false]],[[0,8]],"hometown"]],,,[["en"]],5]

it seems that none of the above post attempts produces such string
please check before posting

Edited by Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

As an alternative, you can always do something like this:

#include "WinHttp.au3"

MsgBox(4096, "", GoogleTranslate("hometown", "el"))


Func GoogleTranslate($sString, $sTo, $sFrom = "auto")
    Local $hOpen = _WinHttpOpen("Mozilla/5.0")
    Local $hConnect = _WinHttpConnect($hOpen, "https://translate.google.com")
    Local $sRead = _WinHttpSimpleSSLRequest($hConnect, Default, "translate_a/t?client=t&sl=" & $sFrom & "&text=" & __WinHttpURLEncode($sString) & "&tl=" & $sTo)
    _WinHttpCloseHandle($hConnect)
    _WinHttpCloseHandle($hOpen)
    Return $sRead
EndFunc
1 person likes this

♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

It appears that it is how Inetread is pulling down the data, the same happens with inetget.  The solution that tracexx posted works.

Edited by step887

Share this post


Link to post
Share on other sites

.......

Again if I use winhttp instead of inetread it works good but I don't like this way because I am using winhttp object not the winhttp.au3 and I can not check errors or anythings since it is not pure autoit.

how do you do it?

 

......  The solution that tracexx posted works.

 

as always...  :) 


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

 

As an alternative, you can always do something like this:

#include "WinHttp.au3"

MsgBox(4096, "", GoogleTranslate("hometown", "el"))


Func GoogleTranslate($sString, $sTo, $sFrom = "auto")
    Local $hOpen = _WinHttpOpen("Mozilla/5.0")
    Local $hConnect = _WinHttpConnect($hOpen, "https://translate.google.com")
    Local $sRead = _WinHttpSimpleSSLRequest($hConnect, Default, "translate_a/t?client=t&sl=" & $sFrom & "&text=" & __WinHttpURLEncode($sString) & "&tl=" & $sTo)
    _WinHttpCloseHandle($hConnect)
    _WinHttpCloseHandle($hOpen)
    Return $sRead
EndFunc

I love when you fucking do that girl ;)

 

Cheers everyone

 

Share this post


Link to post
Share on other sites

I do like the alternative that trancexx provided. It does highlight the importance of a http user agent string that can be recognized.

InetRead seems to be adequate with this example:

; Set http user agent.
$sUserAgent = _HttpSetUserAgent()
; Show http user agent being used
MsgBox(0, 'Http User Agent', $sUserAgent)
; Get binary data from this URL. 9 = Force reload with binary mode.
$URL = "http://translate.google.com/translate_a/t?client=t&sl=auto&text=hometown+&tl=el"
$dData = InetRead($URL, 9)
; Convert the binary to a string. 4 = UTF8.
$sData = BinaryToString($dData, 4)
; Show the UTF8 string.
MsgBox(0, $URL, $sData)

Func _HttpSetUserAgent($sPreferredUserAgent = 'Mozilla/5.0', $bForcePreference = False)
    ; Set http user agent.
    Local $sUserAgent
    If Not $bForcePreference Then
        ; Places of reading the http user agent can vary. This is just one place.
        $sUserAgent = RegRead('HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings', 'User Agent')
    EndIf
    If $sUserAgent == '' Then
        Switch $sPreferredUserAgent
            Case Default, 'Mozilla/5.0'
                $sUserAgent = 'Mozilla/5.0'
            Case ''
                $sUserAgent = ''
            Case Else
                $sUserAgent = $sPreferredUserAgent
        EndSwitch
    EndIf
    HttpSetUserAgent($sUserAgent)
    Return SetError(@error, @extended, $sUserAgent)
EndFunc

; http://msdn.microsoft.com/en-us/library/ms537503%28v=vs.85%29.aspx
; Quote:
; "Note  The user-agent string should not be used to indicate the presence of optional software or features.
; Custom version vectors, which can be detected using conditional comments, provide a more appropriate mechanism."
;
; http://msdn.microsoft.com/en-us/library/hh869301%28v=vs.85%29.aspx
; Quote:
; "As with previous versions of Internet Explorer, portions of the user-agent string can vary according
; to the device running Internet Explorer, the operating system, and the environment."
;

I doubt too many http servers will be looking for the http user agent string of "AutoIt" so IMO change it. Trident usually uses "Mozilla/4.0" or "Mozilla/5.0" with a possible features string appended.

Output I get from testing:

---------------------------
Http User Agent
---------------------------
Mozilla/4.0 (compatible; MSIE 8.0; Win32)
---------------------------
OK   
---------------------------


---------------------------
http://translate.google.com/translate_a/t?client=t&sl=auto&text=hometown+&tl=el
---------------------------
[[["πατρίδα","hometown","patrída",""]],,"en",,[["πατρίδα",[1],true,false,572,0,1,0]],[["hometown",1,[["πατρίδα",572,true,false],["γενέτειρά",276,true,false],["hometown",128,true,false],["πόλη",13,true,false],["την πατρίδα",8,true,false]],[[0,8]],"hometown"]],,,[["en"]],3]
---------------------------
OK   
---------------------------
2 people like this

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0