Jump to content

Problem With UTF-8


 Share

Recommended Posts

Hi ..

I Want To Write A Simple RSS Reader For My Site& IT's RSS Link Is : www.cina.ir/feed

I Found Some UDFs & Examples In Forum Such As RSS.Au3 & ...

But They Have Problem With Site's Language ..

Site Language Is Persian , UTF-8

#include-once
#region _RSS
; RSS Reader
; Created By: Frostfel
#include <INet.au3>
#include <Array.au3>

; ============================================================================
; Function: _RSSGetInfo($RSS, $RSS_InfoS, $RSS_InfoE[, $RSS_Info_ = 1])
; Description: Gets RSS Info
; Parameter(s): $RSS =  RSS Feed Example: "http://feed.com/index.xml"
;               $RSS_InfoS = String to find for info start Example: <title>
;               $RSS_InfoE = String to find for info end Example: </title>
;               $RSS_Info_Start = [optional] <info>/</info> To start at
;                                   Some RSS feeds will have page titles
;                                   you dont want Defualt = 0
; Requirement(s): None
; Return Value(s): On Success - Returns RSS Info in Array Starting at 1
;                  On Failure - Returns 0
;                       @Error = 1 - Failed to get RSS Feed 
; Author(s): Frostfel
; ============================================================================
Func _RSSGetInfo($RSS, $RSS_InfoS, $RSS_InfoE, $RSS_Info_Start = 0)
$RSSFile = _INetGetSource($RSS)

If @Error Then
    SetError(1)
    Return -1
EndIf

Dim $InfoSearchS = 1
Dim $Info[1000]
Dim $InfoNumA
$InfoNum = $RSS_Info_Start
    While $InfoSearchS <> 6
        $InfoNum += 1
        $InfoNumA += 1
        $InfoSearchS = StringInStr($RSSFile, $RSS_InfoS, 0, $InfoNum)
        $InfoSearchE = StringInStr($RSSFile, $RSS_InfoE, 0, $InfoNum)
        $InfoSearchS += 6
        $InfoSS = StringTrimLeft($RSSFile, $InfoSearchS)
        $InfoSearchE -= 1
        $InfoSE_Len = StringLen(StringTrimLeft($RSSFile, $InfoSearchE))
        $InfoSE = StringTrimRight($InfoSS, $InfoSE_Len)
        _ArrayInsert($Info, $InfoNumA, $InfoSE)
    WEnd
Return $Info
EndFunc
#endregion
ConsoleWrite(_INetGetSource('http://cina.ir/feed/'))
$Test1 = _RSSGetInfo("http://cina.ir/feed/", "<title>", "</title>", 1)
MsgBox(0, "Test", "Title 1: "&$Test1[1]&" Title 2: "&$Test1[2]&" Title 3: "&$Test1[3]&" Title 4: "&$Test1[4]&" Title 5: "&$Test1[5])

Link to comment
Share on other sites

This could work, but my fonts don't support the chars, so i can't test it >_<

#include-once
#region _RSS
; RSS Reader
; Created By: Frostfel
#include <INet.au3>
#include <Array.au3>

; ============================================================================
; Function: _RSSGetInfo($RSS, $RSS_InfoS, $RSS_InfoE[, $RSS_Info_ = 1])
; Description: Gets RSS Info
; Parameter(s): $RSS =  RSS Feed Example: "http://feed.com/index.xml"
;               $RSS_InfoS = String to find for info start Example: <title>
;               $RSS_InfoE = String to find for info end Example: </title>
;               $RSS_Info_Start = [optional] <info>/</info> To start at
;                                   Some RSS feeds will have page titles
;                                   you dont want Defualt = 0
; Requirement(s): None
; Return Value(s): On Success - Returns RSS Info in Array Starting at 1
;                  On Failure - Returns 0
;                       @Error = 1 - Failed to get RSS Feed 
; Author(s): Frostfel
; ============================================================================
Func _RSSGetInfo($RSS, $RSS_InfoS, $RSS_InfoE, $RSS_Info_Start = 0)
$RSSFile = _INetGetSource($RSS)
$RSSFile = BinaryToString(StringToBinary($RSSFile,1),4)

If @Error Then
    SetError(1)
    Return -1
EndIf

Dim $InfoSearchS = 1
Dim $Info[1000]
Dim $InfoNumA
$InfoNum = $RSS_Info_Start
    While $InfoSearchS <> 6
        $InfoNum += 1
        $InfoNumA += 1
        $InfoSearchS = StringInStr($RSSFile, $RSS_InfoS, 0, $InfoNum)
        $InfoSearchE = StringInStr($RSSFile, $RSS_InfoE, 0, $InfoNum)
        $InfoSearchS += 6
        $InfoSS = StringTrimLeft($RSSFile, $InfoSearchS)
        $InfoSearchE -= 1
        $InfoSE_Len = StringLen(StringTrimLeft($RSSFile, $InfoSearchE))
        $InfoSE = StringTrimRight($InfoSS, $InfoSE_Len)
        _ArrayInsert($Info, $InfoNumA, $InfoSE)
    WEnd
Return $Info
EndFunc
#endregion
ConsoleWrite(BinaryToString(StringToBinary(_INetGetSource('http://cina.ir/feed/'),1),4))
$Test1 = _RSSGetInfo("http://cina.ir/feed/", "<title>", "</title>", 1)
MsgBox(0, "Test", "Title 1: "&$Test1[1]&" Title 2: "&$Test1[2]&" Title 3: "&$Test1[3]&" Title 4: "&$Test1[4]&" Title 5: "&$Test1[5])

*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Link to comment
Share on other sites

You're welcome >_<

Do you know what i added?

$RSSFile = BinaryToString(StringToBinary($RSSFile,1),4)

$temp = StringToBinary($RSSFile,1)

-> first treat the data as 1-byte ASCII, since we got the bytes in this format from the function.

-> after this, we have the raw binary data the webserver sent to us.

$temp2 = BinaryToString($temp, 4)

-> we know, the binary is UTF-8, so create a string from it using the UTF-8 decoding algorithm.

-> Now we have the string in a readable format.

Edited by ProgAndy

*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Link to comment
Share on other sites

You're welcome >_<

Do you know what i added?

$RSSFile = BinaryToString(StringToBinary($RSSFile,1),4)

$temp = StringToBinary($RSSFile,1)

-> first treat the data as 1-byte ASCII, since we got the bytes in this format from the function.

-> after this, we have the raw binary data the webserver sent to us.

$temp2 = BinaryToString($temp, 4)

-> we know, the binary is UTF-8, so create a string from it using the UTF-8 decoding algorithm.

-> Now we have the string in a readable format.

Thx Again ..

Yes , I Saw Code .. I Tried To Use _WinAPI_WideCharToMultiByte Function , But It Coundn't Help Me .. !

But Your Solution Was Very Good .. ! :(

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...