Jump to content

Doubts about Url encoding (percent encoding) for application/x-www-form-urlencoded requests


Recommended Posts

I've been reading some threads and this article in wikipedia and I understood that I have to encode the data before I send it using POST method. I tested these two functions by Prog@andy (thank you a lot!):

Func _URIEncode($sData)
    ; Prog@ndy
    Local $aData = StringSplit(BinaryToString(StringToBinary($sData,4),1),"")
    Local $nChar
    $sData=""
    For $i = 1 To $aData[0]
        ConsoleWrite($aData[$i] & @CRLF)
        $nChar = Asc($aData[$i])
        Switch $nChar
            Case 45, 46, 48 To 57, 65 To 90, 95, 97 To 122, 126
                $sData &= $aData[$i]
            Case 32
                $sData &= "+"
            Case Else
                $sData &= "%" & Hex($nChar,2)
        EndSwitch
    Next
    Return $sData
EndFunc

Func _URIDecode($sData)
    ; Prog@ndy
    Local $aData = StringSplit(StringReplace($sData,"+"," ",0,1),"%")
    $sData = ""
    For $i = 2 To $aData[0]
        $aData[1] &= Chr(Dec(StringLeft($aData[$i],2))) & StringTrimLeft($aData[$i],2)
    Next
    Return BinaryToString(StringToBinary($aData[1],1),4)
EndFunc

But I have some doubts:

1) From what I read, there are certain 'special characters' that have a reserved meaning in url like '/' '?' and so on. But I googled and I got only partial lists, are all the characters in this list in wikipedia? http://en.wikipedia.org/wiki/Percent-encoding#Types_of_URI_characters

However, I think that type of encoding does not apply to encode POST data because an 'space' is changed by a '+' so the question would be, for application/x-www-form-urlencoded data, what is the full list of characters that need to be encoded?

Here are some other 'commonly used' lists of characters:

http://community.contractwebdevelopment.com/url-escape-characters

http://www.eugenoprea.com/url-escape-codes/

2) I tested the functions made by Prog@andy and they encode the data correctly to use for POST, I used LiveHttp headers plugin for firefox to trace the requests then I decoded them and encoded and it was the same, actually when encoding I had to encode some parts to get the same string but there were some characters like numbers that were encoded and they shouldn't have been, I will explain it with examples so I can get across better:

First I decoded the data sent in a POST request -I edited the MD5 hash just in case although I know is 'one way' ie it can not be decrypted-

I used the table from wikipedia to manually identify the special characters and so create the string that I expected to be the result of the decode:

$stringtodecode = "do=login&url=%2Fforo%2Fnewthread.php%3Fdo%3Dnewthread%26f%3D104&vb_login_md5password="& _
"md5hash&vb_login_md5password_utf=md5hash&s=&securitytoken=guest&vb_login_username=username&vb_login_password="
$stringdecoded = _URIDecode($stringtodecode)
$stringexpected = "do=login&url=/foro/newthread.php?do=newthread&f=104&vb_login_md5password="& _
"md5hash&vb_login_md5password_utf=md5hash&s=&securitytoken=guest&vb_login_username=username&vb_login_password="
If StringCompare($stringexpected,$stringdecoded) = 0 Then MsgBox(0,"","successful decode")

So that was ok but when encoding I had to encode the actual data that I was sending so I did this:

$stringexpected = $stringtodecode
$stringencoded = "do=login&url="&_URIEncode("/foro/newthread.php?do=newthread&f=104")&"&vb_login_md5password="& _
"md5hash&vb_login_md5password_utf=md5hash&s=&securitytoken=guest&vb_login_username=username&vb_login_password="
If StringCompare($stringencoded,$stringexpected) = 0 Then
   MsgBox(0,"string encoded ok",$stringencoded)
Else
    MsgBox(0,"string incorrectly encoded",$stringencoded)
    FileWrite(@ScriptDir&'/stringencoded.txt',$stringencoded)
EndIf

But the $stringencoded was this:

do=login&url=%2Fforo%2Fnewthread.php%3Fdo%3Dnewthread%26f%3D%31%30%34&vb_login_md5password=md5hash&vb_login_md5password_utf=md5hash&s=&securitytoken=guest&vb_login_username=username&vb_login_password=

Instead of:

do=login&url=%2Fforo%2Fnewthread.php%3Fdo%3Dnewthread%26f%3D104&vb_login_md5password=md5hash&vb_login_md5password_utf=md5hash&s=&securitytoken=guest&vb_login_username=username&vb_login_password=

Why did it encode the 104 if it is a number that is not a special character? Also it happened with the '1' in 'Prueba1,' in the following test:

$stringdecoded = "subject=prueba1&threadjaq=&description=&message="&_URIDecode("primera+prueba+de+hacer+un+post%3Cbr%3E")&"&wysiwyg=1&taglist="&_URIDecode("prueba1%2C")&"&sendtrackbacks=&iconid=0&s=&securitytoken="& _
"1293833908-59b120fba44d6bcf829291454854a3344217afc7&f=104&do=postthread&posthash=6c93c96461f82bbcf301e40d5522ccae&poststarttime=1293833908&loggedinuser=32163&parseurl="& _
"1&vbseo_retrtitle=1&vbseo_is_retrtitle=1&emailupdate=9999&polloptions=4&preview="&_URIDecode("Vista+Previa+de+Mensaje")

MsgBox(0,"string decoded",$stringdecoded)

$stringencoded = "subject=prueba1&threadjaq=&description=&message="&_URIEncode("primera prueba de hacer un post<br>")&"&wysiwyg=1&taglist="&_URIEncode("prueba1,")&"&sendtrackbacks=&iconid=0&s=&securitytoken="& _
"1293833908-59b120fba44d6bcf829291454854a3344217afc7&f=104&do=postthread&posthash=6c93c96461f82bbcf301e40d5522ccae&poststarttime=1293833908&loggedinuser=32163&parseurl="& _
"1&vbseo_retrtitle=1&vbseo_is_retrtitle=1&emailupdate=9999&polloptions=4&preview="&_URIEncode("Vista Previa de Mensaje")

MsgBox(0,"stringdencodedbien",$stringencoded)

Obviously I can fix this in both cases by using

$stringencoded = "do=login&url="&_URIEncode("/foro/newthread.php?do=newthread&f=")&"104"&"&vb_login_md5password="& _

"md5hash&vb_login_md5password_utf=md5hash&s=&securitytoken=guest&vb_login_username=username&vb_login_password="

and

$stringencoded = "subject=prueba1&threadjaq=&description=&message="&_URIEncode("primera prueba de hacer un post<br>")&"&wysiwyg=1&taglist="&_URIEncode("prueba")&"1"&_URIEncode(",")&"&sendtrackbacks=&iconid=0&s=&securitytoken="& _

"1293833908-59b120fba44d6bcf829291454854a3344217afc7&f=104&do=postthread&posthash=6c93c96461f82bbcf301e40d5522ccae&poststarttime=1293833908&loggedinuser=32163&parseurl="& _

"1&vbseo_retrtitle=1&vbseo_is_retrtitle=1&emailupdate=9999&polloptions=4&preview="&_URIEncode("Vista Previa de Mensaje")

but in the second case it's more difficult cause it's a tag of a forum post and it can vary a lot. Is there a way with less hassle to encode the request? Maybe it would help to know ,as I previously asked, a full list of special characters that have to be converted and make a function that parse all the data character by character and if it's a special one, it applies _URIEncode to it.

3) This is not completely related but I noticed that in the previous test, when I decoded the data I sent in the message of a post it returned "primera prueba de hacer un post<br>"

Isn't '<br>' a tag from HTML? Why is it there if forums they use BBCODE? . The forum that I used which has the testing zone is in particular a vbulletin one...

Thanks for your help!

Edited by Mithrandir
Link to comment
Share on other sites

The function has a small error: "48-57" should actually be "48 To 57". Thats why numbers do not get encoeded properly.

This is no big deal since you are allowed to send all characters encoded, but it adds some extra bytes to the data to send.

Edit: And you should remove the ConsoleWrite.

Edited by ProgAndy

*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Link to comment
Share on other sites

This is the function I use:

Func _URLEncode($urlText)
    Local $sText

    For $i = 1 To StringLen($urlText)
        $acode = Asc(StringMid($urlText, $i, 1))
        Select
            Case ($acode >= 48 And $acode <= 57) Or _
                    ($acode >= 65 And $acode <= 90) Or _
                    ($acode >= 97 And $acode <= 122)
                $sText &= StringMid($urlText, $i, 1)
            Case $acode = 32
                $sText &= "%20"
            Case Else
                $sText &= "%" & Hex($acode, 2)
        EndSelect
    Next

    Return $sText
EndFunc   ;==>_URLEncode
Link to comment
Share on other sites

  • 2 months later...

I just noticed that the most common error is in the setup of 'content type' mime application/x-www-form-urlencoded. Well, I'm trying to fill and send forms without IE.au3. Do you think the post method is still useful even if a large amount of data is involved?

Edited by shantelhaynes
Link to comment
Share on other sites

  • Moderators

Well, since we're all sharing, here's one I wrote a while back, looks like it "should" be faster:

( although, StringToASCIIArray() has always been kind of slow for me, StringSplit(BinaryToString(StringToBinary($sData,4),1),"") may in fact be faster, based on same concept I see )

Func _URL_Encode($s_str)

    Local $a_split = StringSplit($s_str, "", 2)
    Local $i_ub = UBound($a_split), $s_ret
    Local $a_dec = StringToASCIIArray($s_str, 0, $i_ub, 1)

    For $ichar = 0 To $i_ub - 1
        Switch $a_dec[$ichar]
            Case 33, 36, 38 To 59, 61, 63 To 90, 95, 97 To 122
                $s_ret &= $a_split[$ichar]
            Case Else
                $s_ret &= "%" & Hex($a_dec[$ichar], 2)
        EndSwitch
    Next

    Return $s_ret
EndFunc

I don't see in my snippets I ever got around to the decode part though :)

@shantelhaynes

Have you taken a look at trancexx's winhttp.au3 udf?

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

I just did a quick test and got these Results for the following text: This is a very simple sample of text!

SmOke_N: This%20is%20a%20very%20simple%20sample%20of%20text!

Trancexx's: This+is+a+very+simple+sample+of+text%21

Also...

JamesBrooks: This%20is%20a%20very%20simple%20sample%20of%20text%21

Which one would be better for general use? Edited by guinness

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

  • Moderators

Hmm, something is wrong with mine me thinks.

Edit:

Trancexx's code and mine are pretty much the same, if you copied his case's over to mine, and changed the vars over, you'd probably get the same result.

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

OK, cool thanks for the clarification :) I only copied the Functions from your post above and Trancexx's __WinHttpURLEncode() from the latest version of WinHttp.au3.

Edited by guinness

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

  • 2 months later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...