Jump to content
david1337

StringReplace special characters in htm file

Recommended Posts

david1337

Hey guys

Can anyone help me explain this? :)

$szFile = "test.htm"

$szText = FileRead($szFile)


$szText = StringReplace($szText, "hello", "ö")

FileDelete($szFile)
FileWrite($szFile,$szText)

If the file "test.htm" has it's text changed into something containing non US characters, in this example "ö", the output is " ö " when shown in a browser.
If i manually change the text in the "test.htm" file to "ö" - the output in the browser is "ö" !
In both cases, if the htm file is opened in notepad, the content is just "ö" - but the one changed from the script, still opens as " ö " in a browser. How weird is this?

I am aware that I can replace the text to " ö" , which is the HTML code for "ö" - then the output is correct in the browser, but this is just dumb when there are a lot of characters to be changed :)


Does anyone know why this happens, and how to solve it in a more simple way?

 

Share this post


Link to post
Share on other sites
jchd

You're seeing UTF8 encoding of characters. Change the html header to indicate UTF8.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
AutoBert
27 minutes ago, jchd said:

You're seeing UTF8 encoding of characters. Change the html header to indicate UTF8.

Are you sure it's the header. I think it's the filewrite using the wrong mode.

@david1337: so test first:

$szFile = "test.htm"

$szText = FileRead($szFile)


$szText = StringReplace($szText, "hello", "ö")
$hFile=FileOpen($hFile,$FO_OVERWRITE + $FO_UTF8)
FileWrite($hFile,$szText)
FileClose($hFile)

 

Share this post


Link to post
Share on other sites
jchd

Either you leave the header as ISO and use FileOpen with the ANSI mode to write the file,
or switch to full Unicode and switch the header to UTF8.

The later solution is universal, not the former.

  • Like 1

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
david1337

Hey guys, thanks for your answer.

2 hours ago, AutoBert said:

Are you sure it's the header. I think it's the filewrite using the wrong mode.

@david1337: so test first:

$szFile = "test.htm"

$szText = FileRead($szFile)


$szText = StringReplace($szText, "hello", "ö")
$hFile=FileOpen($hFile,$FO_OVERWRITE + $FO_UTF8)
FileWrite($hFile,$szText)
FileClose($hFile)

 

You code gives an error: (Am I missing an include for or something for the code to understand what "$FO_UTF8" is?)
==> Variable used without being declared.:
$hFile=FileOpen($hFile,$FO_OVERWRITE + $FO_UTF8)
$hFile=FileOpen(^ ERROR

 

2 hours ago, jchd said:

Either you leave the header as ISO and use FileOpen with the ANSI mode to write the file,
or switch to full Unicode and switch the header to UTF8.

The later solution is universal, not the former.

You are correct. Changing the HTML header to UTF-8 fixed the problem :)
But what if I do the same thing with a txt file, and open that in a web browser? Then I have the same problem, and I can't add a header to a txt file.
 

Edited by david1337

Share this post


Link to post
Share on other sites
AutoBert
4 minutes ago, david1337 said:

You code gives an error: (Am I missing an include for or something for the code to understand what "$FO_UTF8" is?)

Yes a include is missing, but there's a 2. error (typo):

#include <FileConstants.au3>

$szFile = "test.htm"

$szText = FileRead($szFile)


$szText = StringReplace($szText, "hello", "ö")
$hFile=FileOpen($szFile,$FO_OVERWRITE + $FO_UTF8)
FileWrite($hFile,$szText)
FileClose($hFile)

and maybe you need other mode:

$hFile=FileOpen($szFile,$FO_OVERWRITE + $FO_ANSI)

 

  • Like 1

Share this post


Link to post
Share on other sites
david1337
56 minutes ago, AutoBert said:

 

#include <FileConstants.au3>

$szFile = "test.htm"

$szText = FileRead($szFile)


$szText = StringReplace($szText, "hello", "ö")
$hFile=FileOpen($szFile,$FO_OVERWRITE + $FO_UTF8)
FileWrite($hFile,$szText)
FileClose($hFile)

 

AutoBert, this was exactly what I was looking for, and it worked perfectly! Thanks a lot :)

Edited by david1337

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • milos83
      By milos83
      Default keyword for optional parameter is interpreted wrongly.
      ConsoleWrite(StringReplace("aa", "a", "b", Default, 1) & @CRLF) StringReplace ( "string", "searchstring/start", "replacestring" [, occurrence = 0 [, casesense = 0]] ) The code above will output ab even thou the default value for the occurrence is 0 (replace all).
      Of course using zero instead of Default will work fine.
    • smartkey
      By smartkey
      Hi All,
           I have written a UDF for one of my requirement which replaces a single character in string with a sub string/another character. 
           I am using this for my requirement by calling below function as StrReplace("C:\Software\Autoit\Substr","\","\\") and gives result as C:\\Software\\Autoit\\Substr
           Please let me know if this can be improvised or any mistakes to correct.   
      ;===============================================================================
      ;
      ; Function Name:    StrReplace($INPUT_STRING)
      ; Description:      This function is to replace a character with another in a string.
      ; Parameter(s):     $INPUT_STRING     - Original String Value
      ;                    $STR_2_FIND     - Single Character to find the $INPUT_STRING
      ;                    $STR_2_REPLACE     - Substring/Multiple Characters to replace in place of     $STR_2_FIND value
      ; Requirement(s):   Replacing one single Character in a string with multiple Characters
      ; Return Value(s):  success - Output string after replacing a character with required character
      ;                    failure - 0
      ; Author(s):        smartkey
      ;
      ;===============================================================================
      Func StrReplace($INPUT_STRING, $STR_2_FIND, $STR_2_REPLACE)
          Local $OUTPUT_STRING = ""
          If StringLen($INPUT_STRING) > 0 Then
              If StringMid($INPUT_STRING,1,1) = $STR_2_FIND Then
                  $OUTPUT_STRING = $OUTPUT_STRING & $STR_2_REPLACE
              Else
                  $OUTPUT_STRING = StringMid($INPUT_STRING,1,1)
              EndIf
              For $i=2 to StringLen($INPUT_STRING)
                  If StringMid($INPUT_STRING,$i,1) = $STR_2_FIND Then
                      $OUTPUT_STRING= $OUTPUT_STRING & $STR_2_REPLACE
                  Else
                      $OUTPUT_STRING= $OUTPUT_STRING & StringMid($INPUT_STRING,$i,1)
                  EndIf
              Next
              Return $OUTPUT_STRING
          Else
              Return 0
          EndIf
      EndFunc
    • Amixg
      By Amixg
      Hi! I have another problem with AutoIT.  You see, I'm still the one who's just starting out with this great programming language. I have a problem with AutoIT, it seems not to recognize the "StringReplace" function even when it is by default. The code was made half in KODA and half of my own. It is a software programmed for the Italian language (I am Italian) and translate the SMS language into Italian correct. Would you help me? Thanks in advance.
       
      #include <ButtonConstants.au3> #include <EditConstants.au3> #include <GUIConstantsEx.au3> #include <StaticConstants.au3> #include <WindowsConstants.au3> $hull = "hello" StringReplace($hull, "hello", "hellx") $Form1 = GUICreate("Linguaggio SMS A Italiano", 507, 498, 192, 124) $Edit1 = GUICtrlCreateEdit("", 24, 40, 465, 177) GUICtrlSetData(-1, "") $Label1 = GUICtrlCreateLabel("Inserisci qui il testo che non riesci a capire:", 24, 8, 205, 17) $Label2 = GUICtrlCreateLabel("Ecco il testo tradotto(non toccare questo campo):", 27, 237, 237, 17) $Edit2 = GUICtrlCreateEdit("", 24, 264, 465, 137) GUICtrlSetData(-1, "") $Button1 = GUICtrlCreateButton("Traduci", 104, 424, 249, 49) GUISetState(@SW_SHOW) While 1 $nMsg = GUIGetMsg() Switch $nMsg Case $GUI_EVENT_CLOSE Exit Case $Button1 StringReplace(GUICtrlGetData($Label1), "qlc", "qualcuno") StringReplace(GUICtrlGetData($Label1), "qls", "qualcosa") StringReplace(GUICtrlGetData($Label1), "ke", "che") StringReplace(GUICtrlGetData($Label1), "x", "per") StringReplace(GUICtrlGetData($Label1), "pls", "per favore") StringReplace(GUICtrlGetData($Label1), "tu6", "tu sei") StringReplace(GUICtrlGetData($Label1), "zzz", "mi fai dormire") StringReplace(GUICtrlGetData($Label1), "hagn", "Buonanotte") StringReplace(GUICtrlGetData($Label1), "tvtb", "Ti voglio tanto bene") StringReplace(GUICtrlGetData($Label1), "tat", "Ti amo tanto") StringReplace(GUICtrlGetData($Label1), "lafs", "Amore a prima vista") StringReplace(GUICtrlGetData($Label1), "fli?", "Flirtiamo?") StringReplace(GUICtrlGetData($Label1), "msidt", "Mi sono innamorato di te") StringReplace(GUICtrlGetData($Label1), "ba", "Bacio") StringReplace(GUICtrlGetData($Label1), "midi", "Mi dispiace.") StringReplace(GUICtrlGetData($Label1), "ntm", "Non ti merito") StringReplace(GUICtrlGetData($Label1), "tdp", "Togliti dai piedi.") StringReplace(GUICtrlGetData($Label1), "amò", "Amore") StringReplace(GUICtrlGetData($Label1), "ap", "A presto!") StringReplace(GUICtrlGetData($Label1), "cmq", "comunque") StringReplace(GUICtrlGetData($Label1), "cvd", "Ci vediamo dopo") StringReplace(GUICtrlGetData($Label1), "Tvb", "Ti voglio bene") StringReplace(GUICtrlGetData($Label1), "nn", "non") StringReplace(GUICtrlGetData($Label1), "risp", "rispondimi") StringReplace(GUICtrlGetData($Label1), "cel", "cellulare") StringReplace(GUICtrlGetData($Label1), "dom", "qualcosa") StringReplace(GUICtrlGetData($Label1), "nm", "numero") StringReplace(GUICtrlGetData($Label1), "fv", "favore") StringReplace(GUICtrlGetData($Label1), "-male", "meno male") StringReplace(GUICtrlGetData($Label1), "disc", "discoteca") StringReplace(GUICtrlGetData($Label1), "se#", "settimana") StringReplace(GUICtrlGetData($Label1), "+ o -", "più o meno") StringReplace(GUICtrlGetData($Label1), "ts", "torno subito") StringReplace(GUICtrlGetData($Label1), "tvtbcoa", "Ti voglio tanto bene come amica") EndSwitch WEnd  
    • Jibberish
      By Jibberish
      Junior Programmer here... 
      Not much experience with opening, changing and closing files.
      I am trying to replace strings in a Text file except StringReplace does not actually replace the text.
      Here is a sample of my code...
      #include <File.au3> #include <MsgBoxConstants.au3> #include <WinAPIFiles.au3> Local $iStrReturn = 0 Local $hFile Local $sText Local $sNewText ; Location of File to be read $sFileName = "C:\Temp\MyPlayer.exe.config" ; The default is FALSE. We want to change this to TRUE $bLoopChecked = True CheckBox() Func Checkbox() $hFile = FileOpen($sFileName,$FO_READ) ; Open file in read mode to get text If $hFile = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when Opening the file.") Exit EndIf FileSetPos($hFile, 0, 0) ; No idea if I need to do this, grasping at straws $sText = FileRead($hFile) ; Read the file into $sText If $sText = 1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading/writing the file.") Exit Else FileClose($hFile) ; Finished reading the file into $sText, so close the file. FileFlush($hFile) ; Manual says to use FileFlush between File Close and Open so here it is EndIf MsgBox(0,"Before Replacement",$sText) ; Displays the text read from the file to make sure something is there. ; Loop Check If $bLoopChecked = True Then ; Find the string return > 0 for success $iStrReturn = StringInStr('"<add key="LoopCheckbox" value=""False" />"', "False") ;MsgBox(0,"", "LoopCheckBox is " & $iStrReturn) If $iStrReturn > 0 Then ; If StringInStr returned > 0 the it found the string! ; The Meat of the code. This is where we have to replace "False" with "True" $sNewText = StringReplace($sText, '"<add key="LoopCheckbox" value="False" />"', '"<add key="LoopCheckbox" value="True" />"') MsgBox(0,"After Replacement",$sNewText) ; Display the text to see if it worked. $hFile = FileOpen($sFileName,$FO_OVERWRITE) ; Reopen the file to write to it, overwriting everything. FileWrite($hFile,$sNewText) ; Write the text to the file FileClose($hFile) ; Close the file EndIf EndIf EndFunc This is the file I am reading...
      <?xml version="1.0" encoding="utf-8"?> <configuration> <startup> <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.6"/> </startup> <appSettings> <add key="LoopCheckbox" value="false"/> </appSettings> </configuration> I tried opening the file with $FO_UTF8 and $FO_UTF8_NOBOM but got errors opening the file.
      The MsgBox "After Replacement" shows the value is still false.
    • AnAdventurer
      By AnAdventurer
      Hello hello!
      As the title suggests, I am fairly new to AutoIt. In fact, I am new to scripting/coding in general! I've done a few Codecademy courses on CSS and HTML and perhaps Java though this was all a few years back. I've recently come across AutoIt and decided to give it a try since I do quite a few repetitive tasks on a daily basis. In the last couple of weeks I've managed to master (or at least get comfortable with) mouse clicks(left/right), window focus, sending key strokes, controls, and pixel search.
      Now let's get to the topic.
      At this point in time I've tried out a few simple IE scripts but I am having difficulty understanding some things and tying everything together into one tool.
      Specifically, I am struggling with this little bit of code I got from DaleHohm in his IE examples thread. Post #3 (The last example.)
      #include <IE.au3> $sImgDir = "c:\foo\"; Please make certain this folder already exists (silent failure if not) $sWebPage = "http://www.autoitscript.com/forum/index.php?"; webpage with images $oIE = _IECreate() _IENavigate($oIE, $sWebPage) $oIMGs = _IETagNameGetCollection($oIE.document, "img") ; Loop through all IMG tags and save file to local directory using INetGet For $oIMG in $oIMGs $sImgUrl = $oIMG.src $sImgFileName = $oIMG.nameProp INetGet($sImgUrl, $sImgDir & $sImgFileName) Next I have a couple questions about the code above.
      1) ".src" ".nameProp" What are these called? I figured out that I can change the .src to something like .href and it gets anything on the webpage with a .href tag but where can I learn more about these? I still haven't been able to figure out what ".nameProp" is for or what it does. Is there any documentation/list of all the different ".PurpleTextAfterAVariable" (Edit: Not sure why it's red in the above example, just checked SciTE and it's purple there) that I can use?
      2) I understand that the code above gets every "For $oIMG in $oIMGs" on the page but how can I make it only get the first 5? I've tried doing a "count" and a "for" but I am unsure what to replace the "For...in" statement with to keep the script functional. Is there a way to limit the _IETagNameGetCollection function to only get a specific amount of tags?
       
      Finally, the reason I can't just use the code as is.
      The site I am trying to get images from works in this way:
      A href= "Link-To-Picture.jpg" Img src= "Link-To-Picture-thumbnail.jpg" The script above downloads every single thumbnail from the image gallery which is great, it does what it's supposed to but I need the full resolution image.
      After changing the script to get anything with an "A href" tag it does what I need it to do, it gets every single image in full resolution... along with every single one of the 80-100 extra files/links to other sites that are listed under an "A href" tag.
       
      Now I've come up with two solutions but unfortunately, as I mentioned above. I don't know how to put my solution into the code above to make it work.
      Solution 1) Only get the first 5 instances of "A href" on the page.
      As mentioned above. I don't know how to do this.
      Solution 2) Read the entire page, find "-Thumbnail.jpg" replace with ".jpg" and use the script as is.
      I understand how to do a replace. All I am missing is how to do a replace within a field in the code of an IE page. I assume that I have to use the HTMLRead functions but how do I use/alter the data read?
      I really hope all of this make sense and that someone here will be able to help me figure out a solution to my issue or at least answer one of my questions! I do have plenty more questions and I am sure that I'll have even more by the time I figure this out.
      Thank you very much for your time!
×