Jump to content
Sign in to follow this  
autoitxp

read html file and replace string

Recommended Posts

autoitxp

Guyz i have large html file about 2 mb i wanted to replace same \<br> from file and then need to trim right . coz some <br> not have \

file is large so what is best n fastest way to get data

these files keep on coming all the time from server .

$r_File = "xxx.html"
$sRead = FileRead($r_File)
 $text = StringReplace($sRead, "<br>", StringTrimRight("",1))
MsgBox (0 , "",$text)
Edited by autoitxp

Share this post


Link to post
Share on other sites
BrettF

So let me get this straight...

Some lines might be

text blah blah \<BR>

and some are just

text bfsdfhsljdfs<BR>

Let me do some tests for the fastest way :)

You haven't got an example of HTML contents do you?

Edited by BrettF

Share this post


Link to post
Share on other sites
BrettF

Okay.

I created a HTML file using the following:

$out = ""
$file = @ScriptDir & "\helping.html"

For $i = 1 to 100000
    $addslash = Random (0,1,1);1 or 0
    If $addslash = 1 Then
        $out &= "RANDOMTEXTWILLGOHERE\<BR>" & @CRLF
    Else
        $out &= "RANDOMTEXTWILLGOHERE<BR>" & @CRLF
    EndIf
    ToolTip ("Done " & Int (($i/200000)*100) & "%")
Next
FileWrite ($file, $out)

It created a 2.52MB file. Lines either had \<BR> or <BR> tacked onto the end.

Next I created this short script to test:

$file = @ScriptDir & "\helping.html"

$text = FileRead ($file)

$timer1 = TimerInit ()
$tout1 = StringReplace ($text, "\<BR>", "")
$tout1 = StringReplace ($tout1, "<BR>", "")
FileWrite (@ScriptDir & "\helpingOut1.html", $tout1)
$timer1 = TimerDiff ($timer1)

$timer2 = TimerInit ()
$tout2 = StringRegExpReplace ($text, "(\\?)<BR>", "")
FileWrite (@ScriptDir & "\helpingOut2.html", $tout2)
$timer2 = TimerDiff ($timer2)

MsgBox (0, "Results", "1 = " & $timer1 & @CRLF & "2 = " & $timer2)

Results were as follows:

1 = 503.857333823153

2 = 693.47389123478

I do believe my RegExp could be optimized though, as I managed to do it with a few different patterns...

Share this post


Link to post
Share on other sites
autoitxp

Thanks code works good i have an other question how to stringrighttrim after replacment of <BR>

$text = StringReplace("RED;<BR> GREEN {<BR><BR> YELLOW }<br> \<br> BLACK +<BR><BR> PURPLE -<BR> ", "<BR>", "" )

Share this post


Link to post
Share on other sites
BrettF
autoitxp

$text = StringReplace("RED;<BR> GREEN {<BR><BR> YELLOW }<br> \<br> BLACK +<BR><BR> PURPLE -<BR> ", "<BR>", "" )

man this is most final input it could be anything at <br> right side like so thats y im askinig about stringtrimright ty.

Share this post


Link to post
Share on other sites
BrettF
autoitxp

local $out = ""
local $Letter
$file = "logs.html"

For $i = 1 to 100000



 $addslash = Random (0,1,1);1 or 0
    If $addslash = 1 Then
        If Random() < 0.5 Then

    $Letter = Chr(Random(Asc("A"), Asc("Z"), 1))

Endif
        $out &= "RANDOMTEXTWILLGOHERE.."&  $Letter  &"<br>"& @CRLF
    EndIf
    ToolTip ("Done " & Int (($i/200000)*100) & "%")
Next
FileWrite ($file, $out)

Share this post


Link to post
Share on other sites
BrettF
autoitxp

Hi i wanted replace these special chrs to CGI format how to do it properly help !

StringRegExpReplace("<>="':?#[]!$&(),;%",  "<>="':?#[]!$&(),;%" , "%3C %3E %3D %22 %27 %3A %3F %23 %5B %5D %21 %24 %26 %28 %29 %2C %3B %25 " )
Edited by autoitxp

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.