Sign in to follow this  
Followers 0
AuToItItAlIaNlOv3R

Help string...

11 posts in this topic

Hi, i have this problem. I woult like to capture a line in a text. For example i've copied a source of web page in a .txt file.In this txt are the source code of web pages.I would like to copy a part of this source.This part is enclose in a tag.For example i've this source:

<html>

<title>Test</title>

<font color="ff0000"><div id="link">

THIS IS A TEST

</div></font></center>

</body>

</html>

i would like to copy the period "THIS IS A TEST" in anoter .txt file.

how can I do?

Hello :D

Share this post


Link to post
Share on other sites



I'm not 100% but could you use _StringBetween where start = <div id="link"> and end = </div>?

Check the helpfile under string management (UDF) for the full thing

Good luck

MDiesel

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

#Include <File.au3>
dim $html
$file=@ScriptDir & "\html.txt"  ; here the file with the html
_FileReadToArray($file, $html)

For $i=1 to $html[0]
    If StringLeft($html[$i],1) <> "<" Then
        FileWriteLine("new.txt", $html[$i]) ; output in new.txt
        ExitLoop ; delete this for more/all lines not starting with "<"
    EndIf
Next

Edited by PhilRip

Share this post


Link to post
Share on other sites

i would linke to copy in anoter file the string that start = <div id="link"> and end = </div>.

How do it?

Share this post


Link to post
Share on other sites

Up

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Please wait 24 hours before bumping a post.

Using this as the source

<html>

<title>Test</title>

<font color="ff0000"><div id="link">

THIS IS A TEST

</div></font></center>

<font color="ff0000"><div id="link2">

THIS IS ANOTHER TEST

</div></font></center>

</body>

</html>

$sStr = FileRead("Somefile.htm.")
$aStr = StringRegExp($sStr, "(?i)(?s)<div id=\x22?link.*?\x22?>\v?(.*?)\v?</?", 3)
For $i = 0 to Ubound($aStr) -1
    MsgBox(0, "RESULTS", $aStr[$i]
Next
NOTE: if you notice it will only work if you have the word link in the id. If that's not going to work then we will need to see some actual html code. Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

I've this source code of web page:

<HTML><HEAD><TITLE>Insert image</TITLE><LINK href="www.imageshack.us" rel="SHORTCUT ICON">

<script language=javascript src="Footer.js" type=text/javascript></SCRIPT>

<LINK title="Default Theme" media=all href="style.css" type=text/css rel=stylesheet></HEAD>

<BODY onload=addFooter();>

<CENTER>

<DIV id=container><IMG style="WIDTH: 342px; HEIGHT: 38px" alt="" src="http://img136.imageshack.us/img/img.png"> </DIV>

<DIV id=container>

<META content=reeky name=copyright>

<STYLE type=text/css>

/*<![CDATA[*/

body,input{font: small Verdana, Geneva, Arial, Helvetica, sans-serif;}

a{color:#00C;}

h2{margin-bottom:0;}

/*]]>*/

</STYLE>

<CENTER>

<H2><FONT color=#ffffff>Official Web Site</FONT> </H2><BR><BR><FONT color=#ff0000>

<DIV id=premium_link>http://www.EXAMPLELINK.EXAMPLE

</DIV></FONT></CENTER><BR>

<DIV id=container>

<DIV style="TEXT-ALIGN: center">

<FORM id=a name=a method=post><B><FONT color=#ffffff>Link: (<FONT color=#ffffff>Insert Image Path</FONT>)</B><BR></FONT><BR><INPUT size=50 value=" " name=link> <INPUT type=submit value=Go!> </FORM></DIV></DIV></CENTER></DIV></BODY></HTML>

I would link to copy in anoter .txt file, the string in bold (http://www.EXAMPLELINK.EXAMPLE).

how can I do?

Edited by AuToItItAlIaNlOv3R

Share this post


Link to post
Share on other sites

"(?i)<div[^>]*+>([^\r\n<]*+)"

This is not a task for StringRegExp alone, you need first to enumerate the document <DIV> tags and check each one is _IEPropertyGet($oDiv, 'innerText') or 'outerText, if I'm not wrong, then use string regexp to match link pattern.

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Change the regexp I gave you to

$aStr = StringRegExp($sStr, "(?i)(?s)<div\sid=.*?link.*?>\v?(.*?)\v?</.+>", 3)

I should have added that you change the MsgBox() to FileWriteLine("myfile.txt", $aStr[$i])

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

@Geosoft this code don't work :

$sStr = FileRead(@DesktopDir & "\sourcecode.txt")

$aStr = StringRegExp($sStr, "(?i)(?s)<div\sid=.*?link.*?>\v?(.*?)\v?</.+>", 3)

For $i = 0 to Ubound($aStr) -1

FileWriteLine("myfile.txt", $aStr[$i])

Next

The myfile.txt is empty!

How can i do this script?

Share this post


Link to post
Share on other sites

I tested it using the HTML thatyou posted and it worked fine.

replace the FileWrite line with this just to see what it returns.

MsgBox(0, "TEST RESULTS", "Error = " & @Error & @CRLF & $aStr[$i])

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0