Jump to content
Sign in to follow this  
AuToItItAlIaNlOv3R

Help string...

Recommended Posts

AuToItItAlIaNlOv3R

Hi, i have this problem. I woult like to capture a line in a text. For example i've copied a source of web page in a .txt file.In this txt are the source code of web pages.I would like to copy a part of this source.This part is enclose in a tag.For example i've this source:

<html>

<title>Test</title>

<font color="ff0000"><div id="link">

THIS IS A TEST

</div></font></center>

</body>

</html>

i would like to copy the period "THIS IS A TEST" in anoter .txt file.

how can I do?

Hello :D

Share this post


Link to post
Share on other sites
Mat

I'm not 100% but could you use _StringBetween where start = <div id="link"> and end = </div>?

Check the helpfile under string management (UDF) for the full thing

Good luck

MDiesel

Share this post


Link to post
Share on other sites
PhilRip

#Include <File.au3>
dim $html
$file=@ScriptDir & "\html.txt"  ; here the file with the html
_FileReadToArray($file, $html)

For $i=1 to $html[0]
    If StringLeft($html[$i],1) <> "<" Then
        FileWriteLine("new.txt", $html[$i]) ; output in new.txt
        ExitLoop ; delete this for more/all lines not starting with "<"
    EndIf
Next

Edited by PhilRip

Share this post


Link to post
Share on other sites
AuToItItAlIaNlOv3R

i would linke to copy in anoter file the string that start = <div id="link"> and end = </div>.

How do it?

Share this post


Link to post
Share on other sites
GEOSoft

Please wait 24 hours before bumping a post.

Using this as the source

<html>

<title>Test</title>

<font color="ff0000"><div id="link">

THIS IS A TEST

</div></font></center>

<font color="ff0000"><div id="link2">

THIS IS ANOTHER TEST

</div></font></center>

</body>

</html>

$sStr = FileRead("Somefile.htm.")
$aStr = StringRegExp($sStr, "(?i)(?s)<div id=\x22?link.*?\x22?>\v?(.*?)\v?</?", 3)
For $i = 0 to Ubound($aStr) -1
    MsgBox(0, "RESULTS", $aStr[$i]
Next
NOTE: if you notice it will only work if you have the word link in the id. If that's not going to work then we will need to see some actual html code. Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites
AuToItItAlIaNlOv3R

I've this source code of web page:

<HTML><HEAD><TITLE>Insert image</TITLE><LINK href="www.imageshack.us" rel="SHORTCUT ICON">

<script language=javascript src="Footer.js" type=text/javascript></SCRIPT>

<LINK title="Default Theme" media=all href="style.css" type=text/css rel=stylesheet></HEAD>

<BODY onload=addFooter();>

<CENTER>

<DIV id=container><IMG style="WIDTH: 342px; HEIGHT: 38px" alt="" src="http://img136.imageshack.us/img/img.png"> </DIV>

<DIV id=container>

<META content=reeky name=copyright>

<STYLE type=text/css>

/*<![CDATA[*/

body,input{font: small Verdana, Geneva, Arial, Helvetica, sans-serif;}

a{color:#00C;}

h2{margin-bottom:0;}

/*]]>*/

</STYLE>

<CENTER>

<H2><FONT color=#ffffff>Official Web Site</FONT> </H2><BR><BR><FONT color=#ff0000>

<DIV id=premium_link>http://www.EXAMPLELINK.EXAMPLE

</DIV></FONT></CENTER><BR>

<DIV id=container>

<DIV style="TEXT-ALIGN: center">

<FORM id=a name=a method=post><B><FONT color=#ffffff>Link: (<FONT color=#ffffff>Insert Image Path</FONT>)</B><BR></FONT><BR><INPUT size=50 value=" " name=link> <INPUT type=submit value=Go!> </FORM></DIV></DIV></CENTER></DIV></BODY></HTML>

I would link to copy in anoter .txt file, the string in bold (http://www.EXAMPLELINK.EXAMPLE).

how can I do?

Edited by AuToItItAlIaNlOv3R

Share this post


Link to post
Share on other sites
Authenticity

"(?i)<div[^>]*+>([^\r\n<]*+)"

This is not a task for StringRegExp alone, you need first to enumerate the document <DIV> tags and check each one is _IEPropertyGet($oDiv, 'innerText') or 'outerText, if I'm not wrong, then use string regexp to match link pattern.

Share this post


Link to post
Share on other sites
GEOSoft

Change the regexp I gave you to

$aStr = StringRegExp($sStr, "(?i)(?s)<div\sid=.*?link.*?>\v?(.*?)\v?</.+>", 3)

I should have added that you change the MsgBox() to FileWriteLine("myfile.txt", $aStr[$i])

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites
AuToItItAlIaNlOv3R

@Geosoft this code don't work :

$sStr = FileRead(@DesktopDir & "\sourcecode.txt")

$aStr = StringRegExp($sStr, "(?i)(?s)<div\sid=.*?link.*?>\v?(.*?)\v?</.+>", 3)

For $i = 0 to Ubound($aStr) -1

FileWriteLine("myfile.txt", $aStr[$i])

Next

The myfile.txt is empty!

How can i do this script?

Share this post


Link to post
Share on other sites
GEOSoft

I tested it using the HTML thatyou posted and it worked fine.

replace the FileWrite line with this just to see what it returns.

MsgBox(0, "TEST RESULTS", "Error = " & @Error & @CRLF & $aStr[$i])

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.