Sign in to follow this  
Followers 0
orichec

Copy text betwen two html tags using website source code

5 posts in this topic

#1 ·  Posted (edited)

Hello,

I'm using Autoit for very long time but I just unable to create specific autoit it which help me to copy everything between two html tags (from "<div class="grid_28 mgtop_30">"

to "<div id="specifications")...I'm trying to copy description of free version or trial version of software using softpedia website. 

My starting Point=<div class="grid_28 mgtop_30">

My ending Point=<div id="specifications"

My script should find starting point first from opened source codes of website and start selecting data between my starting and ending points. After selecting this data script should copy that text only.

I tried to create following code but not works for me;

<snip>

Please help me to make this possible I also attached a picture (and highlighted desired text to copy) for source codes of website where i want to copy content;

<snip>

Edited by JLogan3o13

Share this post


Link to post
Share on other sites



Hello orichec,

AutoIT is able to get source code from a web page using embedded Internet Explorer without having to copy paste it from notepad :)

Also there were 2 mistakes in your script:

1°) 

; Set UNIQUE start and end points for your extract
$sStart = "<div class="grid_28 mgtop_30">"
$sEnd = "<div id="specifications"

Since your strings contains double quotes signs: " you need to surround them with single quote signs ' otherwise autoit will think the string is between  "<div class=" only!

2°) If you look in help file of autoit for _StringBetween function, you will see that it returns an array and not a string. So to display it, you need to do this:

MsgBox(0, "Extract", $sExtract[0])

or this

_arraydisplay($sExtract)

Try this out, it's working for me :) Note that i changed '<div id="specifications' to '<div class="_tabpage tabpage hidden legible specifications" id="specifications">' because it is what is return by Internet Explorer in opposite to firefox!

This is because all web browser translate php code into html their own way, so what your read with Firefox won't be the same as Internet Explorer and Chrome :)

#include <IE.au3>
#include <String.au3>
#include <Array.au3>

;getting the page source code and storing it into text file for easy reading it and debugging

$file = fileopen(@scriptdir & "\source.txt", 10)

$IE = _IECreate("http://www.softpedia.com/get/Tweak/System-Tweak/Edge-Blocker.shtml", 0, 0)

$source = _IEDocReadHTML($IE)

FileWrite($file, $source)

;extracting source code between <div class="grid_28 mgtop_30"> and <div class="_tabpage tabpage hidden legible specifications" id="specifications">

$target_source = _StringBetween($source, '<div class="grid_28 mgtop_30">', '<div class="_tabpage tabpage hidden legible specifications" id="specifications">')

msgbox("","",$target_source[0])

_IEQuit($IE)

fileclose($file)

 

Share this post


Link to post
Share on other sites

Hey, I'm totally confused with these codes because I never work with internet explorer codes of autoit. Could you please modify my codes and help me what and how to do with it and also share its working .au3 file to make it helpful for me. Please...

Share this post


Link to post
Share on other sites

Hello Neutro,

Thank you so much it's working for me now but it just show information in a msg box but I want to save that information as a text file so please guide show to code this...waiting for kind responpse....

Share this post


Link to post
Share on other sites

@orichec You obviously missed the part in the forum rules about not reposting the same question with a change in wording. I highly suggest you read the rules and adhere to them, our patience has its limits.


√-1 2^3 ∑ π, and it was delicious!

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.
Sign in to follow this  
Followers 0