Jump to content

Copy text betwen two html tags using website source code


orichec
 Share

Recommended Posts

Hello,

I'm using Autoit for very long time but I just unable to create specific autoit it which help me to copy everything between two html tags (from "<div class="grid_28 mgtop_30">"

to "<div id="specifications")...I'm trying to copy description of free version or trial version of software using softpedia website. 

My starting Point=<div class="grid_28 mgtop_30">

My ending Point=<div id="specifications"

My script should find starting point first from opened source codes of website and start selecting data between my starting and ending points. After selecting this data script should copy that text only.

I tried to create following code but not works for me;

<snip>

Please help me to make this possible I also attached a picture (and highlighted desired text to copy) for source codes of website where i want to copy content;

<snip>

Edited by JLogan3o13
Link to comment
Share on other sites

Hello orichec,

AutoIT is able to get source code from a web page using embedded Internet Explorer without having to copy paste it from notepad :)

Also there were 2 mistakes in your script:

1°) 

; Set UNIQUE start and end points for your extract
$sStart = "<div class="grid_28 mgtop_30">"
$sEnd = "<div id="specifications"

Since your strings contains double quotes signs: " you need to surround them with single quote signs ' otherwise autoit will think the string is between  "<div class=" only!

2°) If you look in help file of autoit for _StringBetween function, you will see that it returns an array and not a string. So to display it, you need to do this:

MsgBox(0, "Extract", $sExtract[0])

or this

_arraydisplay($sExtract)

Try this out, it's working for me :) Note that i changed '<div id="specifications' to '<div class="_tabpage tabpage hidden legible specifications" id="specifications">' because it is what is return by Internet Explorer in opposite to firefox!

This is because all web browser translate php code into html their own way, so what your read with Firefox won't be the same as Internet Explorer and Chrome :)

#include <IE.au3>
#include <String.au3>
#include <Array.au3>

;getting the page source code and storing it into text file for easy reading it and debugging

$file = fileopen(@scriptdir & "\source.txt", 10)

$IE = _IECreate("http://www.softpedia.com/get/Tweak/System-Tweak/Edge-Blocker.shtml", 0, 0)

$source = _IEDocReadHTML($IE)

FileWrite($file, $source)

;extracting source code between <div class="grid_28 mgtop_30"> and <div class="_tabpage tabpage hidden legible specifications" id="specifications">

$target_source = _StringBetween($source, '<div class="grid_28 mgtop_30">', '<div class="_tabpage tabpage hidden legible specifications" id="specifications">')

msgbox("","",$target_source[0])

_IEQuit($IE)

fileclose($file)

 

Link to comment
Share on other sites

Hey, I'm totally confused with these codes because I never work with internet explorer codes of autoit. Could you please modify my codes and help me what and how to do with it and also share its working .au3 file to make it helpful for me. Please...

Link to comment
Share on other sites

  • Moderators

@orichec You obviously missed the part in the forum rules about not reposting the same question with a change in wording. I highly suggest you read the rules and adhere to them, our patience has its limits.

"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...