arunachandu Posted May 17, 2011 Share Posted May 17, 2011 HI, I was trying to extract the text that is placed in between the HTML tags. <title> eGrabber - Prospects lists lead generation software | business lead generation program | address | list | email processing </title> The code i wrote is: $title=_StringBetween($html,'<title>','</title>') MsgBox(0,"title",$title) I tried using reg expresions also $title= StringRegExp($html, '<title>(.*?)</title>',3) But nothing worked out. Can someone help me in extracting the text between the tags? Thanks Link to comment Share on other sites More sharing options...
arunachandu Posted May 17, 2011 Author Share Posted May 17, 2011 Hey i was able to extract the string in between the tags. modified little bit. $title=_StringBetween($html,'<title>','</title>') $page_title=$title[0] MsgBox(0,"title",$page_title) But the issue now is it is extracting the line numbers also. For ex: 5 <title> 5 eGrabber - Prospects lists lead generation software | business lead generation program | address | list 5 | email processing 5 </title> The result it was displaying is: 5 eGrabber - Prospects lists lead generation software | business lead generation program | address | list 5 | email processing How do i remove the line numbers from the text? Thanks Link to comment Share on other sites More sharing options...
sleepydvdr Posted May 17, 2011 Share Posted May 17, 2011 Good to see you figured it out. Look up StringTrimLeft in the help file. #include <ByteMe.au3> Link to comment Share on other sites More sharing options...
arunachandu Posted May 17, 2011 Author Share Posted May 17, 2011 Hi, I tried triming the line numbers using the function. Following is the code. $title=_StringBetween($html,'<title>','</title>') $page_title=$title[0] $Final_Title=StringTrimLeft($page_title, 6) $Final_Title_1 =StringTrimRight($Final_Title,10) MsgBox(0,"title",$Final_Title_1) But here i have guessed and hardcoded the range. I would like to know is there any function which will automatically checks the range and trim it. Thanks Link to comment Share on other sites More sharing options...
somdcomputerguy Posted May 17, 2011 Share Posted May 17, 2011 I would think you would only need StringTrimLeft($page_title, 2), no StringTrimRight. But you shouldn't need StringTrimLeft either. What is $html, and where/how do you get it? I've added an example. No line numbers show up.#Include <String.au3> #Include <INET.au3> $html = _StringBetween(_INetGetSource('http://somdcomputerguy.com'), '<title>', '</title>') MsgBox(0, "title", $html[0]) - Bruce /*somdcomputerguy */ If you change the way you look at things, the things you look at change. Link to comment Share on other sites More sharing options...
somdcomputerguy Posted May 17, 2011 Share Posted May 17, 2011 This works for me. #Include <String.au3> $html = "<title>eGrabber - Prospects lists lead generation software | business lead generation program | address | list| email processing</title>" $title=_StringBetween($html, '<title>', '</title>') MsgBox(0, "title", $title[0]) - Bruce /*somdcomputerguy */ If you change the way you look at things, the things you look at change. Link to comment Share on other sites More sharing options...
Nina Posted May 20, 2020 Share Posted May 20, 2020 hi the below code works, but when I change <title> and put <h3 class = ...> it doesn't work #Include <String.au3> #Include <INET.au3> $html = _StringBetween(_INetGetSource('http://somdcomputerguy.com'), '<title>', '</title>') MsgBox(0, "title", $html[0]) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now