arunachandu 0 Posted May 17, 2011 HI, I was trying to extract the text that is placed in between the HTML tags. <title> eGrabber - Prospects lists lead generation software | business lead generation program | address | list | email processing </title> The code i wrote is: $title=_StringBetween($html,'<title>','</title>') MsgBox(0,"title",$title) I tried using reg expresions also $title= StringRegExp($html, '<title>(.*?)</title>',3) But nothing worked out. Can someone help me in extracting the text between the tags? Thanks Share this post Link to post Share on other sites
arunachandu 0 Posted May 17, 2011 Hey i was able to extract the string in between the tags. modified little bit. $title=_StringBetween($html,'<title>','</title>') $page_title=$title[0] MsgBox(0,"title",$page_title) But the issue now is it is extracting the line numbers also. For ex: 5 <title> 5 eGrabber - Prospects lists lead generation software | business lead generation program | address | list 5 | email processing 5 </title> The result it was displaying is: 5 eGrabber - Prospects lists lead generation software | business lead generation program | address | list 5 | email processing How do i remove the line numbers from the text? Thanks Share this post Link to post Share on other sites
sleepydvdr 8 Posted May 17, 2011 Good to see you figured it out. Look up StringTrimLeft in the help file. #include <ByteMe.au3> Share this post Link to post Share on other sites
arunachandu 0 Posted May 17, 2011 Hi, I tried triming the line numbers using the function. Following is the code. $title=_StringBetween($html,'<title>','</title>') $page_title=$title[0] $Final_Title=StringTrimLeft($page_title, 6) $Final_Title_1 =StringTrimRight($Final_Title,10) MsgBox(0,"title",$Final_Title_1) But here i have guessed and hardcoded the range. I would like to know is there any function which will automatically checks the range and trim it. Thanks Share this post Link to post Share on other sites
somdcomputerguy 103 Posted May 17, 2011 I would think you would only need StringTrimLeft($page_title, 2), no StringTrimRight. But you shouldn't need StringTrimLeft either. What is $html, and where/how do you get it? I've added an example. No line numbers show up.#Include <String.au3> #Include <INET.au3> $html = _StringBetween(_INetGetSource('http://somdcomputerguy.com'), '<title>', '</title>') MsgBox(0, "title", $html[0]) - Bruce /*somdcomputerguy */ If you change the way you look at things, the things you look at change. Share this post Link to post Share on other sites
somdcomputerguy 103 Posted May 17, 2011 This works for me. #Include <String.au3> $html = "<title>eGrabber - Prospects lists lead generation software | business lead generation program | address | list| email processing</title>" $title=_StringBetween($html, '<title>', '</title>') MsgBox(0, "title", $title[0]) - Bruce /*somdcomputerguy */ If you change the way you look at things, the things you look at change. Share this post Link to post Share on other sites
Nina 0 Posted May 20, 2020 hi the below code works, but when I change <title> and put <h3 class = ...> it doesn't work #Include <String.au3> #Include <INET.au3> $html = _StringBetween(_INetGetSource('http://somdcomputerguy.com'), '<title>', '</title>') MsgBox(0, "title", $html[0]) Share this post Link to post Share on other sites