Jump to content

StringRegExp capture multiple strings between 2 constants


Kyan
 Share

Recommended Posts

Hi everyone :)

I trying for more than a hour to get this regular expression right, what I want to do is grab multiple strings inside a table/div on a webpage the problem is to put this to check multiple times for the expressions to capture between (in the example " and "</table")

In this example I'm trying to get all the topic names listed on first page of "General Help and Support" from Autoit forum's but limiting the search/capture between '<table class="ipb_table topic_list hover_rows " summary="Topics In This Forum &quot;General Help and Support&quot;" id="forum_table">' and '</table>'

(I don't know how to paste this tidy, if some one experienced in the mater can share the secret I'll be grateful :) )

#include <Array.au3>
$pg = InetRead("http://www.autoitscript.com/forum/forum/2-general-help-and-support/",1)
If $pg <> '' Then
$exp = '(?i)<table class="ipb_table topic_list hover_rows " summary="Topics In This Forum &quot;General Help and Support&quot;" id="forum_table">.*?'& _
'(?:<a itemprop="url" id=".*?" href=".*?" title="(.*?) - started .*?" class="topic_title">)*?.*?</table>'
$aTopics = StringRegExp(BinaryToString($pg),String($exp),3)
ConsoleWrite(@error&@LF)
_ArrayDisplay($aTopics)
Else
ConsoleWrite("Cannot DL the page"&@LF)
EndIf
Exit

EDIT: Code updated, forgot to add 'BinaryToString' to $pg var

Edited by DiOgO

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

Bump

Isn't possible to do it with stringregexp?

Edited by DiOgO

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

Bump

Ins't possible to do it with stringregexp?

Try this:

#include <Array.au3>
$bRead = InetRead("http://www.autoitscript.com/forum/forum/2-general-help-and-support/", 1)
If @error Then MsgBox(0, "Error", "Cant download the page")

$sSource = BinaryToString($bRead)

$exp = "<a itemprop=.*title='(.*?) - started"

$aTopics = StringRegExp($sSource, $exp , 3)
_ArrayDisplay($aTopics)

By the way you have to wait at least 24 hour before to bump a topic ;)

To paste tidy code just untoggle the editing mode ;)

Hi!

Edited by Nessie

My UDF: NetInfo UDF Play with your network, check your download/upload speed and much more! YTAPI Easy to use YouTube API, now you can easy retrive all needed info from a video. NavInfo Check if a specific browser is installed and retrive other usefull information. YWeather Easy to use Yahoo Weather API, now you can easily retrive details about the weather in a specific region. No-IP UDF Easily update your no-ip hostname(s).

My Script: Wallpaper Changer Change you wallpaper dinamically, you can also download your wallpaper from your website and share it with all!   My Snippet: _ImageSaveToBMPConvert an image to bmp format. _SciteGOTO Open a file in SciTE at specific fileline. _FileToHex Show the hex code of a specified file

Link to comment
Share on other sites

Try this:

#include <Array.au3>
$bRead = InetRead("http://www.autoitscript.com/forum/forum/2-general-help-and-support/", 1)
If @error Then MsgBox(0, "Error", "Cant download the page")

$sSource = BinaryToString($bRead)

$exp = "<a itemprop=.*title='(.*?) - started"

$aTopics = StringRegExp($sSource, $exp , 3)
_ArrayDisplay($aTopics)

By the way you have to wait at least 24 hour before to bump a topic ;)

To paste tidy code just untoggle the editing mode ;)

Hi!

using that way works, but a really want to limit my stringregexp matches within

<table class="ipb_table topic_list hover_rows " summary="Topics In This Forum &quot;General Help and Support&quot;" id="forum_table">

and

</table>

in order to get the correct text since exists 2 tables with different names, but the items class/id are equal

I didn't understand 'untoggle the editing mode', can you explain? are you talking about Full Editor mode?

Sorry, forgot about that part :s, yesterday a got to bed late, tonight it must be different :ermm:

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

in order to get the correct text since exists 2 tables with different names, but the items class/id are equal

I didn't understand 'untoggle the editing mode', can you explain? are you talking about Full Editor mode?

Sorry, forgot about that part :s, yesterday a got to bed late, tonight it must be different :ermm:

Posted Image

Just disable this and your code will be ok.

For the regex just wait a minute that i will look in the source code ;)

Hi!

My UDF: NetInfo UDF Play with your network, check your download/upload speed and much more! YTAPI Easy to use YouTube API, now you can easy retrive all needed info from a video. NavInfo Check if a specific browser is installed and retrive other usefull information. YWeather Easy to use Yahoo Weather API, now you can easily retrive details about the weather in a specific region. No-IP UDF Easily update your no-ip hostname(s).

My Script: Wallpaper Changer Change you wallpaper dinamically, you can also download your wallpaper from your website and share it with all!   My Snippet: _ImageSaveToBMPConvert an image to bmp format. _SciteGOTO Open a file in SciTE at specific fileline. _FileToHex Show the hex code of a specified file

Link to comment
Share on other sites

Posted Image

Just disable this and your code will be ok.

For the regex just wait a minute that i will look in the source code ;)

Hi!

test:

#include <Array.au3>
$pg = InetRead("http://www.autoitscript.com/forum/forum/2-general-help-and-support/", 1)
If $pg <> '' Then
    $exp = '(?i)<table class="ipb_table topic_list hover_rows " summary="Topics In This Forum &quot;General Help and Support&quot;" id="forum_table">.*?' & _
            '(?:<a itemprop="url" id=".*?" href=".*?" title="(.*?) - started .*?" class="topic_title">)*?.*?</table>'
    $aTopics = StringRegExp(BinaryToString($pg), String($exp), 3)
    ConsoleWrite(@error & @LF)
    _ArrayDisplay($aTopics)
Else
    ConsoleWrite("Cannot DL the page" & @LF)
EndIf
Exit

okey :D

EDIT: It works, now autoit code is tidy, thank you Nessie :)

Edited by DiOgO

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

test:

#include <Array.au3>
$pg = InetRead("http://www.autoitscript.com/forum/forum/2-general-help-and-support/", 1)
If $pg <> '' Then
$exp = '(?i)<table class="ipb_table topic_list hover_rows " summary="Topics In This Forum &quot;General Help and Support&quot;" id="forum_table">.*?' & _
'(?:<a itemprop="url" id=".*?" href=".*?" title="(.*?) - started .*?" class="topic_title">)*?.*?</table>'
$aTopics = StringRegExp(BinaryToString($pg), String($exp), 3)
ConsoleWrite(@error & @LF)
_ArrayDisplay($aTopics)
Else
ConsoleWrite("Cannot DL the page" & @LF)
EndIf
Exit

okey :D

EDIT: It works, now autoit code is tidy, thank you Nessie :)

Glad to help you. By the way i dont see duplicate table on the source of General Help and Support. So why overcomplicate the regex?

Hi!

My UDF: NetInfo UDF Play with your network, check your download/upload speed and much more! YTAPI Easy to use YouTube API, now you can easy retrive all needed info from a video. NavInfo Check if a specific browser is installed and retrive other usefull information. YWeather Easy to use Yahoo Weather API, now you can easily retrive details about the weather in a specific region. No-IP UDF Easily update your no-ip hostname(s).

My Script: Wallpaper Changer Change you wallpaper dinamically, you can also download your wallpaper from your website and share it with all!   My Snippet: _ImageSaveToBMPConvert an image to bmp format. _SciteGOTO Open a file in SciTE at specific fileline. _FileToHex Show the hex code of a specified file

Link to comment
Share on other sites

Glad to help you. By the way i dont see duplicate table on the source of General Help and Support. So why overcomplicate the regex?

Hi!

on the sorce of Genereral Help n' support not but here there's one: https://itunes.apple.com/us/album/same-trailer-different-park/id604129427 :)

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

on the sorce of Genereral Help n' support not but here there's one: https://itunes.apple.com/us/album/same-trailer-different-park/id604129427 :)

so you need a regex to grab the album track from itunes? I really dont have understand how in the world now we are talking of itunes :D

Hi!

My UDF: NetInfo UDF Play with your network, check your download/upload speed and much more! YTAPI Easy to use YouTube API, now you can easy retrive all needed info from a video. NavInfo Check if a specific browser is installed and retrive other usefull information. YWeather Easy to use Yahoo Weather API, now you can easily retrive details about the weather in a specific region. No-IP UDF Easily update your no-ip hostname(s).

My Script: Wallpaper Changer Change you wallpaper dinamically, you can also download your wallpaper from your website and share it with all!   My Snippet: _ImageSaveToBMPConvert an image to bmp format. _SciteGOTO Open a file in SciTE at specific fileline. _FileToHex Show the hex code of a specified file

Link to comment
Share on other sites

so you need a regex to grab the album track from itunes? I really dont have understand how in the world now we are talking of itunes :D

Hi!

yup, but there's 2 tables with the same item ID/class name

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

You only need the name?

Hi!

My UDF: NetInfo UDF Play with your network, check your download/upload speed and much more! YTAPI Easy to use YouTube API, now you can easy retrive all needed info from a video. NavInfo Check if a specific browser is installed and retrive other usefull information. YWeather Easy to use Yahoo Weather API, now you can easily retrive details about the weather in a specific region. No-IP UDF Easily update your no-ip hostname(s).

My Script: Wallpaper Changer Change you wallpaper dinamically, you can also download your wallpaper from your website and share it with all!   My Snippet: _ImageSaveToBMPConvert an image to bmp format. _SciteGOTO Open a file in SciTE at specific fileline. _FileToHex Show the hex code of a specified file

Link to comment
Share on other sites

You only need the name?

Hi!

could be, but for the rest, the sre is similar, I don't want to give much work :)

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

so you didn't manage to do it with regular expressions :(

seems I need to go back to the old stringmid +stringrexp :s

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...