Jump to content

Global StringRegExp returns Array but its elements are empty


Recommended Posts

#include <Inet.au3>
#include <Array.au3>

$sUrl       = "https://deadline.com/"

$sRegEx     = '(?<=(?:post-title">))((\n)|.)*?(?=(?:<p class="post-author-time))'

$sHTML      = _INetGetSource($sUrl)

;~ MsgBox(0,"",$sHTML)

$aArticles = StringRegExp($sHTML,$sRegEx,3) ; get articles

_ArrayDisplay($aArticles)

;~ ConsoleWrite($aArticles[0] & @CRLF)

I want to do a simple get of HTML texts on this news site for each article. I know that this site has 12 articles on their front page, and the after I apply the regex to split each article into an array, I can see that it has 12 elements as well, but they are empty. I assume it has something to do with the linebreaks; because when I do the same but for just single lines, the elements in the array are no longer empty. How do I fix this to have the elements contain the article info and not be empty?

Edited by yyywww
Link to comment
Share on other sites

@FrancescoDiMuro

Edit: No, it's actually everything inbetween post-title"> and <p class="post-author-time

But, what exactly you get is not very important, it could obtain anything from this site; but it needs to be multiple lines at once (Because when I get single lines it does work). I'm more interested in why the array contains empty elements when I do it like this with the code above, or what I need to change in order to not have the array contain empty elements, but instead contain the HTML between those tags.

Edited by yyywww
Link to comment
Share on other sites

@yyywww
Something like this?

#include <Array.au3>
#include <Inet.au3>
#include <StringConstants.au3>

Global $strUrl = "https://deadline.com/", _
       $strHTML = "", _
       $arrResult

$strHTML = _INetGetSource($strURL, True)

$arrResult = StringRegExp($strHTML, '(?s)<h2 class="post-title">(.*?)<p class="post-author-time">', $STR_REGEXPARRAYGLOBALMATCH)

_ArrayDisplay($arrResult)

:)

Edited by FrancescoDiMuro

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...