Jump to content

Recommended Posts

Posted

Greetings fellow Autoiteers!

I've been working with Autoit for about 4 months now. I've created 3 programs thus far.. And the vast majority of what I've learned along the way has came from this forum. So I'd like to start by thanking everybody who has contributed to this forum over the years, and helped compile the vast wealth of knowledge here. This forum alone has helped me tremendously through my journey of learning to code with Autoit.

 

However, the most recent project that I've been working on, has me at a complete loss. And brings me here asking my 1st question in search of a point in the right direction.

 

I have a bunch of mini websites that I've created over the past 8 years or so, and I'm trying to create a program that goes to all of my sites, and grabs Only the sentences that contain specified Keyword Phrases.

For example, using Autoit's Basic Example page, we get these 3 lines of text:

"This is a simple HTML page with text, links and images.
AutoIt is a wonderful automation scripting language.
It is supported by a very active and supporting user forum."

 

...and let's say for example that the specified keyword phrase is "automation scripting". You'll notice that That specific keyword phrase can be found in the 2nd sentence.

How can I grab That entire sentence, based on the keyword given, ..while ignoring the rest of the content pulled from the web page using _IEBodyReadText() ?

#include <IE.au3>
#include <MsgBoxConstants.au3>

Local $oIE = _IE_Example("basic")
Local $sText = _IEBodyReadText($oIE)
ConsoleWrite($sText & @CRLF)

 

I know that you can crawl through the page and find the given Keyword Phrase with a basic StringRegExp() function:

#include <IE.au3>
#include <MsgBoxConstants.au3>

Local $oIE = _IE_Example("basic")
Local $sText = _IEBodyReadText($oIE)
ConsoleWrite($sText & @CRLF)

$testReg = StringRegExp ( $sText, "automation scripting", 1)

ConsoleWrite ('Keyword found: "' & $testReg[0] & '"' & @CRLF)

 

But after that, as far as grabbing the Entire sentence that contains the given Keyword, I'm at a complete loss.

Is what I'm trying to accomplish here even possible with Autoit?

Any help, or a point in the right direction, would be greatly appreciated! :)

 

 

 

Posted (edited)

@2Toes

Happy to hear from you those kind words about this amazing community ( I can confirm! ) and the help it gives to everyone kindly ask here :)

By the way, I think that your approach is correct, but the search pattern in the StringRegExp should be filled as much as to fit your request :)

I am sure it is possible, but I'm not a big fan of StringRegExp(), so, I really don't know where to start.

Maybe @mikell could give you some tips ;)

Edited by FrancescoDiMuro

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Posted

@FrancescoDiMuro

Thanks for stopping by to leave a response!

I'm with you.. I'm not a big fan of StringRegExp() either! :D

I've been reading through HelpFile and online Tutorials over the past 2 days. And StringRegExp() is a Massive can-of-worms with a lot information to take in and understand.

Even after looking into it as much as I have, I'm still unsure where to start ..mainly because I'm just not sure what steps would need to be taken to accomplish such a task. It's all very confusing lol.

I appreciate you recommending and tagging mikell to the post for me. Hopefully he/she will be able to send me off in the right direction.

Anywho, it's good to hear from ya.. And thanks again for stopping by to leave a response! :)

Posted

Hum, StringRegExp is magic, everybody knows that. So what you want to do is possible, BUT the requirements need to be precisely defined
Example : in the code below, a sentence is strictly defined as "a sequence of characters which are not a dot, beginning at start of text OR after a dot followed by 0 or more white spaces" 

#Include <Array.au3>

$s = "This is a simple HTML page with text, links and images. " & @crlf & _
"AutoIt is a wonderful automation scripting language." & @crlf & _
"It is supported by a very active and supporting user forum."

;$keyword = "automation scripting"
;$keyword = "this"
;$keyword = "is a"
$keyword = "and"

$r = StringRegExp($s, '(?is)(?:^|\.\s*)([^.]*' & $keyword & '[^.]*)', 3)
_ArrayDisplay($r)

 

Posted

@mikell

Wow, that is fantastic!! B)

I don't know what any of that means, or how it works... but it definitely shows me what to look into to better understand the logic behind it.

Which is exactly what I was hoping to get here.

I cannot thank you enough for that.. Very much appreciated my friend!!

Posted

@mikell

Quick follow up question..

What part of that code detects when there is a Period in the sentence?

Assuming that not all sentences will end with a Period, or follow a sentence ending with a period, I changed the 1st & 2nd sentence to a '?' and a '!'.

#Include <Array.au3>

$s = "This is a simple HTML page with text, links and images? " & @crlf & _
"AutoIt is a wonderful automation scripting language!" & @crlf & _
"It is supported by a very active and supporting user forum."


;$keyword = "automation scripting"
;$keyword = "this"
$keyword = "is a"
;$keyword = "and"

$r = StringRegExp($s, '(?is)(?:^|\.\s*)([^.]*' & $keyword & '[^.]*)', 3)
_ArrayDisplay($r)

 

After doing so, the code no longer works properly, and places the entire block of content into a single array Instance.

k0ekVaF.png

 

What part of the code would I focus on to handle sentences ending with a '?' and '!' etc?

Thank you again for your help!

 

Posted (edited)

strip all your doublespaces, then split by all the punctuation you want it split by.  Then you can play with rows as lines however you need; if it will only ever be singular matches you dont even have to go regex at that point.  But if there is even a 2nd criteria definitely use whatever mikell posts.

#Include <Array.au3>

$s = "This is a simple HTML page with text. Links and images, maybe? " & @crlf & _
"AutoIt is a wonderful automation scripting language! " & @crlf & _
"It is supported by a very active and supporting user forum."

;~ $keyword = "Links"
;~ $keyword = "HTML"
$keyword = "by a very"
;~ $keyword = "wonderful"

msgbox(0, '' , _ArrayExtract(stringsplit(stringstripws($s , 4) , ".?!" , 2) , _ArraySearch(stringsplit(stringstripws($s , 4) , ".?!" , 2) , $keyword , 0 , 0 ,0 , 1))[0])

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Posted (edited)
6 hours ago, iamtheky said:

But if there is even a 2nd criteria

Sorry but I don't agree  ;)

Local $r, $a = StringSplit(StringStripWS($s , 4) , ".?!" , 2)
For $i = 0 to UBound($a)-1
  If StringInStr($a[$i], $keyword) Then $r &= StringStripWS($a[$i], 3) & @crlf
Next
;Msgbox(0, "", StringTrimRight($r, 2))
$final = StringSplit(StringTrimRight($r, 2), @crlf, 3)
_ArrayDisplay($final)

 

9 hours ago, 2Toes said:

the code no longer works properly

Of course, if the definition of what is a sentence changes, then the regex must be amended to fit with that
Reason why I said "the requirements need to be precisely defined"  :)

$r = StringRegExp($s, '(?is)(?:^|[.\?!]\s*)([^.\?!]*' & $keyword & '[^.\?!]*)', 3)

 

Edited by mikell
Posted

well played.  Suppose now I will spend too many minutes trying to pull off that dance without the loop.

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...