Newbie Help

I'm a newbie. I don't really know that much. I've exhausted all the help files available, but I still can't get what I want.

I'm trying to make a script where I want to extract the data in maybe assign in it an array. However, I don't know how to extract the data, here's an example <a name='4beatz'></a>

Is there anyway I can extract "4beatz" and assign it into an array?

It's not only a single data, there's bunch of data but I don't know how.

I don't know which _IE can do that for me.

Any help will be much appreciated.


If the fomat is consistent, _StringBetween using <a name=' and '></a> should extracy the bit you want. Do it in a loop to write to an array.

You could also read the html file to an array [ _FileReadToArray ], check each element for the presence of <a name=' and use the stringbeetween command only on those elements that fit the match, adding the result to a new array as you go.

Cleverer people than me will suggest a regex for this.

I presume you've been able to download the html data that contains this?


If the fomat is consistent, _StringBetween using <a name=' and '></a> should extracy the bit you want. Do it in a loop to write to an array.

You could also read the html file to an array [ _FileReadToArray ], check each element for the presence of <a name=' and use the stringbeetween command only on those elements that fit the match, adding the result to a new array as you go.

Cleverer people than me will suggest a regex for this.

I presume you've been able to download the html data that contains this?


Hi William,

I looked up _StringBetween command, I don't think it'll help in this scenario. Cause the string I'm looking for is random. Like there's no pattern. But the format is somehow consistent.

The _FileReadToArray is not applicable to since the html always updates, so I can't just download it and extract. :/

I'm really getting frustrated. But yeah, your help is really really appreciated.



  • Moderators


Cause the string I'm looking for is random. Like there's no pattern

Do you mean there is no pattern to the string you are looking to extract or that there is no pattern in the surrounding code? :)

If the former, then a StringRegExp should be able to extract the data as long as there is a pattern in the surrounding code. If there is no pattern in the surrounding code then you are unlikely to be able to do this automatically. :P

Here is an example of an SRE extracting random elements from within surrounding code that follows a pattern: :D

#include <Array.au3>

$sString = "<a name='4beatz'></a>" & @CRLF & _
          "blahblahblah" & @CRLF & _
          "<a name='tom'></a>" & @CRLF & _
          "blahblahblah" & @CRLF & _
          "<a name='dick'></a>" & @CRLF & _
          "blahblahblah" & @CRLF & _
          "<a name='harry'></a>" & @CRLF & _
          "blahblahblah" & @CRLF & _
          "<a name='And a long one'></a>"

$aReturn = StringRegExp($sString, "(?U)<a name='(.*)'><\/a", 3)


Does that help? :)


Hi Melba,

Thank you for making me smile.

I'll try that ASAP and let you know.


Thanks a bunch.


I think I'm doomed.

Here's a screenshot of the data I want to extract and assign to an array.

Posted Image

I only want to extract the NAMES! 4beatz, BlueDiverMasang, etc.

As you can see, the format is not so consistent after all. And here's the html code:

<div class='list-inner-tweet'>
<span class="status">@<a href="/koocci">koocci</a> 불륜의 말로는 이렇게 순서대로 끝이 난다. : 희희락락-&gt;설왕설래-&gt;유두입문-&gt;질문공세-&gt;상하운동-&gt;고성방가-&gt;용암분출-&gt;목욕재개-&gt;의관정재-&gt;연락두절 <a href="/searches?q=%23koocci">#koocci</a></span>
<div class='list-tweet-status'>
<a href="/kongmyeong/status/50715457736548352" class="status_link">about 14 hours ago</a>
<form action="http://mobile.twitter.com/kongmyeong/follow" class="user_button" method="post"><div style="margin:0;padding:0;display:inline"><input name="authenticity_token" type="hidden" value="30a96ba0bc15d74aa19c" /></div>
<input id="last_url" name="last_url" type="hidden" value="/ylxxx/followers" />
<input type="submit" value="Follow" class="friend-actions-btn"/>

<a name='harang1009'></a>
<div class='list-tweet' id='user_harang1009'>

<strong><a href="http://mobile.twitter.com/harang1009">harang1009</a></strong>
<br />
<div class='list-inner-tweet'>
<span class="status">일등에게는 다음회에 경쟁에서 벗어나 자신이 부르고 싶은 노래를 부르게하는 방법을 쓰는 겁니다 아니면 작은 콘서트를 마련해 준다던가 그럼 일등을 한 가수의 무대를 더 볼 수 있는거죠 포인트는 실력있는 가수의 순위매김과 탈락이 아닌 좋은 음악과 방송입니다</span>
<div class='list-tweet-status'>
<a href="/harang1009/status/50576330751868928" class="status_link">about 23 hours ago</a>
<form action="http://mobile.twitter.com/harang1009/follow" class="user_button" method="post"><div style="margin:0;padding:0;display:inline"><input name="authenticity_token" type="hidden" value="30a96ba0bc15d74aa19c" /></div>
<input id="last_url" name="last_url" type="hidden" value="/ylxxx/followers" />
<input type="submit" value="Follow" class="friend-actions-btn"/>

<a name='hakjunoh1227'></a>
<div class='list-tweet' id='user_hakjunoh1227'>

<strong><a href="http://mobile.twitter.com/hakjunoh1227">hakjunoh1227</a></strong>
<br />
<div class='list-inner-tweet'>
<span class="status">다음주 한겨레21 크로스주제가 "나는가수다"인데, 김어준씨가 제가할말을 다해버리네요 ㅠㅠ<br /><br />RT @<a href="/lomo_on">lomo_on</a> + 김어준 '나는가수다' 발언 <br /><br /><a href="http://t.co/pQnGQGP" target="twitter_external_link">http://t.co/pQnGQGP</a>  <a href="http://bit.ly/ePNQ10" target="twitter_external_link">http://bit.ly/ePNQ10</a></span>
<div class='list-tweet-status'>
<a href="/hakjunoh1227/status/50699969002352640" class="status_link">about 15 hours ago</a>
<form action="http://mobile.twitter.com/hakjunoh1227/follow" class="user_button" method="post"><div style="margin:0;padding:0;display:inline"><input name="authenticity_token" type="hidden" value="30a96ba0bc15d74aa19c" /></div>
<input id="last_url" name="last_url" type="hidden" value="/ylxxx/followers" />
<input type="submit" value="Follow" class="friend-actions-btn"/>


Is it possible?


  • Moderators


As none of the strings you say you want (4beatz, BlueDiverMasang) actually appear in the HTML I am at a loss as to how you expect to extract them. :)

You can extract these strings which do appear:

<a name='harang1009'></a>
<a name='hakjunoh1227'></a>

as you can see: :P

#include <Array.au3>

$sText = "<a name='harang1009'></a>" & @CRLF & _
"<div class='list-tweet' id='user_harang1009'>" & @CRLF & _
"</div>" & @CRLF & _
"</div>" & @CRLF & _
"<a name='hakjunoh1227'></a>" & @CRLF & _
"<div class='list-tweet' id='user_hakjunoh1227'>"

$aReturn = StringRegExp($sText, "(?U)<a name='(.*)'><\/a", 3)


but those do not appear in your screen shot. Are you sure that the HTML refers to that screen? :)


  • Moderators


How much more "real time" do you want than this? :)

I have shown you how to get the names from the HTML - how are you getting the HTML in the first place? :)


Sorry Sorry! :)

I'm just really frustrated and desperate.

To give you a bigger picture of what the script is about, here it goes.

First thing, it will open IE. I know how to do that :) haha. At least.

Then it'll go to a user inputed web site. Like in my case "http://mobile.twitter.com/ylxxx/followers"

After that, it will extract all the names it can see and assign in an array. The thing is, I don't know what _IE to use.

I hope you understand what I'm trying to say.

Link to comment
Share on other sites

  • Moderators


_IEDocReadHTML gets you the HTML of a page, but you have to have the IE object identified first. :)

Please post the code you are using to get to the page in question and we will see what we can do. When you post your code please use Code tags - put [autoit] before and [/autoit] after your posted code. :)


  • Moderators


That is fine, so add the _IEDocReadHTML line and you should get the HTML content.

What does this get you? :)

#include <IE.au3>
#include <Array.au3>

$oIE = _IECreate ("http://mobile.twitter.com/ylxxx/followers")
$sHTML = _IEDocReadHTML($oIE)

$aReturn = StringRegExp($sHTML, "(?U)<a name='(.*)'><\/a", 3)



P.S. [code] works fine, but with [autoit] you get the pretty colours. :)

Darnit. Can't believe it's that simple. Was thinking of a loop, if-else or something... Haha. Omg. :/

Melba, YOU'RE A GOD! Haha. Thanks so much.

I'm not done though. I mean *sigh*. I'll try to figure this out on my own first, and if I get stuck again, I'll ask for your assistance again!

MELBA the GODDESS! haha.

Thanks again! Damn, I'm so noob. :)

  • Moderators



Even in these "enlightened" times, I prefer to remain a single gender - and my wife would prefer it to be "male"! :P

And I am by no means exceptional - I have just been around a bit longer. :)

Glad you got it working. :)


Oh Crap. My Bad!

I actually thought you were a girl, that's why I was extra amazed how good you were.

But, going back, it's not actually showing me the array. So, I can't check if the value in the array are the ones I needed. :)

I'm still very thankful for you Melba. :)

  • Moderators


I actually thought you were a girl, that's why I was extra amazed how good you were

Time for some more "gender diversity" training for you, young man - I hope you do not say things like that in the real world. :)

One of the best coders on this forum has 2 "X" chromosomes and she leaves me for dead.

it's not actually showing me the array

That is because you did not give the correct pattern surrounding the names. When I get the HTML from that page, the names look like this:

<A name=trashjp></A

So we need to adjust the SRE to match that pattern - try this:

#include <IE.au3>
#include <Array.au3>

$oIE = _IECreate ("http://mobile.twitter.com/ylxxx/followers")
$sHTML = _IEDocReadHTML($oIE)

$aReturn = StringRegExp($sHTML, "(?i)(?U)<a name=(.*)><\/a", 3)

ConsoleWrite(@error & @CRLF)


Does it work now - it does for me? :)


:P:D;):);):D :D

If you can only see me now Melba. Hahaha.

That's all for today mate. I'm really so thankful and I highly appreciate your help...

I'm guessing your major is Computer Science or something?!

Thanks mate!!!

I'll try to do this on my own if I can... :)

It's really nice to know you, Melba :)

  • Moderators


your major is Computer Science or something?!

No, I am a retired fighter pilot. I now fly small aircraft for the Air Training Corps (an organisation in the UK for air minded youngsters) getting a new generation interested in flying - just as I help out here to get a new generation interested in coding. :)

Do I take it that you get your array now? :)


No, I am a retired fighter pilot. I now fly small aircraft for the Air Training Corps (an organisation in the UK for air minded youngsters) getting a new generation interested in flying - just as I help out here to get a new generation interested in coding. :D

Do I take it that you get your array now? :)


Yes yes. Sorry, I kinda got excited and forgot to answer your question. ;)

God, I never would've guessed. I'm lost for words. I'm just amazed how talented you are. Thanks again so much! :)

I deleted a whole bunch of text. :P Haha. I'm just too embarrassed to ask. I'll really try to finish the script on my own.

Have a good day, Melba

  • Moderators


I'm just too embarrassed to ask

Do not be - how else are you going to learn? :)


