Jump to content
Sign in to follow this  
XanzyX

Reading a file on the Internet to an array

Recommended Posts

XanzyX

I am writing a script to make sure that emails on a list are all structured correctly.  One of the componants of the email structure is the "Tope Level Domain" (TLD) ie .com, .net, .org, etc. 

A current and most up-to-date list of all the TLD are published on the Internet Assigned Numbers Authority's website. (http://www.iana.org/)

http://data.iana.org/TLD/tlds-alpha-by-domain.txt

The question is:

How can I read this file and put each line as an element of an array?

 

Share this post


Link to post
Share on other sites
Cravin

XanzyX,

Try something like this...  You'll need to add in some code to remove the first few lines and the last empty string that's returned, but for the sake of learning I'll let you do that :)

#include <Array.au3>
Local $sData = InetRead("http://data.iana.org/TLD/tlds-alpha-by-domain.txt")
Local $sStringSplit = BinaryToString($sData)
Local $aArray = StringSplit($sStringSplit, @LF)
_ArrayDisplay($aArray)
Edited by Cravin

Share this post


Link to post
Share on other sites
XanzyX

Wow!  That was fast!

 I thought for sure I was going to figure it out before I recieved a answer. 

I was wrong. Your answer is so much simplier than where I was going

Thanx

Share this post


Link to post
Share on other sites
XanzyX

That was such an efficent way of populating an array, that I have to ask:

How would you populate an array of email address from a list named EmailList.txt (one email per line)

Forgive me I wrote a whole function and it works but it takes a long time.

Share this post


Link to post
Share on other sites
mikell

It depends tightly of the content of your file

Assuming there is effectively one address per line (and nothing else) , and an email address doesn't include a white space, such an expression should work (it simply reads parts without white spaces or cr or lf)

#Include <Array.au3>
$text = FileRead("yourlist.txt")
$res = StringRegExp($text, '[\S]+', 3)
_ArrayDisplay($res)
Edited by mikell

Share this post


Link to post
Share on other sites
kylomas

@mikell,

This might work better

#include <array.au3>
Local $myArray = stringregexp(FileRead('EmailList.txt'),'([^\R].*)', 3)
_ArrayDisplay($myArray)

kylomas

edit: posted wrong code, better start drinking!

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
mikell

@kylomas

Possibly using R works with the last beta (I didnt try) but with my 3.3.8.1 it returns the cr and lf and matches empty lines

#include <array.au3>
$str = 'me@isp1.com' &@crlf& 'me@isp2.com' &@crlf&@crlf& 'me@isp3.com' &@crlf& 'me@isp4.com'
Local $myArray = stringregexp($str, '([^\R].*)', 3)
_ArrayDisplay($myArray)

I'm not sure that i's a good idea to build and play with expressions able to run properly with betas only, as the recommended version for common users on the main download page is (at this time) the 3.3.8.1 release

So to be more selective on newlines and exclude horizontal white spaces inside lines I would better use

Local $myArray = stringregexp($str, '\V+', 3)
Edited by mikell
  • Like 1

Share this post


Link to post
Share on other sites
kylomas

mikell,

I ran the code I posted under 3.3.8.1 and it returns the dataset exactly as it appears, blank lines included.  That was my intent, however, I like yours for removing blank lines, thanks.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×