Sign in to follow this  
Followers 0
Sercankd

Check line ends in txt file

13 posts in this topic

i have a txt file that contains some image links and normal page links like

http://www.google.com/

http://www.google.com/logo.jpg

http://www.yahoo.com/yahooo.png

http://www.yahoo.com/

http://www.yahoo.com/mail

i want to remove all lines that doesnt end with common image extensions like .jpg .png .gif

i suck at regexp i cant do it please help me

Share this post


Link to post
Share on other sites



This will pull those lines, i too lack skills with regexp

#include <array.au3>
#Include <File.au3>


Dim $Asites
_FileReadToArray("c:\sites.txt", $Asites)

Dim $Apics[1]
$Apng = _ArrayFindAll ($Asites, ".png", 0 ,0 , 0 , 1)
$Ajpg = _ArrayFindAll ($Asites, ".jpg", 0 ,0 , 0 , 1)
$Agif = _ArrayFindAll ($Asites, ".gif", 0 ,0 , 0 , 1)
_ArrayConcatenate ($Apics , $Ajpg)
_ArrayConcatenate ($Apics , $Apng)
_ArrayConcatenate ($Apics , $Agif)
_ArrayDelete ($Apics , 0)

for $i = 0 to ubound ($Apics) - 1
filewrite ("c:\sites_pics_only.txt" , $Asites[$Apics[$i]] & @CRLF)
next

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Here is one way of doing this sort of thing

#include <file.au3>

; Create a test file
Local $sFileName = "C:\Documents and Settings\" & @UserName & "\My Documents\Mylinklist.txt"
Local $sMyLinksData = "http://www.google.com/" & @CRLF & "http://www.google.com/logo.jpg" & @CRLF & "http://www.yahoo.com/yahooo.png" & @CRLF & "http://www.yahoo.com/" & @CRLF & "http://www.yahoo.com/mail" & @CRLF
FileWrite($sFileName,$sMyLinksData)

Local $aListData = 0
Local $hFile = 0

_FileReadToArray($sFileName,$aListData)
FileMove($sFileName,$sFileName & ".bak") ; make a backup copy
$hFile = FileOpen($sFileName,2) ; Create a new empty verion of the file
For $i = 1 To UBound($aListData) - 1
    If StringRegExp($aListData[$i],"(?i:.jpg|.png|.gif)$") Then ;add extra flie types by inserting |.ext
        ; Keep this line, write it to the new file
        FileWriteLine($hFile,$aListData[$i])
    EndIf
Next
FileClose($hFile)

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

$text = "http://www.google.com/" & @crlf & _
"http://www.google.com/logo.jpg" & @crlf & _
"http://www.google.com/logo3.jpg" & @crlf & _
"http://www.yahoo.com/" & @crlf & _
"http://www.yahoo.com/yahooo.png" & @crlf & _
"http://www.yahoo.com/" & @crlf & _
"http://www.yahoo.com/mail" & @crlf
$array = StringRegExp($text, '(?i)(.*?\.jpg|.*?\.gif|.*?\.png)', 3)
for $i = 0 to UBound($array) - 1
    msgbox(0, "RegExp Test with Option 1 - " & $i, $array[$i])
Next


Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

Is there any way to invert the result? I mean finding *.jpg, *.gif, *.png and invert the results!

Br,

UEZ


Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

couldnt you use bowmores solution with an If NOT....filewriteline statement instead?

unless you mean reverse the order -- or turn them upside down?

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

This will pull those lines, i too lack skills with regexp

#include <array.au3>
#Include <File.au3>


Dim $Asites
_FileReadToArray("c:\sites.txt", $Asites)

Dim $Apics[1]
$Apng = _ArrayFindAll ($Asites, ".png", 0 ,0 , 0 , 1)
$Ajpg = _ArrayFindAll ($Asites, ".jpg", 0 ,0 , 0 , 1)
$Agif = _ArrayFindAll ($Asites, ".gif", 0 ,0 , 0 , 1)
_ArrayConcatenate ($Apics , $Ajpg)
_ArrayConcatenate ($Apics , $Apng)
_ArrayConcatenate ($Apics , $Agif)
_ArrayDelete ($Apics , 0)

for $i = 0 to ubound ($Apics) - 1
filewrite ("c:\sites_pics_only.txt" , $Asites[$Apics[$i]] & @CRLF)
next

thank you that solved my problem

Share this post


Link to post
Share on other sites

couldnt you use bowmores solution with an If NOT....filewriteline statement instead?

unless you mean reverse the order -- or turn them upside down?

I meant it with StringRegExp. With RegEx it is simple to get all the extension but it would save a lot of work if regexp can invert the results.

Then you can easily delete all non extension and overwrite the text file for example.

Br,

UEZ


Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

@UEZ

Do you mean reverse the array returned from the SRE?

@Sercankd

Just a minor rework of Jos' code to get the result you seem to want.

$text = "http://www.google.com/" & @crlf & _
"http://www.google.com/logo.jpg" & @crlf & _
"http://www.google.com/logo3.jpg" & @crlf & _
"http://www.yahoo.com/" & @crlf & _
"http://www.yahoo.com/yahooo.png" & @crlf & _
"http://www.yahoo.com/" & @crlf & _
"http://www.yahoo.com/mail" & @crlf
$array = StringRegExp($text, '(?i)(.*?\.jpg|.*?\.gif|.*?\.png)', 3)
If NOT @Error Then
    $text = ""
    For $i = 0 to UBound($array) - 1
        ;msgbox(0, "RegExp Test with Option 1 - " & $i, $array[$i]) ;; Attn: JOS -- You said "RegExp Test with Option 1 and yet you used Option 3
        $text &= $array[$i] & @CRLF
    Next
EndIf
MsgBox(0, "Result", $text)

If you #Include<array.au3> at the top of your script, there are many things you can do with the returned array.

Change the last part of the code to.

$array = StringRegExp($text, '(?i)(.*?\.jpg|.*?\.gif|.*?\.png)', 3)
If NOT @Error Then
    $array = _ArrayUnique($array)
    ;_ArrayReverse($array);; For UEZ
    $text = _ArrayToString($array, @CRLF)
EndIf
MsgBox(0, "Result", $text)

EDIT:

Missed a set of code tags

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

; Attn: JOS -- You said "RegExp Test with Option 1 and yet you used Option 3

One of those Cut&Paste starters that didn't get changed :graduated:

Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

I meant it with StringRegExp. With RegEx it is simple to get all the extension but it would save a lot of work if regexp can invert the results.

Then you can easily delete all non extension and overwrite the text file for example.

Br,

UEZ

Better to manipulate the array to what you want and then over-write the file I think. Of course you could also do at least part of it by changing the order of the conditionals but that isn't guaranteed to be accurate either.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

@UEZ

Do you mean reverse the array returned from the SRE?

No, let me try to explain it again:

1:http://www.google.com/
2:http://www.google.com/logo.jpg
3:http://www.yahoo.com/yahooo.png
4:http://www.yahoo.com/
5:http://www.yahoo.com/mail

The RegExp code will find line 2 and line 3 but can I inverse it so that I will get the line 1, 4 and 5?

Something like ^.*\.(jpg|png|gif) which is of course not working!

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

In that case it's faster to use StringRegExpReplace()

$sTxt = [url="http://www.google.com/"]1:http://www.google.com/[/url] & @CRLF
$sTxt &= [url="http://www.google.com/logo.jpg"]2:http://www.google.com/logo.jpg[/url] & @CRLF
$sTxt &= [url="http://www.yahoo.com/yahooo.png"]3:http://www.yahoo.com/yahooo.png[/url] & @CRLF
$sTxt &= [url="http://www.yahoo.com/"]4:http://www.yahoo.com/[/url] & @CRLF
$sTxt &= [url="http://www.yahoo.com/mail"]5:http://www.yahoo.com/mail[/url]

$sTxt = StringRegExpReplace($sTxt, "(?m:^|\n)(.*\.jpg|.*\.png|.*\.gif).*", "")
MsBox(0, "Result", $sTxt)

EDIT: had one too many ".*" in there

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0