Jump to content
lordsocke

Count links in a txt file

Recommended Posts

lordsocke

Hi guys is there a function to count the number of links in a txt file? Or maybe to count the number of "https://" which is in every link

Thanks already :D

Share this post


Link to post
Share on other sites
water

How large is the file?


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
RaiNote

Links not always have https they also can have http and someothers but there maybe would be a function but it depends on somehow how you the links are formatted within the txt file

Edited by RaiNote

  • C++/AutoIt/OpenGL Easy Coder
  • I will be Kind to you and try to help you
  • till what you want isn't against the Forum
  • Rules~

 

Share this post


Link to post
Share on other sites
water

The OP posted that all his links start with "https://"

"https://" which is in every link.


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
water

If the file isn't too long I would do it this way

Global $sFile = FileRead("Your filename goes here") ; Read the whole file into a variable
StringReplace($sFile, "https://", "https://") ; Replace the link with itself
ConsoleWrite("Number of links in the file: " & @extended) ; @extended holds the number of replacements

 


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
water

Did you insert the space intentionally?

"https: //"

 


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
lordsocke

Thanks a lot the file is about 5kb should work for me :D

Share this post


Link to post
Share on other sites
RaiNote

@water Little question @Extended this does what exactly? Does it Returns the Count of operations of a Function does or something other?

Edited by RaiNote

  • C++/AutoIt/OpenGL Easy Coder
  • I will be Kind to you and try to help you
  • till what you want isn't against the Forum
  • Rules~

 

Share this post


Link to post
Share on other sites
water

@extended is set by StringReplace and returns the number of replacements that have been done.

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
RaiNote

ah ok thank you very much.


  • C++/AutoIt/OpenGL Easy Coder
  • I will be Kind to you and try to help you
  • till what you want isn't against the Forum
  • Rules~

 

Share this post


Link to post
Share on other sites
water

It is described in the help file: StringRegExp

Return Value

Returns the new string with the number of replacements performed stored in the @extended macro.


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
lordsocke

If the file isn't too long I would do it this way

Global $sFile = FileRead("Your filename goes here") ; Read the whole file into a variable
StringReplace($sFile, "https://", "https://") ; Replace the link with itself
ConsoleWrite("Number of links in the file: " & @extended) ; @extended holds the number of replacements

 

Sorry if my question is really stupid but how can I save the counted links number into a variable?

Share this post


Link to post
Share on other sites
mikell

$count = @extended   :)

Share this post


Link to post
Share on other sites
lordsocke

tanks :sweating:

 

Share this post


Link to post
Share on other sites
Surya

try this: (CODE TESTED AND VERIFIED) this will check for links and will check them if they are real (requires internet connection added to make your functions better)

#include <string.au3>
#include <Array.au3>

$occur = _FindlinkOcuurance("https://www.autoitscript.com text in between https://www.autoitscript.com just some text " & @CRLF & "https://www.google.com https://thereisnoserverlikethis.com")
_ArrayDisplay($occur)

; #FUNCTION# ====================================================================================================================
; Name ..........: _FindlinkOcuurance
; Description ...:
; Syntax ........: _FindlinkOcuurance($string[, $check = True[, $timout = 4000]])
; Parameters ....: $string              - the main string to be checked
;                  $check               - [optional] True or false. Default is True.if true the link will be
;                                         checken if exists in theinternet (requires data connection)
;                  $timout              - [optional] the timeout period to check for the link in the internet
;                                         set a large value for poor network connection and vice versa
; Return values .: $ary                 - A two dimensional array where the first element of the first coulmn
;                                         is the number of links found and the first elemnt in the second column
;                                         is the links that are found.the second element in first column is the
;                                         number of true links that exists in the internet and the second element
;                                         in the second column has the true links that exists in the internet
; Author ........: Surya Saradhi.B
; Modified ......: 05/09/15
; Remarks .......: Requires internet connection if the link is to be checked,the second element in the first column and the
;                  second element in the second column are set if the links are to be verified in the internet
; ===============================================================================================================================
Func _FindlinkOcuurance($string, $check = True, $timout = 4000)
    $strs = StringSplit(StringReplace($string, @CRLF, " "), " ")
    Local $find[2][2] = [[0, ""], [0, ""]]
    For $i = 1 To $strs[0]
        If StringInStr($strs[$i], "https://") Then
            $find[0][0] += 1
            $subs = _StringBetween($strs[$i], "https://", ".com")
            If Not @error Then $find[0][1] = $find[0][1] & "|" & "https://" & $subs[0] & ".com"
            If $check Then
                $linked = _StringBetween($strs[$i], "https://", ".com")
                If Not @error Then
                    $link = $linked[0] & ".com"
                    $pin = Ping($link, $timout)
                    If Not @error Then
                        $find[1][0] += 1
                        $find[1][1] = $find[1][1] & "|" & "https://" & $link
                    EndIf
                EndIf
            EndIf
        EndIf
    Next
    $find[1][1] = StringTrimLeft($find[1][1], 1)
    $find[0][1] = StringTrimLeft($find[0][1], 1)
    Return $find
EndFunc   ;==>_FindlinkOcuurance

 


No matter whatever the challenge maybe control on the outcome its on you its always have been.

MY UDF: Transpond UDF (Sent vriables to Programs) , Utter UDF (Speech Recognition)

Share this post


Link to post
Share on other sites
iamtheky

just because it starts with https:// would not assume it ends with .com, moreover would not assume that it could be pinged.  I dont really know what would be a solid method, maybe testing the @extended from _inetgetsource for a value greater than 0?


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
jguinch
$sContent = FileRead("source.html")

$timer = TimerInit()
StringReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
StringRegExpReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( StringRegExp($sContent, "https://", 3) )
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

 

Share this post


Link to post
Share on other sites
iamtheky

if you bring it in with stripped white space its even quicker, naturally.

 

#include <Inet.au3>
 $sContent = _INetGetSource("https://autoitscript.com")


$timer = TimerInit()
StringReplace($sContent, "https://", "https://")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
StringRegExpReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( StringRegExp($sContent, "https://", 3) )
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( Stringsplit($sContent, "https://", 3)) - 1
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

;stripping ws

$sContent = stringstripws(_INetGetSource("https://autoitscript.com") , 8)

$timer = TimerInit()
StringReplace($sContent, "https://", "https://")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
StringRegExpReplace($sContent, "https://", "")
$count = @extended
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( StringRegExp($sContent, "https://", 3) )
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

$timer = TimerInit()
$count = UBound( Stringsplit($sContent, "https://", 3)) - 1
ConsoleWrite($count & @TAB & TimerDiff($timer) & @CRLF)

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
Malkey

These RE Replace examples returns all the links from the HTML document, and not just the links from within the body tag.
The "s?" in the RE pattern means "s" can appear once or not at all.

#include <Inet.au3>

$sContent = _INetGetSource("https://autoitscript.com")

StringRegExpReplace($sContent, 'https://', "")
$count = @extended
ConsoleWrite($count & @TAB & 'https://' & @CRLF)

StringRegExpReplace($sContent, 'https?://', "")
$count = @extended
ConsoleWrite($count & @TAB & 'https?://' & @CRLF)

StringRegExpReplace($sContent, '"https?://', "")
$count = @extended
ConsoleWrite($count & @TAB & '"https?://' & @CRLF)

#cs ; Returns:
    80  https://
    89  https?://
    71  "https?://
#ce

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×