Jump to content

Search the Community

Showing results for tags 'another question'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • General
    • Announcements and Site News
    • Administration
  • AutoIt v3
    • AutoIt Help and Support
    • AutoIt Technical Discussion
    • AutoIt Example Scripts
  • Scripting and Development
    • Developer General Discussion
    • Language Specific Discussion
  • IT Administration
    • Operating System Deployment
    • Windows Client
    • Windows Server
    • Office

Categories

  • AutoIt Team
    • Beta
    • MVP
  • AutoIt
    • Automation
    • Databases and web connections
    • Data compression
    • Encryption and hash
    • Games
    • GUI Additions
    • Hardware
    • Information gathering
    • Internet protocol suite
    • Maths
    • Media
    • PDF
    • Security
    • Social Media and other Website API
    • Windows
  • Scripting and Development
  • IT Administration
    • Operating System Deployment
    • Windows Client
    • Windows Server
    • Office

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


Member Title


Location


WWW


Interests

Found 1 result

  1. Hi everybody I'm looking for way to clean convert HTML to TEXT I found few examples here (), tryed both scripts, but 1 script - using StringRegExpReplace function that gives me fatal error when im using it on big web-sites 2 script - using _IECreate function that working too slow and i dont wan't to create any new IE porcesses Here is my script that sometimes gives me FATAl error: #include <INet.au3> #include <Constants.au3> #Include <String.au3> #include <Array.au3> #Include <Misc.au3> #include <file.au3> #include <IE.au3> $DATA = _INetGetSource("any web site") checkcode() Func checkcode() local $x,$y,$lnx,$Content ;if StringLen($DATA)<90000 Then $Content = $DATA ;MsgBox(0,"XXX",$LINE&" "&StringLen($DATA)) $Content = StringStripCr($Content) $Content = StringRegExpReplace($Content, '<head>(.|n)+?</head>','') $Content = StringRegExpReplace($Content, '<script(.|n)+?/script>','') $Content = StringRegExpReplace($Content, '<!--(.|n)+?-->','') $Content = StringRegExpReplace($Content, '<(.|n)+?>','') $Content = StringRegExpReplace($Content, 'http://(.|n)+? ','') $Content = StringRegExpReplace($Content, 'ftp://(.|n)+? ','') $Content = StringRegExpReplace($Content, 'https://(.|n)+? ','') $Content = StringRegExpReplace($Content, 'www.(.|n)+? ','') $Content = StringReplace($Content, '<','') $Content = StringReplace($Content, '>','') $Content = StringReplace($Content, '&lt;','<') $Content = StringReplace($Content, '&gt;','>') $Content = StringReplace($Content, '&nbsp;',' ') $Content = StringReplace($Content, '&copy;','©') $Content = StringReplace($Content, '&ldquo;','"') $Content = StringReplace($Content, '&raquo;','»') $Content = StringReplace($Content, '&laquo;','«') $Content = StringReplace($Content, '&rdquo;','"') $Content = StringReplace($Content, '&quot;','"') $Content = StringReplace($Content, '&amp;','&') $Content = StringReplace($Content, '&#149;','•') $Content = StringReplace($Content, '&bull;','•') $Content = StringReplace($Content, '&#8249;','') $Content = StringReplace($Content, '&#8250;','') $Content = StringReplace($Content, "&#8217;","'") $Content = StringReplace($Content, "&#39;","'") $Content = StringReplace($Content, '^[',' [') $Content = StringReplace($Content, ']^',' ]') $Content = StringReplace($Content, ' , ',', ') $Content = StringReplace($Content, ' : ',': ') $Content = StringReplace($Content, ' . ','. ') $Content = StringReplace($Content, ' ? ','? ') $Content = StringReplace($Content, ' ! ','! ') $Content = StringReplace($Content, ' ; ','; ') $Content = StringStripWS($Content, 4) FileWriteLine("DUMP.txt",$Content) Endfunc Any ideas how to do it HTML to TEXT coverstion ?
×
×
  • Create New...