Sign in to follow this  
Followers 0
Existance

Changing unknown text between two known points

9 posts in this topic

Hi,

I am trying to write a script which updates content between the Title tags on lots of different html pages.

A typical html page will have a single line which contains content similar to the following:

<title>Page title info - unique to every page</title>

So far i wrote a script which reads the location of the file to modify from a text file (files separated by line breaks) and then reads the new content to replace with from another text file (also separated by line breaks) and replaces the specified value. I need to update it to replace unknown text between two specified values: "<title>" replace *.* "</title>"

I am pretty sure i would have to use StringRegExp and StringRegExpReplace, but i am not quite sure how to properly find and replace unknown content between two known points (html tags). Please help.

I expect my code is somewhat average, but i am trying.

;-------------
;USER OPTIONS:
;-------------
;
;LOCATION AND NAME OF FILE CONTAINING THE LIST OF HTML FILES TO MODIFY:
$file_locations = 'C:\HTML UPDATER\FileList.txt'
;*** MUST BE BETWEEN SINGLE QUOTES
;
;LOCATION AND NAME OF FILE CONTAINING THE LIST OF TITLES TO ADD TO FILES:
$new_title_list = 'C:\HTML UPDATER\TitleList.txt'
;*** MUST BE BETWEEN SINGLE QUOTES
;
;TEXT TO REPLACE:
$old_text_to_replace = 'somerandomtext'
;*** MUST BE BETWEEN SINGLE QUOTES
;
;
;-----------------------------------------
;*** END OF USER CONFIGURABLE SETTINGS ***
;-----------------------------------------
;
;
;
;
;
;---------------------
;AUTOIT SPECIFIC CODE:
;---------------------
;
;EXIT IF THIS SCRIPT IS ALREADY RUNNING
$g_szVersion = "UPDATE HTML FILES"
If WinExists($g_szVersion) Then Exit ; It's already running
AutoItWinSetTitle($g_szVersion)
;
;DON'T SHOW THE AUTOIT TRAY ICON
;COMMENT OUT TO SHOW THE TRAY ICON
;#NoTrayIcon
;
;DON'T SHOW ANY UNHELPFUL AUTOIT BASED ERRORS IF PROBLEMS OCCUR
;WINDOWS WILL STILL SHOW ITS OWN ERROR MESSAGE IF PROBLEMS OCCUR
Opt("RunErrorsFatal", 0)
;
;INCLUDES:
#include <File.au3>
;


;-------------
;PROGRAM CODE:
;-------------


;SET VALUE OF $line TO 1 TO START OFF WITH
$line = 1


;BEGIN LOOP:
While 1 = 1


;CHECK IF FILE SPECIFIED AS THE $file_locations VARIABLE EXISTS AND READ SINGLE LINE OF $file_locations VARIABLE
If NOT FileExists($file_locations) Then MsgBox(0, 'ERROR', 'UNABLE TO FIND:' & @CRLF & '"' & $file_locations & '"')
If NOT FileExists($file_locations) Then Exit
FileOpen($file_locations, 0)
If $file_locations = -1 Then
    MsgBox(0, 'ERROR', 'UNABLE TO OPEN FILE: ' & @CRLF & '"' & $file_locations & '"')
    Exit
EndIf
$file_locations_data = FileReadLine($file_locations , $line)
FileClose($file_locations)


;CHECK IF FILE SPECIFIED AS THE $new_title_list VARIABLE EXISTS AND READ SINGLE LINE OF $new_title_list VARIABLE
If NOT FileExists($new_title_list) Then MsgBox(0, 'ERROR', 'UNABLE TO FIND:' & @CRLF & '"' & $new_title_list & '"')
If NOT FileExists($new_title_list) Then Exit
FileOpen($new_title_list, 0)
If $new_title_list = -1 Then
    MsgBox(0, 'ERROR', 'UNABLE TO OPEN FILE: ' & @CRLF & '"' & $new_title_list & '"')
    Exit
EndIf
$new_title_list_data = FileReadLine($new_title_list , $line)
FileClose($file_locations)


;CHECK IF FILE SPECIFIED IN THE $file_locations_data VARIABLE EXISTS, CAN BE OPENED AND WRITE TO FILE
If NOT FileExists ($file_locations_data) Then MsgBox(0, 'TASK COMPLETE!', 'IT APPEARS ALL THE SPECIFIED FILES HAVE BEEN UPDATED')
If NOT FileExists ($file_locations_data) Then Exit
;
;FILE CHECK NOT WORKING - COMMENTED OUT:
;FileOpen($file_locations_data, 0)
;If $file_locations_data = -1 Then
    ;MsgBox(0, 'ERROR', 'UNABLE TO OPEN FILE: ' & @CRLF & '"' & $new_title_list_data & '"')
    ;Exit
;EndIf
;
$retval = _ReplaceStringInFile($file_locations_data, $old_text_to_replace, $new_title_list_data)
;
FileClose($file_locations_data)


;ADD +1 TO VARIABLE $line
$line = $line+1

WEnd

Exit

Thanks guys,

Liam

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

If you are not familiar with RegExp then you may use

_StringBetween() + StringReplace()

instead of StringRegExpReplace()

Search forum for StringRegExpReplace for examples of use.

EDIT: definitelly look at this post

http://www.autoitscript.com/forum/index.php?showtopic=96512

Edited by Zedna

Share this post


Link to post
Share on other sites

not absolutely sure what you're after - would something like this work for you:

$sOldLine = 'some surrounding blah bla before<title>somerandomtext</title>some surrounding blah bla after'
$sNewTitle = '<title>a new random blah blah</title>'
$sNewLine = StringRegExpReplace($sOldLine,'<title>(.*)</title>',$sNewTitle)
ConsoleWrite('$sNewLine = ' & $sNewLine & @crlf)


(The signature is placed on the back of this page to not disturb the flow of the thread.)

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Hi,

I am trying to write a script which updates content between the Title tags on lots of different html pages.

A typical html page will have a single line which contains content similar to the following:

<title>Page title info - unique to every page</title>

So far i wrote a script which reads the location of the file to modify from a text file (files separated by line breaks) and then reads the new content to replace with from another text file (also separated by line breaks) and replaces the specified value. I need to update it to replace unknown text between two specified values: "<title>" replace *.* "</title>"

I am pretty sure i would have to use StringRegExp and StringRegExpReplace, but i am not quite sure how to properly find and replace unknown content between two known points (html tags). Please help.

I expect my code is somewhat average, but i am trying.

;-------------
;USER OPTIONS:
;-------------
;
;LOCATION AND NAME OF FILE CONTAINING THE LIST OF HTML FILES TO MODIFY:
$file_locations = 'C:\HTML UPDATER\FileList.txt'
;*** MUST BE BETWEEN SINGLE QUOTES
;
;LOCATION AND NAME OF FILE CONTAINING THE LIST OF TITLES TO ADD TO FILES:
$new_title_list = 'C:\HTML UPDATER\TitleList.txt'
;*** MUST BE BETWEEN SINGLE QUOTES
;
;TEXT TO REPLACE:
$old_text_to_replace = 'somerandomtext'
;*** MUST BE BETWEEN SINGLE QUOTES
;
;
;-----------------------------------------
;*** END OF USER CONFIGURABLE SETTINGS ***
;-----------------------------------------
;
;
;
;
;
;---------------------
;AUTOIT SPECIFIC CODE:
;---------------------
;
;EXIT IF THIS SCRIPT IS ALREADY RUNNING
$g_szVersion = "UPDATE HTML FILES"
If WinExists($g_szVersion) Then Exit ; It's already running
AutoItWinSetTitle($g_szVersion)
;
;DON'T SHOW THE AUTOIT TRAY ICON
;COMMENT OUT TO SHOW THE TRAY ICON
;#NoTrayIcon
;
;DON'T SHOW ANY UNHELPFUL AUTOIT BASED ERRORS IF PROBLEMS OCCUR
;WINDOWS WILL STILL SHOW ITS OWN ERROR MESSAGE IF PROBLEMS OCCUR
Opt("RunErrorsFatal", 0)
;
;INCLUDES:
#include <File.au3>
;


;-------------
;PROGRAM CODE:
;-------------


;SET VALUE OF $line TO 1 TO START OFF WITH
$line = 1


;BEGIN LOOP:
While 1 = 1


;CHECK IF FILE SPECIFIED AS THE $file_locations VARIABLE EXISTS AND READ SINGLE LINE OF $file_locations VARIABLE
If NOT FileExists($file_locations) Then MsgBox(0, 'ERROR', 'UNABLE TO FIND:' & @CRLF & '"' & $file_locations & '"')
If NOT FileExists($file_locations) Then Exit
FileOpen($file_locations, 0)
If $file_locations = -1 Then
    MsgBox(0, 'ERROR', 'UNABLE TO OPEN FILE: ' & @CRLF & '"' & $file_locations & '"')
    Exit
EndIf
$file_locations_data = FileReadLine($file_locations , $line)
FileClose($file_locations)


;CHECK IF FILE SPECIFIED AS THE $new_title_list VARIABLE EXISTS AND READ SINGLE LINE OF $new_title_list VARIABLE
If NOT FileExists($new_title_list) Then MsgBox(0, 'ERROR', 'UNABLE TO FIND:' & @CRLF & '"' & $new_title_list & '"')
If NOT FileExists($new_title_list) Then Exit
FileOpen($new_title_list, 0)
If $new_title_list = -1 Then
    MsgBox(0, 'ERROR', 'UNABLE TO OPEN FILE: ' & @CRLF & '"' & $new_title_list & '"')
    Exit
EndIf
$new_title_list_data = FileReadLine($new_title_list , $line)
FileClose($file_locations)


;CHECK IF FILE SPECIFIED IN THE $file_locations_data VARIABLE EXISTS, CAN BE OPENED AND WRITE TO FILE
If NOT FileExists ($file_locations_data) Then MsgBox(0, 'TASK COMPLETE!', 'IT APPEARS ALL THE SPECIFIED FILES HAVE BEEN UPDATED')
If NOT FileExists ($file_locations_data) Then Exit
;
;FILE CHECK NOT WORKING - COMMENTED OUT:
;FileOpen($file_locations_data, 0)
;If $file_locations_data = -1 Then
    ;MsgBox(0, 'ERROR', 'UNABLE TO OPEN FILE: ' & @CRLF & '"' & $new_title_list_data & '"')
    ;Exit
;EndIf
;
$retval = _ReplaceStringInFile($file_locations_data, $old_text_to_replace, $new_title_list_data)
;
FileClose($file_locations_data)


;ADD +1 TO VARIABLE $line
$line = $line+1

WEnd

Exit

Thanks guys,

Liam

MsgBox(0,"", _
StrReplaceBetween("firefox.htm","New.htm","New_Title","<title>" ,"</titleq>"))

Func StrReplaceBetween($FileIn,$FileOut,$NewString,$substring_A ,$substring_B,$count = 1)
$HFIN = FileOpen($FileIn,0)
$HFOUT = FileOpen($FileOut,2)
$TXTIN1 = FileRead($HFIN)
$position_substring_A = StringInStr($TXTIN1, $substring_A)
if @error Then Return False
$position_substring_B = StringInStr($TXTIN1, $substring_B,0,1,$position_substring_A + _
StringLen($substring_A))
if @error Then Return False
$TXTOUT1 = StringMid($TXTIN1, $position_substring_A,($position_substring_B _ 
+ StringLen($substring_B)) - $position_substring_A)
if @error Then Return False
$TXTOUT2 = StringReplace($TXTIN1,$TXTOUT1,$substring_A & $NewString & $substring_B , $count)
if @error <> 1 Then Return False
FileWrite($HFOUT,$TXTOUT2)
FileClose($HFIN)
FileClose($HFOUT)
Return True
EndFunc
Edited by wolf9228

صرح السماء كان هنا

 

Share this post


Link to post
Share on other sites

Thanks for everyone's replies. Very much appreciated.

I am only at the very basic user level and unfortunately some of the replies were quite honestly a little over my head. I am getting there though, slowly but surely.

As i am not experienced with arrays (yet), i went the "StringBetween" and "ReplaceStringInFile" path, but i think i stuffed something up (probably something simple) as i haven't been able to get it working properly yet.

I also watered down my original code to focus on the core issue, so what i am really trying to accomplish should be pretty straight forward.

I would really appreciate some feedback or suggestions on the following:

;INCLUDES:
#include <File.au3>
#include <String.au3>
#include <array.au3>
;
$file_location = 'C:\test\test.txt' ; FILE CONTAINS THE FOLLOWING TEXT: <title>SOME RANDOM TEST INFO</title>
$new_title = 'NEW INFO'

$sFile = FileRead($file_location)
$aBetween = _StringBetween($sFile, '<title>', '</title>')

_ArrayDisplay($aBetween) ; WORKS FINE

MsgBox(0, '', $aBetween) ; WHY DOESN'T THIS WORK???

_ReplaceStringInFile($file_location, $aBetween, $new_title) ; ALSO NOT WORKING, BUT DON'T KNOW WHY EITHER.

Exit

Thanks,

Liam

Share this post


Link to post
Share on other sites

_StringBetween - Success: A 0 based $array[0] contains the first found string.

Change MsgBox(0, '', $aBetween) to MsgBox(0, '', $aBetween[0])

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

MsgBox(0, '', $aBetween) ; WHY DOESN'T THIS WORK???

Because you can't display an array, only the elements.

_ReplaceStringInFile($file_location, $aBetween, $new_title) ; ALSO NOT WORKING, BUT DON'T KNOW WHY EITHER.

Same problem. $aBetween is an array. If there is only one Title in the text then the array will only have one element but it's still an array. Assuming you only want the first title found then you need $aBetween[0]. (See the help for this function)

You don't want to replace $aBetween with the new title because that text might appear somewhere else and not only between the title tags, so as wolf9228 has already shown, you need

_ReplaceStringInFile($file_location,'<title>' & $aBetween[0] & '</title>', '<title>' & $new_title & '</title>')

Edited by martin

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites

use:

_ArrayDisplay($array)

..to display an array.... must #include <array.au3>


Hi ;)

Share this post


Link to post
Share on other sites

Thanks everyone. That helped heaps.

Liam

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0