Jump to content

RegEx help required


saywell
 Share

Recommended Posts

Hi,

I have an html file that represents the differences between 2 excel files. It shows changes/deletions in different coloured blocks.

I need to add these to my SQlite database [but not fully automatically, as there are abbreviations and format errors that need manual checking.

I should be able to copy the changed section [as viewed in browser] and clipget it in my script, stringsplit to an array and enter it into my GUI for manual inspection/edit prior to making the SQLite change.

The most tedious bit is finding the changed section, scrolling down the html page.

So, my question is, how do I strip out the bits with no background color set?

I've written a script that splits the rows into an array, then cycles through the array.

If there's a 'background-color:' in the array element, it does nothing. If there isn't it should replace this with a white space so the end file just has the coloured items in it.

However, I can't make it work! The code I'm using is;

#include <String.au3>
#include <Array.au3>
#Include <File.au3>

Local $backup = FileCopy (@ScriptDir & "\test_diff report.htm", @ScriptDir & "\test_diff report.bak",1)
If $backup = 0 Then
    MsgBox(0,"error", "File Not backed up",10)
    Exit
ElseIf $backup = 1 Then
    MsgBox (0, "Backup", "File backed up")  
EndIf

Local $file = FileOpen (@ScriptDir & "\test_diff report.htm")
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

Local $sHTML = FileRead ($file)

FileClose ($file)

 Local $aArray1 = _StringBetween($sHTML,'<tr valign="top">', '</tr>')
 
 _ArrayDisplay ($aArray1, "Array1")

$i = 0

For $i = 0 To UBound ($aArray1)
    
    Local $sRow = $aArray1[$i]
    $sRow = StringRegExpReplace($sRow, "\v", "<BR>")
    $sRow = '<tr valign="top">' & $sRow & '</tr>'
    
        
    Local $colorTest = StringInStr ($sRow, "background-color:")
    ToolTip ($sRow,50,50,"Processing "& $i & " of " & UBound ($aArray1),1,1)
    If $colorTest = 0 Then 
        If $i < 5 Then MsgBox (0, "Replace?", $sRow)
        $retval = _ReplaceStringInFile (@ScriptDir & "\test_diff report.htm", $sRow, " ",1,0)
        ConsoleWrite ("Ret = " &$retval & " @Error = " & @error & @CR)
        if $retval = -1 then
        msgbox(0, "ERROR", "The pattern could not be replaced: " & $sRow & " Error: " & @error)
        ContinueLoop
        EndIf
    EndIf
    
Next
ToolTip ("")
MsgBox (0, "Done", "Process has ended", 2)
Exit

And in the consolewrite output I get a return value of 0 for the _ReplaceStringInFile indicating string not found [though they look fine in the msgbox/tooltip text] but @error =0 .

One possibility is that the array elements shown in _ArrayDisplay have some non-text characters in them [shown as little rectangles] and I wonder if that's the problem. If so, how to remove them!

I tried adding [autoit$sRow = StringRegExpReplace($sRow, "\v", "<BR>")[/autoit]lest they be vertical tab characters, to no avail.

I'd be grateful if anyone can help me here!

The test file is attached [ a shortened version for testing]

Edited by saywell
Link to comment
Share on other sites

  • Moderators

saywell,

I do not think you need an SRE here - just delete all the "normal" lines that do not have a "color: ; background-color:" tag like this:

#include <Array.au3>
#Include <File.au3>

Local $backup = FileCopy (@ScriptDir & "\test_diff report.htm", @ScriptDir & "\test_diff report.bak",1)
If $backup = 0 Then
    MsgBox(0,"error", "File Not backed up",10)
    Exit
ElseIf $backup = 1 Then
    MsgBox (0, "Backup", "File backed up")
EndIf

; Read file into array
Global $aArray1
_FileReadToArray(@ScriptDir & "\test_diff report.htm", $aArray1)

$i = 0

For $i = UBound ($aArray1) - 1 To 0 Step -1 ; Must work upwards or we screw the count by removing lines

    ; Remove all non-"background colour" data lines
    Local $sRow = $aArray1[$i]
    If StringInStr ($aArray1[$i], '<td class="ln"></td><td class="') Then ; This is what we find in "normal" lines
        _ArrayDelete($aArray1, $i)
    EndIf

Next

; Rewrite the file
_FileWriteFromArray(@ScriptDir & "\test_diff report Colour.htm", $aArray1, 1)

; And this is what you get!
ShellExecute(@ScriptDir & "\test_diff report Colour.htm")

Works for me. :)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...