Jump to content

Ideas needed on Compare Tool


ScottW
 Share

Recommended Posts

So, i'm working on a compare tool that will compare the output of the output of a route in the legacy HL7 engine and the output of the new HL7 engine.  The idea is to run a months worth of messages into the test route and capture the output in a file.  Then using this file to compare it to the current live output.  This will tell me all my coding and filtering has been done correctly.

We have ultra Edit, but an HL7 message consists of several lines.  The files consist of 500-500,000 HL7 messages each.  We don't want lines to be compared, we want messages.

An HL7 message looks like this.  Imagine a file with 100,000 of them stacked on top of each other, to get an idea of what I'm working with.

MSH|^~\&|EPIC|EPICADT|SMS|SMSADT|199912271408|CHARRIS|ADT^A04|1817457|D|2.5|
PID||0493575^^^2^ID 1|454721||DOE^JOHN^^^^|DOE^JOHN^^^^|19480203|M||B|254 MYSTREET AVE^^MYTOWN^OH^44123^USA||(216)123-4567|||M|NON|400003403~1129086|
NK1||ROE^MARIE^^^^|SPO||(216)123-4567||EC|||||||||||||||||||||||||||
PV1||O|168 ~219~C~PMA^^^^^^^^^||||277^ALLEN MYLASTNAME^BONNIE^^^^|||||||||| ||2688684|||||||||||||||||||||||||199912271408||||||002376853

 

Each segment prefixed with a segment type, each field in the segment split with a pipe, each sub-field with a carrot, and sub-sub-fields with an ampersand.

 

I started this project by writing each file to an array parsing on MSH|^~\&.  This gave me 2 arrays to compare.  I ran the compare using this snipit, modeled off a post i found here:

_ArraySort($AnchorArray, 0, 1, 0, 0, 1)

; comparing the 2 array's
For $i = 0 to UBound($TestingArray) - 1
    $Index = _ArrayBinarySearch($AnchorArray, $TestingArray[$i], 1)
    GUICtrlSetData($LabelStatus,"Status: comparing record"&$i)

    ; add equal rows to a string
    If $Index <> -1 Then
        $delString1 &= ";" & $Index
        $delString2 &= ";" & $i
    EndIf
Next
; removing the equal rows from the array's
_ArrayDelete($AnchorArray, $delString1)
_ArrayDelete($TestingArray, $delString2)


; writing the rsult to files
_FileWriteFromArray("migrationcompare\"&$FnMIA, $TestingArray)
_FileWriteFromArray("migrationcompare\"&$FnMIT, $AnchorArray)

and removed the matching messages, leaving me with 2 files (Missing in Anchor) and (Missing in Test)... Now these missing files can either actually be missing or they can be mapped wrong.  They may exist but some part of the string maybe different.  So I took it a step further.

I then run each of these through another step and create 2 files with just the MSH.10 field (in Red above).  An MSH.10 field SHOULD be a UUID for each message.  I then compare these 2  MSH.10 files to see what MSH.10's match in (Missing in Anchor) and (Missing in Test)...That tells me what messages exist in both the anchor (legacy) outbound and Test(new) outbound, but don't match.. these I have to correct with my mapper logic.  I also create 2 other files, one for messages that got past my filters and shouldn't have, and one that is for messages that didn't get across that should have.  I use these to work on my filtering logic.

The only thing is these 3 final files are just MSH.10 values.  I have to then search my files for the actual record, pull it out of the legacy and the test files and then do a compare with notepad++ or ultraedit to see what exactly doesn't match.  What I'd like to happen is for my final 3 files to not be just the MSH.10 value, but the actual messages so I can skip a step of finding them.

So my thoughts are:  Look at the 2 'Missing' files that have the full HL7 records in them and searching for the matching MSH.10 values from above, create 2 new missing files that have only the messages that had the matching MSH values.... should work but a string in string function will take to long... so my thought is to prefix each message with the MSH.10 value and then do a compare on only the first X characters... 

But am i wasting my time and CPU?  Can I compare 'quickly' the MSH.10 value of each file and sort them out without first extracting the MSH.10 into their own files?  And this seems quicker, but how do i use this with just the MSH.10 value? I'm fumbling with this right now.

Local $a=$AnchorArray
Local $b=$TestingArray

Local $sda = ObjCreate("Scripting.Dictionary")
Local $sdb = ObjCreate("Scripting.Dictionary")
Local $sdc = ObjCreate("Scripting.Dictionary")

For $i In $a
    $sda.Item($i)
Next
For $i In $b
    $sdb.Item($i)
Next

For $i In $a
    If $sdb.Exists($i) Then $sdc.Item($i)
Next
$asd3 = $sdc.Keys()

For $i In $asd3
    If $sda.Exists($i) Then $sda.Remove($i)
    If $sdb.Exists($i) Then $sdb.Remove($i)
Next
$asd1 = $sda.Keys()
$asd2 = $sdb.Keys()

_ArrayDisplay($asd1, "$asd1")
_ArrayDisplay($asd2, "$asd2")
_ArrayDisplay($asd3, "$asd3")

; writing the result to files
_FileWriteFromArray("migrationcompare\new_"&$FnMIA, $asd2)
_FileWriteFromArray("migrationcompare\new_"&$FnMIT, $asd1)

(also found here somewhere)

Ideas?  Suggestions?  I can post the entire source code if needed.

 

Thanks

 

Scott

Link to comment
Share on other sites

Thanks for the suggestion, i'll file that under the to learn tab in my to do list.  :)  I don't think it'll work very well for what I am working on.

 

Here's what I finally came up with.  I like it.  It works!

Takes 2 large collections of messages, splits them into arrays on the beginning of the message.  Compares the 2 files and deletes the exact matches, leaves me 2 files without any exact matches.  Then it looks at these 2 files and matches any messages that have the same MSH.10 value... writes out these matches to 2 files (sorted) so they can be compared line by line latter to see where the mapper errors are... finally writes out two more final files with extra data from each of the 2 sides of the compare.... now to just rework my front end and plug this in. 

Here's my code... i'm sure there's some room for improvement, so fire away.

#Include <Array.au3>
#Include <File.au3>
$finalMSH10 = ""
$finalMSH10_test = ""
$Test_matches = ""
$Test_nonmatch = ""
$Anchor_matches = ""
$Anchor_nonmatches = ""
;--------------------------creates anchor array and anchor MSH.10 array---------------------------------------
;Local $Array_anchor = StringSplit(FileRead("C:\AutoIT\RewriteHL7\VarianADT2019-02-08-10-30-55-355 (2).txt"), "MSH|^~\&", 1)
Local $Array_anchor = StringSplit(FileRead("C:\AutoIT\RewriteHL7\Range_Anchor_20190129.txt"), "MSH|^~\&", 1)
_arraydelete($Array_anchor, '0')
for $ACount = 0 to UBound($Array_anchor) - 1
    Local $start = stringinstr($Array_anchor[$ACount], "|", 0, 8, 1)
    Local $end = stringinstr($Array_anchor[$ACount], "|", 0, 9, 1)
    local $MSH10_Lenght = $end - $start
    local $MSH10 = StringMid($Array_anchor[$ACount], $start + 1, $MSH10_Lenght - 1)
    ;GUICtrlSetData($LabelStatus, "Status: Pulling MSH from Anchor-" & $a)
    $finalMSH10 = $finalMSH10 & $MSH10 & @CRLF
    $Array_anchor[$ACount] = "MSH|^~\&" & $Array_anchor[$ACount] ;put MSH and control characters back on
next
;build the MSH.10 only array for compare latter
$Array_MSH10_Anchor = StringSplit($finalMSH10, @CRLF, 1)
_arraydelete($Array_MSH10_Anchor, '0-1')
;_ArrayDisplay($Array_MSH10_Anchor)

;------------------------Creates Test Array and test MSH.10 array----------------------------------------
;Local $Array_test = StringSplit(FileRead("C:\AutoIT\RewriteHL7\Range_Anchor_20190129.txt"), "MSH|^~\&", 1)
Local $Array_test = StringSplit(FileRead("C:\AutoIT\RewriteHL7\VarianADT2019-02-08-10-30-55-355 (2).txt"), "MSH|^~\&", 1)
for $TCount = 0 to UBound($Array_test) - 1
    Local $start = stringinstr($Array_test[$TCount], "|", 0, 8, 1)
    Local $end = stringinstr($Array_test[$TCount], "|", 0, 9, 1)
    local $MSH10_Length = $end - $start
    local $MSH10_test = StringMid($Array_test[$TCount], $start + 1, $MSH10_Length - 1)
    ;GUICtrlSetData($LabelStatus, "Status: Pulling MSH from Anchor-" & $a)
    $finalMSH10_test = $finalMSH10_test & $MSH10_test & @CRLF
    $Array_test[$TCount] = "MSH|^~\&" & $Array_test[$TCount] ;put MSH and control characters back on
Next
$Array_MSH10_Test = StringSplit($finalMSH10_test, @CRLF, 1)
_arraydelete($Array_test, '0-1')


;-----------create dictionary objects----------------------
Local $Object_String_Anchor = ObjCreate("Scripting.Dictionary") ;Holds Anchor Array
Local $Object_String_A_MSH10 = ObjCreate("Scripting.Dictionary") ;Holds Anchor MSH.10 values
Local $Object_String_T_MSH10 = ObjCreate("Scripting.Dictionary") ;Holds Test MSH.10 values
Local $Object_String_Test = ObjCreate("Scripting.Dictionary") ;Holds Test Array
Local $Object_String_Matches = ObjCreate("Scripting.Dictionary") ; holds array of matched messages, used to delete good matches

;-----------  populate $Object_String_A_MSH10 object with anchor array msh.10 values-------------
For $i In $Array_MSH10_Anchor
    $Object_String_A_MSH10.Item($i)
Next

;-----------  populate $Object_String_T_MSH10 object with test array msh.10 values-------------
_ArrayDelete($Array_MSH10_Test, '0-2')
For $i In $Array_MSH10_Test
    $Object_String_T_MSH10.Item($i)
Next

;-----------  populate $Object_String_Anchor object with anchor array-------------
For $i In $Array_anchor
    $Object_String_Anchor.Item($i)
Next
;-------------- populate $Object_String_Test object with test array--------------
For $i In $Array_test
    $Object_String_Test.Item($i)
Next
;-------------- check anchor array for matching items in test array and add them to the $Object_String_Matches object ----------------
For $i In $Array_anchor
    If $Object_String_Test.Exists($i) Then $Object_String_Matches.Item($i)
Next
;----- populate matching array Array_returned_Matches with values from $Object_String_Matches object -----------
$Array_returned_Matches = $Object_String_Matches.Keys()

;------------------ take the matches of Array_returned_Matches and remove them from the $Object_String_Anchor and $Object_String_Test objects ----------------------
For $i In $Array_returned_Matches
    If $Object_String_Anchor.Exists($i) Then $Object_String_Anchor.Remove($i)
    If $Object_String_Test.Exists($i) Then $Object_String_Test.Remove($i)
Next
;------ set arrays $Array_returned_Anchor and Array_returned_Test to the $Object_String_Anchor and $Object_String_Test objects -------------
$Array_returned_Anchor = $Object_String_Anchor.Keys()
$Array_returned_Test = $Object_String_Test.Keys()
;----------Part 2---------------------
;comments-start
;-----------  Pull matching records out for Test-------------
for $TC = 0 to UBound($Array_returned_Test) - 1
    Local $start = stringinstr($Array_returned_Test[$TC], "|", 0, 9, 1)
    Local $end = stringinstr($Array_returned_Test[$TC], "|", 0, 10, 1)
    local $MSH10_Lenght = $end - $start
    ;local $MSH10=StringMid($Array_returned_Test[$TC],$start+1, $MSH10_Lenght-1)
    if $Object_String_A_MSH10.Exists(StringMid($Array_returned_Test[$TC], $start + 1, $MSH10_Lenght - 1)) Then
        $Test_matches = $Test_matches & "AWAWRERET" & $Array_returned_Test[$TC] & @CRLF
    Else
        $Test_nonmatch = $Test_nonmatch & $Array_returned_Test[$TC] & @CRLF
    EndIf

next
$Sorted_Test_Matches = StringSplit($Test_matches, "AWAWRERET", 1)
_arraysort($Sorted_Test_Matches)
;_ArrayDisplay($Sorted_Test_Matches)
_FileWriteFromArray("C:\AutoIT\RewriteHL7\_Test_matches_sorted.txt", $Sorted_Test_Matches)
;$hFilehandle = FileOpen("C:\AutoIT\RewriteHL7\_Test_matches.txt", $FO_OVERWRITE)
;FileWrite($hFilehandle,$Test_matches)
;FileClose($hFilehandle)

$hFilehandle = FileOpen("C:\AutoIT\RewriteHL7\_Test_Non_matches.txt", $FO_OVERWRITE)
FileWrite($hFilehandle, $Test_nonmatch)
FileClose($hFilehandle)
;#comments-end
;-----------  Pull matching records out for Anchor-------------
for $AC = 0 to UBound($Array_returned_Anchor) - 1
    Local $start = stringinstr($Array_returned_Anchor[$AC], "|", 0, 9, 1)
    Local $end = stringinstr($Array_returned_Anchor[$AC], "|", 0, 10, 1)
    local $MSH10_Lenght = $end - $start
    if $Object_String_T_MSH10.Exists(StringMid($Array_returned_Anchor[$AC], $start + 1, $MSH10_Lenght - 1)) Then
        ;_arrayadd($Anchor_matches,$Array_returned_Anchor[$AC])
        $Anchor_matches = $Anchor_matches & "AWAWRERET" & $Array_returned_Anchor[$AC] & @CRLF
    Else
        ;_arrayadd($Anchor_nonmatches,$Array_returned_Anchor[$AC])
        $Anchor_nonmatches = $Anchor_nonmatches & $Array_returned_Anchor[$AC] & @CRLF
    EndIf

next
$Sorted_Anchor_Matches = StringSplit($Anchor_matches, "AWAWRERET", 1)
_arraysort($Sorted_Anchor_Matches)
_ArrayDisplay($Sorted_Anchor_Matches)
_FileWriteFromArray("C:\AutoIT\RewriteHL7\_Anchor_matches_sorted.txt", $Sorted_Anchor_Matches)
;$hFilehandle = FileOpen("C:\AutoIT\RewriteHL7\_Anchor_matches.txt", $FO_OVERWRITE)
;FileWrite($hFilehandle,$Anchor_matches)
;FileClose($hFilehandle)

$hFilehandle = FileOpen("C:\AutoIT\RewriteHL7\_Anchor_Non_matches.txt", $FO_OVERWRITE)
FileWrite($hFilehandle, $Anchor_nonmatches)
FileClose($hFilehandle)



;_ArrayDisplay($Array_returned_Anchor, "$Array_returned_Anchor")
;_ArrayDisplay($Array_returned_Test, "$Array_returned_Test")
;_ArrayDisplay($Array_returned_Matches, "$Array_returned_Matches")

; ------------write out arrays to files ------------------------------------
_Filewritefromarray("C:\AutoIT\RewriteHL7\HL71(MissInTest).txt", $Array_returned_Anchor)
_Filewritefromarray("C:\AutoIT\RewriteHL7\HL72(MissinAnch).txt", $Array_returned_Test)

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...