Jump to content

_datadifference() Get the real difference between two sets of data. (Supports colors, binary + more!)


nullschritt
 Share

Recommended Posts

Hey guys, saw a couple posts in the help section recently asking for a way to compare images, to get how different the two actually are. So I wrote a function that directly compares sets of data and gets how different one character is from the next. In other words this function won't tell you the percentage of data that changed what is does do is tell you is how different the two sets of data are from each other.

You may ask what's the difference? Well put quite simply if you have a string 001 and 021, the percentage of change is 33% however the difference between the two is 3.7%, Simmilarly with the sets of data 001 and 051, the percentage of change is still 33%, however the difference is now 15%. We're not checking to see what has changed, but rather, how much it has changed from it's original value.

While this was primarily deigned to compare the difference between two colors, you could get the relative difference between any two sets of data.

This can also be used to see how random an output is. I'm sure other's will think of interesting uses too!

The example below creates two random 1000x1000 pixel images and compares them. You'll notice the output is always around 34%, this means that the random() function is only about 34% random!

Example:

Global $pixelseed = StringSplit('0123456789abcdef', ""), $image1, $image2

for $i=1 to 2000 ;let's make a 1000x1000 pixels image
    $image1 &= $pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]
    $image2 &= $pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]
Next

$timer = TimerInit()
$compare = _datacompare($image1, $image2)
ConsoleWrite($compare&'% Different'&@CRLF&'Took '&(TimerDiff($timer)/1000)&' seconds.'&@CRLF)

Functions:

#cs
Function _datacompare($data1, $data2, [$declimal])
    -$data1: A string of data, any length.
    -$data2: A string of data, must be the same length as $data1
    -$decimal: 1=Binary 9=Base-10, 15=Base-16, 31=Base-32

    Note: If you just want to compare two sets of binary data
    you probably want to use base-16. Unless you are sure your
    binary is in 1's and 0's.

    Returns: A float containing the percentage of difference.
#ce
func _datacompare($data1, $data2, $decimal=15)
    Local $difference
$data1 = StringSplit($data1, "")
$data2 = StringSplit($data2, "")
$difference = 0
for $i=1 to $data1[0]
if $data1[$i] <> $data2[$i] Then
    $difference += Abs(_tonum($data1[$i]) - _tonum($data2[$i]))
EndIf
Next
$difference = (($difference/$data1[0])/$decimal)*100
Return $difference
EndFunc

#cs
Function _tonum()
    -$info: A single digit or carachter.

    Returns: A 0-based value.
#ce
func _tonum($info)
if $info+0 > 0 Then Return $info
$info = StringLower($info)
$return = asc($info)-87
switch $return
    Case -39
        Return 0
    Case Else
        Return $return
EndSwitch
EndFunc

Comments, questions, criticisms, improvements? Post them!

This is pretty slow right now, about 60 seconds on a 1000x1000 image.

I would really like to get it working faster for more practice uses. And help is much appreciated!

Edited by nullschritt
Link to comment
Share on other sites

Hey guys, saw a couple posts in the help section recently asking for a way to compare images, to get how different the two actually are. So I wrote a function that directly compares sets of data and gets how different one character is from the next. In other words this function won't tell you the percentage of data that changed what is does do is tell you is how different the two sets of data are from each other.

You may ask what's the difference? Well put quite simply if you have a string 001 and 021, the percentage of change is 33% however the difference between the two is 3.7%, Simmilarly with the sets of data 001 and 051, the percentage of change is still 33%, however the difference is now 15%. We're not checking to see what has changed, but rather, how much it has changed from it's original value.

While this was primarily deigned to compare the difference between two colors, you could get the relative difference between any two sets of data.

This can also be used to see how random an output is. I'm sure other's will think of interesting uses too!

The example below creates two random 1000x1000 pixel images and compares them. You'll notice the output is always around 34%, this means that the random() function is only about 34% random!

Example:

Global $pixelseed = StringSplit('0123456789abcdef', ""), $image1, $image2

for $i=1 to 2000 ;let's make a 1000x1000 pixels image
    $image1 &= $pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]
    $image2 &= $pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]&$pixelseed[Random(1, $pixelseed[0], 1)]
Next

$timer = TimerInit()
$compare = _datacompare($image1, $image2)
ConsoleWrite($compare&'% Different'&@CRLF&'Took '&(TimerDiff($timer)/1000)&' seconds.'&@CRLF)

Functions:

#cs
Function _datacompare($data1, $data2, [$declimal])
    -$data1: A string of data, any length.
    -$data2: A string of data, must be the same length as $data1
    -$decimal: 1=Binary 9=Base-10, 15=Base-16, 31=Base-32

    Note: If you just want to compare two sets of binary data
    you probably want to use base-16. Unless you are sure your
    binary is in 1's and 0's.

    Returns: A float containing the percentage of difference.
#ce
func _datacompare($data1, $data2, $decimal=15)
    Local $difference
$data1 = StringSplit($data1, "")
$data2 = StringSplit($data2, "")
$difference = 0
for $i=1 to $data1[0]
if $data1[$i] <> $data2[$i] Then
    $difference += Abs(_tonum($data1[$i]) - _tonum($data2[$i]))
EndIf
Next
$difference = (($difference/$data1[0])/$decimal)*100
Return $difference
EndFunc

#cs
Function _tonum()
    -$info: A single digit or carachter.

    Returns: A 0-based value.
#ce
func _tonum($info)
if $info+0 > 0 Then Return $info
$info = StringLower($info)
$return = asc($info)-87
switch $return
    Case -39
        Return 0
    Case Else
        Return $return
EndSwitch
EndFunc

Comments, questions, criticisms, improvements? Post them!

This is pretty fast, it only takes about 0.2 seconds to compare two 1000x1000 pixel images, which is around 15.6 mb of data.

This example simulates around a 45 x 45 pixel image and took 0.1 seconds.

A 1000 x 1000 (1,000,000) pixel simulation took 89 seconds.

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Link to comment
Share on other sites

Not my code, if memory serves this was from monoceres from MANY years back.

#include <GDIPlus.au3>

$Pic1 = @DesktopDir&"\untitled.jpg"
$Pic2 = @DesktopDir&"\untitled2.jpg"

$StartTimer = TimerInit()
$Check = _Check($Pic1, $Pic2)
$EndTimer = TimerDiff($StartTimer)
ConsoleWrite("images are the same -> " & $Check & ", time taken (ms) " & $EndTimer & @CR)

Func _Check($fname1, $fname2)
    _GDIPlus_Startup()
    $bm1 = _GDIPlus_ImageLoadFromFile($fname1)
    $bm2 = _GDIPlus_ImageLoadFromFile($fname2)

    If _CompareBitmaps($bm1, $bm2) = 1 Then
        $Same = True ; the images being compared are the same
    Else
        $Same = False ; the images being compared are different
    EndIf
    _GDIPlus_ImageDispose($bm1)
    _GDIPlus_ImageDispose($bm2)
    _GDIPlus_Shutdown()
    Return $Same
EndFunc   ;==>_Check

Func _CompareBitmaps($bm1, $bm2)
    $Bm1W = _GDIPlus_ImageGetWidth($bm1)
    $Bm1H = _GDIPlus_ImageGetHeight($bm1)
    $BitmapData1 = _GDIPlus_BitmapLockBits($bm1, 0, 0, $Bm1W, $Bm1H, $GDIP_ILMREAD, $GDIP_PXF32RGB)
    $Stride = DllStructGetData($BitmapData1, "Stride")
    $Scan0 = DllStructGetData($BitmapData1, "Scan0")

    $ptr1 = $Scan0
    $size1 = ($Bm1H - 1) * $Stride + ($Bm1W - 1) * 4

    $Bm2W = _GDIPlus_ImageGetWidth($bm2)
    $Bm2H = _GDIPlus_ImageGetHeight($bm2)
    $BitmapData2 = _GDIPlus_BitmapLockBits($bm2, 0, 0, $Bm2W, $Bm2H, $GDIP_ILMREAD, $GDIP_PXF32RGB)
    $Stride = DllStructGetData($BitmapData2, "Stride")
    $Scan0 = DllStructGetData($BitmapData2, "Scan0")

    $ptr2 = $Scan0
    $size2 = ($Bm2H - 1) * $Stride + ($Bm2W - 1) * 4

    $smallest = $size1
    If $size2 < $smallest Then $smallest = $size2
    $call = DllCall("msvcrt.dll", "int:cdecl", "memcmp", "ptr", $ptr1, "ptr", $ptr2, "int", $smallest)

    _GDIPlus_BitmapUnlockBits($bm1, $BitmapData1)
    _GDIPlus_BitmapUnlockBits($bm2, $BitmapData2)

    Return ($call[0] = 0)
EndFunc   ;==>_CompareBitmaps

For comparing, I took 2 screenshots of my desktop, 3360X1080 resolution.  Ran test with both images the same, result was True, 98.1 miliseconds.  Made slight change to the original image, ran again, result was False, 88.8 miliseconds.

Ian

Edited by llewxam

My projects:

  • IP Scanner - Multi-threaded ping tool to scan your available networks for used and available IP addresses, shows ping times, resolves IPs in to host names, and allows individual IPs to be pinged.
  • INFSniff - Great technicians tool - a tool which scans DriverPacks archives for INF files and parses out the HWIDs to a database file, and rapidly scans the local machine's HWIDs, searches the database for matches, and installs them.
  • PPK3 (Persistent Process Killer V3) - Another for the techs - suppress running processes that you need to keep away, helpful when fighting spyware/viruses.
  • Sync Tool - Folder sync tool with lots of real time information and several checking methods.
  • USMT Front End - Front End for Microsoft's User State Migration Tool, including all files needed for USMT 3.01 and 4.01, 32 bit and 64 bit versions.
  • Audit Tool - Computer audit tool to gather vital hardware, Windows, and Office information for IT managers and field techs. Capabilities include creating a customized site agent.
  • CSV Viewer - Displays CSV files with automatic column sizing and font selection. Lines can also be copied to the clipboard for data extraction.
  • MyDirStat - Lists number and size of files on a drive or specified path, allows for deletion within the app.
  • 2048 Game - My version of 2048, fun tile game.
  • Juice Lab - Ecigarette liquid making calculator.
  • Data Protector - Secure notes to save sensitive information.
  • VHD Footer - Add a footer to a forensic hard drive image to allow it to be mounted or used as a virtual machine hard drive.
  • Find in File - Searches files containing a specified phrase.
Link to comment
Share on other sites

@llewxam, your post is a bit off topic here, while it has to do with comparing two images, it has nothing to do with getting their relative difference as a percentage. (the check lighting effects such)

Am looking for anyone who could point out a way to make my code faster, it doesn't seem bloated to me, I'm not sure what would really speed it up.

PS: Thought: Although, if the image was sized down to 100x100 pixels couldn't it still be accurately compared? My thoughts is that everything is preserved in perfect proportion, so the same visual differences should be equally measurable in them. Correct me if I am wrong. (still doesnt means I wouldnt love any advice on making the algorithm faster anyways)

Edited by nullschritt
Link to comment
Share on other sites

LOL, sorry, saw the part about comparing images and my brain leapt in that direction!  :)

Ian

My projects:

  • IP Scanner - Multi-threaded ping tool to scan your available networks for used and available IP addresses, shows ping times, resolves IPs in to host names, and allows individual IPs to be pinged.
  • INFSniff - Great technicians tool - a tool which scans DriverPacks archives for INF files and parses out the HWIDs to a database file, and rapidly scans the local machine's HWIDs, searches the database for matches, and installs them.
  • PPK3 (Persistent Process Killer V3) - Another for the techs - suppress running processes that you need to keep away, helpful when fighting spyware/viruses.
  • Sync Tool - Folder sync tool with lots of real time information and several checking methods.
  • USMT Front End - Front End for Microsoft's User State Migration Tool, including all files needed for USMT 3.01 and 4.01, 32 bit and 64 bit versions.
  • Audit Tool - Computer audit tool to gather vital hardware, Windows, and Office information for IT managers and field techs. Capabilities include creating a customized site agent.
  • CSV Viewer - Displays CSV files with automatic column sizing and font selection. Lines can also be copied to the clipboard for data extraction.
  • MyDirStat - Lists number and size of files on a drive or specified path, allows for deletion within the app.
  • 2048 Game - My version of 2048, fun tile game.
  • Juice Lab - Ecigarette liquid making calculator.
  • Data Protector - Secure notes to save sensitive information.
  • VHD Footer - Add a footer to a forensic hard drive image to allow it to be mounted or used as a virtual machine hard drive.
  • Find in File - Searches files containing a specified phrase.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...