John117 Posted May 18, 2009 Share Posted May 18, 2009 (edited) Hey, I have about 5k pictures in one folder. about 2/3 of them are duplicates with different names. Picassa is now exhausted. I need way to identify duplicates and then move them. Here is what am thinking match based on dimensions, dpi, and size. They have different names, but may be the same picture. I was also considering a pixel match based on 4 points. (like a square) in from the corners. So, maybe 10% in then 20% in and so forth. -store these then compare to the next. . . . Any better ideas? Will post in examples when complete with credit for anyone that helps! Could make for a nice piece of software! Edit: eye can spell! Edited May 18, 2009 by John117 Link to comment Share on other sites More sharing options...
AdmiralAlkex Posted May 18, 2009 Share Posted May 18, 2009 What format is the pictures in? Jpg, Bmp or.... ? .Some of my scripts: ShiftER, Codec-Control, Resolution switcher for HTC ShiftSome of my UDFs: SDL UDF, SetDefaultDllDirectories, Converting GDI+ Bitmap/Image to SDL Surface Link to comment Share on other sites More sharing options...
John117 Posted May 18, 2009 Author Share Posted May 18, 2009 What format is the pictures in? Jpg, Bmp or.... ?I would say 95% are jpg Link to comment Share on other sites More sharing options...
John117 Posted May 18, 2009 Author Share Posted May 18, 2009 How about using a MD5 Hash or CRC? Identical files will have the same MD5/CRC value.http://www.autoitscript.com/forum/index.php?showtopic=76976 yeah, I was thinking about that using crc32 - I downloaded duplicate finder. but it found 18 dupes. I can spot that many in the first 30-40 files alone. Maybe limitation of the trial? In either case, looking up post now . . . Link to comment Share on other sites More sharing options...
LurchMan Posted May 18, 2009 Share Posted May 18, 2009 if i remember right, if you change the filename then the MD5 / SHA1 Hash will be different...been awhile since i did this in college.. Dating a girl is just like writing software. Everything's going to work just fine in the testing lab (dating), but as soon as you have contract with a customer (marriage), then your program (life) is going to be facing new situations you never expected. You'll be forced to patch the code (admit you're wrong) and then the code (wife) will just end up all bloated and unmaintainable in the end. Link to comment Share on other sites More sharing options...
John117 Posted May 18, 2009 Author Share Posted May 18, 2009 if i remember right, if you change the filename then the MD5 / SHA1 Hash will be different...been awhile since i did this in college..odly enough, I found that some of the files, have different names and are 1-5kb different in size.not sure how, much copy paste I guess . . . Link to comment Share on other sites More sharing options...
nitekram Posted May 18, 2009 Share Posted May 18, 2009 odly enough, I found that some of the files, have different names and are 1-5kb different in size.not sure how, much copy paste I guess . . .does the time stamp do anything for you? 2¢ All by me:"Sometimes you have to go back to where you started, to get to where you want to go." "Everybody catches up with everyone, eventually" "As you teach others, you are really teaching yourself." From my dad "Do not worry about yesterday, as the only thing that you can control is tomorrow." WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2 AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit Docs SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language Programming Tips Excel Changes ControlHover.UDF GDI_Plus Draw_On_Screen GDI Basics GDI_More_Basics GDI Rotate GDI Graph GDI CheckExistingItems GDI Trajectory Replace $ghGDIPDll with $__g_hGDIPDll DLL 101? Array via Object GDI Swimlane GDI Plus French 101 Site GDI Examples UEZ GDI Basic Clock GDI Detection Ternary operator Link to comment Share on other sites More sharing options...
KaFu Posted May 18, 2009 Share Posted May 18, 2009 (edited) For truly identical files give my program "SMF" a try ... If the files are not identical but similar the only program I know to be capable of identifying those is d'peg (http://www.gotdupes.com/index.cfm?page=3495&pagename=d%60peg!). I guess you could perform something similar by calculating Color Checksums with the ImageMagick Suite. Edited May 18, 2009 by KaFu OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2022-Nov-26) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Feb-16) HMW - Hide my Windows (2018-Sep-16) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2023-Jun-03) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
LurchMan Posted May 18, 2009 Share Posted May 18, 2009 No, the file name has no effect on the hashes. As long as the the file type and file content is the same, they will match regardless of the name.couldnt remember for sure...i went to college for computer forensics but ended up doing programming once i got a job lol.... Dating a girl is just like writing software. Everything's going to work just fine in the testing lab (dating), but as soon as you have contract with a customer (marriage), then your program (life) is going to be facing new situations you never expected. You'll be forced to patch the code (admit you're wrong) and then the code (wife) will just end up all bloated and unmaintainable in the end. Link to comment Share on other sites More sharing options...
John117 Posted May 18, 2009 Author Share Posted May 18, 2009 does the time stamp do anything for you?not sure, which timestamp? The created date is different, but the modified date is the same. Sometimes it was modified before it was created :-) Link to comment Share on other sites More sharing options...
John117 Posted May 18, 2009 Author Share Posted May 18, 2009 Am currently running Dupdetector - it may do the trick! :-) cleaned about a thousand so far . . . we will see. Link to comment Share on other sites More sharing options...
junkew Posted May 19, 2009 Share Posted May 19, 2009 With logic inhttp://www.autoitscript.com/forum/index.php?showtopic=66545you could write a picture comparison in autoit FAQ 31 How to click some elements, FAQ 40 Test automation with AutoIt, Multithreading CLR .NET Powershell CMDLets Link to comment Share on other sites More sharing options...
AutoBert Posted March 23, 2016 Share Posted March 23, 2016 (edited) Hash (_Crypt_HashFile) all files and compare the hash. Edited March 23, 2016 by AutoBert Link to comment Share on other sites More sharing options...
InunoTaishou Posted March 23, 2016 Share Posted March 23, 2016 Hopefully OP comes back to look at his 7 year old topic to see the solution. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now