JohnOne Posted December 16, 2011 Share Posted December 16, 2011 (edited) I'm trying to think up a way how to solve a coding issue I have been presented with.I have not started to code yet, as I cannot even dream up the logic.Here's the scenario...There is a folder filled with many thousands of files, except there are that many because something (I don't know what)went wrong with a friends software and the files became corrupted.Good files look like this.123_321267_87612_982223_7881They are always numbers separated by underscore and are .lmi files (I don't think that matters)But many files have been added and muddled, here's an example of how the file 123_321 has been damaged.There will be files like so...123_946887_321456_321123_998As you can see, each file has either a 123 before the underscore or a 321 after it, and the true filethat is needed is the 123_321.lmi.I need to add that the numbers are of varying length 6-11 digits.There are hundreds of files like this all in the one folder, for instance another good file might be33333_44444and its children33333_6578846588_4444488888_4444433333_11112I cannot get my head around even some ideas.What I am asking for is some (as the title says) pseudo logic to identify the true files.HopefullyJ1 Edited December 16, 2011 by JohnOne AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted December 16, 2011 Moderators Share Posted December 16, 2011 JohnOne,All the files have very similar filenames - how do you identify the "good" files? How do you tell that "123_321" is an original and "123_946" a corrupt copy? Do you have to do that manually or is there some other marker? If you can somehow mark the "good" names then it should not be too difficult to devise a logic to separate the "spawn" names (he said with unwarranted confidence ).M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
jaberwacky Posted December 16, 2011 Share Posted December 16, 2011 (edited) Sort the files by date created? Corrupt files will be newer than non-corrupt. That's my first stab. Edited December 16, 2011 by LaCastiglione Helpful Posts and Websites: AutoIt3 Variables and Function Parameters MHz | AutoIt Wiki | Using the GUIToolTip UDF BrewManNH | Can't find what you're looking for on the Forum? Link to comment Share on other sites More sharing options...
rcmaehl Posted December 16, 2011 Share Posted December 16, 2011 Was this caused by the disk being ejected while chkdsk was running or the pc being shutdown during chkdsk? My UDFs are generally for me. If they aren't updated for a while, it means I'm not using them myself. As soon as I start using them again, they'll get updated.My Projects WhyNotWin11Cisco Finesse, Github, IRC UDF, WindowEx UDF Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 JohnOne,All the files have very similar filenames - how do you identify the "good" files? How do you tell that "123_321" is an original and "123_946" a corrupt copy? Do you have to do that manually or is there some other marker? If you can somehow mark the "good" names then it should not be too difficult to devise a logic to separate the "spawn" names (he said with unwarranted confidence ).M23That's the crux of it, the good filenames are not known.Only that there will be more than one with the same number before and more than one with the same number after the _Then there will only be one with both of those numbers in it.I've been thinking about this for too long and gone blank.PS. look at you all MODified Well in mucker. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
trancexx Posted December 16, 2011 Share Posted December 16, 2011 PS. look at you all MODifiedWell in mucker.Well, I be...Imagine that. ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted December 16, 2011 Moderators Share Posted December 16, 2011 (edited) JohnOne,Only that there will be more than one with the same number before and more than one with the same number after the _ Then there will only be one with both of those numbers in itI was afraid you were going to say that. I will go and have a think about it for a while. And thanks for the P.S. M23Edit:trancexx,Stop acting surprised.... Edited December 16, 2011 by Melba23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
rcmaehl Posted December 16, 2011 Share Posted December 16, 2011 (edited) Have a script detect the file type using the file header and attempt to open the file with the correct program then check if the file successfully opened. I have a script for an occasions like this but it's coded for Linux :EDIT: Are you sure the HomePortal program didn't just do something to them. Isn't there an option in Homeportal to fix it. Source: http://filext.com/file-extension/LMI Edited December 16, 2011 by rcmaehl My UDFs are generally for me. If they aren't updated for a while, it means I'm not using them myself. As soon as I start using them again, they'll get updated.My Projects WhyNotWin11Cisco Finesse, Github, IRC UDF, WindowEx UDF Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 (edited) @ LaCastglione & rcmaehi I don't know for certain that they are corrupt, or how they became this way, only that they are. Thanks. EDIT: The only thing I know for certain is that if there is a left side name(number) that occurs more than once on the left, then that will be the name(number) of one of the true files for the left side. EDIT2: I'm thinking that (from those I found manually) If a number is found to occur more than once on the left side, then one of the numbers to the right of one of those two will also occur more than once. That would be a true filename. Edited December 16, 2011 by JohnOne AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Mellon Posted December 16, 2011 Share Posted December 16, 2011 Did the corruption happen on the same day/run? Do the known corrupt files have a time/date that can be assocated with the good files (i.e. were the filescreated a minute before or after the good one. Can you compare modified or accessed dates)? Do the corrupt files have a common size? Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 All dates and sizes are the same. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
kaotkbliss Posted December 16, 2011 Share Posted December 16, 2011 (edited) Perhaps, you can put all the first parts of the name into an array and all the 2nd parts into another array. Search the first array for strings of numbers that occur more than once search the 2nd array for string of numbers that occur more than once take the first result from array 1 and combine it with the fisrt result from array 2 search your folder for the new filename move the file to a new location loop I'll see about writting up some code to express my thought more clearly **edit** I guess something like this (just a quick and dirty to clarify, I hope) Dim $array1[1] Dim $array2[1] Dim $array3[1] Dim $array4[1] $list = _FileListToArray("somepath","*",1) For $i = 1 To UBound($list)-1 $split = StringSplit(StringTrimRight($list[$i],4),"_") _ArrayAdd($array1,$split[1]) _ArrayAdd($array2,$split[2]) Next For $i = 1 To UBound($array1)-1 For $i2 = 2 To UBound($array1) -1 If $array1[$i] == $array1[$i2] Then _ArrayAdd($array3,$array1[$i2]) EndIf Next Next For $i = 1 To UBound($array2)-1 For $i2 = 2 To UBound($array2) -1 If $array2[$i] == $array2[$i2] Then _ArrayAdd($array4,$array2[$i2]) EndIf Next Next For $i = 1 To UBound($array3)-1 For $i2 = 2 To UBound($array4) -1 $file = FileExists("somepath"&$array3[$i]&"_"&$array4[$i2]&".lmi") If $file = 1 Then FileMove("somepath"&$array3[$i]&"_"&$array4[$i2]&".lmi","somenewpath"&$array3[$i]&"_"&$array4[$i2]&".lmi",8) EndIf Next Next Edited December 16, 2011 by kaotkbliss 010101000110100001101001011100110010000001101001011100110010000 001101101011110010010000001110011011010010110011100100001 My Android cat and mouse gamehttps://play.google.com/store/apps/details?id=com.KaosVisions.WhiskersNSqueek We're gonna need another Timmy! Link to comment Share on other sites More sharing options...
kylomas Posted December 16, 2011 Share Posted December 16, 2011 J1, Is each node in a good file unique to other good files? Like, 999_444 = good file 143_444 = bad file 999_547 = bad file 847_444 is it possible for this to be a good file? kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 Cheers for code kaotkbliss I will take a look post haste.kylomas , No, if 999_444 was a good file then no other file that ends 444 is goodI suppose it is True that all filenames are unique, including the good files.Thank you kindly for input AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
iamtheky Posted December 16, 2011 Share Posted December 16, 2011 (edited) suppose mine is deleting where kaotik is adding, but same thought expandcollapse popup#include <array.au3> Global $Array[5] Global $column1[5] Global $column2[5] $Array[0]= "332333_14631" $Array[1]= "88888_44444" $Array[2]= "46588_44444" $Array[3]= "33333_65788" $Array[4]= "33333_11112" for $i = 0 to ubound($Array) - 1 $Temp = stringsplit ($Array[$i] , "_") $column1[$i] = $Temp[1] $column2[$i] = $Temp[2] next local $FOUND = 0 for $k = ubound($column1) - 1 to 0 step -1 for $i = ubound($column1) - 1 to 0 step -1 if $column1[$k] = $column1[$i] AND $i <> $k then $FOUND = 1 Next If $FOUND = 1 Then $FOUND = 0 Else _ArrayDelete($Column1 , $k) EndIf Next for $k = ubound($column2) - 1 to 0 step -1 for $i = ubound($column2) - 1 to 0 step -1 if $column2[$k] = $column2[$i] AND $i <> $k then $FOUND = 1 Next If $FOUND = 1 Then $FOUND = 0 Else _ArrayDelete($Column2 , $k) EndIf Next $Answer = $Column1[0] & "_" & $Column2[0] msgbox (0, '' , $Answer) Edited December 16, 2011 by boththose ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
kylomas Posted December 16, 2011 Share Posted December 16, 2011 (edited) J1,Have you listed all the suspect file names, sorted them and looked through the names?kylomasEdit: additional info - J1 if the filenames are truly unique then what boththose proposes is where I was going. However,I suppose it is True that all filenames are unique, including the good files.Does not sound real certain...Good Luck ! Edited December 16, 2011 by kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 (edited) I can detect them easily manually if a see a file 123_456 and another file 123_789 I know the first part of a good file is 123_ Next if I see a file 567_456 I know the good file is 123_456 That is how it has panned out for manual/visual search. Edited December 16, 2011 by JohnOne AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
kylomas Posted December 16, 2011 Share Posted December 16, 2011 J1, Then this is true: if there are more than one of the first node and more than one of the second node then the filename containing both nodes is the good file... I believe that boththose has the solution for that... kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 I'll have a try, I'm still waiting for kaotkbliss code to finish, I guess the _array* functions slow things up. I have a selection 500 files in the test folder. Thanks for code bothose AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
JohnOne Posted December 16, 2011 Author Share Posted December 16, 2011 See, now I just could not get my head in this, I tried tons of similar code but nothing was turning out as expected. Mucho thank you all, I definitely have enough to implement this now. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now