Sign in to follow this  
Followers 0
JohnOne

pseudo logic suggestions (Solved)

44 posts in this topic

#1 ·  Posted (edited)

I'm trying to think up a way how to solve a coding issue I have been presented with.

I have not started to code yet, as I cannot even dream up the logic.

Here's the scenario...

There is a folder filled with many thousands of files, except there are that many because something (I don't know what)

went wrong with a friends software and the files became corrupted.

Good files look like this.

123_321

267_876

12_98

2223_7881

They are always numbers separated by underscore and are .lmi files (I don't think that matters)

But many files have been added and muddled, here's an example of how the file 123_321 has been damaged.

There will be files like so...

123_946

887_321

456_321

123_998

As you can see, each file has either a 123 before the underscore or a 321 after it, and the true file

that is needed is the 123_321.lmi.

I need to add that the numbers are of varying length 6-11 digits.

There are hundreds of files like this all in the one folder, for instance another good file might be

33333_44444

and its children

33333_65788

46588_44444

88888_44444

33333_11112

I cannot get my head around even some ideas.

What I am asking for is some (as the title says) pseudo logic to identify the true files.

Hopefully

J1

Edited by JohnOne

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites



JohnOne,

All the files have very similar filenames - how do you identify the "good" files? How do you tell that "123_321" is an original and "123_946" a corrupt copy? Do you have to do that manually or is there some other marker? :)

If you can somehow mark the "good" names then it should not be too difficult to devise a logic to separate the "spawn" names (he said with unwarranted confidence ;)).

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Was this caused by the disk being ejected while chkdsk was running or the pc being shutdown during chkdsk?


My UDFs are generally for me. If they aren't updated for a while, it means I'm not using them myself. As soon as I start using them again, they'll get updated.

MY PROJECTS


Active: IRC UDF, WindowEx UDF
Discontinued: GithubBubbleSort UDF

Share this post


Link to post
Share on other sites

JohnOne,

All the files have very similar filenames - how do you identify the "good" files? How do you tell that "123_321" is an original and "123_946" a corrupt copy? Do you have to do that manually or is there some other marker? :)

If you can somehow mark the "good" names then it should not be too difficult to devise a logic to separate the "spawn" names (he said with unwarranted confidence ;)).

M23

That's the crux of it, the good filenames are not known.

Only that there will be more than one with the same number before and more than one with the same number after the _

Then there will only be one with both of those numbers in it.

I've been thinking about this for too long and gone blank.

PS. look at you all MODified

Well in mucker.


AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

PS. look at you all MODified

Well in mucker.

Well, I be...

Imagine that.


♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

JohnOne,

Only that there will be more than one with the same number before and more than one with the same number after the _

Then there will only be one with both of those numbers in it

I was afraid you were going to say that. :D

I will go and have a think about it for a while. :)

And thanks for the P.S. :)

M23

Edit:

trancexx,

Stop acting surprised.... ;)

Edited by Melba23

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Have a script detect the file type using the file header and attempt to open the file with the correct program then check if the file successfully opened. I have a script for an occasions like this but it's coded for Linux :

EDIT: Are you sure the HomePortal program didn't just do something to them. Isn't there an option in Homeportal to fix it. Source: http://filext.com/file-extension/LMI

Edited by rcmaehl

My UDFs are generally for me. If they aren't updated for a while, it means I'm not using them myself. As soon as I start using them again, they'll get updated.

MY PROJECTS


Active: IRC UDF, WindowEx UDF
Discontinued: GithubBubbleSort UDF

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

@ LaCastglione & rcmaehi

I don't know for certain that they are corrupt, or how they became this way, only that they are.

Thanks.

EDIT:

The only thing I know for certain is that if there is a left side name(number) that occurs more than once

on the left, then that will be the name(number) of one of the true files for the left side.

EDIT2:

I'm thinking that (from those I found manually)

If a number is found to occur more than once on the left side, then one of the numbers to the right of one of those two

will also occur more than once.

That would be a true filename.

Edited by JohnOne

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

Did the corruption happen on the same day/run?

Do the known corrupt files have a time/date that can be assocated with the good files (i.e. were the filescreated a minute before or after the good one. Can you compare modified or accessed dates)?

Do the corrupt files have a common size?

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Perhaps, you can put all the first parts of the name into an array

and all the 2nd parts into another array.

Search the first array for strings of numbers that occur more than once

search the 2nd array for string of numbers that occur more than once

take the first result from array 1 and combine it with the fisrt result from array 2

search your folder for the new filename

move the file to a new location

loop

I'll see about writting up some code to express my thought more clearly

**edit**

I guess something like this (just a quick and dirty to clarify, I hope)

Dim $array1[1]
Dim $array2[1]
Dim $array3[1]
Dim $array4[1]
$list = _FileListToArray("somepath","*",1)
For $i = 1 To UBound($list)-1
$split = StringSplit(StringTrimRight($list[$i],4),"_")
_ArrayAdd($array1,$split[1])
_ArrayAdd($array2,$split[2])
Next
For $i = 1 To UBound($array1)-1
For $i2 = 2 To UBound($array1) -1
  If $array1[$i] == $array1[$i2] Then
   _ArrayAdd($array3,$array1[$i2])
  EndIf
Next
Next
For $i = 1 To UBound($array2)-1
For $i2 = 2 To UBound($array2) -1
  If $array2[$i] == $array2[$i2] Then
   _ArrayAdd($array4,$array2[$i2])
  EndIf
Next
Next
For $i = 1 To UBound($array3)-1
For $i2 = 2 To UBound($array4) -1
  $file = FileExists("somepath"&$array3[$i]&"_"&$array4[$i2]&".lmi")
  If $file = 1 Then
   FileMove("somepath"&$array3[$i]&"_"&$array4[$i2]&".lmi","somenewpath"&$array3[$i]&"_"&$array4[$i2]&".lmi",8)
  EndIf
Next
Next
Edited by kaotkbliss

010101000110100001101001011100110010000001101001011100110010000

001101101011110010010000001110011011010010110011100100001

My Android cat and mouse game
https://play.google.com/store/apps/details?id=com.KaosVisions.WhiskersNSqueek

We're gonna need another Timmy!

Share this post


Link to post
Share on other sites

J1,

Is each node in a good file unique to other good files?

Like,

999_444 = good file

143_444 = bad file

999_547 = bad file

847_444 is it possible for this to be a good file?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Cheers for code kaotkbliss I will take a look post haste.

kylomas ,

No, if 999_444 was a good file then no other file that ends 444 is good

I suppose it is True that all filenames are unique, including the good files.

Thank you kindly for input


AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

suppose mine is deleting where kaotik is adding, but same thought

#include <array.au3>

Global $Array[5]
Global $column1[5]
Global $column2[5]


$Array[0]= "332333_14631"
$Array[1]= "88888_44444"
$Array[2]= "46588_44444"
$Array[3]= "33333_65788"
$Array[4]= "33333_11112"

for $i = 0 to ubound($Array) - 1
    $Temp = stringsplit ($Array[$i] , "_")
    $column1[$i] = $Temp[1]
    $column2[$i] = $Temp[2]
    next
    
local $FOUND = 0

for $k = ubound($column1) - 1 to 0 step -1
for $i = ubound($column1) - 1 to 0 step -1
    
if $column1[$k] = $column1[$i] AND $i <> $k then $FOUND = 1
    
Next

If $FOUND = 1 Then
   $FOUND = 0

Else
     _ArrayDelete($Column1 , $k)

  EndIf

Next



for $k = ubound($column2) - 1 to 0 step -1
for $i = ubound($column2) - 1 to 0 step -1
    
if $column2[$k] = $column2[$i] AND $i <> $k then $FOUND = 1
    
Next

If $FOUND = 1 Then
   $FOUND = 0

Else
     _ArrayDelete($Column2 , $k)

  EndIf

Next



$Answer = $Column1[0] & "_" & $Column2[0]

msgbox (0, '' , $Answer)
Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

J1,

Have you listed all the suspect file names, sorted them and looked through the names?

kylomas

Edit: additional info - J1 if the filenames are truly unique then what boththose proposes is where I was going. However,

I suppose it is True that all filenames are unique, including the good files.

Does not sound real certain...

Good Luck !

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

I can detect them easily manually

if a see a file

123_456

and another file

123_789

I know the first part of a good file is 123_

Next if I see a file

567_456

I know the good file is

123_456

That is how it has panned out for manual/visual search.

Edited by JohnOne

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

J1,

Then this is true:

if there are more than one of the first node and more than one of the second node then the filename containing both nodes is the good file...

I believe that boththose has the solution for that...

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0