Sign in to follow this  
Followers 0
leuce

Wrap text file in an EXE, searchable

13 posts in this topic

#1 ·  Posted (edited)

G'day everyone

I would like to write a script that allows me to distribute a large text file (a dictionary, to be exact) inside an EXE file so that the user can search the text file (and have results exported to a separate text or HTML file that opens in his browser), but so that the user can't copy (read: steal) the text file easily.

(I'm trying to encourage a few small local publishers and authors of useful dictionaries who are concerned about intellectual property theft to release their reference material in electronic format. This is not meant as a profitable venture for me, but paper dictionaries are useless to me and some colleagues. The EXE file will be loaded with some kind of copy-protection, and I see there are some threads on that topic in the forums.)

I can't use FileInstall because some clever user will figure out that the dictionary is extracted to his hard drive, and then he'll share it with all those who didn't want to pay the author any money. So the dictionary has to be inside the EXE and it must not come out of the EXE at any time. The dictionaries probably won't exceed about 5 MB of plain text.

The script doesn't have to have a GUI -- I'm happy to have users double-click the EXE file and type their query in an InputBox every time. So my only concern at this stage is how to wrap a text file in the EXE file and have it searchable by the script.

Do you have any idea how this can be done?

One option is to put the entire dictionary in a single string, and use stringfinding functions to search the string. I know I can use StringInStr to test whether a value is in a string, but how can I tell the script to add the result plus 100 character before and after it to another string (so that I can write it to a file)? I guess I can use a stack of StringSplit functions to do that. Alternatively I could use StringRegExpReplace to find everyting that does not match the search term (minus about 100 characters on either side) and remove it from the dictionary string, before writing the dictionary string to a text file.

I can also add the dictionary to an array but I don't know how to do regex searches within the items of an array. Is that even possible? I can then add the array items to another string, which I can write to a file.

Alternatively, I'm hoping that there is some way to compile the text file into the EXE and then have the script interact with it as if it were a separate file (because then I can use all kinds of tricks to get data from it).

What do you suggest?

Thanks

Samuel

Edited by leuce

Share this post


Link to post
Share on other sites



Sounds interesting and I have no idea how to go about doing that but I do know of a Resource UDF which lets you embed data into the script. I have never used it and not sure if you could use it like that but worth a try.

Share this post


Link to post
Share on other sites

You could encrypt the text file. When the EXE launches, it will decrypt the parts needed into memory. If the user tries to capture the encrypted file, they won't be able to do much with it. Example:

#include <String.au3>

$string = "This is a definition"

$e = _StringEncrypt(1, $string, "abc123", 1)

MsgBox(0, "", $e)

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

You can use my Resource UDF for storing/getting large texts into compiled EXE file

but there is problem with copy protection.

You can encrypt your text file and add it into resources in encrypted form.

At runtime you can get this whole text into variable and decrypt its content in memory.

Then you may use for example StringSplit or search that variable with RegExp directly.

But anyway as was written on this forum about decompilation

there is problem to make compiled scripts resistant against decompilation/deobfuscation etc.

Also another problem is that your whole text will be at some stage stored in memory in decrypted form

so some experienced hacker can copy it from memory.

Also storing any passwords/keys (for decryption) inside script/compiled EXE is not recommended if it's very important.

So this way (encrypted text file stored in resources) really unexperienced user has no chance to get it

but experinced people can get it.

EDIT:

simple example for use my UDF for that (without encryption)

#AutoIt3Wrapper_useupx=n
#AutoIt3Wrapper_run_after=ResHacker.exe -add %out%, %out%, test_1.txt, rcdata, TEST_TXT_1, 0
#AutoIt3Wrapper_run_after=upx.exe --best --compress-resources=0 "%out%"

#include "resources.au3"
 
$string = _ResourceGetAsString("TEST_TXT_1")
MsgBox(0, 'Text from resource', $string)
Edited by Zedna

Share this post


Link to post
Share on other sites

Hi,

At runtime you can get this whole text into variable and decrypt its content in memory.

There is no need to decrypt the data in memory.

The user types the word into the inputbox, the Script encrypts the word, searches for the best translation (which is encrypted) and decrypts it and displays it i.e. in a Messagebox.

Andy

Share this post


Link to post
Share on other sites

Hi, There is no need to decrypt the data in memory.

The user types the word into the inputbox, the Script encrypts the word, searches for the best translation (which is encrypted) and decrypts it and displays it i.e. in a Messagebox.

Andy

It would have to be very simple, (ie piss poor) encryption for that to work IMO :)


Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites

You could encrypt the text file. When the EXE launches, it will decrypt the parts needed into memory. If the user tries to capture the encrypted file, they won't be able to do much with it.

Thanks, although I don't see how this might be put into practice unless the user does exact searches (searches that match encrypted strings exactly). Do I understand correctly? I suppose this would be useful if users wanted to search only the headwords (lemmas, terms) of dictionary entries, and not the body (definition) section. In my case, however, full-text search would be preferable. But thanks for the tip.

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

It would have to be very simple, (ie piss poor) encryption for that to work IMO :) If I understand correctly, the only way this might work is if each word of the dictionary is encrypted individually, and even then the user has to type in the exact word. Then, when the result is displayed, each word from the dictionary entry has to be decrypted individually before displaying it.

This might not be a good idea if one wants to keep the information secret (because a brute force attack will decrypt enough words to allow the cryptologist to guess the sentences), but it may be sufficient if one only wants to prevent someone from copying and redistributing the file without the EXE (which is what I want). What do you think?

ADDED: Using Volley's snippet...

#include <String.au3>
$string1 = "House"
$string2 = "a"
$string3 = "building"
$string4 = "people"
$string5 = "live"
$string6 = "in"
$e1 = _StringEncrypt(1, $string1, "abc123", 1)
$e2 = _StringEncrypt(1, $string2, "abc123", 1)
$e3 = _StringEncrypt(1, $string3, "abc123", 1)
$e4 = _StringEncrypt(1, $string4, "abc123", 1)
$e5 = _StringEncrypt(1, $string5, "abc123", 1)
$e6 = _StringEncrypt(1, $string6, "abc123", 1)
InputBox ("", "", $e1 & " : " & $e2 & " " & $e3 & " " & $e4 & " " & $e5 & " " & $e6)

#cs

Original = House : a building people live in .

Encrypted = D0028EFCDB57C0B30B11 : D274 D2778F8DDC5BC7C70B167E057AC26DF7 D1758E8DDC5DC0CB0B617E70 D2798E81DB2FC7B1 D2058EFF

So "building" = "D2778F8DDC5BC7C70B167E057AC26DF7"

#ce

$string7 = "BUILDING"
$string8 = "Building"
$string9 = "building"
$e7 = _StringEncrypt(1, $string7, "abc123", 1)
$e8 = _StringEncrypt(1, $string8, "abc123", 1)
$e9 = _StringEncrypt(1, $string9, "abc123", 1)
InputBox ("", "", $e7 & " | " & $e8 & " | " & $e9)

#cs

D0778D8DDE5BC5C709167C057CC216F7 | D0778F8DDC5BC7C70B167E057AC26DF7 | D2778F8DDC5BC7C70B167E057AC26DF7

#ce

I could create a list of short words that I will not allow the user to search for.

I could create a plaintext file with all the dictionary's words in it (listed alphabetically) which can be checked against the unencrypted search word to see if the word occurs in the larger, encrypted file before searching for it in the encrypted file.

I would have to convert the user's input to a generic format (eg lowercase, uppercase, title case) to cover more searches. But this method won't allow regex searches.

Edited by leuce

Share this post


Link to post
Share on other sites

You can use my Resource UDF for storing/getting large texts into compiled EXE file but there is problem with copy protection. You can encrypt your text file and add it into resources in encrypted form. At runtime you can get this whole text into variable and decrypt its content in memory. Then you may use for example StringSplit or search that variable with RegExp directly.

Thanks, this looks like the most promising method. In fact, if the text file is inside the EXE file, it is sufficiently out of reach for my liking and I don't think it needs to be encrypted further. It all depends on how fast the EXE loads on slow computers. Thanks again for this.

Share this post


Link to post
Share on other sites

Encryption means that decryption is also possible. The only reliable method is to do a hash on the search term, and compare to a hashed dictionary. The problem with the hash is the hash must be unique.

Share this post


Link to post
Share on other sites

Thanks, this looks like the most promising method.  In fact, if the text file is inside the EXE file, it is sufficiently out of reach for my liking and I don't think it needs to be encrypted further.  It all depends on how fast the EXE loads on slow computers.  Thanks again for this.

If you add text file into resources without encryption then people will see all texts directly by looking inside compiled EXE file by any viewer.

You can try it yourself. Just add text file and look at it.

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Hi,

This is an example of a dictionary inside the exe. To "read" the dictionary words and sentences the user has to decompile the exe (to find the algorithm) ....if you are afraid of that, i think AutoIt is the wrong Program to publish Exe-Files

#include <String.au3>

#include <Array.au3>

;very simple example to use encrypted text in a dictionary

;example of encryptet text inserted into the exe-file
$dict = "CC43D2AEF2BD1397=CC34D6DAF3CE1394ED5B141CA72ABCCD46A5A9E2CCFCE558E329693839DA457167FE837DBB0084EA93E6D34D97681538A76BF4D578D464D2D59524F2937BB6B60B367A9F9C9F5CFF0A64529A2C447989B4A4EFCAC6E9FB35DBF657EA2FDF9D665DC2705C1178FF13" & Chr(0) & _
        "CC30D2DBF3CF12E4ED2B=CC34D6DAF3CD1395ED5B141EA758C0C93AD6ADEBCD85E52AE35E6D3939DB380E638E827DBF74879E92E1D44F97691433A01AF4D578D760DBD39C5FF8927CB7B70F407A9DE1EE58F20A6453EC2C48"


$oDictionary = ObjCreate('Scripting.Dictionary')  ;dictionary
$array = StringSplit($dict, Chr(0), 3) ;splits $dict at chr(0) or any other character which is not used in a text
;_arraydisplay($array)
For $i = 0 To UBound($array) - 1 ;fill the dictionary
    $part = StringSplit($array[$i], "=", 3) ;or any other charakter which is not used in a text
    ;_arraydisplay($part)
    $oDictionary.Add($part[0], $part[1])
Next

While 1
    $searchfor = InputBox("Search", "Type the word you are looking for" & @CRLF & " ...try Earth...try Mars", "Moon")
    If @error Then ExitLoop
    ;ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $test = ' & $test & @crlf & '>Error code: ' & @error & @crlf) ;### Debug Console
    If $oDictionary.Exists(_encrypt($searchfor)) Then
        MsgBox(0, $searchfor, _decrypt($oDictionary.Item(_encrypt($searchfor))))
    Else
        MsgBox(0, $searchfor, "is not found in the dictionary")
    EndIf
WEnd

Func _encrypt($x) ;simple encryption
    Return _StringEncrypt(1, $x, "Password", 1) ;or any other useful encrypting method
EndFunc   ;==>_encrypt

Func _decrypt($x)
    Return _StringEncrypt(0, $x, "Password", 1)
EndFunc   ;==>_decrypt
Andy Edited by AndyG

Share this post


Link to post
Share on other sites

I would like to know how you eventually do it so please do keep us updated.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0