Jump to content

Database type, ini, my sql, Sqlite


Recommended Posts

Hi

A friend of mine who works in some telephone sales firm, asked if I could create him a program that could handle data from excel, and put em in a database of some kind.
Course as it is now he have to check the spreadsheet by hand, removing empty fields, correct wrong formated addr. etc.

Then he uses a program to compare the spreadsheet to some blacklist spreadsheets to remove entries that exist in either of the blacklist sheets, from the first spreadsheet ending up with the data he wants.

What hi wanned was a program that load the xlsl file into a database, compare it to the blacklist's (witch also should be loaded as databases)
and then give him a gui in where he can sort the data by eg. zipcode, and then choose some data to be exported to an excel file, and the option to mark the data as "sold"

All in all, that should not be that complicated to create a program for, but but the spreadsheets contains from 50K to maybe 200k lines and one of the blacklists contains over a mil. lines the other 2.5 mil lines.

My problem is that I have never worked with MySql nor Sqlite, and therefor my understanding of those is very limited,  I have only used ini as db for my projects in the past.

Could i use ini files as database for all the data, without Autoit crashing and performance issues, like long loading/search/compare time

My thought was to add identical data only once in sections, and then point to that section in the data set. (suggst on how to do the effectively is appreciated :) )

As of now I haven't created any code, first of be course I don't know what type of database I should use, second be course the excel udf crashes with "rc:-1073741819" exit code "3221225477" when I try to load excel files over 30mb, so I need to find a way to prevent that :-/


Hope some of you nice Guy's and Girl's can point me in the right direction, and come with suggst. on how to do the project :)

 

Cheers
/Rex

Link to comment
Share on other sites

As far as there are one or more programs using a local SQLite database there is no real issue with concurrency or size. OTOH I don't recommend using SQLite over a network without serious safeguards, if at all.

A good way to go is to first use a SQLite manager which will allow you to plan and implement a good schema for your database, create and fine tune what needs to be examined for performance. Then only you can make an AutoIt GUI integrating everything. I strongly advise trying SQLite Expert. The free Personal Edition will get you started easily.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

@jchd

So you suggest that I use sqlite, and learn how to build a database from the bottom, is a "Real" database that much better than my first idea with use of an ini file in witch I'm familiar with and if so why? :).

As I said I have only a very limited understanding of sql databases, of course I have done some goggling but still need to do some more to understand why sql databases is so much better, then doing it the "flat file" way :)

This is not attended to be negative, and I will surly have a look at the SQLite Expert.

Cheers

/Rex

Link to comment
Share on other sites

I agree with jchd on using SQLite. If concurrency or network access is an issue, take a look at the semaphore UDF in my signature.

Link to comment
Share on other sites

@jchd

So you suggest that I use sqlite, and learn how to build a database from the bottom, is a "Real" database that much better than my first idea with use of an ini file in witch I'm familiar with and if so why? :).

​Come on, you're a developer are you not? You are skilled enough to find the answer to your own question, by Googling "flat file database vs relational database". Also INI files file are intended for configuration usage as it uses key value pairs, not a database with multiple attributes and entities (tables).

INI-file is completely rewritten every time you change at least one parameter.
SQL writes data to the file pointer, ie rewrites the only sector in the file.

​Source: http://www.autoitscript.com/forum/topic/157034-text-ini-file-or-database-which-one-should-i-use/

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

As of now I haven't created any code, first of be course I don't know what type of database I should use, second be course the excel udf crashes with "rc:-1073741819" exit code "3221225477" when I try to load excel files over 30mb, so I need to find a way to prevent that :-/

​Never seen Excel crash because of big workbooks.
Which version of AutoIt do you run?
Can you post the code you use to load the Excel file?
 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

@willichan
Yeah my goggling returned the same answer, use Sqlite

As I understand it' only him that have to use the program, so no network or multi user should be necessary.

 

@guinness

I'm not sure that I could call myself a developer, with those small programs I have made :/

I did Goggle Database vs flat file, but there was so many pros and cons for either one - that was why I ended up asking my Q. here.

Hmm that thread I missed some how, I didn't knew that an ini file was completely rewritten when an entry was made to it :o, but now I do :) thx.

 

@water

Shure, my AutoIT v is 3.3.12.0

The code I used to test with is

#include <Array.au3>
    #include <_XLSXReadToArray.au3>

    local $a, $t, $Error, $Extended
    $t = TimerInit()
        $a = _XLSXReadToArray("test.xlsx", 0, 0, 0)
    $t = TimerDiff($t)
    MsgBox(0, "Ubound", UBound($a))


    ;;or
    $t = TimerInit()
    $a = _XLSXReadToArray("test.xlsx", 0, 2, 11)
    $Error = @Error
    $Extended = @Extended
    $t = TimerDiff($t)
    ConsoleWrite("Timer = " & $t & @LF)
    ConsoleWrite("Return = " & $a & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)

Cheers

/Rex

Link to comment
Share on other sites

_XLSXReadToArray is needed when no Excel is installed on the computer you run your script. Is this the case?
If Excel is installed you could try the Excel UDF that comes with AutoIt, function _Excel_RangeRead.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

One of the benefits of building a solid database is that you don't have to always read the data in full to perform the processing you need. Once the blacklist(s) is (or are) loaded in a table (or tables, depending on your precise needs) you can rely on the power of SQL to return valid results without having to scan the full dataset (provided use of apropriate index).

Enough contributors here have good experience to guide you.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

@water

I have excel installed, and so does my friend.

I will try the excel udf instead then :) thx

@jchd

Ahh cool, then I have to find some tuts about database setups and structuring.

That what I like with this community your all so helpful, don't know what I should had done without you :huggles:

 

Cheers

/Rex

:ILA2:

Link to comment
Share on other sites

DB design is directed by requirements. So what are yours? Input format of data, relationships between informations, queries and result expected.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Well the data i need to insert is something like this:

added|ip|Gender|SureName|LastName|email|adress|zip code|city|phone1|phone2
01-05-2014 18:21|192.192.162.123|female|Cindy|Loper|Cindy@loper.com|Cindy Road 12|1234|CindyTown|12345678|
13-02-2014 16:21|123.321.456.98|Male|Jolly|Roger|Jolly@Roger.org|RogerRoad 666|9876|RogerCity||98765432

But I still don't get the the relationship part (I'm still reading about sql databases on google)

But as I understand one could create an database with all the zipcodes and city names, and the point to them in the main database?

this would then make the database smaller course the data is only entered once, and then pointed to, yes??

 

Sry for all my noob Q's

 

Cheers

/Rex

Link to comment
Share on other sites

Absolutely: this is called normalization. That (mostly) consists of designing things so that no information is duplicated in a given DB. In this precise case, [country], city, zipcode indeed deserve their own tables not only to save space --that's a good point albeit minor-- but mainly to insure consistancy. Once you've loaded a correct table with countries (should you need it), cities and zipcodes, you just can't create an address with a wrong location.

The machinery used in SQL to achieve that is to use foreign keys (FK). Instead of storing the city name in the address table, you store the identifier (typically an integer) of the city in the cities table. This way the table address references the cities table. You can even make consistancy complete by forcing update and/or delete of city identifier to follow up (cascade in SQL parlance) in tables referencing it. So if a city someday gets ruined by earthquake or alien nuke, there in no more point to keep addresses located there and an "on delete cascade" clause would automagically delete those zipcodes and addresses for you.

This is an example of relationship between DB entities.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Oh wonderful normalization with 1NF, 2NF and finally 3NF. I've never even used 4-6NF, as I was under the impression third normal form was enough. Even 1NF is better than nothing.

@Rex, think of it like this, every time you want to insert a new address, you have to insert a new user, this a bad design choice and then you have the issue of when you remove a user you loose information about that address. Also you might have people residing at the same address and thus this increases duplication and a likelihood of inconsistencies i.e. if all the people in House A change the address to House AB (postal service or something), then it's case of updating the records in multiple places, rather than one.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Ohh, but I don't have to enter anything, all the info is imported directly from excel spreadsheets.

But i guess that I still can save a lot of space by do some check on the spreadsheet when importing it, and mark zip code and city as a "FK" (Foreign Key)?

Also I was thinking of doing some thing like a common names database, just to save space and maybe make everything run faster (not that I have any idea of the speed of such program/database), In Denmark we have a lot of Hansen, Jensen, Olsen etc.

And if I understand what you guys trying to teach me correct, then all those duplicates I can combine into one record makes the db smaller and faster, yes?

But from reading on google, I can see that creating and setting up a database is quit easy - the hard part is to get the structure right. :/

 

Cheers

/Rex

beer.gif

If I could I would give you all a case of beers, as an appreciation for all your help <3

Link to comment
Share on other sites

But from reading on google, I can see that creating and setting up a database is quit easy - the hard part is to get the structure right. :/

​Yep, that's why we create Entity Relationship (ER) diagrams and from there move onto Database diagrams in which the PKs and FKs are determined, so it's easy to visualise how every entity aka table is related to one another and what relationships they have e.g. zero to many (0..*) or 1 to 1 (1..1). SQLite is easy in that sense that you have INTEGER, REAL, TEXT, NULL and BLOB and not VCHAR or NVARCHAR.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Sorry but it's a bad idea of merging duplicate lastnames. Several Hansen may share the name and "Hansen" is a data in string form but in no way an information by itself. Normalization looks at duplicate information, not duplicate strings, well, unless there are very strong benefits to expect, which is not the case here.

Here's a link to a sample DB with your 2 lines. Play with it using SQLite Expert. You'll notice that you don't have to supply a value for addrid by yourself, since it is an "integer primary key", an identifier that SQLite will fill for you if you don't supply one.

I've made gender a one char column which I force to be either M or F: look at the constraints and triggers for that one. So you can insert a row with "female" or "masculine" or "fff" or "minion" and the field will be F or M accordingly.

DateAdded defaults to current timestamp. Note that the timestamp is in ISO format: sorts correctly and is easily manipulated by SQL date/time functions. Feel free to reformat it according to your locale for human interface.

Email is checked to contain exactly one @ (complete validation of email addresses is too complicated at this stage).

These checks and constraints enforce consistancy.

Some columns are optional (I've introduced a second address line which may not be necessary) like phone numbers. Others are specified "not null" which means you have to supply something, hopefully something sensible!

See the FK relationship on zipid.

Obviously this is only a sketch and nothing forcibly close to what you actually need. Yet it's a start toy example with which you can experiment with a third-party manager like Expert, without ever having to write a single line of AutoIt code. That will only come much later once the dust has settled on the DB schema and queries that you need to fine-tune for the job at hand.

Sorry for the late reply, I've been vampirized by phone calls.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Now it's my turn to excuse for a late reply.

Wauu Jchd don't know what to say, this is wauu fantastic service.

The word Thanks isn't enough, but I cant find a word in English that fits my gratitude any better.

Cheers, with a head to the floor bow.

/Rex

 

Edited by Rex
Link to comment
Share on other sites

Wait, this is only a start. Now you have to decide what processing will take place and (if I understood you correctly) against which data then for which result.

Is that 2 lines sample an example of blacklist or data to be examined?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

The sample data is the "true" data witch must be compared up against 2 blacklists.

I have created some samples from the list I have resived form him.

"Importdata"

 

The 1. black list (official)

 

The 2. blacklist (internal)

 

I have also att. them in csv format.

The blacklists, should (i guess) be loaded into one or 2 databases, blacklist 1 (official) is close 2 1.5 mili lines, at the 2. (Internal) is about 4K lines, to make the compare run faster?

Blacklist 1 is formated by Last name, Surname and then address - but my friend said that is only the address the compare against and not the names.

Blacklist 2 is only phone numbers, and an added to blacklist date.

 

I asked him to specify him what hi wanned the end result to be, and he send me  this.

"Import:

I want to import a CSV file where i can choose what's in every row - this way i can prevent import errors

Database informations:

Info there needs to be.

IP, Date, Surname, Last Name, Addr, Zip, City, Email, Phone1, Phone2, Prospect ID*, Sold/Not Sold

The prospect ID we first get when we have send the data to a costumer, who then checks the data against there database - and then returns the result.

So if i send 10K to the costumer, he checks against the database and then returns 8k, those 8k would then have the ID, witch should then be added to the db

The db should also be able to import data from our calling system, info like Phone provider, Last contact (on each project) (the DB should be able to grow with new informations, not already in the db)

 

Data Withdraw:

I want to be able to sort the data (zip, City eg.), and also sort them at then choose specific data (eg. x in this zip code)

export as excel (or CSV)

 

Match:

Should be able to match against other db that contains the official blacklist and also a db that contains our internal blacklist

It should match against add and phone."

 

As it seems it will be a quit large project, and I'm not sure that my AutoIT skill is up to that level :'( - but if not i try I would't know for sure :D, and learn by doing has always been the best way :)

But maybe I could start simple and do a db with the import data, that ckecks against the blacklists - and then removes what is equal to the blacklist, ending up with only data not equal to the blacklists is written to the DB.

That DB should of course could be checked against the blacklist, when the blacklist is updated.

 

Cheers

/Rex

 

 

 

Edited by Melba23
Removed personal data
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...