trescon Posted October 20, 2012 Share Posted October 20, 2012 (edited) Good evening, I would have to ask you a little help. So I have 4 files to manage, in the sense that these files are there 'a common field for all that allows me to identify them so that they can be merged into a single file. The files consist of data in csv format to a file length of even 120,000 records. I then I need suggestions to find the best way to make an effective and rapid search through all records of the four files in order to merge the data of the four files with the same unique code in a single file. Does anyone have suggestions? thanks Alberto Edited October 20, 2012 by trescon Thank You Alberto --------------------------------------------------- I am translate with Google. Link to comment Share on other sites More sharing options...
water Posted October 20, 2012 Share Posted October 20, 2012 I would suggest to import the CSV files into a database. They are made for searching and merging big amount of data. My UDFs and Tutorials: Spoiler UDFs:Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsOutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - DownloadOutlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - WikiPowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - WikiTask Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs:Excel - Example Scripts - WikiWord - Wiki Tutorials:ADO - WikiWebDriver - Wiki Link to comment Share on other sites More sharing options...
trescon Posted October 20, 2012 Author Share Posted October 20, 2012 Yes, of course, but I want to do it with autoit. I want to do with autoit also because 'I must try some data on that large file, I do not always handle everything after the research that I make records to manage than this enso large files that are not more' than 400/500. My problem is' to know what is a reasonable algorithm to search for data within a file loaded for better management in an array. thanks Alberto Thank You Alberto --------------------------------------------------- I am translate with Google. Link to comment Share on other sites More sharing options...
AZJIO Posted October 20, 2012 Share Posted October 20, 2012 My other projects or all Link to comment Share on other sites More sharing options...
water Posted October 21, 2012 Share Posted October 21, 2012 Are the data files sorted on the common field? My UDFs and Tutorials: Spoiler UDFs:Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsOutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - DownloadOutlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - WikiPowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - WikiTask Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs:Excel - Example Scripts - WikiWord - Wiki Tutorials:ADO - WikiWebDriver - Wiki Link to comment Share on other sites More sharing options...
trescon Posted October 21, 2012 Author Share Posted October 21, 2012 I'm sorry, what do you mean by common ground? Thank You Alberto --------------------------------------------------- I am translate with Google. Link to comment Share on other sites More sharing options...
water Posted October 21, 2012 Share Posted October 21, 2012 You are talking about that "there is a common field for all that allows me to identify them". Are the data records sorted by this field? My UDFs and Tutorials: Spoiler UDFs:Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsOutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - DownloadOutlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - WikiPowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - WikiTask Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs:Excel - Example Scripts - WikiWord - Wiki Tutorials:ADO - WikiWebDriver - Wiki Link to comment Share on other sites More sharing options...
trescon Posted October 21, 2012 Author Share Posted October 21, 2012 You are talking about that "there is a common field for all that allows me to identify them". Are the data records sorted by this field?not, this field is a "code" number with a tendency to grow with every new product added to the range (the range goes from code "0000000" to "110000").this area code and 'the only field in common between 4 archives de vo manage, to manage in which I intend to do some research based on the "code".I take from 4 files, descriptions, quantities, stocks and prices, each archive can 'contain one or more of these information.Archives (4) do not all have the same length, both of records of different fields.The largest archive and 'the one that contains the registry of the article, the other files instead may contain important dates and stocks that I need to discrimination.Files can be some csv or excel format.I hope I have cleared things up a bit.thanksAlberto Thank You Alberto --------------------------------------------------- I am translate with Google. Link to comment Share on other sites More sharing options...
water Posted October 21, 2012 Share Posted October 21, 2012 If the files you want to process can be in Excel format you could use My UDFs and Tutorials: Spoiler UDFs:Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsOutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - DownloadOutlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - WikiPowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - WikiTask Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs:Excel - Example Scripts - WikiWord - Wiki Tutorials:ADO - WikiWebDriver - Wiki Link to comment Share on other sites More sharing options...
trescon Posted October 21, 2012 Author Share Posted October 21, 2012 The files are in CSV format and not in excel. Thank You Alberto --------------------------------------------------- I am translate with Google. Link to comment Share on other sites More sharing options...
jchd Posted October 22, 2012 Share Posted October 22, 2012 You can import your 4 CSV files into 4 SQLite tables (an easy-to-use RDBMS engine directly manageable using AutoIt). Doing so is easy thanks to the command-line executable, SQLite3.exe. Once all data is loaded in a database, you can create the final table consisting of all the columns you need and use SQL to merge data there according to your needs and rules. This (presumably a one-time task) will be easier if you use a good SQLite DB manager, like SQLite Expert Personal (freeware version). When you have the final table(s) ready, you can use AutoIt to routinely manipulate/query/update your SQLite database. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
water Posted October 22, 2012 Share Posted October 22, 2012 In post #8 you statet: "Files can be some csv or excel format." Anyway, as jchd and I suggested for "data analysis" use a SQL database. My UDFs and Tutorials: Spoiler UDFs:Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsOutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - DownloadOutlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - WikiPowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - WikiTask Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs:Excel - Example Scripts - WikiWord - Wiki Tutorials:ADO - WikiWebDriver - Wiki Link to comment Share on other sites More sharing options...
trescon Posted October 22, 2012 Author Share Posted October 22, 2012 You can import your 4 CSV files into 4 SQLite tables (an easy-to-use RDBMS engine directly manageable using AutoIt). Doing so is easy thanks to the command-line executable, SQLite3.exe.Once all data is loaded in a database, you can create the final table consisting of all the columns you need and use SQL to merge data there according to your needs and rules. This (presumably a one-time task) will be easier if you use a good SQLite DB manager, like SQLite Expert Personal (freeware version).When you have the final table(s) ready, you can use AutoIt to routinely manipulate/query/update your SQLite database.In post #8 you statet: "Files can be some csv or excel format."Anyway, as jchd and I suggested for "data analysis" use a SQL database.Good evening, meanwhile, thanks for your suggestions, but they are too complex for my knowledge, and perhaps still are too "powerful" than it is to me.I just have to find within all four files, the data corresponding to each code in my possession (present in one-fifth rows) and combine them all on one record, but in practice I do four surveys in four rows and when I find equality with my data i copy all the data that is connected to a common file.I hope I was clear and I hope some suggestions on your part.Both for the best technique to do the research is to create any INDEXES if necessary.Best Regards Thank You Alberto --------------------------------------------------- I am translate with Google. Link to comment Share on other sites More sharing options...
water Posted October 23, 2012 Share Posted October 23, 2012 If a database is too complex then maybe the problem is too complex as well To write good and manageable code you need an algorithm that is appropriate for your problem. To handle half a million records you need more then just a few loops reading through the files. I would suggest to import the CSV files into a database and then do all the necessary processing. Just my 0.02$ worth. My UDFs and Tutorials: Spoiler UDFs:Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsOutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - DownloadOutlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - WikiPowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - WikiTask Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs:Excel - Example Scripts - WikiWord - Wiki Tutorials:ADO - WikiWebDriver - Wiki Link to comment Share on other sites More sharing options...
jchd Posted October 23, 2012 Share Posted October 23, 2012 Just like Water I also stand by my answer. As you can see we're willing to help you learning how to perform elegantly and powerfully this kind of task. Not only you'll learn a lot but experience shows that over time, needs grow and get more complex. Hence putting this on solid feet from the ground up is looking forward in time. But then, we can't (and won't) force your choice. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now