Jump to content

data search in multi file


trescon
 Share

Recommended Posts

Good evening, I would have to ask you a little help.

So I have 4 files to manage, in the sense that these files are there 'a common field for all that allows me to identify them so that they can be merged into a single file.

The files consist of data in csv format to a file length of even 120,000 records.

I then I need suggestions to find the best way to make an effective and rapid search through all records of the four files in order to merge the data of the four files with the same unique code in a single file.

Does anyone have suggestions?

thanks

Alberto

Edited by trescon

Thank You

Alberto

---------------------------------------------------

I am translate with Google.

Link to comment
Share on other sites

I would suggest to import the CSV files into a database. They are made for searching and merging big amount of data.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Yes, of course, but I want to do it with autoit.

I want to do with autoit also because 'I must try some data on that large file, I do not always handle everything after the research that I make records to manage than this enso large files that are not more' than 400/500.

My problem is' to know what is a reasonable algorithm to search for data within a file loaded for better management in an array.

thanks

Alberto

Thank You

Alberto

---------------------------------------------------

I am translate with Google.

Link to comment
Share on other sites

Are the data files sorted on the common field?

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

You are talking about that "there is a common field for all that allows me to identify them". Are the data records sorted by this field?

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

You are talking about that "there is a common field for all that allows me to identify them". Are the data records sorted by this field?

not, this field is a "code" number with a tendency to grow with every new product added to the range (the range goes from code "0000000" to "110000").

this area code and 'the only field in common between 4 archives de vo manage, to manage in which I intend to do some research based on the "code".

I take from 4 files, descriptions, quantities, stocks and prices, each archive can 'contain one or more of these information.

Archives (4) do not all have the same length, both of records of different fields.

The largest archive and 'the one that contains the registry of the article, the other files instead may contain important dates and stocks that I need to discrimination.

Files can be some csv or excel format.

I hope I have cleared things up a bit.

thanks

Alberto

Thank You

Alberto

---------------------------------------------------

I am translate with Google.

Link to comment
Share on other sites

If the files you want to process can be in Excel format you could use

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

You can import your 4 CSV files into 4 SQLite tables (an easy-to-use RDBMS engine directly manageable using AutoIt). Doing so is easy thanks to the command-line executable, SQLite3.exe.

Once all data is loaded in a database, you can create the final table consisting of all the columns you need and use SQL to merge data there according to your needs and rules. This (presumably a one-time task) will be easier if you use a good SQLite DB manager, like SQLite Expert Personal (freeware version).

When you have the final table(s) ready, you can use AutoIt to routinely manipulate/query/update your SQLite database.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

In post #8 you statet: "Files can be some csv or excel format."

Anyway, as jchd and I suggested for "data analysis" use a SQL database.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

You can import your 4 CSV files into 4 SQLite tables (an easy-to-use RDBMS engine directly manageable using AutoIt). Doing so is easy thanks to the command-line executable, SQLite3.exe.

Once all data is loaded in a database, you can create the final table consisting of all the columns you need and use SQL to merge data there according to your needs and rules. This (presumably a one-time task) will be easier if you use a good SQLite DB manager, like SQLite Expert Personal (freeware version).

When you have the final table(s) ready, you can use AutoIt to routinely manipulate/query/update your SQLite database.

In post #8 you statet: "Files can be some csv or excel format."

Anyway, as jchd and I suggested for "data analysis" use a SQL database.

Good evening, meanwhile, thanks for your suggestions, but they are too complex for my knowledge, and perhaps still are too "powerful" than it is to me.

I just have to find within all four files, the data corresponding to each code in my possession (present in one-fifth rows) and combine them all on one record, but in practice I do four surveys in four rows and when I find equality with my data i copy all the data that is connected to a common file.

I hope I was clear and I hope some suggestions on your part.

Both for the best technique to do the research is to create any INDEXES if necessary.

Best Regards

Thank You

Alberto

---------------------------------------------------

I am translate with Google.

Link to comment
Share on other sites

If a database is too complex then maybe the problem is too complex as well :unsure:

To write good and manageable code you need an algorithm that is appropriate for your problem. To handle half a million records you need more then just a few loops reading through the files.

I would suggest to import the CSV files into a database and then do all the necessary processing.

Just my 0.02$ worth.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Just like Water I also stand by my answer. As you can see we're willing to help you learning how to perform elegantly and powerfully this kind of task. Not only you'll learn a lot but experience shows that over time, needs grow and get more complex. Hence putting this on solid feet from the ground up is looking forward in time.

But then, we can't (and won't) force your choice.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...