Jump to content

StringRegexp beating stringinstr() and c++ iostream by miles


Recommended Posts

If stringinstr($text, "substr") on a big text(the one i use) will average about 150-200 milisecs to loop through the text and finally finishing scanning text for string.

The following c++ example will average about 15-30 milisecs.

#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{


string line;
size_t found;

ifstream myfile ("C:/Users/xxx/xxx/xxx/x/c++examples/io/iotexst/iotexst/file.xml");
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
found = line.find("something that cant be found so we put a real test");

if (found!=string::npos)
cout << "first 'name' found at: " << int(found) << endl;
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
//return 0;
}

But what is most interesting of all that on the same file readed into string while using if StringRegexp(readed_text, "substr",0) the average time to finish will be 3-8 milisecs!

My first guess is that stringregexp is actually that faster because of the way it figures out that the text cant be found by first trying to match the starting chars of the substring.

Edited by Aktonius
Link to comment
Share on other sites

They should be using the kmp algorithm, so I am very surprised they are that slow. Particularly the C code. At the end of the day you have to compile the regex so it should always be possible to write equivalent code, or faster, in a low level language.

Edited by Mat
Link to comment
Share on other sites

on a big text

Have you considered the impact of caching? Try timing the same scan 1000 times for each method and see what the results are.

Whenever someone says "pls" because it's shorter than "please", I say "no" because it's shorter than "yes".

Link to comment
Share on other sites

  • 2 weeks later...

if you read the whole text and scanning it, instead of doing sequential reads, the c++ example will be much quicker.

im guessing, for comparison, you autoit script doesn't do something like:

while not @error
stringregexp(filereadline(file), ...)

Ever wanted to call functions in another process? ProcessCall UDFConsole stuff: Console UDFC Preprocessor for AutoIt OMG

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...