Jump to content

searching using a reg expression


Recommended Posts

First, I know nothing about constructing a reg expression.

Second, I am guilty of not doing much research ahead of time. I am pressed for time in a project and I need a function for my script that will search a string and validate it has 2 of the exact matches, just in reverse order. Currently, the string being searched has to have an exact match to be true.

R146.1-V10.1

is the same as:

V10.1-R146.1

NOTE: That is a dash "-" not subtraction.

The "-" will always be the delimeter if you will.

This is an example:

$iTrunk = _ArraySearch($npparray, $ppBranches[0], 0, 0, 0, 1, 1, 0)

The string is entered by the user into a database and loaded into both arrays. The values easily could have the order reversed.

What would the expression be, searching the value of $ppBranches[0], and how would I incorporate it into my search?

I'm thinking I only need to search the alternate, from the reg expression if in the initial search, $iTrunk returns a -1.

If $iTrunk < 1 then

search reg expression in array.

thanks in advance,

myids

PS: I promise to try and understand in the future how the reg expr works. :-)

Link to comment
Share on other sites

  • Moderators

myids,

I am not altogether sure I have correctly understood your question, but if there is always a "-" delimiter in the string you are searching for, it is easy to reverse the order to produce the variant. Here is a short example which, if I did understand correctly, does what you want:

#include <Array.au3>

; This is what you are searching for
$sString = "R146.1-V10.1"

; Create array and fill (both normal and reversed versions are included)
Global $npparray[300]
For $i = 0 To 149
    $npparray[$i] = "R" & $i & ".1-V10.1"
    $npparray[$i + 150] = "V10.1-R" & $i & ".1"
Next

; Search for the "normal" version
_Search_Array($sString)

; Now reverse the order of the 2 elements
$aStringArray =  StringSplit($sString, "-")
$sString = $aStringArray[2] & "-" & $aStringArray[1]

; Now search for the "reversed" version
_Search_Array($sString)

Func _Search_Array($sString)

    $iTrunk = _ArraySearch($npparray, $sString, 0, 0, 0, 1, 1, 0)
    MsgBox(0, "Found", "Found " & $sString & " at index " & $iTrunk)

EndFunc

Does that fit the bill? Please ask again if not. :(

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Thanks.

No, it's not quite there. The "-" as a delimeter, I thought about doing something similar, but seems like a lot of extra code for doing little work.

The values that I'm searching for could be all over the place: for example:

U36.A10-CR199.2, where the reverse is exactly equiv CR199.2-U36.A10. The database to search could be extremely large.

I'd use StringSplit() function to break the string and put it back together.

$npparray = StringSplit( $npairs, "-", 2 )

will leave me with $npparray[0] and $npparray[1]

I don't think that is very efficient either, although I could resort to that.

I was hoping for a regular expression. I've used them before, although I don't understand how to construct them.

I was hoping for use of this function somehow: StringRegExp()

Edited by myids
Link to comment
Share on other sites

Granted regexps can be very powerful, but they require a very precise understanding of what has to be done. Regexps are a way to express complex rules in a compact way, but there must be rules to direct behavior.

Your input looks like batteries types or something similar. You need to ask yourself which rules govern isolating a type from another. I don't see how your last example doesn't fit the bill of StringSplit using the '-' delimiter, but perhaps can you explain more what you mean by "values all over the place".

If you're using a database for storing these values, why do you leave individual values glued together this way? You can use database access to do your search.

What size could be your database?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Granted regexps can be very powerful, but they require a very precise understanding of what has to be done. Regexps are a way to express complex rules in a compact way, but there must be rules to direct behavior.

Your input looks like batteries types or something similar. You need to ask yourself which rules govern isolating a type from another. I don't see how your last example doesn't fit the bill of StringSplit using the '-' delimiter, but perhaps can you explain more what you mean by "values all over the place".

If you're using a database for storing these values, why do you leave individual values glued together this way? You can use database access to do your search.

What size could be your database?

The database being searched is connected via OLE object. It's not a MySQL, or SQLite, or DBF, or other database. That data is pulled by use of that application's API. When data is pulled, I'm temp storing in arrays. There is a fixed version of the string that I am presented. Next, The data strings to search for are entered into an Excel spreadsheet by a user. The order could possibly be reversed. The user knows the string as the same no matter which order used. It all comes down to; I need the exact string to find the specific information I ultimately need. Right now, if the user reverses the order, it's not the exact string.

In programming this, I don't want to have to care which is on the left side of the dash.

The examples I gave are part of a Printed Circuit Board database. The strings are called pin pairs. So the designations may be all over the place. Resistors and capacitors can be pretty common and many not knowing pcbs might know what those elements are. If R1.1 is connected to U14.13, U14.13 is connected to R1.1.

So, IF it's possible to do use a regular expression, the left side of the "-" and the right side may be swapped in order. But if a user is looking for a pin pair, the node indication (ie: R146.2 would be exact. This is the only way the user can describe R146 pin 2.) -- format is always the same.

So, is it possible? In different terms I guess:

(some exact string) - (some other exact string)

or

(some other exact string) - (some exact string)

If they exist in the searched string together then the search is true.

thanks,

myids

Link to comment
Share on other sites

  • Moderators

myids,

Part of the secret of regexes is knowing when not to use them. :)

I still believe this is a case in point, as jchd has already pointed out. Using StringSplit will very easily get you both versions of the pairing. Admittedly you then to search each string for both - and this could take up to a maximum of twice as long as a single search but could well be a lot less if there a match to be found. :(

I have not tested this, but you might like to investigate if it would be quicker to look for just one half of the pairing and then check those results to see if the other half is present. I am still unclear as to whether you are searching array elements which each contain one pairing or long strings with multiple pairings:

- if array elements, then the code would look a bit like this:

#include <Array.au3>

; This is what you are searching for
$sString = "R146.1-V10.1"

; Create array and fill (both normal and reversed versions are included)
Global $npparray[300]
For $i = 0 To 149
    $npparray[$i] = "R" & $i & ".1-V10.1"
    $npparray[$i + 150] = "V10.1-R" & $i & ".1"
Next

; Split the string
$aStringArray =  StringSplit($sString, "-")
$sReverseString = $aStringArray[2] & "-" & $aStringArray[1]
$sPartString = $aStringArray[1]

; Now search the array
$iStartIndex = 0
$fFound = False

; Loop until there is no match
While 1
    ; Find the part string
    $iTrunk = _ArraySearch($npparray, $sPartString, $iStartIndex, 0, 0, 1, 1, 0)
    If @error = 6 Then ExitLoop
    ; We have a possible match
    If $npparray[$iTrunk] = $sString Or $npparray[$iTrunk] = $sReverseString Then
        MsgBox(0, "Found", "Found " & $sString & " at index " & $iTrunk)
        $fFound = True
    EndIf
    ; Continue search from the next element
    $iStartIndex = $iTrunk + 1

WEnd

If Not $fFound Then MsgBox(0, "Not Found", $sString & "was not found")

The $fFound lines are only there for the example because we know there are 2 matching elements.

Will that do the job? :)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

What I'd like to do using StringRegExp() function:

So, is it possible to have in a reg expression? If so, how do I construct it?

(some exact string) - (some other exact string)

or

(some other exact string) - (some exact string)

If they exist in the string being searched then the search is true.

thanks,

myids

Link to comment
Share on other sites

Unfortonately for me, I do have experience with hardware design, hardware repair, hardware black and white magics and hardware documentation. I mean electronics-level hardware, as my "avatar" (made with Eagle for the purpose) shows. In front of me there is about 50K€ of Agilent and Tektronix lab equipment waiting for me to have time using them :( So I do have a good idea of what a spare part listing for a line of product can look like.

I you have large base why not using your own (?) local base to search in. My idea is that you'll have better time using something along the line of:

select * from mybase where refpart like '%U36.A10%'; to fetch whatever datafield you need that comes along. This will hapily match what a regexp or StringSplit would match on a single line, but comes with the bonus of large volume of data indexing and also organizing surrounding data easily.

What I still don't understand is why don't you query you OLE base from AutoIt and what the relationship is with Excel spreadsheets.

If you don't have specific requirement to hide details and avoid exposing them in public, I would suggest you expose the larger picture of your practical problem, as we could possibly then come up with a much better overall solution than one single person would. Not that we are superior, but we are coming from various horizons and have a vast agglutinated experience in an incredible number of areas. I let you judge by yourself the range of domains where specialized questions were asked on the forum that have received a more global answer than the original poster would have never thought was possible (with AutoIt).

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Thanks for the responses so far.

One reason I ask here is that I already know there are many programmers with vastly different areas and expertise. I simply write scripts that get critical data from a place the original application and API cannot supply or more quickly or the reason to use AutoIT, to try and automate time consuming procedures. I haven't thought I needed to know much about reg expressions but with what I do know, I thought the concept could solve my immediate problem.

I thought I've explained what I wanted to know. Simply, is there a way to use a reg expression. The examples were inserted simply to give explicit examples of the strings I am searching for. I would have been asked for them any way, no? IMO, technically, the specific examples weren't needed to find out if a reg expression can be used to compare a string with another one. I have a gut feel that it can, but, my ignorance leads me here in hopes someone can tell me.

If I can't use a reg expression, I already can program a few lines to solve it. But, if there is reg expression that can be formulated, it will be no more than 2 or 3 lines I expect and be faster than using some routine/function incorporating a stringsplit. Where the script takes it apart and reassembles to do another tedious compare.

Now, my specific programming problem, since you've asked. There is more details that I won't give because of time, but Excel is in place of an actual user interface. The users know how to enter data into a spreadsheet and I don't have time to create one anyway. AutoIT is a great tool for this because I can read what the user wants from the spreadsheet cells.

The OLE App, can't give me the data back the way it needs to be described. The API doesn't have the capability to do searches like your SQL example. I wish I could, believe me. True, I have not tried a complicated query type call that embeds the API functions. I'm not sure it would be possible anyway, since there is no SQL type db. Maybe in a future version I can pull data and create a db. That isn't happening right now though.

The combinations of the required information usually don't exist in the particular application. So the interface/spreadsheet is the method to describe exactly what is needed in one place. For example, I might need to know the physical length of the trace between 2 nodes + the length of another 2 nodes - pin pair1 + pin pair2. I don't want a report of 15000 nodes. I need to be able to let a user specify which pin pairs to get the length from, write it to the spreadsheet then Excel can do the addition or what ever the user wants to do with the data. What they want to see and what gets reported is all in one place. The way the string is entered by the user might be the reverse order than it is coming out of the API reporting.

If time weren't an issue, creating an interface would probably solve this whole issue because I could control what information and how it is formatted when comparing items with each other.

So, my question again, by ignoring examples and reasons:

I want to search for string A-B in a set of of other strings. I want it to find B-A and report it as an equal. Is this possible with regular expressions?

Thanks in advance,

myids

Link to comment
Share on other sites

untested but say you have variables

$first = "blah_1"

$second = "blah_2"

and your string to search is

$testString

and you want to find what you said

(some exact string) - (some other exact string)

or

(some other exact string) - (some exact string)

Then this will return 1 or 0 if match was found or not.

StringRegExp($testString,"(?:" & $first & "-" & $second & ")|(?:" & $second & "-" & $first & ")",0)

It will match blah_1-blah_2 or blah_2-blah_1

if you want to know which was matched you can have the last parameter 1 instead of 0, and add a () around the 2nd parameter like this

StringRegExp($testString,"((?:" & $first & "-" & $second & ")|(?:" & $second & "-" & $first & "))",1)

This will return an array since there is only 1 captured grouping (), because (?: ) is an non-captured group, there will be only 1 value in the array at index 0. This will be whatever was matched and captured in that group either blah_1-blah_2 or blah_2-blah_1.

Edited by ShawnW
Link to comment
Share on other sites

Please don't take it bad when I or some other regular contributor ask for more information. This is certainly not to make posters appears like dumb puppets but often because they/we suspect a better solution _could_ be to place the problem posted in a larger framework.

That said, looking up strings X-Y or Y-X type like you want with a regexp is easy if one doesn't look too deep:

Local $str = 'U36.A10-CR199.2'
Local $target1 = 'CR199.2'
Local $target2 = 'U36.A10'
If StringRegExp($str, "(" & $target1 & "-" & $target2 & ")|(" & $target2 & "-" & $target1 & ")") Then
 MsgBox(0, "Search", "Found")
Else
 MsgBox(0, "Search", "Not found")
EndIf

But special characters in $targeti strings can lead to false positive. In regexp patterns, a dor '.' stands for "any character not a newline" (by default). So out simple-minded regexp will happily match $str = 'U36/A10-CR199*2', which is probably dangerous for a critical application.

Even the [bogus] simple-minded regexp above isn't very far, from an efficiency/complexity point of view, from the equivalent but much simplerand way more robust:

Local $str = 'U36.A10-CR199.2'
Local $target1 = 'CR199.2'
Local $target2 = 'U36.A10'
If $target1 & "-" & $target2 = $str Or $target2 & "-" & $target1 = $str Then
 MsgBox(0, "Search2", "Found")
Else
 MsgBox(0, "Search2", "Not found")
EndIf

My question was rather in the direction of: how are you going to organize a "very large database" of strings similar to $str in your program? You may say that's your problem, and you'll be close to reality :( saying so. OK, the $str will be in an array, possibly large. But the question shifts to loading this array into memory. Also scanning an array is linear in time and averages to half the array size while indexed lookup is logarithmic.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

untested but say you have variables

$first = "blah_1"

$second = "blah_2"

and your string to search is

$testString

and you want to find what you said

(some exact string) - (some other exact string)

or

(some other exact string) - (some exact string)

Then this will return 1 or 0 if match was found or not.

StringRegExp($testString,"(?:" & $first & "-" & $second & ")|(?:" & $second & "-" & $first & ")",0)

It will match blah_1-blah_2 or blah_2-blah_1

if you want to know which was matched you can have the last parameter 1 instead of 0, and add a () around the 2nd parameter like this

StringRegExp($testString,"((?:" & $first & "-" & $second & ")|(?:" & $second & "-" & $first & "))",1)

This will return an array since there is only 1 captured grouping (), because (?: ) is an non-captured group, there will be only 1 value in the array at index 0. This will be whatever was matched and captured in that group either blah_1-blah_2 or blah_2-blah_1.

Thanks ShawnW. That solves my generic question.

Next:

Could this still be in an _ArraySearch() function?

example:

$iTrunk = _ArraySearch($npparray, $ppBranches[0], 0, 0, 0, 1, 1, 0)

where $ppBranches[0] will be the object of the StringRegExp() function.

I'm not seeing it, if it can be since I'm not manually looping through the array.

Link to comment
Share on other sites

Look at the code for _ArraySearch: you can derive your own _ArrayRegExp function based on it.

Edit: OTOH, bulk compare simply with StringInStr with one part of the pattern (A or :( and test only those cases for equality with A-B or B-A.

But beware the false positives with special characters in patterns, like I said previously.

How is that better than the direct (exact) compare in my second example?

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...