Jump to content

Proper String Management


Recommended Posts

Hey fellow code enthusiasts!

I have a slight problem with my code at the moment. Unfortunately it is a private website, my apologies for any inconveniences this may cause.

Purpose of Code: Read the body document from the website, and capture a city (in this case "Los Angeles") in a variable (With NO spaces before or after it).

This must be done without using absolute position of a string, because there are different cities that are different in length.

My Code:

Global $sCity
Global $oIE = _IECREATE ("http://www.fakewebsite.com/")

Func getInfo()
    $sText = _IEBodyReadText ($oIE) ;Retrieve text from body of page
    $sCity = StringInStr ($sText, "City : ")

EndFunc

Website HTML Body Text Window:

State : California

City : Los Angeles

_____________________________________

I'm not certain if there are spaces behind the word "Angeles", or if it just skips to a new line. All I know is that Angeles is the last word on that line. Can anybody lend a helping hand? Thank you!! :)

Link to comment
Share on other sites

Please provide a few short examples of strings like they appear in the text. A regexp will probably do nicely.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Please provide a few short examples of strings like they appear in the text. A regexp will probably do nicely.

I'm not quite sure I follow. I'm relatively new to the coding world. Sorry :)

Link to comment
Share on other sites

  • Moderators

Manic,

The easiest way is to use a Regular Expression. This should do what you want: :)

#include <Array.au3> ; Just for display

; This is the text you get from the website
$sText = "Blahblahblah" & @CRLF & _
         "City : Los Angeles" & @CRLF & _
         "Blahblahblah" & @CRLF & _
         "City : Chicago" & @CRLF & _
         "Blahblahblah"

; Extract the city names
$aCities = StringRegExp($sText, "City\x20:\x20(.*)(?:\v|\z)", 3)

; And display them
_ArrayDisplay($aCities)

The SRE works like this:

City\x20:\x20 - look for City[SPACE]:[SPACE]
(.*)          - extract any number of characters
(?:\v|\z)     - until we get to EOL or EOF - the ?: just tell it not to return the value even though it is within ()

All clear?

M23

Edit: jchd was asking if the lines in the test read exactly as you suggested in your first post and I have used in my example. if not, then we might need to change the SRE. ;)

Edited by Melba23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

To expand on what both Melba23 and I were refering to, to be fully functional in all _your_ situations, we have to craft a regular expression pattern which is certain to capture exactly what you want. The "City :" prefix should work fine, but the question is still open for where to end the capture.

Anyway, you can always remove leading and/or duplicate and/or trailing whitespaces using StringTrimWS. See Help for how to use it.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Wow again quick responses, I love this forum already! M23, I just took a peek @ the helpfile and that looks very close if not bang on to what I want.

Although, I'm not sure I follow some of this:

(?:\v|\z) - until we get to EOL or EOF - the ?: just tell it not to return the value even though it is within ()

And yes, the text for City : Los Angeles

appears just like that when I read the body html.

@jchd: I want to end the capture after the last character of the city. For example:

If the document read... City : New York

the last character I want to capture is the "k" in "York". Does this help?

Thanks again you guys, I really appreciate it!

Edited by Manic
Link to comment
Share on other sites

:)

There isn't such a thing like "the last character of the city name". That's only human understanding.

What is of interest here is what is _after_ the name, or any condition denoting the end of the name. If whitespaces may appear here, or newlines, or whatever, we need to rely on this to avoid making these chars part of the name. In my Zipcode DB, I have city names as long as 175 characters, which might be "given" to you (or me) on more than one line, for instance.

In your case, try Melba23 code and see what gives.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

  • Moderators

Manic,

I'm not sure I follow some of this: "(?:\v|\z) - until we get to EOL or EOF - the ?: just tell it not to return the value even though it is within ()"

Normally, putting a group in () makes it a capturing group - that means it is returned as part of the thing you are searching for. We need to have this group in () because we want the match to end on either EOL or EOF - that is what the | character does. However, we do not want to capture the actual value, so we start the group with ?: which tells the engine to treat this as a non-capturing group even though it is in (). :)

Brain hurting yet? If not, you have not understood! ;)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

@ jchd: Ok cool. Thank you very much for your input!

@ Melba23: I think my brain just exploded :) Aside from that, it works as you told it to. This is progress indeed! The next step is trying to use the Html Body Text while utilizing this headache-inducing technique you two call "StringRegExp". I'm going to attempt to do this (and hopefully succeed!).

Thank you both so much. I'm less than a day old, and I've gotten answers that would've stumped me for weeks! If there's anything else I have trouble with on this problem, I'll be sure to post in here again. You'll be seeing alot more of me around here. Thanks again! ;)

Edit: As a side note, do either of you know of a good tutorial for RegExp that is compliant with the syntax of AutoIt? The AutoIt Help file is good, but it is a little too technical for a simple guy like me.

Edited by Manic
Link to comment
Share on other sites

  • Moderators

Manic,

If you want to learn about SREs, I recommend this site. It has helped me a lot while I got to my current pretty low level of understanding - I still use it regularly for more complex structures. :)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

OK I GIVE. :) This code is driving me crazy. I'm stuck on getting sending one of those cities from the array into it's own respective variable... Although, something tells me it has to do with the _ArrayToString. Ideas?

Link to comment
Share on other sites

Show your code and you'll be accompagnied home.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

The array that stringregexp returns in Melba23 example holds the variables.

So $aCities[0] holds "Los Angeles" and $aCities[1] holds "Chicago"

If you are unsure how many cities will be returned, you are best using the array variables.

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Link to comment
Share on other sites

Alrighty, this is my code.

#include <IE.au3> ;for _IE()
Global $sRegion ;State or Province Global 
Global $sZip 
Global $sCity

$oIE = _IECREATE ("HTTP://FAKESITE.com")
 
Func getInfo()  
$sText = _IEBodyReadText ($oIE) ;Retrieve text from body of page    
$result = StringInStr ($sText, "(approx.)",0,1,1) ;position 552     
$result += 12 ;brings the position to 564, the beginning of the text for ZIP    
$next = StringMid ($sText, $result, 5) ;extracts the zip code   
$sZip = $next ;set zip to global variable $sZip     
$sRegion = StringRegExp($sText, "Region\x20:\x20(.*)(?:\v|\z)", 3);find way to pull the region into a variable  
$sCity = StringRegExp($sText, "City\x20:\x20(.*)(?:\v|\z)", 3) ;find way to extract the city into a variable

    MsgBox(1, "Test", "Result :" & $sRegion & $sCity)

EndFunc

Now from my understanding, in getInfo(), the $sRegion and $sCity are being turned into arrays. The $sZip works fine, because it's always going to be (in this case) 5 characters long, so I didn't have to worry about Regexp. If I am correct about these values being turned into arrays, how do I transfer these arrays properly into strings?

edit: I didn't run the function for this example, hence why I left out getInfo()

edit: I had region/city keywords mixed up.

Edited by Manic
Link to comment
Share on other sites

  • Moderators

Manic,

JohnOne has already given you the answer - use the array elements. Try this: ;)

$sRegion = StringRegExp($sText, "City\x20:\x20(.*)(?:\v|\z)", 3) ; pull the region into an array    
$sCity = StringRegExp($sText, "Region\x20:\x20(.*)(?:\v|\z)", 3) ; extract the city into an array

For $i = 0 To Ubound($sRegion) - 1 ; Sets the beginning and end values to use
    MsgBox(1, "Test", "Result : " & $sRegion[$i] &  " - " & $sCity[$i]) ; Displays the elements of the arrays
Next

EndFunc

The above assumes that there are equal numbers of elements in each array - i.e. there are as many "Region"s found as "City"s. if this is not the case, you might end up with errors, but we can deal with that in due course. :)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Melba,

the code did indeed pull the correct City and State, and echo them into that MsgBox. Also, there is only one instance of each on the page. I don't fully understand the loop. Instead of just displaying the elements from the array, can I store them into their respective variables? I need to manipulate them elsewhere. :) lol.

I don't know why, but I have a nagging feeling that I'm going to need to use the _ArrayToString function. Thoughts?

Again, thanks so much!!

Link to comment
Share on other sites

  • Moderators

Manic,

The _ArrayToString function just puts all of the elements from an array into one single variable - and I do think that is what you want at all. ;)

The elements of the array are already strings, so you can use them directly as I did in the MsgBox line. How is your understanding of arrays? Perhaps the Arrays tutorial in the Wiki might be useful here so that you see how to use the elements individually. ;)

What are you trying to do exactly? :)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

I always preffer the arrays grouping types of data, in this case "Cities"

Manic,

If you want to learn about SREs, I recommend this site. It has helped me a lot while I got to my current pretty low level of understanding - I still use it regularly for more complex structures. :)

M23

I will look the tutorial too, it seems very explicative. Thanks Edited by monoscout999
Link to comment
Share on other sites

Finally, got it! Thank you all so much! :) Huggles for all!

Melba I would love to tell you about it, but I am sworn to secrecy ;) Sorry!

This has helped me advance tons in my project, even though I am far from complete. It gives me hope that I can actually accomplish something when this forum has such good community support. Thanks you guys, I really appreciate everything you've done for me. I'll be back when I run into my next problem (which will probably be in the near future because I am a noob ;) But I will do my best to not take up too much valuable time from you guys :D

Anyways that's all I've got for now! Cheers!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...