Jump to content

How to find first and last word of sentence with Regular Expression


Go to solution Solved by jguinch,

Recommended Posts

Hi all,

How to find first and last word in a sentence with Regular expression ?. Sentence is the code from SciTE. For example if sentence is;

"For $i = 0 to 15" 

I need to extract "For from the sentence

And the same way i need last word from a sentence

"If Apple = 15 Then" 

I need "Then" from the sentence.

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

The most important thing for a working solution is to define the meaning of "word".

You need to define what delimits a word. Space, comma, bracket etc.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites

Hi all,

How to find first and last word in a sentence with Regular expression ?. Sentence is the code from SciTE. For example if sentence is;

"For $i = 0 to 15" 

I need to extract "For from the sentence

And the same way i need last word from a sentence

"If Apple = 15 Then" 

I need "Then" from the sentence.

 

Give this a whirl:

$sentence = "For $i = 0 to 15"
$sentence2 = "If Apple = 15 Then"

$firstword = StringRegExp($sentence, "(?:^|(?:\.\s))(\w+)", 1)
$lastword = StringRegExp($sentence, "\s(\w+)$", 1)

ConsoleWrite($firstword[0] & @CRLF)
ConsoleWrite($lastword[0] & @CRLF)

$firstword = StringRegExp($sentence2, "(?:^|(?:\.\s))(\w+)", 1)
$lastword = StringRegExp($sentence2, "\s(\w+)$", 1)

ConsoleWrite($firstword[0] & @CRLF)
ConsoleWrite($lastword[0] & @CRLF)

Sources: http://lmgtfy.com/?q=regex+first+word+in+sentence    and     http://lmgtfy.com/?q=regex+last+word+in+sentence

Edited by mpower
Link to post
Share on other sites

$sentence = "For $i = 0 to 15"
$sentence2 = "If Apple = 15 Then"

msgbox( 0 , '' , _firstlast($sentence))
msgbox( 0 , '' , _firstlast($sentence2))



Func _firstlast($string)

$aString = stringsplit($string , " ")
return $aString[1] & @CRLF & $aString[ubound($aString) - 1]

EndFunc

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to post
Share on other sites

I have googled a lot and tried something with the RegExp_GUI. Thanks for AutoIt for giving me a nice program for testing regular expresions.

@Water, In this case a word which separated with space only.

@boththose and @mpower, thanks. Let me try.

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites
  • Solution

#Include <Array.au3>

$sentence2 = "If Apple = 15 Then"

$words = StringRegExp($sentence2, "(?m)\W*(\w+).*?(\w+)\W*$", 1)
_ArrayDisplay($words)

Here is a small explanation :

(?m) : multiline mode. With this mode, ^ and $ operates on each line instead of the whole string

W* : any non-word character, 0 or more times

(w+) : capturing group. any word character, one or more times

.*? : anything, one or more times. ? takes the smallest occurence

$ : end of line or end of string

Edited by jguinch
Link to post
Share on other sites

@

jguinch, Thanks. There is more than one anser which needs to mark solved. What to do.

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

Mark the one which is easiest for you to underrstand.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites

@water,

Actually regular expression code is always horrible for me. So there is no easiest thing. I have googled a lot for what this "?" stands for . I didn't get any proper answer. So i have downloaded O'reilly's Regular expression Nutshell. And started reading. 

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

If you feel bad with regex, why don't you use the code from boththose ?

It works nice - if you just add a StringStripWS

$sentence = "    If Apple = 15 Then  "

msgbox( 0 , "" , _firstlast($sentence))

Func _firstlast($string)
  $aString = stringsplit(StringStripWS($string, 3) , " ")
  return $aString[1] & @CRLF & $aString[$aString[0]]
EndFunc

Edit

BTW in the case below all the codes on this page will fail  :)

$sentence = "   If Apple = 15 Then   ; this is a condition "
Edited by mikell
Link to post
Share on other sites

@

mikell, I think RegExp code works faster. And i had a wish to learn regular expression. May be it is harder. But i need to learn it. 

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

@

mikell, So either Regular expression is for humans or you are an alien.  :)

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

@mikell Then please suggest me a good and free e-book or pdf book to learn regular expression. I am complete beginner.

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites
  • Moderators

kcvinu,

I always recommend this site. But the learning curve is steep and stays that way - when mikell says "I've been a regex student for a long time " he really means it! :D

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to post
Share on other sites

@Melba23 , Thank you. I know that. 

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

@Mikell, Thanks. :)

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By RAMzor
      Hi everyone,
      I have this string:
      "main_lot      0x111” & @CRLF & “main_version          0xABC” & @CRLF & “main_number 0xDEAD123” & @CRLF & “main_version          0x333"
      And I'm trying to extract one specific hexadecimal number, actually main_version from this string by using StringRegExp:
      How to get 'ABC' from it?
      I'm not sure if the original string uses @CRLF, @CR or @LF as a line breaks (received from linux over ssh plink.exe) I have tried this code but it doesn't work 
      #include <Array.au3> $sLog = "main_lot 0x111” & @CRLF & “main_version 0xABC” & @CRLF & “main_number 0xDEAD123” & @CRLF & “main_version 0x333" $aVer = StringRegExp($sLog, "main_version\h*(.+)(?:0[xX][[:xdigit:]])", 3) _ArrayDisplay($aVer)  
       
    • By Tosyk
      Hi,
      Please help me to change metasymbol line. Right now I have this condition code:
      If StringInStr($_sName, 'TEXT ') Then $_sName = StringRegExpReplace($_sName, '(^.*)\TEXT (.*)$', '$2') $_sName = StringRegExpReplace($_sName, '(^.*)\ (.*)$', '$1') If Not CheckIsSave_($_sName) Then It work fine with this text file and finds each line which start from 'TEXT':
      Material B7E671143D244B ==================================== TEXT 2F3139D816C34D 1 TEXT B6A968EF2505A2 1 TEXT 35206697A04F91 1 TEXT EB485AF490D83D 1 TEXT 0DAB42294BD9B3 1 TEXT 3D6525BEE360E1 0 Material D6906B886B06E3 ==================================== TEXT 0CCECCCCFB62AE 1 TEXT 1E14CB29AB43F0 1 TEXT FB7F0DCE9B5950 1 But I have a new text file now the lines of which now are start with 0:, 1: and so on:
      sm_0 --------------- 0: dummy_gray 1: c_com_socksa_mt 2: c_com_socksa_tn 3: dummy_white 4: default_z 5: dummy_nmap 6: --- 7: --- sm_1 --------------- 0: c_com_prisoner_shoes_di 1: c_com_prisoner_shoes_mt 2: c_com_prisoner_shoes_tn 3: dummy_white 4: default_z 5: c_com_leatherb_rt 6: --- 7: --- how to change (or add) the condition code above to work with new text file?
      I'm trying to change this script: http://autoit-script.ru/threads/poisk-fajlov-rekursivno-po-dannomu-spisku.26970/post-148646
       
    • By seadoggie01
      I'm trying to capture everything after a "#ToDo" in my scripts. I got that like this:
      (?i)[^\v]*#todo(.*) But then I thought it would be nice to use underscores to continue the ToDo... kind of like this:
      #ToDo: This is a really long explanation about something _ # that is very in-depth and needs to take up a lot of _ # space in a ToDo comment Global $variables = "Bad" I can't seem to capture everything... and maybe I'm trying to do too much with Regex... I keep trying variations of this:
      Condensed Version: (?im)[^\v]*#todo(?:([^\v]*)_\s*)*#([^\v]*) Expanded with comments (?ixm)(?# Ignore case, ignore newlines in Regex, use multiline option)# [^\v]*(?# Match leading space/s)# \#todo(?# Match the #todo)# (?:([^\v]*)_\s*)*(?# Match lines ending with _)# \#([^\v]*)(?# Last line only, no _'s)# I never seem to be able to build an array well with Regex... I saw something once about not being able to capture repeated patterns, and I think that's my issue
    • By genius257
      Inspired by PHP's preg_split.
      Split string by a regular expression.
      Also supports the same flags as the PHP equivalent.
      v1.0.1
       
      Example:
      #include "StringRegExpSplit.au3" StringRegExpSplit('splitCamelCaseWords', '(?<=\w)(?=[A-Z])') ; ['split', 'Camel', 'Case', 'Words']  
    • By RAMzor
      Hi guys I need your help.
      I have string like this : "TDM111A5,      RCT222Y5/ 7  ; FDT444E4 /8 , ABC222R5"
      I need find a coma or semicolon and delete white spaces before and after them
      The output should be a string and/or array 
      String : "TDM111A5,RCT222Y5/ 7;FDT444E4 /8,ABC222R5"
      Array:
      TDM111A5
      RCT222Y5/ 7
      FDT444E4 /8
      ABC222R5
×
×
  • Create New...