Jump to content

How to get nth line text from a given text with Regexp


Recommended Posts

Hi all,

How to get nth line text from a given text with Regexp. For example, i have a text from a window which contains 15 lines.

And i need to get the text from 5th line (it may vary). How to do it. I have tried some patters like "(^ &)5" and "[^ $]{5}". But didn't work.

Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

Does it have to be RegExp?

I'd personally use _FileReadToArray(), and then use RegExp to extract the needed data from the line I need.

My Contributions: Unix Timestamp: Calculate Unix time, or seconds since Epoch, accounting for your local timezone and daylight savings time. RegEdit Jumper: A Small & Simple interface based on Yashied's Reg Jumper Function, for searching Hives in your registry. 

Link to comment
Share on other sites

I agree with Realm, regular expressions are generally for pattern matching, so it would be far simpler not to use a regular expression for this.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Actually, this text is from an edit control. And i need to process as fast as possible. My first method was to split the string and put it in an array and get my desired line with line number. In that case, i need to verify that there is no error while putting it to array and gettng from array. My assumption is that fileread and write are slower than array functions. Correct me if i am wrong.

Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

Test it and don't make assumptions.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Are you sure the text consists of lines? Meaning, does it actually contain @CR and/or @LF characters? If so, then this should work:

$text = "hello1" & @CRLF & "hello2" & @CRLF & "hello3" & @CRLF & "hello4" & @CRLF & "hello5"
$array_of_lines = StringRegExp($text, "(?m)^.*$", 3)
MsgBox(0, 0, "Third line: " & $array_of_lines[2])

(Note that the third line is array element 2, as the array result of that StringRegExp call is 0-based.)

Of course, if speed is paramount you should benchmark all possible approaches.

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Link to comment
Share on other sites

Hi 

SadBunny, Thanks . Let me try this .
Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

@

SadBunny, It's working. What i understood from that regular expression is;

1. " ^.*$  "

^ = start point of the line

.* = any text can come after start point

$ = end point point of the line. 

Thant means this regex will catch a complete line.

But i don't know what is meant by (?m) 

Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

@

SadBunny, Instead of making an array, does regexp can catch the exact line text only. ?
Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

It's possible without making an array of all the lines:

$text = "this is some text on line 1" & @CRLF & "text on line 2" & @CRLF & "another line of text (line 3)" & @CRLF & "line 4" & @CRLF & "this is the final line (5)"

For $linenum = 1 to 5
    $pattern = "^(?:[^\n]*\n){" & $linenum - 1 & "}([^\n]*)"
    $test = StringRegExp($text, $pattern, 1)
    ConsoleWrite('Text on line ' & $linenum & ' is: "' & StringStripCR($test[0]) & '"' & @CRLF)
Next
Link to comment
Share on other sites

@mpower, If we know the line number (say 4) then do we need a loop ?

Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

Of course no:

Local $sText = "this is some text on line 1" & @CRLF & "text on line 2" & @LF & "another line of text (line 3)" & @CRLF & "line 4" & @LF & "this is the final line (5)"
Local $sLine4 = StringRegExpReplace($sText, "(?m)((?:.*\R){3})(.*)(?s)(.*)", "$2")
ConsoleWrite($sLine4 & @LF)

Just make sure that the line number is < 65535 else you need something slightly more complex.

Also, using R copes with line breaks being CR or LF or CRLF as shwon here.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Obviously not....it was just an example. You don't need a loop I was just showing that I can get any number line.

$text = "this is some text on line 1" & @CRLF & "text on line 2" & @CRLF & "another line of text (line 3)" & @CRLF & "line 4" & @CRLF & "this is the final line (5)"
$linenum = 4
$pattern = "^(?:[^\n]*\n){" & $linenum - 1 & "}([^\n]*)"
$test = StringRegExp($text, $pattern, 1)
ConsoleWrite('Text on line ' & $linenum & ' is: "' & StringStripCR($test[0]) & '"' & @CRLF)

EDIT: jchd beat me to it by milliseconds :)

Edited by mpower
Link to comment
Share on other sites

@jchd Thanks a lot.

@mpower thanks a lot. 

Let me try these ideas.

Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

@

mikell, you mean, this regexp will delete all other lines from a text and give the  text from nth line ?
Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

@

mikell, But yoour code is giving an empty string. 
Spoiler

My Contributions

Glance GUI Library - A gui library based on Windows api functions. Written in Nim programming language.

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Link to comment
Share on other sites

I used jchd's example to test it

Local $sText = "this is some text on line 1" & @CRLF & "text on line 2" & @LF & "another line of text (line 3)" & @CRLF & "this is line 4" & @LF & "this is the line (5)" & @CRLF & "text on line 6"
msgbox(0,"", $sText)
Local $sLine4 = StringRegExpReplace($sText, "(?:.*\R){3}(.*)(?s).*", "$1")
msgbox(0,"", $sLine4)

But if the 4th line is empty it will return nothing

Local $sText = "this is some text on line 1" & @CRLF & "text on line 2" & @LF & "another line of text (line 3)" & @CRLF & @CRLF & "this is line 4" & @LF & "this is the line (5)" & @CRLF & "text on line 6"
msgbox(0,"", $sText)
Local $sLine4 = StringRegExpReplace($sText, "(?:.*\R){3}(.*)(?s).*", "$1")
msgbox(0,"", $sLine4)
Link to comment
Share on other sites

Local $sText = "this is some text on line 1" & @CRLF & "text on line 2" & @LF & "another line of text (line 3)" & @CRLF & "this is line 4" & @LF & "this is the line (5)" & @CRLF & "text on line 6"


$line = 4


msgbox(0, '' , stringsplit($sText , @LF)[$line])

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...