Sign in to follow this  
Followers 0
glasglow

Break down a string

17 posts in this topic

Would you happen to know where I can start. I have a script that reads the text in documents, however I would like to take that text and break it down into smaller units. For example:

This is the example sentence I want to break down into chunks of say 3 words.

Would be broken down into this:

This is the

example sentence I

want to break

down into chunks

of say 3

words

Where it would take the input text and break it down into smaller units line by line. Is something like that possible? I will be using the program for a research project I call the Anna. Anna is Associated Noun Negative Association...ideas existing or not existing between two people.

Share this post


Link to post
Share on other sites



Would you happen to know where I can start. I have a script that reads the text in documents, however I would like to take that text and break it down into smaller units. For example:

This is the example sentence I want to break down into chunks of say 3 words.

Would be broken down into this:

This is the

example sentence I

want to break

down into chunks

of say 3

words

Where it would take the input text and break it down into smaller units line by line. Is something like that possible? I will be using the program for a research project I call the Anna. Anna is Associated Noun Negative Association...ideas existing or not existing between two people.

You can probably use StringSplit using whitespace as the delimiter which would return each word stored in the string into an array. Then you can separate the array into 3s. Start there.


My Projects: [topic="89413"]GoogleHack Search[/topic], [topic="67095"]Swiss File Knife GUI[/topic], [topic="69072"]Mouse Location Pointer[/topic], [topic="86040"]Standard Deviation Calculator[/topic]

Share this post


Link to post
Share on other sites

You can probably use StringSplit using whitespace as the delimiter which would return each word stored in the string into an array. Then you can separate the array into 3s. Start there.

Thank you. Seems logical.

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

After a long headache of fiddling, i figured out a solution.

I think im looping the long way, but w/e. It is now a UDF that can strip any sting based on words.

#include <Array.au3>

$String = "This is the example sentence I want to break down into chunks of say 3 words"
$Count = 3
$answer = SplitByWord($String, $Count)
_ArrayDisplay($Answer)


Func SplitByWord($String, $Split)
$words = StringSplit($String," ")
Dim $Array[Ceiling($words[0]/$Split)+1]
ReDim $Words[Ubound($Words)+2]
For $i = 1 to Ceiling($words[0]/$Split)
    $array[$i] = ""
    For $add = 1 to $Split
    $array[$i]&=$Words[($i*$Split)-$Split+$add]&" "
    Next
    $array[$i] = StringTrimRight($array[$i],1)
Next
Return $array
EndFunc
Edited by Paulie

Share this post


Link to post
Share on other sites

After a long headache of fiddling, i figured out a solution.

I think im looping the long way, but w/e. It is now a UDF that can strip any sting based on words.

#include <Array.au3>

$String = "This is the example sentence I want to break down into chunks of say 3 words"
$Count = 3
$answer = SplitByWord($String, $Count)
_ArrayDisplay($Answer)


Func SplitByWord($String, $Split)
$words = StringSplit($String," ")
Dim $Array[Ceiling($words[0]/$Split)+1]
ReDim $Words[Ubound($Words)+2]
For $i = 1 to Ceiling($words[0]/$Split)
    $array[$i] = ""
    For $add = 1 to $Split
    $array[$i]&=$Words[($i*$Split)-$Split+$add]&" "
    Next
    $array[$i] = StringTrimRight($array[$i],1)
Next
Return $array
EndFuncoÝ÷ Ûú®¢×zajÝý²Ø^àÁ¬¯+aȧ²×u«­¢+ØÀÌØíÍÑÉ¥¹ôÅÕ½ÐíQ¡¥Ì¥ÌÑ¡áµÁ±Í¹Ñ¹$ݹÐѼɬ½Ý¸¥¹Ñ¼¡Õ¹­Ì½ÍäÌݽÉÌÅÕ½Ðì(ÀÌØí½Õ¹ÐôÌ(ÀÌØí¹ÍÝÈôMÑÉ¥¹MÁ±¥Ð ÀÌØíÍÑÉ¥¹°ÅÕ½ÐìÅÕ½Ðì¤(ÀÌØíµàôÀÌØí¹ÍÝÉlÁt)¥´ÀÌØí¹ÝÍÑÉ¥¹ôÅÕ½ÐìÅÕ½Ðì()½ÈÀÌØí¤ôÄѼÀÌØíµà(%%ÀÌØí¤ôÀÌØí½Õ¹Ð¬ÄQ¡¸($$ÀÌØí¹ÝÍÑÉ¥¹ôÀÌØí¹ÝÍÑÉ¥¹µÀì
I1µÀìÀÌØí¹ÍÝÉlÀÌØí¥tµÀìÅÕ½ÐìÅÕ½Ðì($$ÀÌØí½Õ¹ÐôÀÌØí½Õ¹Ð¬Ì(%±Í($$ÀÌØí¹ÝÍÑÉ¥¹ôÀÌØí¹ÝÍÑÉ¥¹µÀìÀÌØí¹ÍÝÉlÀÌØí¥tµÀìÅÕ½ÐìÅÕ½Ðì(%¹%)9áÐ()5Í  ½à À°ÅÕ½Ðí5ä¹ÜÍÑÉ¥¹¥ÌÅհѼèÅÕ½Ðì°ÀÌØí¹ÝÍÑÉ¥¹¤

My Projects: [topic="89413"]GoogleHack Search[/topic], [topic="67095"]Swiss File Knife GUI[/topic], [topic="69072"]Mouse Location Pointer[/topic], [topic="86040"]Standard Deviation Calculator[/topic]

Share this post


Link to post
Share on other sites

Thanks Paulie and Ealric, I kind of like Paulies there, it's the long way but it seems more functional.

Share this post


Link to post
Share on other sites

I know this can be done in one line. Regex is kicking my ass though. Should look close to:

MsgBox(0,"",StringRegExpReplace($String,"(\s){3}", @CRLF))

Share this post


Link to post
Share on other sites

I know this can be done in one line. Regex is kicking my ass though. Should look close to:

MsgBox(0,"",StringRegExpReplace($String,"(\s){3}", @CRLF))

You could probably mold this to do it:

http://www.autoitscript.com/forum/index.ph...st&p=382143


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

Well I know you could use regex to split the sentence into groups of three words, but I thought it would be easier just to replace every third space with a carriage return.

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Well I know you could use regex to split the sentence into groups of three words, but I thought it would be easier just to replace every third space with a carriage return.

#include<Array.au3>
  $string = "This is the example sentence I want to break down into chunks of say 3 words. Test"
  $re = StringRegExp($string, '((.*?\s+){3})|.*', 3)
  _ArrayDisplay($re)

Edit: Hmmh thought ((.*?\s+){3})|.* should also work but it doesn't

Edited by Xenobiologist

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites

#include<Array.au3>
 $string = "This is the example sentence I want to break down into chunks of say 3 words. Test"
 $re = StringRegExp($string, '.*?\s+.*?\s+.*?\s|.*', 3)
 _ArrayDisplay($re)
Not bad. Leaves an empty element at the end of the array but it works.

Share this post


Link to post
Share on other sites

Not bad. Leaves an empty element at the end of the array but it works.

Okay

#include<Array.au3>
$string = "This is the example sentence I want to break down into chunks of say 3 words."
$re = StringRegExp($string, '(?U).*\s+.*\s+.*\s|.+?', 3)
_ArrayDisplay($re)

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites

Works well.

Share this post


Link to post
Share on other sites

Works well.

I like this. Nice job.


My Projects: [topic="89413"]GoogleHack Search[/topic], [topic="67095"]Swiss File Knife GUI[/topic], [topic="69072"]Mouse Location Pointer[/topic], [topic="86040"]Standard Deviation Calculator[/topic]

Share this post


Link to post
Share on other sites

Okay

#include<Array.au3>
$string = "This is the example sentence I want to break down into chunks of say 3 words."
$re = StringRegExp($string, '(?U).*\s+.*\s+.*\s|.+?', 3)
_ArrayDisplay($re)

Yes that's a good one.. I've never fully understood the capabilities of regex, seems it's about time to get it.

Share this post


Link to post
Share on other sites

Yes that's a good one.. I've never fully understood the capabilities of regex, seems it's about time to get it.

Yeah! Have fun and if you got some examples then post them. I like to create the right patterns :-)


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites

I like to create the right patterns

I'll quote it forever. There's something satisfying about the stress that comes.. almost have it.. 10 minutes.. 20 minutes.. 4 hours.. 10 hours.. 5am ... WAH.. it worked just like it worked in our mind when we started.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0