DynamicRookie Posted May 24, 2018 Posted May 24, 2018 (edited) Hey there! I've been developing a artificial intelligence. My first hard task was letting the A.I know when a sentence is found in memory with different words What i tried to do here is simply, get all the words in user sentence that could be used as a identifier example: Steve Jobs then identify the sentence purpose with the words we found in the past "for" loop example: Do/Know/You/Who/Steve/Jobs Compare the example in the following matching sentences in memory. 1-Steve jobs was a known person 2-Do you know who barack obama is? 3-Do you know Steve jobs? 4-Do you know who steve jobs is? 5-How much money steve jobs had Then find the sentence that has way more matches than the other ones, remember that if the identifier words were not found (Steve jobs) then the sentence is invalid. Every sentence has a different answer and is important that the right one is chosen. If there's no more than the half of words in matches, then assign a variable the result of function, such as a return but for a global var. I couldn't figure out how to do that with StringRegExp. I honestly need help with detecting identifiers on memory sentences. I would also like to let the AI know typos, meaning that moeny and money means the same thing. Any help is hugely appreciated. Edited May 24, 2018 by DynamicRookie Updated
KickStarter15 Posted May 24, 2018 Posted May 24, 2018 9 hours ago, DynamicRookie said: My first hard task was letting the A.I know when a sentence is found in memory with different words Is this an excel file where your memory database found for validation? 9 hours ago, DynamicRookie said: Compare the example in the following matching sentences in memory. 1-Steve jobs was a known person 2-Do you know who barack obama is? 3-Do you know Steve jobs? 4-Do you know who steve jobs is? 5-How much money steve jobs had Secondly, if this were database validation then that would be easy for excel UDF by Water. But off course, you need to GUICtrlRead() for the input made by the user for the word validation. Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.
IAMK Posted May 24, 2018 Posted May 24, 2018 (edited) For the typos, you have a few ways, depending on how lenient you want to make it. 1- The simplest would probably be to count the letters in each word (maybe store it in a data structure), so "apple" would be 1a, 2p, 1l, 1e. Then, you do a search and find a match in the word "papel". Problem: Anagrams. You could also make it accept a difference of 1 (leniency). 2- Check that two words match, but also check the +1/-1 char of the word for the letter in case they swtich it like I did for the word "switch". E.g. [w][t][c][h] and [w][t][c][h]:0=0, 1=1, 2!=2, check 2+1=2 (2 and 3 are now found, so you can skip 3), 4=4, 5=5. Pretend there is no strikeout above ^. The forum put it there. Edited May 24, 2018 by IAMK
FMS Posted May 24, 2018 Posted May 24, 2018 maybe of topic on your question but I am also building AI (code here) Maybe we can compare some code ? I'm curious on your take on the math as finishing touch god created the dutch
DynamicRookie Posted May 25, 2018 Author Posted May 25, 2018 13 hours ago, FMS said: maybe of topic on your question but I am also building AI (code here) Maybe we can compare some code ? I'm curious on your take on the math I'm not skilled or something honestly, i'm just trying to let my code be simple so debugging is easier, however, so far i could do self-learning, memory checking, non-supervised learning, GUI, Inputs, memory arrays, weight, neurons, temporal lobes (This will let the AI know that "My name is Rookie" will be reffered if the user says "What's my name?", i have made it up to 1000 messages and the sentences are kind of encrypted for privacy and saved in a file), and check if the AI is original. The AI is interesting to talk to, but my objective is to make it entertaining, not only for use but for coding it, like a hobbie. I checked ur AI thread and it looks really interesting, but hard to debug. I'd appreciate if you explained me how it works, cause i'm really curious!
Andreik Posted May 25, 2018 Posted May 25, 2018 AI is a complex concept and usually is a mixture of many algorithms combined. For what you need a good start would be to play with levenshtein distance. This is an algorithm to check the similarities of strings. czardas 1
DynamicRookie Posted May 25, 2018 Author Posted May 25, 2018 13 hours ago, Andreik said: AI is a complex concept and usually is a mixture of many algorithms combined. For what you need a good start would be to play with levenshtein distance. This is an algorithm to check the similarities of strings. Yeah, i have thought of that, but i made a extremely easy system that only took me about 10 minutes. Is, checking strings letters, every letter has a value, then we replace all the letters by their values and sum them all, that means that, in example, msas and mass are both worth 16. Very easy system, i'll also add a system so it can check for duplicated letters, missing letters, and extra letters. About the learning system, it is designed to change topic as soon as the sentence the user provided was not found in memory, then tell another person (or to the same user in a different session) the sentence the other user said, and store the answer.
FMS Posted May 27, 2018 Posted May 27, 2018 (edited) Quote I'd appreciate if you explained me how it works, cause i'm really curious! The AI.au3 you see is (in this case) checks "handwritten digits" and calculates the best possible digit it think it is The file's beneath the ai.au3 are the mnist database (handwritten digits and it's "target") The targets are there for the "backwards propagation". The handwritten digits are in a grid of 28 by 28 whit a black/white of 0 to 255. In the beggining you have 1 "array" (or vector) of 784 (28*28) whit value's of 0-255 and propagrate forward (whit matrix math E4A) to a array[10] (1 for each digit). you see what i mean iff you klik the label "normelized input" pseudo code: (input[0]*wheigt[0])+(input[1]*wheight[1])+(...))+bias = output hiddenlayer1-node1 Where you are talking about doesn't sounds like a proper AI (code who thinks for it self instead of working out a code) or am i wrong? If not could I see your feed function? Edited May 27, 2018 by FMS text added as finishing touch god created the dutch
DynamicRookie Posted May 28, 2018 Author Posted May 28, 2018 (edited) 11 hours ago, FMS said: code who thinks for it self instead of working out a code What do you define by thinking by itself? About feedback system, it checks a list with an array containing words that can be used to identify the meaning of the sentence that user typed, then, after getting class of the sentence, it calculates an answer by the user responses to questions with that type of class and the word matches of those responses with the actual user input by calculating every word value and choosing the word with the highest amount of matchs, then the class, sentence, and answer get encrypted, and saved in a file. But, the main point of my AI is not chatting, what i want to do with it is a really original concept, and i don't feel like sharing it yet. Still, i would like to let you know i learned AU3 Programming not long ago, i work on it as a hobbie, so if you could teach me anything it would be really appreciated, that's in fact why i chose the name Rookie Edited May 28, 2018 by DynamicRookie
FMS Posted May 28, 2018 Posted May 28, 2018 (edited) Quote Still, i would like to let you know i learned AU3 Programming not long ago, i work on it as a hobbie, so if you could teach me anything it would be really appreciated, that's in fact why i chose the name Rookie I'm also not a big AU3 programmer but also I couldn't help you if I don't know what you are doing wrong For this I need to see the code on whish you are having problems whit Edit: Quote What do you define by thinking by itself? In mine case I've a "set of handwritten digits" to test whit and for changing the weights. Afther all te test are done and the weights found there local minimum you can put "new" handwritten digits so the AI predicts the correct output. I mean : it's not pre-defined and still get the correct output Edited May 28, 2018 by FMS as finishing touch god created the dutch
mortog Posted September 9, 2018 Posted September 9, 2018 I was working on something similar and tackled it as follows: Trained sentences: - scrap the input from capital letters, dots, etc. Call the remainder as a function with the remainder as its name, the function will have a bunch of variables attached to it (reply, grammar analysis (see below), etc.) New sentences: Split input into 3 parts: the original, grammatical analysis, word roots For the grammatical parts, I used treetagger and had it create a file containing the input, grammatical analysis, and root of the words: ShellExecuteWait("C:\TreeTagger\bin\tag-english.bat", $InputFile & " > " & $OutputFile, @SW_HIDE) build a 2d array of the info in $OutputFile (_DelimFile_To_Array2D) build a loop that uses _ArrayFindAll to check if a specific value of the input matches the trained sentences and build up an array of the results (ideally you should do this for the original input, grammar and root, so you can compare the data). get the highest matching results. For this I used a function I found on these forums, which I edited to suit my needs: expandcollapse popup;=============================================================================== ; ; FunctionName: _ArrayMode() ; Description: Returns the most frequently occuring elements in the array ; Syntax: _ArrayMode( $aArray [, $iStart] ) ; Parameter(s): $aArray - The ByRef array to find the mode of ; $iStart - (optional) The first index to check for data, default is 0 ; Return Value(s): On success returns a 1D array: ; [0] = Mode (number of instances of most common data element) ; [1] = First mode element ; [2] = Second mode element (with same mode count as first) ; [n] = Last mode element (with same mode count as first) ; On failure returns 0 and sets @error ; Author(s): jon8763; modified by PsaltyDS; further modified by Mortog ;=============================================================================== Func _ArrayMode(ByRef $aArray, $iStart = 0) ; Get list of unique elements Local $aData = _ArrayElements($aArray, $iStart) If @error Then Return SetError(@error, 0, 0) If $aData[0] = 0 Then Return $aData ; Setup to use SOH as delimiter Local $SOH = Chr(01), $sData = $SOH ; Setup for number of dimensions Local $iBound1 = UBound($aArray) - 1, $Dim2 = False, $iBound2 = 0 If UBound($aArray, 0) = 2 Then $Dim2 = True $iBound2 = UBound($aArray, 2) - 1 EndIf ; Assemble data string for searching For $m = $iStart To $iBound1 If $Dim2 Then ; 2D For $n = 0 To $iBound2 $sData &= $aArray[$m][$n] & $SOH Next Else ; 1D $sData &= $aArray[$m] & $SOH EndIf Next ; Check count of each unique element listed in $aData, highest count kept in $aCounts[0] Local $aCounts[$aData[0] + 1] = [0], $aRegExp[1] For $n = 1 To $aData[0] $aRegExp = StringRegExp($sData, $SOH & $aData[$n] & $SOH, 3) $aCounts[$n] = UBound($aRegExp) If $aCounts[$n] > $aCounts[0] Then $aCounts[0] = $aCounts[$n] Next ;create 2d array with unique numbers and their occurances _ArrayDelete($aCounts, 0) $o = _ArrayMaxIndex($aCounts) _ArrayDelete($aCounts, $o) _ArrayInsert($aCounts, 0) _ArrayInsert($aCounts, $o +1) $k = UBound($aCounts) Local $aCounts_2d[$k][2] For $i = 1 to $k step +1 _ArrayInsert($aCounts_2d, $i-1, $aCounts[$i -1] & "|" & $aData[$i -1]) Next _ArrayDelete($aCounts_2d, 0) Local $aRET = _ArrayMaxIndex($aCounts_2d,0,0,UBound($aData), 0) Return $aRET EndFunc ;==>_ArrayMode Attach scores to all the values based on their grammatical analysis and start comparing the input to your trained material. Some additional thoughts you should consider: Let's assume you have the following trained: how are you doing? and you have the input: how are you doing today? You may consider the input having a total score of 6, being 6 strings. The trained material would have a score of 5, and can figure out whether you'd like your script to remove adjectives, conjunctions and adverbs one by one to see if it gets a match to your trained data. Even better would be to have different values for different types of words, for examples verbs could have a score of 1, adverbials of time a score of 0.7. This would probably make finding matches more accurate, and easier. The challenging part though, will be that the word today does imply something: you could be expacted to have had a bad experience yesterday for example. The system described here is also easy for single sentences, but you'll need to build on it considerably if you want it to be able to process multiple sentences (and for example recognize a sentence with multiple questions).
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now