Jump to content

Are there English words in a text file?

Recommended Posts



I'm a QA Engineer. We localize our product to 13 languages. Before I invest a lot of time in an idea I have I need an answer. 

My idea is to invoke dialogs, inspect menus, inspect tooltips, look at the status bar text, basically get all the text of our UI. Either write all the text to a text file or inspect the text on the fly. Is there a way to see if the string is in English?

In Ruby or Python I can add a gem or library to a script and do the check against an English dictionary.  I want to do the same with AutoIt. Can you #include an English dictionary to an AutoIt script to inspect the text and verify whether it is English or not?

Thank you.


Share this post

Link to post
Share on other sites

if your purpose is to determine if a given text is in English or not, then you can get pretty close if you check for non-English characters.

as a preparatory step, get yourself familiar with what valid characters are considered English, a.k.a "Basic Latin" (disregard punctuation): unicode 0020-007F.

now, if your text contains anything out of that range, it's probably not English.

when testing, at first you may discover some common characters out of that range, that are used in English too; adapt your code to compensate.

after few such tests and compensations, you ought to get pretty close to 100% certainty.

but if you don't mind your script taking forever to complete, you can download a full English dictionary to check your text against.



Share this post

Link to post
Share on other sites


Thanks for the responses! 

I'm currently pursuing this avenue, there is a website that can detect the language https://detectlanguage.com/.

It returns a JSON object like this


You pass in your string and API key


It's pretty sweet.

Thanks again.



Share this post

Link to post
Share on other sites

A way, using  detectlanguage.com

ConsoleWrite( _GetLanguage("buenos dias señor" ) )

Func _GetLanguage($sText)
    Local $sRet, $aLang
    Local $sUrl = "https://ws.detectlanguage.com/0.2/detect"
    Local $sRequest = "key=demo&q=" & $sText

    Local $oHTTP = ObjCreate("Microsoft.XMLHTTP")
    If @error Then Return SetError(1, 0, "")
    $oHTTP.open ("POST", $sUrl,false)
    $oHTTP.setRequestHeader("Content-Type", "application/x-www-form-urlencoded")
    $oHTTP.setRequestHeader("Content-Length", StringLen($sRequest) )
    $sRet = $oHTTP.responseText
    $aLang = StringRegExp($sRet, '"language":"(\w+)"', 1)
    If @error Then Return SetError(2, 0, "")

    Return $aLang[0]


  • Like 1

Share this post

Link to post
Share on other sites

Hello. días requires accent mark.



Edited by Danyfirex

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now