trancexx Posted July 14, 2011 Share Posted July 14, 2011 (edited) This is something that I've been thinking about the whole freaking day today. I was at a beach and thought about it. Kinda drives me nuts.The function should find the first occurrence of some wide character inside wide string.INT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { //... }So, the most obvious code (to me) would be:INT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return 0; for(INT i = 0; sString[i]; ++i) { if (sString[i] == sFind) return i; } return 0; }Considering that compiler settings could affect the size I've set it (Visual Studio 2010 Express') to produce the shortest possible code. The result is 39 bytes of x86 code.55 push ebp8BEC mov ebp, esp> BLOCK 20 8B5508 mov edx, dword[ebp+08] ; first parameter is copied to edx ; sString 33C0 xor eax, eax ; eax is 0 ; i = 085D2 test edx, edx ; (edx == 0)? ; is (sString == NULL)7417 je 23d ; if equal go forward 23 bytes (GOTO 50), eax == 0 0FB70A movzx ecx, word[edx] ; ecx is first character in string ; sString[i]EB0B jmp 11d ; go forward 11 bytes (GOTO 40).> BLOCK 30 663B4D0C cmp cx, word[ebp+0C] ; see if (e)cx -current character- is the one searched for ; if (sString[i] == sFind)740C je 12d ; if equal go forward 12 bytes (GOTO 50), eax == i40 inc eax ; increment eax ; ++i0FB70C42 movzx ecx, word[edx+2*eax] ; ecx is next character in string> BLOCK 406685C9 test cx, cx ; ((e)cx == 0)? -is null terminator found ; if (!sString[i]), or if you like if (sString[i] == '\0')75F0 jne -15d ; if not equal go back 15 bytes (GOTO 30) ; continue looping if not33C0 xor eax, eax ; eax = 0, eax == 0 > BLOCK 50 - STACK JOB5D pop ebp C20800 ret 0008> BLOCK 10 - STACK JOBBut, here's the catch. I saw like 36 bytes (less than 39) of code that does the same thing, compiled with the same compiler. I saw the guy compiling it, I saw everything except the body of the function and compiler settings. Let's assume settings are not the issue, do you have an idea what to do to have it compiled to less instructions as possible? What c++ code would result in shorter code with the same calling convention and other things? Maybe even lesser than his function produce.How to get rid of this line of code: movzx ecx, word[edx]... and jump directly to this:movzx ecx, word[edx+2*eax]And of course there is possibility that he was just playing me around knowing I can't resist this stuff. Smartass.edit: fact that returned 0 is vague is irrelevant. Edited July 14, 2011 by trancexx ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
wraithdu Posted July 14, 2011 Share Posted July 14, 2011 (edited) Just an idea, maybe a while loop?UINT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return 0; INT i = 0; while (sString[i]) { if (sString[i] == sFind) return i; ++i; } return 0; }Also, don't you want to return something other than 0 if the char is not found? 0 is valid if the first char matches.Edit: saw your edit Edited July 14, 2011 by wraithdu Link to comment Share on other sites More sharing options...
wraithdu Posted July 14, 2011 Share Posted July 14, 2011 Probably compiles the same, but removes one line: UINT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return 0; INT i = 0; while (sString[i]) { if (sString[i++] == sFind) return i-1; } return 0; } Link to comment Share on other sites More sharing options...
trancexx Posted July 14, 2011 Author Share Posted July 14, 2011 (edited) Probably compiles the same, but removes one line: UINT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return 0; INT i = 0; while (sString[i]) { if (sString[i++] == sFind) return i-1; } return 0; } No that's more. 46 bytes of "crash" code. Your first and mine compiles the same. Edited July 14, 2011 by trancexx ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
PsaltyDS Posted July 14, 2011 Share Posted July 14, 2011 Do you know for sure that he sanity checked the input parameter? How many bytes do you get if you remove that line "if (sString == NULL) return 0;"? Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
wraithdu Posted July 14, 2011 Share Posted July 14, 2011 Hmm, what if you remove the multiple exit points?INT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { INT r = 0; if (sString != NULL) { for(INT i = 0; sString[i]; ++i) { if (sString[i] == sFind) { r = i; break; } } } return r; }And I'm curious if you make this a "safe" function:INT __stdcall ChrInStr(LPCWSTR sString, INT nLen, WCHAR sFind) { INT r = 0; if (sString != NULL) { for(INT i = 0; i < nLen; ++i) { if (sString[i] == sFind) { r = i; break; } } } return r; } Link to comment Share on other sites More sharing options...
trancexx Posted July 14, 2011 Author Share Posted July 14, 2011 (edited) Hmm, what if you remove the multiple exit points? INT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { INT r = 0; if (sString != NULL) { for(INT i = 0; sString[i]; ++i) { if (sString[i] == sFind) { r = i; break; } } } return r; } And I'm curious if you make this a "safe" function: INT __stdcall ChrInStr(LPCWSTR sString, INT nLen, WCHAR sFind) { INT r = 0; if (sString != NULL) { for(INT i = 0; i < nLen; ++i) { if (sString[i] == sFind) { r = i; break; } } } return r; }Both of those are 45 bytes with me. @PsaltyDS, that check was done for sure. Btw, it's 4 bytes. The main thing, I think, is to make compiler not doing that one assignment that I showed in the first post. That's redundant since eax is 0 there. word[edx] is the same as word[edx+2*eax] if eax is 0, and that exactly is the case at that point. Edited July 14, 2011 by trancexx ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
Valik Posted July 14, 2011 Share Posted July 14, 2011 Are you sure the calling convention is the same? Are you sure that the function body was C/C++ code and not inline assembly? You could save quite a few bytes if you wrote your own assembly (obviously). Link to comment Share on other sites More sharing options...
trancexx Posted July 14, 2011 Author Share Posted July 14, 2011 Are you sure the calling convention is the same?Yes.Are you sure that the function body was C/C++ code and not inline assembly? You could save quite a few bytes if you wrote your own assembly (obviously).This is possible option. Maybe he cheated. Do you think some weird type casting could help? ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
jaberwacky Posted July 14, 2011 Share Posted July 14, 2011 (edited) Assembly is a great mystery to me. So it sets i to 0 before it checks if sString is null? 33C0 xor eax, eax ; eax is 0 ; i = 0 85D2 test edx, edx ; (edx == 0)? ; is (sString == NULL) Edit: I get it now that's for the loop. Edit: Wait, no it isn't? BRAINASPLODE! <--- Relegated to staring at blinking lights in awe for rest of life. What is the criteria for the end of the for loop? I see no test. Or is that why it sets i to 0 before it compares sString to null? If the for loop checks sString for null every iteration then is this necessary: if (sString == NULL) return 0; No no, scratch all of that. I see now. Edited July 15, 2011 by LaCastiglione Helpful Posts and Websites: AutoIt3 Variables and Function Parameters MHz | AutoIt Wiki | Using the GUIToolTip UDF BrewManNH | Can't find what you're looking for on the Forum? Link to comment Share on other sites More sharing options...
trancexx Posted July 14, 2011 Author Share Posted July 14, 2011 (edited) Assembly is a great mystery to me. So it sets i to 0 before it checks if sString is null? 33C0 xor eax, eax ; eax is 0 ; i = 0 85D2 test edx, edx ; (edx == 0)? ; is (sString == NULL) Edit: I get it now that's for the loop. Edit: Wait, no it isn't? BRAINASPLODE! <--- Relegated to staring at blinking lights in awe for rest of life. eax is the return value of the function. After the call if you want to check what's returned you (and the compiler) check eax. The code that you point to is completely optimized. It's saying "before I do anything, I'll set return value to 0. Maybe change later, maybe not". Edited July 15, 2011 by trancexx ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
jaberwacky Posted July 15, 2011 Share Posted July 15, 2011 (edited) Thanks for that clarification. =) I see now that in the condition of the for loop if sString is 0 then you know you've come to the end of the string because any condition that evaluates to 0 terminates the loop. Well, I'm out of my league here. Yall have fun! I do think you're comments have been misplaced though. Edited July 15, 2011 by LaCastiglione Helpful Posts and Websites: AutoIt3 Variables and Function Parameters MHz | AutoIt Wiki | Using the GUIToolTip UDF BrewManNH | Can't find what you're looking for on the Forum? Link to comment Share on other sites More sharing options...
trancexx Posted July 17, 2011 Author Share Posted July 17, 2011 This is shorter (1 byte) and faster (more than that):INT __stdcall ChrInStr(LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return 0; for(INT i = 0; *sString; ++i) { if (*sString == sFind) return i; ++sString; } return 0; }It basically does what I thought it should. Only it also increments two variables per cycle therefore loosing bytes. I'm fairly confident he went this way. Next question is obvious.@LaCastiglione, nothing is misplaced. ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
jaberwacky Posted July 17, 2011 Share Posted July 17, 2011 What if you pass the parameters as const references? Helpful Posts and Websites: AutoIt3 Variables and Function Parameters MHz | AutoIt Wiki | Using the GUIToolTip UDF BrewManNH | Can't find what you're looking for on the Forum? Link to comment Share on other sites More sharing options...
ProgAndy Posted July 17, 2011 Share Posted July 17, 2011 (edited) What about using this?INT __stdcall ChrInStr(const LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return -1; LPCWSTR s = sString; while (*s) { if (*s == sFind) return s-sString; ++s } return -1; }Or this? INT __stdcall ChrInStr(const LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return -1; LPCWSTR s = sString; register WCHAR c = *s; while (c) { if (c == sFind) return s-sString; c = *(++s); } return -1; } Edited July 17, 2011 by ProgAndy *GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes Link to comment Share on other sites More sharing options...
Valik Posted July 17, 2011 Share Posted July 17, 2011 I'm not sure if the syntax will change the compiled code but it's probably:WCHAR c; while (c = *sString++)Or whatever the syntax is. In other words, combine the assignment to c and the check that c != NULL into a single statement. I hate code that does the assignment and test but in this case it could affect the compiled output - or not. Link to comment Share on other sites More sharing options...
trancexx Posted July 17, 2011 Author Share Posted July 17, 2011 What about using this? INT __stdcall ChrInStr(const LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return -1; LPCWSTR s = sString; while (*s) { if (*s == sFind) return s-sString; ++s } return -1; }Or this? INT __stdcall ChrInStr(const LPCWSTR sString, WCHAR sFind) { if (sString == NULL) return -1; LPCWSTR s = sString; register WCHAR c = *s; while (c) { if (c == sFind) return s-sString; c = *(++s); } return -1; }Both of those compiles to same code. The main issue with it is additional code added as some sort of security check likely due to pointer subtraction. Generated instructions and resulting mnemonics are unknown to me (yet). I don't know how to avoid them. @Valik, whatever I join compiler splits. ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
JohnOne Posted July 18, 2011 Share Posted July 18, 2011 I probably have no business here, but I might be educated nonetheless. Would it matter if you used a short int as the counter rather than int? AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Mat Posted July 18, 2011 Share Posted July 18, 2011 I probably have no business here, but I might be educated nonetheless.Would it matter if you used a short int as the counter rather than int?Yes. It would almost certainly make it longer. AutoIt Project Listing Link to comment Share on other sites More sharing options...
JohnOne Posted July 18, 2011 Share Posted July 18, 2011 Ok, I just thought a short was 2 byts and an int was 4, and some other flawed logic. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now