Sign in to follow this  
Followers 0
WolfWorld

Is using a Pointer faster?

35 posts in this topic

I just started c++ a day ago >_<

okay here the question

For example I have two script written.(Using C++ Builder)

String myString = "Hello World", * myPointer = &myString;

myPointer->length(); //This or

myString.length(); //This

Will be faster if I do it in a loop?

I don't know how to measure time set so.

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Generally the less indirection, the faster the response. So I would try using the direct member access (.) in this situation. The reasons to use a pointer in a loop are generally related to: 1) copying/passing smaller amounts of data, or 2) faster array/member dereference.

It is generally faster to use a pointer or reference to pass an object or other large data structure than passing a copy to a function. The less data passed, the faster the operation. The use of a pointer or reference inside a function is a little slower than using a copy of the data because it has to be dereferenced each time, but this is generally faster than all the work needed to copy a big data structure, especially with a constructor (don't worry; you will learn about them soon).

It is generally faster to follow a pointer than calculate an array dereference each time to walk through an array. This is also true with public members of objects and structures. For example, the C library function strcmp could be written using array dereferencing or pointers. Let's try array dereferences first.

/* return -1 if s1 is lexically less than s2, 0 if s1 = s2 or 1 if s1 > s2 */
int strcmp(const char *s1, const char *s2) {
    int i = 0, diff;

    while (1) {
        /* use subtraction to determine the difference. */
        diff = s1[i] - s2[i];
        /* If they are difference or we have reached the end of the string, get out.  
           BTW, do have to test for end of s2 because if s2 is '\0' and s1 is not, then diff != 0. */
        if (diff != 0 || s1[i] == '\0')
            return diff;
        ++i;
    }
}

Even though the calculation of the index is just the same variable reference each time, it must be determined (at least a look-up) each time it is used and then used to determine the location in the array, using at least an addition and a multiplication.

Okay, let's look at the same function written using pointers.

/* return -1 if s1 is lexically less than s2, 0 if s1 = s2 or 1 if s1 > s2 */
int strcmp(char *s1, char *s2) {
    int diff;

    while (1) {
        /* dereference and subtract */
        diff = *s1 - *s2;
        if (diff != 0 || *s1 == '\0')
            return diff;
        /* increment each pointer to the next array element */
        ++s1;
        ++s2;
    }
}

So what are the differences? First we see that we do not have the variable i. Second we are dereferencing the pointers directly. This saves the addition and multiplication steps needed for array dereferencing. Yes, we are now incrementing two variables, but that (along with 3 pointer dereferences) is more efficient than 3 variable lookups and array dereferences.

By the way, the function is generally written using optimized assembler that is closer to the pointer version. Do not be dismayed if you do not follow the pointer version. At this stage in your education, and even when you get out into the world of work, it is more important to write clean code that others can follow rather than purely optimal code which nobody can follow even yourself three months later. Let the compiler's optimizer take care of that for you.

There are a lot of problems that can happen regarding pointers for beginning programmers, so unless you are familiar with pointers in C (or another language) already, I recommend to be very careful. Learn the rest of the language first, then see what you can do with pointers. There is even a safer version of the pointer in C++ called a reference. It is often preferable to use a reference than a pointer because of its ease of use and that it is much more difficult to get yourself in trouble using references than pointers. That said, there are still many places pointers are better than references and so should be used. Keep on learning.

Edited by Nutster
Change code blocks to C. Couple of grammer and speeling fixes.

David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Thanks, that was very nice. I will take a closer look in to it this morning.

Okay

So you mean by

String myTestString;

myTestString = "Hello World";

String * myTestPointer = &myTestString;

Will this pass the data if it's in the same function/scope or it will use the same set of data?

myTestString.length()//This will make a copy of myTestString into another memory location? and then call object length to calculate the string length? is that a waste of space and time because it does not change myTestString in the first place?

AND

myTestPoint->length()//This will not copy the string into another location but it will need to get the value of that address then call the object?

and isn't array just pointers in the first place? because this is what c and c++ lacks?

So in the end always use Pointer for String/Char and other array. am I right?

So in your first example the program is looking for the address of the array and adding the offset of i

And in the second example. It is adding the offset directly to the pointer.

1. Look-up>Add>dereference

2. Add>dereference

Now I get this.

Soooo confuse, but thanks. And sorry for my english and the question mark.

Edited by athiwatc

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

@athiwatc, as far as Classes/Structures go, when you call a procedure, the same 'this' pointer gets passed, there shouldn't be any difference in speed inside the function. Before the call, I'm not sure - possibly one extra step to get the 'this' pointer from the String object that the myPointer pointer points to - which honestly wouldn't have any real kind of impact on speed.

@Nutser, I believe your first function would operate ever-so-slightly faster with any decent compiler. Basically 'i' would be put into something like the 'EBX' register (consider it an index register), and string access would occur with something like [EDI+EBX*2] and [ESI+EBX*2]. EBX would be incremented by one each loop.

The second function on the other hand would be incrementing two pointers (by 2), (EDI and ESI, or choose-your-own-memory-access-register). 'Dereferencing' the data is not very different than accessing the data using the 'array' style in your first function, though in this case they would be accessed without EBX, using [EDI] and [ESI].

Realistically, the two functions shouldn't be much different speedwise, but the first function is easier to understand.

(Sorry, I know EBX, EDI and ESI may be foreign concepts to those that don't understand assembly - but at the machine code level, they would represent data pointers, or in EBX's case, both a pointer [by itself] and index [used in combination])

Hope I made that clearish..

*edit: oops, made a mistake with indexing - EBX*2 would probably be used (which I believe results in a bit-shift rather than a typical multiply)

Edited by Ascend4nt

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Will this pass the data if it's in the same function/scope or it will use the same set of data?

myTestString.length()//This will make a copy of myTestString into another memory location? and then call object length to calculate the string length? is that a waste of space and time because it does not change myTestString in the first place?

A pointer to another object is going to work on the pointed-to object. Its the same as using the object directly, though you are using an indirect way of accessing it.

myTestPoint->length()//This will not copy the string into another location but it will need to get the value of that address then call the object?

it just 'dereferences' the pointer first, (finding what the 'this' pointer is) and passes that to the function. All member functions in C++ get a pointer generally called the 'this' pointer. The function operates the same regardless of how it is invoked, pointer or direct.

*edit* typo

Edited by Ascend4nt

Share this post


Link to post
Share on other sites

@Ascend4nt

I think the second example will run faster because multiplying is a very expensive operation.

Anyway is there a way to measure time very accurately? So I can start testing.

Actually, I verified - its a bitshift, not a multiply (the [ESI+EBX*2] part). And really, both of Nutster's functions would probably give you about the same results - possibly even microseconds of a difference.

And sorry, don't recall any built-in timer functions.. (haven't been actively programming in C++ for a while now - though I do have a pretty intimate relationship with the language design and it's use with Assembly language).

There are Windows API timer functions you can utilize, however.

Share this post


Link to post
Share on other sites

Actually, I verified - its a bitshift, not a multiply (the [ESI+EBX*2] part). And really, both of Nutster's functions would probably give you about the same results - possibly even microseconds of a difference.

And sorry, don't recall any built-in timer functions.. (haven't been actively programming in C++ for a while now - though I do have a pretty intimate relationship with the language design and it's use with Assembly language).

There are Windows API timer functions you can utilize, however.

So it's a bitshift, so it will be faster >_<

Am using C++ Builder, Debugging the first function and second function gives almost the same code(Just two call off), I guess the compiler is optimize.

I found the API or what every it's called, thanks.

Share this post


Link to post
Share on other sites

If you can get ur hands on MS Visual C/C++ 6, that has a profiling tool where you can measure the speed of functions. I forget the details as it was a while ago but it can be done.


Post your code because code says more then your words can. SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y. Use Opt("MustDeclareVars", 1)[topic="84960"]Brett F's Learning To Script with AutoIt V3[/topic][topic="21048"]Valuater's AutoIt 1-2-3, Class... is now in Session[/topic]Contribution: [topic="87994"]Get SVN Rev Number[/topic], [topic="93527"]Control Handle under mouse[/topic], [topic="91966"]A Presentation using AutoIt[/topic], [topic="112756"]Log ConsoleWrite output in Scite[/topic]

Share this post


Link to post
Share on other sites

@Nutser, I believe your first function would operate ever-so-slightly faster with any decent compiler. Basically 'i' would be put into something like the 'EBX' register (consider it an index register), and string access would occur with something like [EDI+EBX*2] and [ESI+EBX*2]. EBX would be incremented by one each loop.

The second function on the other hand would be incrementing two pointers (by 2), (EDI and ESI, or choose-your-own-memory-access-register). 'Dereferencing' the data is not very different than accessing the data using the 'array' style in your first function, though in this case they would be accessed without EBX, using [EDI] and [ESI].

Realistically, the two functions shouldn't be much different speed-wise, but the first function is easier to understand.

(Sorry, I know EBX, EDI and ESI may be foreign concepts to those that don't understand assembly - but at the machine code level, they would represent data pointers, or in EBX's case, both a pointer [by itself] and index [used in combination])

Hope I made that clearish..

*edit: oops, made a mistake with indexing - EBX*2 would probably be used (which I believe results in a bit-shift rather than a typical multiply)

Well that was part of my point. Write code that is easy for others (or yourself a few months down the road) to understand is usually considered more important than writing slightly more optimal code that nobody understands.

Remember that Intel's Pentium is not the only processor for which there is a C compiler. >_< The details of how other processors do things will be slightly different. For example, check out Sun's UltraSPARC.

Even using the method you outline, there are several steps you skipped which will slow things down for my first (array dereferencing) method. The processor does not inherently know how big the array elements are; a relatively expensive multiplication must be done to determine the index location. Sure there can be less instructions, but how long does each instruction take? An indexed lookup is considerably slower than a direct lookup; this may be hard to tell using instruction pipelining processors, but with older processors, like a 6510, an indexed lookup took 7 clock cycles, while a direct look-up took only 3. Okay, give me a break; it's the only one I remember off the top of my head.

Using array index dereferencing, the general steps required are (described in a processor independent manner):

load i -> Reg(1) // index
load sizeof(element) -> Reg(2) // element size.  Determined at compile time.
mult Reg(1), Reg(2) -> Reg(1) // Determine byte-count within array for start of element.  On some processors this must be done in software (much slower!).
load array -> Reg(2) // get pointer to start of array
add  Reg(1), Reg(2) -> Reg(1) // Add the locations of array with location in array
load Reg(1) -> Reg(2) // Get the actual data in the array.

Using pointer dereferencing, the general steps required are:

load s1 -> Reg(1) // load the pointer into a register
load Reg(1) -> Reg(2) // Done.

Yes we are adding when incementing the pointers each time, but that is just a few fast instructions.

load s1 -> Reg(1)
add Reg(1), sizeof(element) -> Reg(1)  // sizeof() determined at compile time
store Reg(1) -> s1

Yes, optimizers can clean up much of this code, but the smaller code of the pointer dereference is more likely to more likely to be cleaned more than the big massive array dereference. In fact some optimizers will generate code that is very similar for both functions, depending on the processor.

I half-way remember looking at this when I was learning C, close to 20 years ago, and the array dereference method was about twice as slow as the pointer dereference method. I would be interested in what modern processors and optimizers can do with this. I would also be interested in what the generated assembly would be like too (not enough to actually do it, but...)


David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

Remember that Intel's Pentium is not the only processor for which there is a C compiler. >_< The details of how other processors do things will be slightly different. For example, check out Sun's UltraSPARC.

Very true, I had made the assumption that the code was being used for Windows programming - the generated code/speed may very well differ for the two functions on other processors.

Even using the method you outline, there are several steps you skipped which will slow things down for my first (array dereferencing) method. The processor does not inherently know how big the array elements are; a relatively expensive multiplication must be done to determine the index location. Sure there can be less instructions, but how long does each instruction take? An indexed lookup is considerably slower than a direct lookup; this may be hard to tell using instruction pipelining processors, but with older processors, like a 6510, an indexed lookup took 7 clock cycles, while a direct look-up took only 3. Okay, give me a break; it's the only one I remember off the top of my head.

I was working with what you gave as an example, the char * - which is either ANSI or Unicode, and for that particular case, I didn't really miss anything. However, if you do consider arrays of larger (non-string) types, then yes - a multiplication or extra addition would be needed. And you might be right about the older processors clock-cycle-wise, but simple indexing like in the function you posted wouldn't be costly at all (at least on any x86 architecture since, I dunno - the 386? :( ).

Using array index dereferencing, the general steps required are (described in a processor independent manner):

load i -> Reg(1) // index
  load sizeof(element) -> Reg(2) // element size.  Determined at compile time.
  mult Reg(1), Reg(2) -> Reg(1) // Determine byte-count within array for start of element.  On some processors this must be done in software (much slower!).
  load array -> Reg(2) // get pointer to start of array
  add  Reg(1), Reg(2) -> Reg(1) // Add the locations of array with location in array
  load Reg(1) -> Reg(2) // Get the actual data in the array.

Using pointer dereferencing, the general steps required are:

load s1 -> Reg(1) // load the pointer into a register
  load Reg(1) -> Reg(2) // Done.
Okay, a few notes:

1. (array index dereferencing) If you are doing a loop (and you have a halfway decent compiler), the base pointer won't need to be loaded each time through the loop - just once at the start. The indexed reference (in general terms) would indeed get the multiply and add, but indexed lookup like [esi+ebx] (on x86 cpu's) could cut off at least one instruction

2. (pointer dereferencing) If you are doing one-by-one increments, or a specific hardcoded # of increments (in a loop), then you'll just need to add 'add s1, sizeof(element) * #-to-increment'. If you are using structures/classes, the size of the object would need to be loaded (due to inheritance), and then multiplied* - putting you back to the code in (array index dereferencing). However, if its a fixed-size datatype, the add calculation should be decided at runtime.

3. (pointer dereferencing) Additionally, if you don't know ahead of time the amount to increment the pointer, or the amount is stored in a variable - you'll need everything in (array index dereferencing).

In summary: pointer dereferencing will probably only ever be faster (in the general sense) for fixed C++ datatypes (int, char, etc). Otherwise, with structures/classes, I don't see your code making a difference

*edit: regarding #2 - actually, for structures/classes I don't know how C++ would get the next item, that's actually got me wondering now...

Edited by Ascend4nt

Share this post


Link to post
Share on other sites

Okay, from reading -> this <- , its clear that creating arrays of structures\classes in C++ is going to cause a 'disaster' when calling a function that works on an 'parent' object array, with a derived class. So, it looks like arrays better be used only for the basic data types.. something I hadn't thought to look into until now!

Share this post


Link to post
Share on other sites

That link is typical Stroustrup "you can blow your foot off if you aren't careful" stuff. It's just another example of why using a container is superior than using an array. In short I think it's only tangentially related to your discussion.

Also, just because: "Premature optimization is the root of all evil".

Share this post


Link to post
Share on other sites

@Valik: You know, that Premature optimization remark is not really relevant when we are analyzing what the optimization results would be. And for me, I just happen to know x86 Assembly and can give some insight into code results. I could have even remarked that inserting a 'rep cmpsb/cmpsw/cmpsd' could be used to optimize the function further, but the way the code was written wouldn't result in that optimization by any compiler.

@Nutster\athiwatc: When I think of it, most basic data types in C++ are 1, 2, 4 or 8 bytes - all of which can be indexed on an x86 through a bitshift (EBX*x) number, so that's yet another place where 'array index dereferencing' wouldn't be slower - but again, thats machine-specific. Other than that, if you are working on another processor architecture and know how what types of results you would get, I guess its up to you to decide. Array indexing looks cleaner and is easier to understand though.

Share this post


Link to post
Share on other sites

To start I'd like to say that this thread is a good read -- it is exposing me to a few new things which is always good.

Having said that, my problem with the current discussion is that it seems like the original poster's post has not really been properly addressed. He started with

I just started c++ a day ago

and then went on to ask about how to squeeze an extra few percentage points of efficiency out of c++ code. Perhaps I am reading this wrong, but my response is to say 'WTF!' There is absolutely no reason for him to be worried about the number of clock cycles it takes to de-reference a pointer. IMO entertaining this question in this manner is quite counterproductive for the OP.

To finish...

@athiwatc

The fact that you asked this question shows that you are thinking which is fantastic. Unfortunately you are thinking too far ahead at the moment -- you are running before you can walk. Stop trying to optimize code and just try to learn the language. I'd suggest that you read this thread as no more than an abstract aside.


Share this post


Link to post
Share on other sites

@Valik: You know, that Premature optimization remark is not really relevant when we are analyzing what the optimization results would be.

I wouldn't attempt to argue relevancy when the current conversation is irrelevant to the topic at hand. I hardly think discussing details of generated machine code is all that useful to a beginning programmer. The correct answer to the OP was not "use this one because it's faster" but rather "use whichever one you are more comfortable with". We're not talking about embedded systems here where every clock cycle and every bit of memory is critical. Unless you're talking about an OS kernel here, worrying about which syntax provides the most optimized code on x86 architecture is a pre-mature optimization. It's only useful for theoretical applications because in practice the cleanest written code is the best code.

I will also argue that the entire debate is rather pointless. If you want optimized code, write it in assembly. Writing C(++) code a certain way to massage the optimizing compiler into generating something specific strikes me as a poor idea because it's going to be very compiler specific. In fact it's likely to be settings-specific. And then there's the 64-bit compiler that's inherently going to produce different code. What do you do if the 64-bit compiler optimizes better one way and the 32-bit compiler optimizes better another way? All you and David are discussing is which syntax produces the faster machine code with your particular compiler vendor and your particular settings.

This was a C-language question from a newbie who doesn't know that it's an irrelevant question to be asking. You two ceased talking about C several posts back.

Share this post


Link to post
Share on other sites

@Valik, Of course, discussing the speed issue is 'irrelevant to the topic at hand', when the title of this thread is exactly about speed.

As far as compilation goes, I've tried to make it clear that a 'decent' compiler would produce code the way I mentioned on x86 architectures (and yeah, I admit that the x86 architecture was an assumption on my behalf - but is it that much of a stretch in forums devoted to a Windows-only interpreter?)

With compiler-specific settings, especially when talking about singular pieces of code that just perform dereferencing of arrays/pointers, its quite doubtful you'll see any difference in output. On code fragments like Nutster posted however, its true that different settings could produce varying output - though the functions are so small that the differences would be very minimal. With larger pieces of code, optimizations/compiler settings can indeed make a world of difference. And in general yeah, its best to code in 'good form' at first before considering speed.

Regarding 64-bit compiles, dereferencing would work the same as well - though you are now using 64-bit ops and have a few more registers to play with.

With regards to us going off-topic, I'd say that you're missing part of what the topic is about. Okay, maybe we are talking beyond athiwatc's expertise (my apologies athiwatc), but that doesn't mean that we didn't give what you consider 'the correct answer'...I'm giving one view of the speed issue, Nutster another. You are free to give your own responses - but don't assume that you have the golden answer and try to force the rest of us to agree.

Share this post


Link to post
Share on other sites

@Valik, Of course, discussing the speed issue is 'irrelevant to the topic at hand', when the title of this thread is exactly about speed.

Is it really? Or is the topic about a new programmer who doesn't know what is and is not important to be asking? Judging by David's answer, Wus' comments and my own personal thoughts, I think we are all in agreement that providing a literal answer to the original question is stupid and certainly the wrong thing a new programmer should care about. If you wish to wave the literal flag around, then you are certainly right. But start looking at this practically and you'll see that all you're doing is having a mildly interesting discussion with a C++ professor.

As far as compilation goes, I've tried to make it clear that a 'decent' compiler would produce code the way I mentioned on x86 architectures (and yeah, I admit that the x86 architecture was an assumption on my behalf - but is it that much of a stretch in forums devoted to a Windows-only interpreter?)

So is that what we should be doing? Indoctrinating somebody who's been trying to learn C++ for a couple days by going deep into x86 instruction details? Who is that helping?

With compiler-specific settings, especially when talking about singular pieces of code that just perform dereferencing of arrays/pointers, its quite doubtful you'll see any difference in output. On code fragments like Nutster posted however, its true that different settings could produce varying output - though the functions are so small that the differences would be very minimal. With larger pieces of code, optimizations/compiler settings can indeed make a world of difference. And in general yeah, its best to code in 'good form' at first before considering speed.

Regarding 64-bit compiles, dereferencing would work the same as well - though you are now using 64-bit ops and have a few more registers to play with.

Have you sat down and compared 64-bit versus 32-bit code? Or compared two different compilations one with optimize for speed and the other with optimize for size? What about debug versus release builds? What about Microsoft's compiler versus GCC? The point is, there can be slight variance depending on the circumstances. Different optimizations may produce code that's one clock cycle faster but 2 bytes longer or vice versa. Debugging code may have extra code to help catch invalid memory access or leaks

With regards to us going off-topic, I'd say that you're missing part of what the topic is about. Okay, maybe we are talking beyond athiwatc's expertise (my apologies athiwatc), but that doesn't mean that we didn't give what you consider 'the correct answer'...I'm giving one view of the speed issue, Nutster another. You are free to give your own responses - but don't assume that you have the golden answer and try to force the rest of us to agree.

Don't assume that just because you happen to be answering the question that you are in the right, either. I've been around this stuff long enough to know that not all questions need answers because sometimes (many times) people are asking the wrong questions. This is a case where the wrong question has been asked. There is an answer to the question but it's not important. What's important is to guide the OP in a direction so that they realize what is and is not important. Maybe if they ever find themselves working in embedded systems in a few years, then yeah, the information in this thread might be useful. But otherwise, not so much.

My point is this: Don't try to hide behind the "but I'm answering the question" screen. It's pretty obvious to the rest of us that the question does not need a definitive "this way is faster" answer. There are far more important things a new C++ programmer needs to worry about. At least three of us realize that. A theoretical discussion of which method is faster is fine; so long as you realize it's nothing more than a theoretical discussion that should not be taken as gospel.

Share this post


Link to post
Share on other sites

Yes Valik, everyone who doesn't side with you must be 'stupid', those few that agree with you must be the 'general populous'. Nothing new there, that seems to be your usual take on things.

As far as compilers, I've had experience with various C/C++ compilers over the years, with different settings for compiling, viewed assembly outputs/disassembly, linked with Assembly code, etc. How much experience have you had working with Assembly/disassembly? Or with looking at it from different compilers/settings? Point is here - there's two sides to this coin. I'm guessing from your posts that you haven't done any of that.

And if you look again at my posts, I pointed out *numerous* times, that the difference in time would be quite small - even in microseconds. So athiwatc could have skipped the rest of the topic after reading that, it was up to him. Most of my discussions over speed were addressed to Nutster also, you might consider that too.

Also, I never claimed to have the "correct" response - those were your words. I was just discussing the topic. 'ooh he must be hiding behind the "but I'm just answering the question" screen again'... You know, I don't need excuses for what I posted. Just accept them or reject them, period. You're certainly entitled to your own opinion on things, but don't expect me and everyone else in the world to agree with you.

Share this post


Link to post
Share on other sites

Yes Valik, everyone who doesn't side with you must be 'stupid', those few that agree with you must be the 'general populous'. Nothing new there, that seems to be your usual take on things.

No, only people who make idiotic comments like you are making are stupid.

As far as compilers, I've had experience with various C/C++ compilers over the years, with different settings for compiling, viewed assembly outputs/disassembly, linked with Assembly code, etc. How much experience have you had working with Assembly/disassembly? Or with looking at it from different compilers/settings? Point is here - there's two sides to this coin. I'm guessing from your posts that you haven't done any of that.

I have not because I know it's a waste of time and not something that I need to worry about. I also know it's not something that makes one bit of difference on any non-embedded hardware made since about 1990. My term for this sort of stuff is "theoretical bullshit". David may recognize that phrase as I've used it before to describe something.

And if you look again at my posts, I pointed out *numerous* times, that the difference in time would be quite small - even in microseconds. So athiwatc could have skipped the rest of the topic after reading that, it was up to him. Most of my discussions over speed were addressed to Nutster also, you might consider that too.

My issue isn't with your discussion with David so much as it is with your belief that it's beneficial to anyone or is part of the answer to the original question. This is a silly optimization to be worrying about. It may be fun for theoretical discussions but in almost all practical applications it's insignificant. It's a pre-mature optimization because if optimization is really important there are much more obvious places to look that will yield much bigger gains. You don't start out by optimizing away one clock cycle when you can optimize away 20. It's very likely that a program where performance is critical can be optimized to fall within the acceptable performance range without even needing to worry about the subject of your discussion. Chances are that if performance is so critical that the developer needs that extra little bit, they are probably going to write the code in assembly anyway.

Also, I never claimed to have the "correct" response - those were your words. I was just discussing the topic. 'ooh he must be hiding behind the "but I'm just answering the question" screen again'... You know, I don't need excuses for what I posted. Just accept them or reject them, period. You're certainly entitled to your own opinion on things, but don't expect me and everyone else in the world to agree with you.

Stop being a hypocrite. You have certainly had a number of opportunities to demonstrate you subscribe to your own remarks by just not responding to me. I'm not twisting your arm or trying to bait you into responding to my posts. The next example of hypocrisy is your comment on not needing excuses about what you posted. If you don't need excuses, why are you making them? You assert your relevancy with all the assembly talk by claiming it's part of the answer to the original question. That seems a strange comment to make for someone who goes on to claim they don't need excuses.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0