Thursday, November 29, 2007

String Parameters in Delphi

Strings are a special managed type in Delphi and should be handled correctly. When comparing a simple procedure with the parameters

procedure Foo(const S: string)
which results in no extra code vs by value
procedure Foo(S: string)
which results in the following assembly being generated:


push ebp
mov ebp,esp
xor eax,eax
push ebp
push $004094c7
push dword ptr fs:[eax]
mov fs:[eax],esp
xor eax,eax
pop edx
pop ecx
pop ecx
mov fs:[eax],edx
push $004094ce
ret
jmp @HandleFinally
jmp $004094c6
pop ebp
ret


So nearly always pass your strings as const or var.

Update: Allen has some good points which I neglected to mention and I'll bring up into the post from the comments so everyone can see it. "This is good advice for nearly all cases. However there are a few cases where you should pass a string "by value." If the function you're calling manipulates a global string variable and there is a chance that the same global string can be passed in as a parameter, there is a chance that the global string variable is deallocated which would indirectly render parameter invalid as well. The same thing can happen with any of the managed types such as interfaces, dynamic arrays or variants."

Update: Rob Kennedy has more good points which I also neglected to mention so I'm bringing them into the post from the comments. What's happening in that assembler code is a try-finally block generated by the compiler. It's there to ensure that the reference count of the string parameter gets reduced before the function returns.

7 comments:

Unknown said...

This is good advice for nearly all cases. However there are a few cases where you should pass a string "by value." If the function you're calling manipulates a global string variable and there is a chance that the same global string can be passed in as a parameter, there is a chance that the global string variable is deallocated which would indirectly render parameter invalid as well. The same thing can happen with any of the managed types such as interfaces, dynamic arrays or variants.

Anonymous said...

Nice tip Chris!
Could you explain a little more please?

Rob Kennedy said...

What's happening in that assembler code is a try-finally block generated by the compiler. It's there to ensure that the reference count of the string parameter gets reduced before the function returns. (Although none of the code you show actually modifies the reference count. Not what I expected to see.)

When strings are passed by value unnecessarily, it can be real annoyance to debug. When you have "debug DCUs" enabled and you single-step through your code, you'll end up in the RTL code for managing string lifetimes. Strings passed by const reference don't yield that extra code, so you don't end up stepping into code you don't need. This came to my attention when debugging code that used IBX -- none of that library has const string parameters.

The advice is even more important for WideStrings, which aren't reference-counted. When you have a by-value WideString parameter, the receiver makes a completely new copy of the string, so you get not just the implicit try-finally block, but also an OS call to allocate another copy of the string.

Anonymous said...

Um, is it just me, or is adopting a particular parameter passing practice to avoid problems with passing a "global" variable as a parameter just a little bit missing the point?

Don't pass as a parameter what you have presumably deliberately made available as a "global" and you won't have a problem.

Firm believer in "Fix the problem, and the problems caused by the problem will fix themselves".

:)

Anonymous said...

This is such a good point that i am amazed the compiler does not recognize when a string is not modified within a method and automatically treat it as a CONST.

Anonymous said...

Naughty Kiwi: Wow, I knew of the overhead when passing by value but never though of your solution - to me it seems perfect.

The only reason I can think of not to do that, is that changing the way the string is treated inside the function can then break the calling interface.

Perhaps Allen or Chris could elaborate?

Anonymous said...

If you perform a large amount of string processing (XML creation for instance) you can find that you get significant memory leaks from not using the const for strings. Generally speaking, the memory manager does a fairly good job of clearing these up but not always.

I've taken to using const habitually when passing strings into functions. I find it's rare that I want to manipulate the string within the method or pass back more than one string to the caller.

It's worth also noting that I found that performing lots of string concatenations like this by "adding" a string to another string can be pretty memory inefficient too. Where I have large amounts of text I will tend to use a string builder (basically a wrapped TStringList) to build my string - allocating the correct amount of space before concatenating. This seems to reduce the memory fragmentation effect.

Post a Comment