Saturday, May 2, 2020

Demystifying pass by value & pass by reference

Here's a simple code block. Try to figure out the possible output.


When you run this, the test1.Name will still write "original" to the console. If you thought that accessing test1.Name after it being set to null will throw a NullReferenceExceptionthen you definitely need to keep on reading :)

If you got the correct answer, then good for you but you might also be wondering why I pulled out a silly question like this which the outcome is fairly obvious. Well, it turned out that it's not fairly obvious to an untrained eye, despite how many years it might have looked at code.

I had a somewhat heated argument with one of my colleagues regarding this phenomenon, who happened to be a seasoned programmer. And finally, I had to type the above code in Visual Studio to prove my point. So I believe that this is a tricky area where most programmers tend to trip sooner or later. Hopefully, this post will set things straight.


Back to Basics
So what are value types and reference types? In .NET, value types are derived from System.ValueType and reference types are derived from System.Object. While value types are generally stored in the stack, reference types are stored in the managed heap. Examples of value types are Int, Char, DateTime, Enum or Struct whereas String, Delegate, Interface or Class are examples of reference types.

You can think of a variable as a container. When the variable is of a value type, the variable's value is stored inside the container itself. But when the variable is of a reference type, what the container contains is not its actual value but a light-weight meta value pointing to a different place where you would find the actual value.

The best analogy I can think of is the meta redirect tag on an HTML page. When you access this page's URL from a browser, some HTML would be rendered. But due to the meta redirection, what's rendered would be fetched from a different URL. The original page is the container of your reference type variable. However, when you access it, it will bring you the value from a different place.

I hope I didn't make it sound more confusing. I'm trying my best to explain without talking about pointers.

Ok so if you've been following me thus far, you would still scratch your head why on earth a reference type variable still holds its value even after being set to null. For that, you have to understand what happens when you pass variables around.


Pass by value & pass by reference
When you pass a variable to a method by value, you pass a copy of that variable. So whatever you do to that variable does not affect the original. But when you pass by reference, you work with the same copy and therefore all changes are sort of "global".

In the above example, what you saw is an example of a variable passed by value.

Wait, what? Isn't that a reference type variable? Don't they get passed only by reference? You may ask.

Well, that's what you used to believe because whenever you manipulate an object's properties via a passed in variable, the original object retains those changes throughout. But if that was passed by value as I say, how could those changes persist? Yes, it's highly confusing when the phrase "objects are passed by reference" is already baked into your head.

Back to basics. Go back and read how I explained the variables using a container analogy. When you pass a reference type variable to a method, by value, shown in the example above, a copy of the variable is passed in just like for any value type. In this cause, what do we have inside the variable's container? A meta value pointing to a different place.

Was a copy of the value which it points, created? No. Then? Only a copy of the variable with its content was created. A reference type variable does not contain its value inside it, so now we just have two variables pointing to the same actual value. What happens when we set this new copy to null? Does it change what it pointed earlier? No. Just like when you change the meta redirect tag in an HTML page. Just because it redirects to a new URL now, it doesn't magically delete the HTML page which it redirected earlier. That page will still continue to exist.

If you want to change the original content of a variable from inside a method which it was passed in, you need to pass in by reference using refout or in keyword.

Woah, wait. Then how did my code work all this time? I happen to mutate my objects all over the place without a problem not worrying about how I passing them in.

Err... well yeah that's probably because most of the time, if not all that you did was the dot (.) dance on your objects. Back in the day, we had to explicitly say go fetch data using the arrow (->) notation when dealing with reference type variables, so this confusion was not commonplace I suppose. But now, dotting on a reference type variable or on a value type variable (think of a Struct) works more or less the same way, thanks to the compiler.

Retrospect
If you take the above example and change the line inside ModifyTest1 method to test1.Name = "fake"; the output would print "fake" instead of "original". Because as soon as you do test1.(something) it applies to the destination object instance. But when you do test1 = null; or even test1 = new Test1("fake"); for that matter, you are basically changing your variable's (container's) content, not what its previous content was pointing at. Since the calling code still has a test1 variable pointing to the original content, setting the copy of the test1 variable to point to null or to a different instance by the ModifyTest1 method does not affect the original content.

Running the above code would have resulted in a NullReferenceException if the method signature happened to be ModifyTest1(ref Test1 test1) and the call was done as test2.ModifyTest1(ref test1);. And that's the same way you generally pass your value type variables, as reference, at times (Remember int.TryParse()?).

I sincerely wish that I have not made you completely go nuts with this explanation. This is just my understanding of how value types and reference types work when you pass them around. I wish I could include how string type behaves like a value type even though it's a reference type, but this post has become too long already so maybe that's for another day. Cheers!

No comments:

Post a Comment

What's Blazor WebAssembly and why should you care?

There was a peaceful time where web application development was simple. With ASP.NET, all we had to do was just open the IDE and drag UI ...