Ε Γ И І И О: May 2020

"Tell us about a design pattern that you've used."

Probably one of the most asked questions in a programming interview. And the canned answer always happens to be "Singleton". Why? Because that's the easiest design pattern which you cannot go wrong with. Or is it?

If you don't remember what singleton code looks like, I don't blame you because I don't either. But after giving a little bit of thought, you might be able to come up with this:

Well, true it doesn't have all the bells and whistles like thread-safety and whatnot, but it's a good start. To be frank, I hate locks. It makes the code looking out of place. May be it's just me.

But I have to admit, thread syncing is crucial if you are serious about multi-threaded execution. And you might even want to use "double-checked locking" to favor performance. But what if we could get the same without using locks?

Notice that the above code makes use of how static type initialization works. Static type initialization is guaranteed to happen only once per AppDomian hence line 3 will be executed by the runtime only once, no matter how many threads asked for it.

So what's the caveat? Hmm, glad that you asked. Well, apparently you cannot guarantee when this initialization kicks off, so spawning of our singleton instance will not exactly be "lazy". In fact, it would even be initialized without the class being referred at all. Yikes! Can we fix it?

That's why we've slapped a static constructor in line 6. Mind you, you could've written the same code without it and it'll still work. But with that in place, the compiler generates IL code which fires the initializers in a more predictable fashion. Now the initialization would only happen whenever you refer the static class for the first time. But still, it's not ideal. At least in theory.

What if you have other static members in this class? They could get referred elsewhere and your singleton instance would be spawned prematurely. It's a valid case hypothetically. To circumvent this, you could add a nested static class only to hold your singleton instance and return it when needed. But that's overkill in my opinion. I'm pretty content with the above.

Ok, but can't we achieve this lazy behavior with something much more simple? Sure you can. Lazy<T> to the rescue!

This seems to be the most elegant solution of all. It has everything we tried to achieve: performance and laziness in one package. Have you coded your singletons like this? I have to confess that I have not. In fact, I've only used the style shown in the 2nd code snippet. But I'm looking forward to try out the Lazy<T> implementation when I get my next chance. probably you should give it a shot too. Cheers!

PS: One more thing to note before wrapping up. By using locks or static initialization, you are only making your "singleton instance initialization" thread-safe. It doesn't magically make your other instance methods that do the real productive work thread-safe. You'll need to handle those case by case, if they are prone to be problematic in multi-threaded environments.

Here's a simple code block. Try to figure out the possible output.

When you run this, the test1.Name will still write "original" to the console. If you thought that accessing test1.Name after it being set to null will throw a NullReferenceException, then you definitely need to keep on reading :)

If you got the correct answer, then good for you but you might also be wondering why I pulled out a silly question like this which the outcome is fairly obvious. Well, it turned out that it's not fairly obvious to an untrained eye, despite how many years it might have looked at code.

I had a somewhat heated argument with one of my colleagues regarding this phenomenon, who happened to be a seasoned programmer. And finally, I had to type the above code in Visual Studio to prove my point. So I believe that this is a tricky area where most programmers tend to trip sooner or later. Hopefully, this post will set things straight.

Back to Basics

So what are value types and reference types? In .NET, value types are derived from System.ValueType and reference types are derived from System.Object. While value types are generally stored in the stack, reference types are stored in the managed heap. Examples of value types are Int, Char, DateTime, Enum or Struct whereas String, Delegate, Interface or Class are examples of reference types.

You can think of a variable as a container. When the variable is of a value type, the variable's value is stored inside the container itself. But when the variable is of a reference type, what the container contains is not its actual value but a light-weight meta value pointing to a different place where you would find the actual value.

The best analogy I can think of is the meta redirect tag on an HTML page. When you access this page's URL from a browser, some HTML would be rendered. But due to the meta redirection, what's rendered would be fetched from a different URL. The original page is the container of your reference type variable. However, when you access it, it will bring you the value from a different place.

I hope I didn't make it sound more confusing. I'm trying my best to explain without talking about pointers.

Ok so if you've been following me thus far, you would still scratch your head why on earth a reference type variable still holds its value even after being set to null. For that, you have to understand what happens when you pass variables around.

Pass by value & pass by reference

When you pass a variable to a method by value, you pass a copy of that variable. So whatever you do to that variable does not affect the original. But when you pass by reference, you work with the same copy and therefore all changes are sort of "global".

In the above example, what you saw is an example of a variable passed by value.

Wait, what? Isn't that a reference type variable? Don't they get passed only by reference? You may ask.

Well, that's what you used to believe because whenever you manipulate an object's properties via a passed in variable, the original object retains those changes throughout. But if that was passed by value as I say, how could those changes persist? Yes, it's highly confusing when the phrase "objects are passed by reference" is already baked into your head.

Back to basics. Go back and read how I explained the variables using a container analogy. When you pass a reference type variable to a method, by value, shown in the example above, a copy of the variable is passed in just like for any value type. In this cause, what do we have inside the variable's container? A meta value pointing to a different place.

Was a copy of the value which it points, created? No. Then? Only a copy of the variable with its content was created. A reference type variable does not contain its value inside it, so now we just have two variables pointing to the same actual value. What happens when we set this new copy to null? Does it change what it pointed earlier? No. Just like when you change the meta redirect tag in an HTML page. Just because it redirects to a new URL now, it doesn't magically delete the HTML page which it redirected earlier. That page will still continue to exist.

If you want to change the original content of a variable from inside a method which it was passed in, you need to pass in by reference using ref, out or in keyword.

Woah, wait. Then how did my code work all this time? I happen to mutate my objects all over the place without a problem not worrying about how I passing them in.

Err... well yeah that's probably because most of the time, if not all that you did was the dot (.) dance on your objects. Back in the day, we had to explicitly say go fetch data using the arrow (->) notation when dealing with reference type variables, so this confusion was not commonplace I suppose. But now, dotting on a reference type variable or on a value type variable (think of a Struct) works more or less the same way, thanks to the compiler.

Retrospect

If you take the above example and change the line inside ModifyTest1 method to test1.Name = "fake"; the output would print "fake" instead of "original". Because as soon as you do test1.(something) it applies to the destination object instance. But when you do test1 = null; or even test1 = new Test1("fake"); for that matter, you are basically changing your variable's (container's) content, not what its previous content was pointing at. Since the calling code still has a test1 variable pointing to the original content, setting the copy of the test1 variable to point to null or to a different instance by the ModifyTest1 method does not affect the original content.

Running the above code would have resulted in a NullReferenceException if the method signature happened to be ModifyTest1(ref Test1 test1) and the call was done as test2.ModifyTest1(ref test1);. And that's the same way you generally pass your value type variables, as reference, at times (Remember int.TryParse()?).

I sincerely wish that I have not made you completely go nuts with this explanation. This is just my understanding of how value types and reference types work when you pass them around. I wish I could include how string type behaves like a value type even though it's a reference type, but this post has become too long already so maybe that's for another day. Cheers!

Ε Γ И І И О

Sunday, May 10, 2020

Singleton: Are you doing it right?

Saturday, May 2, 2020

Demystifying pass by value & pass by reference

Arduino Line Following Robot: Cracking the shortest/fastest path on a grid with loops