CTO does lazy load. Film at 11.

25 October 2007

Sometimes a CTO needs to get back to coding. Here's something that came up in an internal discussion.

Suppose we have this code:

    
    public class HeavyFoo {
      public HeavyFoo() {
        // big honking constructor
      }
      public void DoSomething() {
      }
    }
  
    public class UserOfHeavyFoo {
      private HeavyFoo heavyFoo;
      public UserOfHeavyFoo() {
        // don't want to create heavyFoo here
      }
      public void UseHeavyFoo() {
        if (heavyFoo == null) 
          heavyFoo = new HeavyFoo();
        heavyFoo.DoSomething();
      }
    }

We imagine we have this useful class called HeavyFoo. Unfortunately its constructor does some heavy processing, maybe by setting up some connection to somewhere, setting up a bunch of data, reading a file, or whatever. So we'd rather create a HeavyFoo object only when we really, really need it. This is exactly what the UserOfHeavyFoo.UseHeavyFoo() method does: if the internal field hasn't been initialized yet, go ahead and create the HeavyFoo instance. This pattern is known as the lazy load pattern.

All well and good, but it imposes a hidden contract on the developer of the UserOfHeavyFoo class: you can never "just" refer to the heavyFoo field, you always have to make sure that it's initialized beforehand.

So people traditionally refactor this code to something more like this, to use a read-only property:

  
    public class UserOfHeavyFoo {
      private HeavyFoo heavyFoo;
      private HeavyFoo HeavyFoo {
        get { 
          if (heavyFoo == null) 
            heavyFoo = new HeavyFoo();
          return heavyFoo;
        }
      }
      public UserOfHeavyFoo() {
        // don't want to create heavyFoo here
      }
      public void UseHeavyFoo() {
        HeavyFoo.DoSomething();
      }
    }

At a superficial level this seems to help, but we still have a hidden contract: when we write a new method, we should never use the heavyFoo field and always use the HeavyFoo property. If we don't, the code will still compile, but it may fail in weird ways at run-time depending on the order of execution of our new method vs. UseHeavyFoo(). Call the UseHeavyFoo() method first and everything works, call the new method first and you'll get a crash.

What we'd like to do is to make sure that we can't access the bare heavyFoo field. The only way to do that is to take it out of the UserOfHeavyFoo class and put it somewhere else. My current idea is along these lines:

  
    public class LazyLoad<T> where T : class, new() {
      private T item;
      public T Item {
        get {
          if (item == null)
            item = new T();
          return item;
        }
      }
    }
  
    public class UserOfHeavyFoo {
      private LazyLoad<HeavyFoo> heavyFoo;
      public UserOfHeavyFoo() {
        heavyFoo = new LazyLoad<HeavyFoo>();
      }
      public void UseHeavyFoo() {
        heavyFoo.Item.DoSomething();
      }
    }

This is much better in one way since we can no longer refer to an uninitialized HeavyFoo instance in our UserOfHeavyFoo class, but it looks a little awkward in another: we now have a double dereference when we want to do something. Also, the LazyLoad generic class can only deal with classes that have a constructor with no parameters; something I think is unlikely for heavy duty constructors. We can avoid both of these by having a specific lazy load class for HeavyFoo, but I can imagine that if we were to have a series of these lazy load classes, there would be a lot of code and behavior duplication.

So I'm still thinking about this one, although I'm beginning to think that HeavyFoo has something to do with the code smell.

Tags
12 comment(s)
Thorsten Engler

Give LazyLoad<T> a constructor which takes a delegate returning T, store that delegate in a private field. In the getter call that delegate to create the new instance (and set the delegate to null, you are no longer needing it afterwards and it might be keeping an object needlessly alive..)

When calling "new LazyLoad<HeavyFoo>" you can then use an anonymous method or a lambda expression  to call "new HeavyFoo(...)" with whatever parameters you want.

25 October, 2007
Robert Kozak

The LazyLoad<T> class does look interesting but it does suffer from the limitations you outlined.

I prefer the solution of using a private field with a read only property. The only implied contract, like you said, is that you don't access the private field directly. That's ok with me. Given a private field and a property that relate to the same thing it is always preferable to use the property. Properties are not supposed to introduce side effects so it is always best to ignore the private field.

Also in C# 3.0 we have Automatic properties so we  can specify a propery now without a explicit private field.

This to me points more to your second solution as the best one. The consumers of the class dont have access to the private fields (ignoring relection) so the only rule to follow is don't access your own private fields.

25 October, 2007
Robert Kozak

Spoke to soon about Automatic properties: Looks as though you need to have a getter and setter declared.

25 October, 2007
Julian Bucknall (DevExpress)

Thorsten

Now there's an idea. Nicely thought out. Gotta play with that one a little bit.

Cheers, Julian

25 October, 2007
Julian Bucknall (DevExpress)

Robert

Using that particular solution makes me yearn for a code analyzer (FXCop?) that would check for me that the property is always used and not the field, even internally to the class (apart from the property itself of course). That could be usable elsewhere of course, the use-only-the-property pattern is useful all over the place.

Cheers, Julian

25 October, 2007
Thorsten Engler

Actually, with that change it becomes more then just a "lazy load". What you get is basically a generic implementation of lazy evaluation with caching (the code behind the delegate isn't limited to just call a constructor, can be as complex as you wish).

As delegate you can use Func<T> from System.Linq. The check in the getter that decides if the delegate should be called or the already cached value returned directly should check the delegate for null (instead of the cached value) so that you can have a delegate which returns null as a valid value.

25 October, 2007
Thorsten Engler

This is working quite nicely and also gets rid of the double dereference:

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

namespace ConsoleApplication2

{

   public class FuncCache<T>

   {

       private Func<T> func;

       private T cache;

       public FuncCache(Func<T> func)

       {

           this.func = func;

       }

       private T Value()

       {

           if (this.func != null)

           {

               this.cache = this.func();

               this.func = null;

           }

           return this.cache;

       }

       public static implicit operator Func<T>(FuncCache<T> fc)

       {

           return new Func<T>(fc.Value);

       }

   }

   public class FuncCache

   {

       public static FuncCache<T> New<T>(Func<T> func)

       {

           return new FuncCache<T>(func);

       }

   }

   public class HeavyFoo

   {

       public HeavyFoo()

       {

           // big honking constructor

           Console.WriteLine("HeavyFoo()");  

       }

       public void DoSomething()

       {

           Console.WriteLine("DoSomething()");

       }

   }

   public class UserOfHeavyFoo

   {

       private Func<HeavyFoo> heavyFoo;

       public UserOfHeavyFoo()

       {

           heavyFoo = FuncCache.New( () => new HeavyFoo() );

       }

       public void UseHeavyFoo()

       {

           heavyFoo().DoSomething();

       }

   }

   class Program

   {

       static void Main(string[] args)

       {

           UserOfHeavyFoo uohf = new UserOfHeavyFoo();

           Console.WriteLine("start...");

           uohf.UseHeavyFoo();

           uohf.UseHeavyFoo();

           Console.ReadLine();

       }

   }

}

25 October, 2007
Rollie Claro

anyone had attempted to use ObjectBuilder?

26 October, 2007
David Shannon

This is what I sometimes use in VB if I know the parameters (stuff and moreStuff) for the constructor before the fact:

   Private Function HeavyFoo() As HeavyFoo

       Static hiddenFoo As HeavyFoo = Nothing

       If hiddenFoo Is Nothing Then

           hiddenFoo = New HeavyFoo(stuff, moreStuff)

       End If

       Return hiddenFoo

   End Function

I'm not sure if there is a similar structure in C#, but this conceals hiddenFoo from the rest of the code.

26 October, 2007
Thorsten Engler

David,

I might be wrong (not very familar with VB..), but wouldn't that "Static" imply that there is only one such variable (for the complete program) as opposed to one per instance?

27 October, 2007
David Shannon

Thorsten,

Nope, not in VB.  In this context, "Static" just means that the variable holds its value between calls to the method.  For the purpose you describe -- class variables as opposed to instance variables -- VB uses the term "Shared."

28 October, 2007
David Shannon

BTW, I don't think C# has this "method-static" syntax.

I think they should modify C# so that it uses "Shared" for this purpose.  Then Static in C# would always mean Shared in VB, and vice-versa.

After that would could simply reverse the meanings of ">" and "<" in one of the two languages -- and maybe "+" and "-" for good measure -- and all would be come clear...

28 October, 2007

Please login or register to post comments.