VBA Class Modules: gateway to SOLID code

Core contributor to the Rubberduck project, co-author of Microsoft Access in a Sharepoint World (2011), Professional Access 2013 Development (2013), and Effective SQL: 61 Specific Ways to Write Better SQL (2016), 10-times Microsoft Access MVP award recipient (2009-2019), Ben Clothier wrote a paper about class modules and OOP that makes a great on-topic addition to this blog. Enjoy!


Introduction

There are popular misconceptions surrounding VBA and object-oriented programming (OOP), usually in 2 forms:

  1. VBA isn’t really OOP, so you can’t really use OOP principles with VBA
  2. OOP makes things too complicated; procedural programming is all you need anyway

Both are incorrect because OOP is not a language feature but rather a principle of how we should design our code. In modern programming languages and IDE, there are features that makes it easier to apply & enforce the principles. In the end, it is still up to us, the programmers, to actually apply & enforce the principle. Thus, the language of choice has no bearing on whether you can apply the principles of OOP or not. If you are still skeptical, consider that the C programming language predated the development of OOP but there is a demonstration on writing OOP code in C.

The 2nd objection is commonly raised because when looking at the OOP code, it can seem frustrating because it refers to several other objects and you find yourselves looking at more files in order to see what a program does. Coming from a procedural mindset, that can feel like you’re dealing with several layers of lasagna. That does require a change in how you perceive the code.

That is the goal of this article, to help you discover how applying OOP principle can help you write better code, not just for VBA but for any programming language. If you’ve worked on a complex project, you might have had an episode where when you fixed a bug in one spot, 2 new bugs appeared in 2 unrelated places. Surely, you’d find that quite frustrating, taking out all the fun in the programming. Procedural design enables you to solve business problems quickly so that you can get on with other stuff. However, what if it’s so successful, that they come back for more; asking you for more features? How many changes do you have to make? With procedural programming, the upkeep is cumulative; first few feature requests are easy and put in action quickly. Next few, it takes more time and more tweaking. Some more, then it feels a bit harder and harder. But coding should not be like that! Adding a new feature should not scale on an exponential scale! That is what the OOP promises; by keeping a clean codebase, it is easy to describe the new feature and integrate it into the codebase with minimum change.

In fact, most programmers nowadays should be emphasizing writing refactor-friendly code. What do we mean by refactor-friendly? Basically, it is a codebase that is easy to change because you are able to change only pieces that actually needs to change and no more than that. That is very difficult to do in a purely procedural system. For long time, refactoring has not been something on average VBA developers’ mind because there were no tools to refactor VBA. Rubberduck exists to provide those tools. To get the most from refactoring, we do need to raise our level of code writing and apply good design to our VBA codebase.

The other important aspect to learn is that we want to make the wrong code look obviously and blatantly wrong. I highly encourage you to also read Joel Spolsky’s article on that subject. His article deals with the Hungarian notations, but we want to go beyond just the naming conventions. Taking up on OOP principles can significantly help us with making wrong code look wrong which means it becomes easier for us to fix the code. You’ve probably had to deal with a giant hundreds-line procedure with the great wall of declarations and deeply nested code and had the thrill of debugging it and cursing while your minor change cascades into something catastrophic. Well, there’s a better way!

The article assumes that you are familiar with VBA and procedural programming but otherwise have never or rarely used classes or interfaces. It further assumes that you might have had heard of object-oriented programming but otherwise are unfamiliar with the design and use of such. Also, it assumes that you are familiar and comfortable with using code-behinds and events in the document modules. (e.g. Excel’s workbook or worksheet, Access’ form or reports, Word’s document, etc.) We will build up on the class design and eventually apply OOP principles in designing our classes. To reinforce the concepts, we will do a build-up starting with familiar approach and transforming it into a clean codebase that is very refactor-friendly. The benefit is that you end up with a codebase that is easy to read, understand and maintain. Because this assumes you are familiar with procedural procedure (e.g. writing small functions or routines that perform a complex task by breaking it down into small steps), we need to provide a good transition from procedural mindset to object-oriented mindset. For that reason, we will take a route around the town instead of a direct route. I believe the indirect route will be beneficial in seeing what we would accomplish with a clean object-oriented codebase. This is not intended to be an exhaustive treatise but rather provide enough of fundamentals for you to see the advantages of the OOP principles in a VBA codebase.

We will start with creating custom types, and doing work with it, then use it as a basis for our first class.

Creating your custom types

You may have already used a user-defined type (UDT), which is a convenient way to create a structure of closely related properties together. You may have used it before especially if you’ve ever had to use certain API functions via the Declare statements. Let’s start with a Person UDT. We can create a new standard module and define a UDT within the module:

Public Type Person
  FirstName As String
  LastName As String
  BirthDate As Date
End Type
'figure 2-1: a Person UDT

The UDT provides us with 3 members that tells us something about a Person; namely the first & last name and the birth date. Obviously, we can have more but we want the example to stay simple. The calling code to use a Person UDT could look something like this:

Public Sub Test()
  Dim p1 As Person
  Dim p2 As Person
  
  p1.FirstName = "John"
  p1.LastName = "Smith"
  p1.BirthDate = #1970-01-01#
  
  p2.FirstName = "Jane"
  p2.LastName = "Doe"
  p2.BirthDate = #1970-01-01#

  Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 2-2: testing code for using a Person UDT

This should demonstrate clearly that with a UDT, we could create several “instances”, which are independent of one other. Our setting of p2.FirstName does not affect the p1.FirstName. Note the last line printing out the VarPtr(p1) and VarPtr(p2). This prints out the variable’s memory address, which demonstrates that the p1 and p2 variables resides in a different region of memory and thus do not share anything. As an exercise, you can check the VarPtr for members of p1 and p2. For example, you could look at VarPtr(p1.FirstName) or VarPtr(p2.LastName) and compare to their respective counterpart.

In typical cases, we might want to have a collection of persons so that we can work with them in bulk or something similar. You might have done this but using a worksheet or a database table as the backing data structure. There is one crucial difference between a UDT and a database table or worksheet; the UDT is always ethereal, resides in the memory whereas the same data saved into a worksheet, a database table, or XML file are persisted and requires a specific method to read or write data to those source.

One thing about a UDT is that it cannot have any methods. We’ll start with creating a new person. In the above code example, we declared a Person variable for each object we needed, but we can do better than that. Let’s have a procedure that returns a new Person instead, as demonstrated below:

Public Function Create( _
  FirstName As String, _
  LastName As String, _
  BirthDate As Date _
) As Person
  Dim NewPerson As Person
  
  NewPerson.FirstName = FirstName
  NewPerson.LastName = LastName
  NewPerson.BirthDate = BirthDate
  
  Create = NewPerson
End Function
'figure 2-3: a Create function for a Person UDT

Now the calling code looks like this:

Public Sub Test()
  Dim p1 As Person
  Dim p2 As Person
  
  p1 = Create("John", "Smith", #1970-01-01#)
  p2 = Create("Jane", "Doe", #1970-01-01#)

  Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 2-4: calling code using a Create function to create Person UDTs

Much more compact code, yes? More importantly, when we read the code, it is easy to understand what it is doing because we separate the mechanics of the creation from the current context which just needs something created without knowing the particular details in the act of creation.

But it doesn’t have to be just about the creation. Let’s say we want to provide name change. Perhaps because Ms. Doe got married and is now Mrs. Holly. We could then write a new function to help us:

Public Function ChangeLastName( _
  Person As Person, _
  NewLastName As String _
) As String
  ChangeLastName = Person.LastName
  Person.LastName = NewLastName
End Function
'figure 2-5: a ChangeLastName function to mutate a Person UDT

Thus, we could have our calling code do the name change:

Public Sub Test()
  Dim p1 As Person
  Dim p2 As Person
  
  p1 = Create("John", "Smith", #1970-01-01#)
  p2 = Create("Jane", "Doe", #1970-01-01#)

  Debug.Print VarPtr(p1), VarPtr(p2)

  Dim OldName As String
  OldName = ChangeLastName(p2, "Holly")
  
  Debug.Print OldName, p2.LastName
  Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 2-6: calling code using ChangeLastName on a Person UDT.

The calling code demonstrates that when we change the last name for Ms. Doe to Mrs. Holly, the variable p2 is still the same; only its content has changed. If you are now wondering why we couldn’t have just assigned the UDT member directly instead of calling ChangeLastName, that’s exactly one of the problems we face with using UDTs:

p2.LastName = "Holly" 'Overwriting the original "Doe" entry
'figure 2-7: bypassing the ChangeLastName function by writing directly to the UDT member.

There is no way for us to control the access. VBA does not allow us to create an UDT that cannot be edited once created. That is often referred to being “immutable”. So, when we pass around UDT, we are always trusting that everyone will follow the same convention we build around the type. However, as human beings, we excel at being inconsistent and forgetful, so it’s too easy to fail to follow the convention, especially because the compiler won’t care whether you do a p2.LastName = "Holly" or ChangeLastName(p2, "Holly"). Both are legal syntax; but we don’t want it to be that way.

If you are wondering why we should want to control access and require use of a ChangeLastName instead, consider that in a typical business process, nothing stays the same for very long. What may have been true yesterday may be no longer true today. To stick to our example, we could suppose that we have a requirement that name change must be approved and is restricted to only those who are 18 years or older, based on the birth date. If we directly set the LastName property, there’s no guarantee that the checks have been enforced. We can write down a sticky note “Use ChangeLastName”, but that won’t be enforced by the compiler. One key to writing a clean code base is to have the compiler do as much work as possible in telling you that some certain action is verboten.

To recap what we’ve learned so far. We’ve seen how we can define a custom user-defined type to group a set of closely related properties. We saw that the UDT can be instantiated multiple times, enabling us to juggle more than one instances of same type at the same time. We wrote some procedures that interacts with the UDT to compensate for the shortcomings of the UDT such as making creation easy or managing some sensitive change such as changing person’s last name which may have additional constraints beyond just the code itself. We saw that an UDT does not really do a good job of managing the access to properties, which requires us to follow conventions that are not enforced by the compiler, which can make the coding around an UDT highly prone to errors or omissions.


Creating our first class module

With all that information we’ve learned, we now have enough working knowledge to start creating a class. Let’s get started by creating a new class module. We’ll call it “Person”. We create a class module via the VBIDE’s toolbar and choosing the Class Module command.

Figure 3-1: “Insert” dropdown menu from the VBIDE main toolbar

The very first thing we want to do with our first class is to define the private data it needs to have to work correctly. We could start with nothing but public fields, like this:

Public FirstName As String
Public LastName As String
Public BirthDate As Date
'figure 3-2: initial class design with fields only.

The class module is probably still unnamed and may have a default name of Class1. To provide it with a name, you can fill in the name via the Properties toolwindow:

Figure 3-3: Specifying a name for a newly created class module

However, this is no better than the original UDT we started with in Figure 2‑1. We would still have the same problem with controlling the access. For example, we might not want to allow arbitrary changes to LastName but rather control it via a dedicated ChangeLastName procedure. We could use Property statements instead. If you’ve never used one before, they are a way to provide a procedural access to a member of the data structure, which grants us additional control on how the property may be accessed. We could revise the code accordingly:

Private mFirstName As String
Private mLastName As String
Private mBirthDate As Date

Public Property Get FirstName() As String
  FirstName = mFirstName
End Property

Public Property Let FirstName(NewValue As String)
  mFirstName = FirstName
End Property

Public Property Get LastName() As String
  LastName = mLastName
End Property

Public Property Let LastName(NewValue As String)
  mLastName = LastName
End Property

Public Property Get BirthDate() As String
  BirthDate = mBirthDate
End Property

Public Property Let BirthDate(NewValue As String)
  mBirthDate = BirthDate
End Property
'figure 3-4: a class design using Property statements instead of public fields.

This is still functionally equivalent to the original version of class and the UDT; all members are readable & writable. However, because property is a procedure, we can add additional logic beyond just setting a field. A typical example might be to require validation such as the code in Figure 3‑5:

Public Property Let FirstName(NewValue As String)
  If Len(NewValue) = 0 Then
    Err.Raise 5, Description:="First name cannot be blank"
  Else
    mFirstName = NewValue
  End If
End Property 
'figure 3-5: an example of a property procedure with validation enforced.

Note that we had to define 3 private fields (mFirstName, mLastName, and mBirthDate). We had to use prefixes because we don’t want name collisions with the public-facing properties of same name. However, this is problematic for two reasons:

  1. All private fields are now sorted together in the IntelliSense, requiring you to type in a “m” to locate the module level field. This becomes annoying when you have other public members that might start with the letter “M” but will now mingle with those various backing fields. That hampers the discoverability of the code.
  2. Because of intermingling, you can’t tell quickly whether a public member LastName has a backing field or not; you’d need to look in two different places to make that determination. While the examples above have shown us properties that provide data, we could create a get-only property that is calculated. For instance, we can create an Age property that is calculated based off the mBirthDate backing field rather than having its own mAge backing field.

We can do better! Let’s use a UDT instead of a bunch of fields. Since we are improving upon the original UDT, we will start with a UDT, but this time we’ll make it Private, and we will only need a single instance of it. We’ll also take the opportunity to make all properties read-only by not providing a Property Let like we did in the original example:

Private Type TPerson
  FirstName As String
  LastName As String
  BirthDate As Date
End Type
Private This As TPerson

Public Property Get FirstName() As String
  FirstName = This.FirstName
End Property

Public Property Get LastName() As String
  LastName = This.LastName
End Property

Public Property Get BirthDate() As String
  BirthDate = This.BirthDate
End Property
'figure 3-6: a class using an UDT as its backing field for several properties.

With this approach, we have only a single module-level variable, called This. This enables us to get a nicely filtered IntelliSense listing only the backing field, which can be now the same name as the public member and this is now much easier to enforce with compiler’s help.

This is obviously an improvement, but we now have no way of setting the data to the This instance. Let’s add a procedure to do just that:

Public Sub FillData( _
  FirstName As String, _
  LastName As String, _
  BirthDate As Date _
) 
  If Len(This.FirstName) = 0 Then
    This.FirstName = FirstName
    This.LastName = LastName
    This.BirthDate = BirthDate
  End If
End Sub
'figure 3-7: a FillData procedure to write to the private data of the class.

Now we can create the persons with this revised calling code:

Public Sub Test()
  Dim p1 As Person
  Dim p2 As Person
  
  Set p1 = New Person
  Set p2 = New Person
  
  p1.FillData "John", "Smith", #1970-01-01#
  p2.FillData "Jane", "Doe", #1970-01-01#

  Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 3-8: revised testing code using the Person class.

Hopefully this illustrates how much cleaner the code is. More importantly, because the fields in the class are now private and can be only set via the FillData method, and exposed as read-only, we make it possible to leverage the compiler to help us enforce guarantees about the access to those fields. That becomes important in a more complex class where we need to be able to make safe assumptions about the class’ internal state. We would not have that with an UDT.

We also have a validation check that This.FirstName is not already a zero-length string, and throwing a runtime error to prevent erroneous use of the FillData procedure. However, this is a runtime validation, rather than compile-time validation. We want the compiler to do the work for us.

Can we do that? Absolutely! We will look at how we can achieve this using interfaces next to help us hide the methods.


Controlling access to methods via interfaces

Above, you saw how we could use a class module to protect the internal data structure and thus control the access to the data, which helps us write code that we can verify at the compile time to do the correct thing. However, we still need to deal with the methods themselves. As noted, we needed to create a FillData method to write data to the internal state. As it is, it would expose the procedure to all consumers and there’s nothing preventing them from inappropriately calling it. We can use a runtime validation but the objective of this paper is to convert as much errors we can from run-time to compile-time. So we need to do something about the FillData method. We want to basically hide the FillData member once the instance has been created. How do we do that? With interfaces. What are interfaces in VBA? They’re actually just class modules. VBA does not actually make a semantic distinction between a class and interface. To further muddy the water, all VBA classes also have a default interface – that is, the class itself (i.e. its Public members). To keep it simple, we will say that an interface is basically a VBA class module with no code. Here’s how we will set up our IPerson interface:

Public Property Get FirstName() As String
End Property

Public Property Get LastName() As String
End Property

Public Property Get BirthDate() As String
End Property

Public Function ChangeLastName(NewLastName As String) As String
End Function
'figure 4-1: IPerson interface.

You might note that this has similar properties like we saw in Figure 3‑6, with the addition of a modified version of ChangeLastName we saw in Figure 2‑5. More importantly, the FillData method is not present on the IPerson interface. By itself, it does not do much because there is no implementation for the interface. We will now make the Person class implement the IPerson interface. This is done with the Implements statement:

Implements IPerson

Private Type TPerson
  FirstName As String
  LastName As String
  BirthDate As Date
End Type
Private This As TPerson

Private Property Get IPerson_FirstName() As String
  IPerson_FirstName = This.FirstName
End Property

Private Property Get IPerson_LastName() As String
  IPerson_LastName = This.LastName
End Property

Private Property Get IPerson_BirthDate() As String
  IPerson_BirthDate = This.BirthDate
End Property

Private Function IPerson_ChangeLastName(NewLastName As String) As String
  IPerson_ChangeLastName = This.LastName
  This.LastName = NewLastName
End Function

Public Sub FillData( _
  FirstName As String, _
  LastName As String, _
  BirthDate As Date _
)
  If Len(This.FirstName) = 0 Then
    This.FirstName = FirstName
    This.LastName = LastName
    This.BirthDate = BirthDate
  End If
End Sub
'figure 4-2: Person class implementing the IPerson interface.

If you compare the original class in figure 3‑6 with the code above, you should note the following differences:

  • The members are now Private rather than Public.
  • The members now have the prefix IPerson_. You might have seen similar setup with event handlers. Obviously this is special in the sense that if you have an IPerson variable, you are able to access the implementation even though it’s Private because the IPerson interface defines the member as Public (see figure 4‑1).
  • The FillData procedure is not on the IPerson interface, but is still Public and doesn’t have an IPerson_ prefix.

Let’s set up some testing code to demonstrate that we’ve in fact hidden the FillData procedure:

Public Sub Test()
  Dim p As Person
  Set p = New Person
  p.FillData "John", "Doe", #1970-01-01#

  Dim i As IPerson
  Set i = p 'We can assign a Person to IPerson because of Implements
  
  i.FillData "Invalid", "Invalid", #9999-12-31#
End Sub
'figure 4-3: testing code to demonstrate that FillData on a IPerson variable is disallowed at compile-time.

Once we’ve set up the code above, we should try to compile the code. This should yield a compile-time error, like this:

Figure 4-4: Compile error: method or data member not found.

This is an immense improvement over the original code from the Figure 3‑7, which would only be enforced at run-time, not at compile-time. By writing code that we can get the compiler to aid us in checking, we reduce the likelihood of introducing bugs due to an incorrect use of methods.

However, you might be wondering if we have to create a Person variable, what’s there to stop us from accidentally using a Person variable when we should be using an IPerson variable? Indeed, the testing code in figure 4‑3 is suboptimal. Ideally we would have a method that will provide us with an IPerson variable so that we don’t actually need to create the Person variable at all. Thus, we need to learn about creating a factory, and also learn about separation of concerns.


Factory Design Pattern

We saw how we can use interfaces to hide the methods that shouldn’t be available to consumers but we also saw that it does little good if we create the implementation then cast it in the same routine. We need to be able to separate the creation so that we will get the interface, rather than creating the implementation ourselves. By separating the two concerns, or the scope of work, we achieve cleaner code. To that end, we will need to create a factory. There are few different ways we can create a factory, with their pros and cons. Because this article intends to build up on the lessons we learned, we will start with the most simple possible way to implement a factory.

One way to create a factory in VBA is to create a standard module and treat it like a class. By using a standard module in this manner, we avoid the need to create an instance of the factory itself and keep things straightforward. Let’s create a PersonFactory module:

Public Function Create( _
  FirstName As String, _
  LastName As String, _
  BirthDate As Date _
) As IPerson
  Dim NewPerson As Person
  Set NewPerson = New Person
  NewPerson.FillData FirstName, LastName, BirthDate
  Set Create = NewPerson
End Function
'figure 5-1: PersonFactory module.

We can then revise the calling code from figure 4‑3 to look like this:

Public Sub Test()
  Dim p As IPerson
  Set p = PersonFactory.Create("John", "Doe", #1970-01-01#)
End Sub 
'figure 5-2: revised calling code using a PersonFactory module.

Note that we now only have an IPerson variable; we don’t even need to know that it’s the Person class that is the implementation for the variable.  More importantly, we are able to hide the members of Person class:

Figure 5-3: the object is only exposing the IPerson members, not the Person members.

At this point, we’ve successfully hidden the FillData member and the changes we have introduced makes it much easier for us to analyze our codebase and make wrong code look obviously wrong. You may be wondering what’s stopping us from simply creating a new Person variable and casting to it. The answer is actually nothing, really. However, the same answer applies to preventing us from making a Private procedure a Public one or promoting a local variable to a global one: we know to not make a private procedure public because that breaks the encapsulation and makes for more messy codebase. By the same token, we want to become comfortable using abstractions (e.g. the interfaces) as a perfectly normal way to work with classes where we need to hide the implementation details from the consumers.

You might have noticed that in the figure 5-3, we used the syntax PersonFactory.Create(). A common misconception regarding naming of procedures in VBA is that they cannot be the same name when they are Public and in a standard module. Thus, a common approach would be to name the method something like CreatePerson, which is a OK name. However, trying to come up with unique names for everything can quickly become a hassle. However, there is nothing preventing us from creating a Public Create() As IPerson in a PersonFactory module and a Public Create() As IWidget in a WidgetFactory module! By qualifying our calls to the Create() methods with the module name, it immediately becomes obvious what we are using to create our objects and without the overhead of creating yet another class. This ought to illustrate why semantic naming matters much more than using notations which only add more noise without making wrong code look wrong.

As already has been mentioned, there are different ways we can create factories. It could be an actual class that you could allow the use of New on. It could be a member on the predeclared class module. It depends on what you need. The key takeaway here is that by separating the concern of creating objects from the code that consumes them, we are able to make good use of interfaces to hide the members of the implementations.


Conclusion

You’ve learned how to create a class module and apply some of the good design principle including encapsulating and separating the concerns. You’ve also learned how you can use interfaces to control the access to members that shouldn’t be used by the calling code promiscuously. All those combine up to enable us to write a codebase that compiler is able to validate and enforce for us. Having the compiler do the work for us means we have less work when we review and try to understand what the code is doing. You’ve seen that does require more objects. We’ve had to create 2 class modules, 1 standard module and another standard module to do the testing/calling. With the original UDT approach outlined in section 2, it might be possible to do all in one module. A number of VBA developers may feel that keeping everything in one module makes for easier porting of code from one project to another. They may also feel that protecting the internals is easier to do with Private methods. However, as you saw, sometimes that is simply not possible using a UDT which need public methods to provide “API” around it. Also, too many private members within a module usually makes the code untestable and brittle. Finally, using a Worksheet, UserForm, or an Access form as the “class” only encourages highly coupled code that itself cannot be verified at compile-time. It is possible to set up a convention of doing things but if the convention can’t be enforced by the compiler, we would have do additional work in analyzing the code to ensure it is doing what it is intended. We will analyze those issues in a future post.

By applying good OOP design principles, we are able to make it easier on ourselves to write verifiable and testable code. The concern with managing the number of modules and being able to easily refactor code is actually a problem with the IDE, not with the language. For that reason, Rubberduck exists to alleviate those shortcomings and make it easier to apply good OOP design principles to your VBA code.


Code Name: Sheet1

There are so many ways to get ahold of a Worksheet reference: you can dereference it from a Sheets collection, and even then you need to decide whether that’ll be off Workbook.Sheets or Workbook.Worksheets, two properties that both return a Sheets collection that will contain the worksheet you’re looking for. The Workbook might be the ActiveWorkbook, or it could be some object variable that was assigned earlier, with the result of Workbooks.Open. Or you might like living on the edge, and activate the Window that has some path/filename as a caption, and then work off the ActiveWorkbook. Every single one of these cases have a thing in common: the Workbook involved isn’t necessarily ThisWorkbook.

ActiveWorkbook vs. ThisWorkbook

In Excel, only one single Workbook is ever the ActiveWorkbook at any given point in time. If all workbooks are closed, then ActiveWorkbook will be Nothing (add-ins in particular, need to mind this). When a workbook is activated, it fires an Activate event; if another workbook was active before that, then that workbook fired a Deactivate event.

The ActiveWorkbook can change in the middle of a loop that uses a DoEvents statement to keep Excel responsive, because the user clicked somewhere and that click was allowed to be handled, because Excel remains responsive: if the user can interact with Excel, you can never assume what ActiveWorkbook is referring to – it can be literally any workbook, or none at all. And after the next instruction it might be something else.

For all these reasons, ActiveWorkbook and ActiveSheet are object you will want to capture into a local variable at the beginning of whatever it is that you need to do, and then use that variable and never refer to ActiveSheetexplicitly or not, for the rest of that procedure. For example instead of this:

Public Sub DoSomething()
    ActiveSheet.Range("A1").Value = 42
    ActiveSheet.Range("A2").Value = VBA.DateTime.Date
End Sub

You’d do that:

Public Sub DoSomething()
    Dim sheet As Worksheet
    Set sheet = ActiveSheet
    sheet.Range("A1").Value = 42
    sheet.Range("A2").Value = VBA.DateTime.Date
End Sub

Of course that’s just an example: if I had to write such a small procedure in real code, I’d skip the local variable and have a With block withold the object reference for me – note the . dereferencing operator qualifying the Range member calls:

Public Sub DoSomething()
    With ActiveSheet
        .Range("A1").Value = 42
        .Range("A2").Value = VBA.DateTIme.Date
    End With
End Sub

This would be very, very different:

Public Sub DoSomething()
    With ActiveSheet
        Range("A1").Value = 42
        Range("A2").Value = VBA.DateTIme.Date
    End With
End Sub

Note the missing dereferencing . operator now: the With ActiveSheet block variable is never actually accessed here. So what sheet is it that these Range member calls are referring to? If that code is written anywhere other than in some worksheet module, then they’re implicitly referring to ActiveSheet. If that same identical code is written in some worksheet module (say, Sheet1), then it refers to that sheet (that’s Me, aka Sheet1).

Implicit qualifiers are evil: they strip vital context out of the code, and suddenly you need to do more than just read the code to understand what’s going on. If you’re going to be referring to ActiveSheet, you might as well be explicit about it.

So what’s ThisWorkbook then? In a word, it’s the host document: the Excel workbook in which your VBA project is hosted. ThisWorkbook always refers specifically to this host document, even if your VBA project is an add-in. Maybe it’s the ActiveWorkbook. Maybe it isn’t.

A very common mistake, is to treat the worksheets of ThisWorkbook like the worksheets of any other workbook (active or not).

Compile-Time, Run-Time

Another common mistake, is to treat worksheets of ThisWorkbook that already exist in ThisWorkbook.Worksheets at compile-time, the same way you’d treat worksheets that only come into existence at run-time.

If the sheet is already in the workbook when your VBA project is in design mode, then at compile-time a project-scope automagic Workbook variable (constant?) exists, named after the (Name) property of the module:

The “Name” property (bottom) is the sheet tab caption that the user can modify as they please; users don’t even get to see the “(Name)” property (top) unless they bring up the VBE.

By default the code name of the first sheet of an empty workbook, is Sheet1, same as its Name property value. When you do this:

Dim sheet As Worksheet
Set sheet = ThisWorkbook.Worksheets("Sheet1")
sheet.Range("A1").Value = 42

You are using this Name property, …and if a user renames the sheet, the statements suddenly starts raising run-time error 9 subscript out of range.

But if you gave the (Name) property a nice meaningful identifier name, say SummarySheet, then you could do this instead:

SummarySheet.Range("A1").Value = 42

SummarySheet is a programmatic identifier that is much harder to tamper with than the sheet tab’s caption, if you’re the worksheet’s end user.

You can’t use worksheets’ code names to access any other sheets than those that exist in ThisWorkbook at compile-time, so a really good habit to take early on, is to name things. Leave ThisWorkbook alone, but name every worksheet module in your project. And then use these names whenever you can: these worksheets are part of your VBA project, they never need to be obtained from a Sheets collection.

In fact, Set sheet = Sheets("Sheet1") is at best a missed opportunity, when the “Sheet1” in question exists in ThisWorkbook. At worst, it’s an outright bug… and that’s the reasoning behind Rubberduck’s sheet accessed using string inspection.

Document Modules

The first time you discovered the Visual Basic Editor and visualized the VBA project in the Project Explorer, when you first met ThisWorkbook and Sheet1, document modules were the world: your baby steps in VBA were very likely in the code-behind of ThisWorkbook or Sheet1 – document modules are the gateway to VBA-land.

In the VBIDE Extensibility object model, the modules are accessed via the VBComponents property of a VBProject object, and the VBComponent.Type property has a value of vbext_ComponentType.vbext_ct_Document (an enum member value) for both ThisWorkbook and Sheet1 modules, but would be vbext_ct_StdModule for a standard module like Module1. Clearly if document modules are class modules, there’s enough “special” about them that they need to be treated differently by the VBA compiler.

Document?

VBA itself is host-agnostic: whether it’s hosted in Excel, PowerPoint, AutoCAD, Sage, Corel Draw, or any other one of over 200 licensed hosts, it’s the same VBA (there’s no such thing as “Excel VBA”, really), and it doesn’t have any intrinsic knowledge of what an Excel.Worksheet is – only the Excel library does; when your VBA project is hosted in Excel, then the Excel type library is automatically added to (and locked; you can’t remove it from) your project. Document modules are a special kind of module that the VBIDE treats as part of the VBA project, but it can’t add or remove them: If you want to add or remove a Worksheet module from a VBA project, you need to actually add or remove a worksheet from the host Excel workbook.

So, other than we can’t directly add such components to a VBA project without involving the host application’s object model, what’s special about them?

They’re in the VBA project, because the host application decided that’s what’s in a VBA project hosted in – here Excel, but each host application gets to decide whether a project includes a module to represent the host document, and under what circumstances to add or remove other types of modules, like Worksheet, or Chart.

Document modules can’t be instantiated or destroyed, so there’s no Initialize or Terminate event handler for them. However, you get to create a handler for any or every one of many convenient events that the host application fires at various times and that the authors of the object model deemed relevant to expose as programmatic extensibility points. For ThisWorkbook, this includes events like Workbook.Open and Workbook.NewSheet, for example. For Worksheet modules, that’s worksheet-specific events like Worksheet.SelectionChange or Worksheet.Change, that you can use to “hook” custom code when such or such thing happens in the host document.

ThisWorkbook

Your VBA project is hosted inside an Excel document. In the VBA project, the host document is referred to with an identifier, and that identifier is (by default anyway, and renaming it is not something I’d recommend doing) ThisWorkbook.

The Excel workbook that contains the current VBA project: that’s ThisWorkbook. When you’re in the code-behind of that module, you’re extending a Workbook object: if you type Me., the VBE’s IntelliSense/autocomplete list will display all the public members you’d find on any other Workbook object, plus any Public (explicitly or not) member. That’s what’s special about a document module: it literally inherits members from another class, as in inheritance – something VBA user code cannot do. Isn’t it fascinating that, under the hood, Visual Basic for Applications apparently has no problem with class inheritance? Something similar happens with UserForm code: the UserForm1 class inherits the members of any other UserForm, “for free”. And of course every Sheet1 inherits the members of every other Worksheet in the world.

So, procedures you write in a document module, should logically be very closely related to that particular document. And because host-agnostic logic can’t add/remove these modules, you’ll want to have as little code as possible in them – and then as a bonus, your VBA project becomes easier to keep under source control, because the code is in modules that VBE add-ins (wink wink) are able to properly import back in and synchronize to & from the file system.


What about ActiveWorkbook?

ActiveWorkbook refers to the one and only workbook that is currently active in the Excel Application instance, which may or may not be ThisWorkbook / the host document. It’s easy to confuse the two, and even easier to write code that assumes one is the other: the macro recorder does it, many documentation examples and Stack Overflow answers do it too. But reliable code is code that makes as few assumptions as possible – sooner or later, built-in assumptions are broken, and you’re faced with an apparently intermittent error 1004 that sometimes happens when you’re debugging and stepping through the code, and it happened to a user once or twice but the problem always seemed to vaporize just because you showed up at the user’s cubicle and stood there watching as nothing blew up and everything went fine. *Shrug*, right?


Accessing Worksheets

You shouldn’t be dereferencing worksheets all the time. In fact, you rarely even need to. But when you do, it’s important to do it right, and for the right reasons. The first thing you need to think of, is whether the sheet exists in ThisWorkbook at compile-time. Meaning, it’s there in the host document, you can modify it in Excel and there’s a document module for it in the VBA project.

That’s the first thing you need to think about, because if the answer to that is “yep, it’s right there and it’s called Sheet3“, then you already have your Worksheet object and there’s no need to dereference it from any Sheets or Worksheets collection!

Dim sheet As Worksheet
Set sheet = ThisWorkbook.Worksheets("Sheet1") '<~ bad if Sheet1 exists at compile-time!

Set sheet = Sheet1 '<~ bad because redundant: the variable obfuscates the target!
sheet.Range("A1").Value = 42 '<~ bad if sheet is a local variable, but good if a parameter

Sheet1.Range("A1").Value = 42 '<~ most reliable way to refer to a worksheet

The magic Sheet1 identifier comes from the (Name) property of the Sheet1 document module under ThisWorkbook in the VBA project: set that property to a valid and meaningful name for that specific worksheet, and you have a user-proof way to refer to your ConfigurationSheet, the SummarySheet, and that DataSheet. If the user decides to rename the DataSheet to “Data (OLD)” for some reason, this code is now broken:

ThisWorkbook.Worksheets("Data").Range("A1").Value = 42

Meanwhile this code will survive any user-induced sheet-name tampering:

DataSheet.Range("A1").Value = 42

Wait, is it .Sheets() or .Worksheets()?

The first thing to note, is that they aren’t language-level keywords, but member calls. If you don’t qualify them, then in the ThisWorkbook module you’ll be referring to Me.Worksheets (i.e. ThisWorkbook.Worksheets), and anywhere else in the VBA project that same code be implicitly referring to ActiveWorkbook.Worksheets: that’s why it’s important to properly qualify member calls. Worksheets is a member of a Workbook object, so you explicitly qualify it with a Workbook object.

Now, Sheets and Worksheets both return an Excel.Sheets collection object, whose default member Item returns an Object. Both are happy to take a string with a sheet name, or an integer with a sheet index. Both will be unhappy (enough to raise runtime error 9 “subscript out of range”) with an argument that refers to a sheet that’s not in the (implicitly or not) qualifying workbook object. Both will return a Sheets collection object if you give it an array of sheet names: that’s one reason why the Item member returns an Object and not a Worksheet. Another reason is that sometimes a sheet is a Chart, not a Worksheet.

Use the Worksheets collection to retrieve Worksheet items; the Sheets collection contains all sheets in the qualifying workbook, regardless of the type, so use it e.g. to retrieve the Chart object for a chart sheet. Both are equivalent, but Worksheets is semantically more specific and should be preferred over Sheets for the common Worksheet-dereferencing scenarios.


Dereferencing Workbooks

If you only need to work with ThisWorkbook, then you don’t need to worry about any of this. But as soon as your code starts opening other workbooks and manipulating sheets that are in these other workbooks, you need to either go nuts over what workbook is currently the ActiveWorkbook as you Activate workbooks and repeatedly go Workbooks("foo.xlsm").Activate…. or properly keep a reference to the objects you’re dealing with.

When you open another workbook with Application.Workbooks.Open, that Open is a function, a member of the Excel.Workbooks class that returns a Workbook object reference if it successfully opens the file.

Workbooks.Open is also side-effecting: successfully opening a workbook makes that workbook the new ActiveWorkbook, and so global state is affected by its execution.

When you then go and work off ActiveWorkbook or unqualified Worksheets(...) member calls, you are writing code that is heavily reliant on the side effects of a function, and global state in general.

The right thing to do, is to capture the function’s return value, and store the object reference in a local variable:

Dim book As Workbook
Set book = Application.Workbooks.Open(path) '<~ global-scope side effects are irrelevant!

Dim dataSheet As Worksheet
Set dataSheet = book.Worksheets("DATA")

If a workbook was opened by your VBA code, then your VBA code has no reason to not have a Workbook reference to that object.

So when is ActiveWorkbook useful then?

As an argument to a procedure that takes a Workbook parameter because it doesn’t care what workbook it’s working with, or when you need to assign a Workbook object variable (presumably a WithEvents module-scope private variable in some class module) to whatever workbook is currently active. With few specific exceptions, that’s all.

ActiveWorkbook.Whatever is just not code that you normally want to see anywhere.


Cha-cha-cha-Chaining Calls

Strive to keep the VBA compiler aware of everything that’s going on. Moreover if all the code you write is early-bound, then Rubberduck understands it as completely as it can. But implicit late binding is alas too easy to introduce, and the primary cause for it is chained member calls:

book.Worksheets("Sheet1").Range("A1").Value = 42 '<~ late bound at .Range

Everything after the Worksheets("Sheet1") call is late-bound, because as described above, Excel.Sheets.Item returns an Object, and member calls against Object can only ever be resolved at run-time.

By introducing a Worksheet variable to collect the Object, we cast it to a usable compile-time interface (that’s Worksheet), and now the VBA compiler can resume validating the .Range member call:

Dim sheet As Worksheet
Set sheet = book.Worksheets("Sheet1")
sheet.Range("A1").Value = 42 '<~ all early bound

Chained early-bound member calls are fine: the compiler will be able to validate the Range.Value member call, because the Excel.Worksheet.Range property getter returns a Range reference. If it returned Object, we would have to declare a Range local variable to capture the Excel.Range object we want to work with, like this:

Dim sheet As Worksheet
Set sheet = book.Worksheets("Sheet1") '<~ good: casts Object into Worksheet.

Dim cell As Range
Set cell = sheet.Range("A1") '<~ redundant: sheet.Range("A1") returns an early-bound Range.
cell.Value = 42 '<~ early-bound, but sheet.Range("A1").Value = 42 would also be early-bound.

Avoid declaring extraneous variables, but never hesitate to use a local variable to turn an Object into a compile-time type that gives you IntelliSense, autocompletion, and parameter quick-info: you’ll avoid accidentally running into run-time error 438 for typos Option Explicit can’t save you from. Using the compiler to validate everything it can validate, is a very good idea.

If you need to repeatedly invoke members off an early-bound object, introducing a local variable reduces the dereferencing and helps make the code feel less repetitive:

sheet.Range("A1").Value = "Rubberduck"
sheet.Range("A1").Font.Bold = True
sheet.Range("A1").Font.Size = 72
sheet.Range("A1").Font.Name = "Showcard Gothic"

By introducing a local variable, we reduce the cognitive load and no longer repeatedly dereference the same identical Range object pointer every time:

Dim cell As Range
Set cell = sheet.Range("A1")

cell.Value = "Rubberduck"
cell.Font.Bold = True
cell.Font.Size = 72
cell.Font.Name = "Showcard Gothic"

Arguably, nested With blocks could hold all the object references involved, and reduces the dereferencing to a strict minimum (.Font is only invoked once, and the reference is witheld), but it’s very debatable whether it enhances or hurts readability:

With sheet.Range("A1")
    .Value = "Rubberduck"
    With .Font
        .Bold = True
        .Size = 72
        .Name = "Showcard Gothic"
    End With
End With

Avoiding nested With blocks seems a fair compromise:

Dim cell As Range
Set cell = sheet.Range("A1")

cell.Value = "Rubberduck"
With cell.Font
    .Bold = True
    .Size = 72
    .Name = "Showcard Gothic"
End With

All this is fairly subjective, of course, and really applies to everything you ever write in VBA (not just when coding against the Excel object model), but any of it is better than this (very cleaned-up) macro-recorder code:

    Range("A1").Select
    ActiveCell.Value = "Rubberduck"
    With Selection.Font
        .Name = "Showcard Gothic"
        .Bold = True
        .Size = 72
    End With

We note the implicit ActiveSheet reference with the implicitly-qualified Range member call; we note the use of Range.Select followed by a use of ActiveCell; we note a With block holding a late-bound reference to Range.Font through the Selection object, and the compiler gets to validate absolutely nothing inside that With block.

The macro recorder doesn’t declare local variables: instead, it Selects things and then works late-bound against the Selection. That’s why it’s a bad teacher: while it’s very useful to show us what members to use to accomplish something, it does everything without leveraging any compile-time checks, and teaches to Activate the sheet you want to work with so that your unqualified Range and Cells member calls can work off the correct sheet… but now you know why, how, and when to dereference a Worksheet object into a local variable, you don’t need any Select and Activate!

The Macro Recorder Curse

The macro recorder is a wonderful thing. It’s one of the tools at your disposal to assist you in your journey, a good way to observe code that does exactly what you just did, and learn what parts of the object model to use for doing what. The problems begin when you see macro recorder code constantly invoking Range.Select and Worksheet.Activate, working against the Selection object, generating dozens of repetitive statements and redundant code that can easily trick a neophyte into thinking “so that’s how it’s done!” – the macro recorder is a great way to familiarize with a number of APIs, …and that’s all it needs to be.

There are very few ways to write more inefficient and bug-prone code, than to use the macro recorder’s output as a model of how VBA code should be written. How to avoid Select and Activate has to be the single most linked-to post in the VBA tag on Stack Overflow, yet an untold number of young souls remain to be saved from the curse of the macro recorder.

Of course, we have all the tools we need to defeat that curse. I’m not going to repeat everything in that very good SO thread, but the crux of it boils down to, in my opinion, a few simple things.

Early Binding and Guard Clauses

From an automation standpoint, Selection is an interesting object. In the Excel object model, Selection is a Shape that’s selected, the Chart you just clicked on, the Range of cells you navigate to. If the current selection is relevant to your code, consider making it an input with an explicit object type: the selection you’re expecting very likely has a very specific type, like a Range. Simple example:

Public Sub MyMacro()
    Selection.Value = 42 'multiple possible errors
End Sub

If Selection is pulled from that code and taken in as a Range parameter instead, we eliminate all ambiguities and restore the natural balance of the universe by coding against the Range interface rather than against Object – which means compile-time validation and IntelliSense:

Public Sub MyMacro()
    If Not TypeOf Selection Is Excel.Range Then Exit Sub '<~ that's a *guard clause*
    DoSomething Selection
End Sub

Private Sub DoSomething(ByVal target As Range)
    target.Value = 42
End Sub

Note the similarities between MyMacro in the first snippet, and DoSomething in the second one – it’s what they do differently that makes… all the difference. Now the procedure can work with any Range object, whether it’s actually selected or not.

Working with Selection is never really needed: what you can do against Selection you can do with any Range if what you mean to work with is a Range, or any Chart if what you mean to work with is a Chart.

It might look like it’s more code, more complicated to write – it might even be. But binding these types at compile-time makes things much simpler, really. When we make a member call against Object, the compiler doesn’t even care that the member exists. This involves overhead at run-time, and a non-zero chance of error 438 being raised when the member does not exist. Like Variant, Object is very flexible… too much for its own good.

A member call against Selection is inherently late-bound. Exactly like dynamic in C#, you want to break out of it, as soon as possible: if you’re expecting a Range, declare a Range and run with it, be explicit. In turn, you’ll be rewarded with VBA blowing up with a type mismatch error (13) early on, rather than with some object doesn’t support property or method error (438), possibly far removed from where the problem really stems from – the instruction that wrongly assumed a Selection could be treated like any old Range:

Public Sub MyMacro()
    Dim cell As Range
    Set cell = Selection '<~ type mismatch if Selection isn't a Range
End Sub

The macro recorder will never generate a single control flow statement. No loops, no conditionals, no variables, no structure. If you take something that’s tedious, and make a computer do it tediously for you, you may be “getting the job done”, but you’re 1) being very mean to your computer, and 2) you could easily be misusing the object model in ways that make it inefficient. The best way to tell a computer to do something 20 times isn’t to tell it 20 times to do one thing!

By simply introducing/declaring variables to hold the Worksheet and Range objects we’re working with, we eliminate the need to Select and Activate everything we touch, and Selection becomes useless, for the most part – if your macro is kicked off with a keyboard shortcut and works with whatever is selected at that time, more power to you – but that doesn’t mean your entire code needs to work with Selection, only that you need to validate what’s selected and switch to early binding as early as possible, using properly typed local variables.

Use Range.Select when you need to programmatically visually select cells in the worksheet, for the user to see that selection being made and later perhaps interacted with.