Core contributor to the Rubberduck project, co-author of Microsoft Access in a Sharepoint World (2011), Professional Access 2013 Development (2013), and Effective SQL: 61 Specific Ways to Write Better SQL (2016), 10-times Microsoft Access MVP award recipient (2009-2019), Ben Clothier wrote a paper about class modules and OOP that makes a great on-topic addition to this blog. Enjoy!
There are popular misconceptions surrounding VBA and object-oriented programming (OOP), usually in 2 forms:
- VBA isn’t really OOP, so you can’t really use OOP principles with VBA
- OOP makes things too complicated; procedural programming is all you need anyway
Both are incorrect because OOP is not a language feature but rather a principle of how we should design our code. In modern programming languages and IDE, there are features that makes it easier to apply & enforce the principles. In the end, it is still up to us, the programmers, to actually apply & enforce the principle. Thus, the language of choice has no bearing on whether you can apply the principles of OOP or not. If you are still skeptical, consider that the C programming language predated the development of OOP but there is a demonstration on writing OOP code in C.
The 2nd objection is commonly raised because when looking at the OOP code, it can seem frustrating because it refers to several other objects and you find yourselves looking at more files in order to see what a program does. Coming from a procedural mindset, that can feel like you’re dealing with several layers of lasagna. That does require a change in how you perceive the code.
That is the goal of this article, to help you discover how applying OOP principle can help you write better code, not just for VBA but for any programming language. If you’ve worked on a complex project, you might have had an episode where when you fixed a bug in one spot, 2 new bugs appeared in 2 unrelated places. Surely, you’d find that quite frustrating, taking out all the fun in the programming. Procedural design enables you to solve business problems quickly so that you can get on with other stuff. However, what if it’s so successful, that they come back for more; asking you for more features? How many changes do you have to make? With procedural programming, the upkeep is cumulative; first few feature requests are easy and put in action quickly. Next few, it takes more time and more tweaking. Some more, then it feels a bit harder and harder. But coding should not be like that! Adding a new feature should not scale on an exponential scale! That is what the OOP promises; by keeping a clean codebase, it is easy to describe the new feature and integrate it into the codebase with minimum change.
In fact, most programmers nowadays should be emphasizing writing refactor-friendly code. What do we mean by refactor-friendly? Basically, it is a codebase that is easy to change because you are able to change only pieces that actually needs to change and no more than that. That is very difficult to do in a purely procedural system. For long time, refactoring has not been something on average VBA developers’ mind because there were no tools to refactor VBA. Rubberduck exists to provide those tools. To get the most from refactoring, we do need to raise our level of code writing and apply good design to our VBA codebase.
The other important aspect to learn is that we want to make the wrong code look obviously and blatantly wrong. I highly encourage you to also read Joel Spolsky’s article on that subject. His article deals with the Hungarian notations, but we want to go beyond just the naming conventions. Taking up on OOP principles can significantly help us with making wrong code look wrong which means it becomes easier for us to fix the code. You’ve probably had to deal with a giant hundreds-line procedure with the great wall of declarations and deeply nested code and had the thrill of debugging it and cursing while your minor change cascades into something catastrophic. Well, there’s a better way!
The article assumes that you are familiar with VBA and procedural programming but otherwise have never or rarely used classes or interfaces. It further assumes that you might have had heard of object-oriented programming but otherwise are unfamiliar with the design and use of such. Also, it assumes that you are familiar and comfortable with using code-behinds and events in the document modules. (e.g. Excel’s workbook or worksheet, Access’ form or reports, Word’s document, etc.) We will build up on the class design and eventually apply OOP principles in designing our classes. To reinforce the concepts, we will do a build-up starting with familiar approach and transforming it into a clean codebase that is very refactor-friendly. The benefit is that you end up with a codebase that is easy to read, understand and maintain. Because this assumes you are familiar with procedural procedure (e.g. writing small functions or routines that perform a complex task by breaking it down into small steps), we need to provide a good transition from procedural mindset to object-oriented mindset. For that reason, we will take a route around the town instead of a direct route. I believe the indirect route will be beneficial in seeing what we would accomplish with a clean object-oriented codebase. This is not intended to be an exhaustive treatise but rather provide enough of fundamentals for you to see the advantages of the OOP principles in a VBA codebase.
We will start with creating custom types, and doing work with it, then use it as a basis for our first class.
Creating your custom types
You may have already used a user-defined type (UDT), which is a convenient way to create a structure of closely related properties together. You may have used it before especially if you’ve ever had to use certain API functions via the Declare statements. Let’s start with a Person UDT. We can create a new standard module and define a UDT within the module:
Public Type Person FirstName As String LastName As String BirthDate As Date End Type 'figure 2-1: a Person UDT
The UDT provides us with 3 members that tells us something about a Person; namely the first & last name and the birth date. Obviously, we can have more but we want the example to stay simple. The calling code to use a Person UDT could look something like this:
Public Sub Test() Dim p1 As Person Dim p2 As Person p1.FirstName = "John" p1.LastName = "Smith" p1.BirthDate = #1970-01-01# p2.FirstName = "Jane" p2.LastName = "Doe" p2.BirthDate = #1970-01-01# Debug.Print VarPtr(p1), VarPtr(p2) End Sub 'figure 2-2: testing code for using a Person UDT
This should demonstrate clearly that with a UDT, we could create several “instances”, which are independent of one other. Our setting of
p2.FirstName does not affect the
p1.FirstName. Note the last line printing out the
VarPtr(p2). This prints out the variable’s memory address, which demonstrates that the
p2 variables resides in a different region of memory and thus do not share anything. As an exercise, you can check the
VarPtr for members of
p2. For example, you could look at
VarPtr(p2.LastName) and compare to their respective counterpart.
In typical cases, we might want to have a collection of persons so that we can work with them in bulk or something similar. You might have done this but using a worksheet or a database table as the backing data structure. There is one crucial difference between a UDT and a database table or worksheet; the UDT is always ethereal, resides in the memory whereas the same data saved into a worksheet, a database table, or XML file are persisted and requires a specific method to read or write data to those source.
One thing about a UDT is that it cannot have any methods. We’ll start with creating a new person. In the above code example, we declared a
Person variable for each object we needed, but we can do better than that. Let’s have a procedure that returns a new
Person instead, as demonstrated below:
Public Function Create( _ FirstName As String, _ LastName As String, _ BirthDate As Date _ ) As Person Dim NewPerson As Person NewPerson.FirstName = FirstName NewPerson.LastName = LastName NewPerson.BirthDate = BirthDate Create = NewPerson End Function 'figure 2-3: a Create function for a Person UDT
Now the calling code looks like this:
Public Sub Test() Dim p1 As Person Dim p2 As Person p1 = Create("John", "Smith", #1970-01-01#) p2 = Create("Jane", "Doe", #1970-01-01#) Debug.Print VarPtr(p1), VarPtr(p2) End Sub 'figure 2-4: calling code using a Create function to create Person UDTs
Much more compact code, yes? More importantly, when we read the code, it is easy to understand what it is doing because we separate the mechanics of the creation from the current context which just needs something created without knowing the particular details in the act of creation.
But it doesn’t have to be just about the creation. Let’s say we want to provide name change. Perhaps because Ms. Doe got married and is now Mrs. Holly. We could then write a new function to help us:
Public Function ChangeLastName( _ Person As Person, _ NewLastName As String _ ) As String ChangeLastName = Person.LastName Person.LastName = NewLastName End Function 'figure 2-5: a ChangeLastName function to mutate a Person UDT
Thus, we could have our calling code do the name change:
Public Sub Test() Dim p1 As Person Dim p2 As Person p1 = Create("John", "Smith", #1970-01-01#) p2 = Create("Jane", "Doe", #1970-01-01#) Debug.Print VarPtr(p1), VarPtr(p2) Dim OldName As String OldName = ChangeLastName(p2, "Holly") Debug.Print OldName, p2.LastName Debug.Print VarPtr(p1), VarPtr(p2) End Sub 'figure 2-6: calling code using ChangeLastName on a Person UDT.
The calling code demonstrates that when we change the last name for Ms. Doe to Mrs. Holly, the variable
p2 is still the same; only its content has changed. If you are now wondering why we couldn’t have just assigned the UDT member directly instead of calling
ChangeLastName, that’s exactly one of the problems we face with using UDTs:
p2.LastName = "Holly" 'Overwriting the original "Doe" entry 'figure 2-7: bypassing the ChangeLastName function by writing directly to the UDT member.
There is no way for us to control the access. VBA does not allow us to create an UDT that cannot be edited once created. That is often referred to being “immutable”. So, when we pass around UDT, we are always trusting that everyone will follow the same convention we build around the type. However, as human beings, we excel at being inconsistent and forgetful, so it’s too easy to fail to follow the convention, especially because the compiler won’t care whether you do a
p2.LastName = "Holly" or
ChangeLastName(p2, "Holly"). Both are legal syntax; but we don’t want it to be that way.
If you are wondering why we should want to control access and require use of a
ChangeLastName instead, consider that in a typical business process, nothing stays the same for very long. What may have been true yesterday may be no longer true today. To stick to our example, we could suppose that we have a requirement that name change must be approved and is restricted to only those who are 18 years or older, based on the birth date. If we directly set the
LastName property, there’s no guarantee that the checks have been enforced. We can write down a sticky note “Use ChangeLastName”, but that won’t be enforced by the compiler. One key to writing a clean code base is to have the compiler do as much work as possible in telling you that some certain action is verboten.
To recap what we’ve learned so far. We’ve seen how we can define a custom user-defined type to group a set of closely related properties. We saw that the UDT can be instantiated multiple times, enabling us to juggle more than one instances of same type at the same time. We wrote some procedures that interacts with the UDT to compensate for the shortcomings of the UDT such as making creation easy or managing some sensitive change such as changing person’s last name which may have additional constraints beyond just the code itself. We saw that an UDT does not really do a good job of managing the access to properties, which requires us to follow conventions that are not enforced by the compiler, which can make the coding around an UDT highly prone to errors or omissions.
Creating our first class module
With all that information we’ve learned, we now have enough working knowledge to start creating a class. Let’s get started by creating a new class module. We’ll call it “Person”. We create a class module via the VBIDE’s toolbar and choosing the Class Module command.
The very first thing we want to do with our first class is to define the private data it needs to have to work correctly. We could start with nothing but public fields, like this:
Public FirstName As String Public LastName As String Public BirthDate As Date 'figure 3-2: initial class design with fields only.
The class module is probably still unnamed and may have a default name of
Class1. To provide it with a name, you can fill in the name via the Properties toolwindow:
However, this is no better than the original UDT we started with in Figure 2‑1. We would still have the same problem with controlling the access. For example, we might not want to allow arbitrary changes to
LastName but rather control it via a dedicated
ChangeLastName procedure. We could use
Property statements instead. If you’ve never used one before, they are a way to provide a procedural access to a member of the data structure, which grants us additional control on how the property may be accessed. We could revise the code accordingly:
Private mFirstName As String Private mLastName As String Private mBirthDate As Date Public Property Get FirstName() As String FirstName = mFirstName End Property Public Property Let FirstName(NewValue As String) mFirstName = FirstName End Property Public Property Get LastName() As String LastName = mLastName End Property Public Property Let LastName(NewValue As String) mLastName = LastName End Property Public Property Get BirthDate() As String BirthDate = mBirthDate End Property Public Property Let BirthDate(NewValue As String) mBirthDate = BirthDate End Property 'figure 3-4: a class design using Property statements instead of public fields.
This is still functionally equivalent to the original version of class and the UDT; all members are readable & writable. However, because property is a procedure, we can add additional logic beyond just setting a field. A typical example might be to require validation such as the code in Figure 3‑5:
Public Property Let FirstName(NewValue As String) If Len(NewValue) = 0 Then Err.Raise 5, Description:="First name cannot be blank" Else mFirstName = NewValue End If End Property 'figure 3-5: an example of a property procedure with validation enforced.
Note that we had to define 3 private fields (
mBirthDate). We had to use prefixes because we don’t want name collisions with the public-facing properties of same name. However, this is problematic for two reasons:
- All private fields are now sorted together in the IntelliSense, requiring you to type in a “m” to locate the module level field. This becomes annoying when you have other public members that might start with the letter “M” but will now mingle with those various backing fields. That hampers the discoverability of the code.
- Because of intermingling, you can’t tell quickly whether a public member
LastNamehas a backing field or not; you’d need to look in two different places to make that determination. While the examples above have shown us properties that provide data, we could create a get-only property that is calculated. For instance, we can create an
Ageproperty that is calculated based off the
mBirthDatebacking field rather than having its own
We can do better! Let’s use a UDT instead of a bunch of fields. Since we are improving upon the original UDT, we will start with a UDT, but this time we’ll make it
Private, and we will only need a single instance of it. We’ll also take the opportunity to make all properties read-only by not providing a
Property Let like we did in the original example:
Private Type TPerson FirstName As String LastName As String BirthDate As Date End Type Private This As TPerson Public Property Get FirstName() As String FirstName = This.FirstName End Property Public Property Get LastName() As String LastName = This.LastName End Property Public Property Get BirthDate() As String BirthDate = This.BirthDate End Property 'figure 3-6: a class using an UDT as its backing field for several properties.
With this approach, we have only a single module-level variable, called
This. This enables us to get a nicely filtered IntelliSense listing only the backing field, which can be now the same name as the public member and this is now much easier to enforce with compiler’s help.
This is obviously an improvement, but we now have no way of setting the data to the
This instance. Let’s add a procedure to do just that:
Public Sub FillData( _ FirstName As String, _ LastName As String, _ BirthDate As Date _ ) If Len(This.FirstName) = 0 Then This.FirstName = FirstName This.LastName = LastName This.BirthDate = BirthDate End If End Sub 'figure 3-7: a FillData procedure to write to the private data of the class.
Now we can create the persons with this revised calling code:
Public Sub Test() Dim p1 As Person Dim p2 As Person Set p1 = New Person Set p2 = New Person p1.FillData "John", "Smith", #1970-01-01# p2.FillData "Jane", "Doe", #1970-01-01# Debug.Print VarPtr(p1), VarPtr(p2) End Sub 'figure 3-8: revised testing code using the Person class.
Hopefully this illustrates how much cleaner the code is. More importantly, because the fields in the class are now private and can be only set via the
FillData method, and exposed as read-only, we make it possible to leverage the compiler to help us enforce guarantees about the access to those fields. That becomes important in a more complex class where we need to be able to make safe assumptions about the class’ internal state. We would not have that with an UDT.
We also have a validation check that
This.FirstName is not already a zero-length string, and throwing a runtime error to prevent erroneous use of the
FillData procedure. However, this is a runtime validation, rather than compile-time validation. We want the compiler to do the work for us.
Can we do that? Absolutely! We will look at how we can achieve this using interfaces next to help us hide the methods.
Controlling access to methods via interfaces
Above, you saw how we could use a class module to protect the internal data structure and thus control the access to the data, which helps us write code that we can verify at the compile time to do the correct thing. However, we still need to deal with the methods themselves. As noted, we needed to create a
FillData method to write data to the internal state. As it is, it would expose the procedure to all consumers and there’s nothing preventing them from inappropriately calling it. We can use a runtime validation but the objective of this paper is to convert as much errors we can from run-time to compile-time. So we need to do something about the
FillData method. We want to basically hide the
FillData member once the instance has been created. How do we do that? With interfaces. What are interfaces in VBA? They’re actually just class modules. VBA does not actually make a semantic distinction between a class and interface. To further muddy the water, all VBA classes also have a default interface – that is, the class itself (i.e. its
Public members). To keep it simple, we will say that an interface is basically a VBA class module with no code. Here’s how we will set up our
Public Property Get FirstName() As String End Property Public Property Get LastName() As String End Property Public Property Get BirthDate() As String End Property Public Function ChangeLastName(NewLastName As String) As String End Function 'figure 4-1: IPerson interface.
You might note that this has similar properties like we saw in Figure 3‑6, with the addition of a modified version of
ChangeLastName we saw in Figure 2‑5. More importantly, the
FillData method is not present on the
IPerson interface. By itself, it does not do much because there is no implementation for the interface. We will now make the
Person class implement the
IPerson interface. This is done with the
Implements IPerson Private Type TPerson FirstName As String LastName As String BirthDate As Date End Type Private This As TPerson Private Property Get IPerson_FirstName() As String IPerson_FirstName = This.FirstName End Property Private Property Get IPerson_LastName() As String IPerson_LastName = This.LastName End Property Private Property Get IPerson_BirthDate() As String IPerson_BirthDate = This.BirthDate End Property Private Function IPerson_ChangeLastName(NewLastName As String) As String IPerson_ChangeLastName = This.LastName This.LastName = NewLastName End Function Public Sub FillData( _ FirstName As String, _ LastName As String, _ BirthDate As Date _ ) If Len(This.FirstName) = 0 Then This.FirstName = FirstName This.LastName = LastName This.BirthDate = BirthDate End If End Sub 'figure 4-2: Person class implementing the IPerson interface.
If you compare the original class in figure 3‑6 with the code above, you should note the following differences:
- The members are now
- The members now have the prefix
IPerson_. You might have seen similar setup with event handlers. Obviously this is special in the sense that if you have an
IPersonvariable, you are able to access the implementation even though it’s
IPersoninterface defines the member as
Public(see figure 4‑1).
FillDataprocedure is not on the
IPersoninterface, but is still
Publicand doesn’t have an
Let’s set up some testing code to demonstrate that we’ve in fact hidden the
Public Sub Test() Dim p As Person Set p = New Person p.FillData "John", "Doe", #1970-01-01# Dim i As IPerson Set i = p 'We can assign a Person to IPerson because of Implements i.FillData "Invalid", "Invalid", #9999-12-31# End Sub 'figure 4-3: testing code to demonstrate that FillData on a IPerson variable is disallowed at compile-time.
Once we’ve set up the code above, we should try to compile the code. This should yield a compile-time error, like this:
This is an immense improvement over the original code from the Figure 3‑7, which would only be enforced at run-time, not at compile-time. By writing code that we can get the compiler to aid us in checking, we reduce the likelihood of introducing bugs due to an incorrect use of methods.
However, you might be wondering if we have to create a
Person variable, what’s there to stop us from accidentally using a
Person variable when we should be using an
IPerson variable? Indeed, the testing code in figure 4‑3 is suboptimal. Ideally we would have a method that will provide us with an
IPerson variable so that we don’t actually need to create the
Person variable at all. Thus, we need to learn about creating a factory, and also learn about separation of concerns.
Factory Design Pattern
We saw how we can use interfaces to hide the methods that shouldn’t be available to consumers but we also saw that it does little good if we create the implementation then cast it in the same routine. We need to be able to separate the creation so that we will get the interface, rather than creating the implementation ourselves. By separating the two concerns, or the scope of work, we achieve cleaner code. To that end, we will need to create a factory. There are few different ways we can create a factory, with their pros and cons. Because this article intends to build up on the lessons we learned, we will start with the most simple possible way to implement a factory.
One way to create a factory in VBA is to create a standard module and treat it like a class. By using a standard module in this manner, we avoid the need to create an instance of the factory itself and keep things straightforward. Let’s create a
Public Function Create( _ FirstName As String, _ LastName As String, _ BirthDate As Date _ ) As IPerson Dim NewPerson As Person Set NewPerson = New Person NewPerson.FillData FirstName, LastName, BirthDate Set Create = NewPerson End Function 'figure 5-1: PersonFactory module.
We can then revise the calling code from figure 4‑3 to look like this:
Public Sub Test() Dim p As IPerson Set p = PersonFactory.Create("John", "Doe", #1970-01-01#) End Sub 'figure 5-2: revised calling code using a PersonFactory module.
Note that we now only have an
IPerson variable; we don’t even need to know that it’s the
Person class that is the implementation for the variable. More importantly, we are able to hide the members of
At this point, we’ve successfully hidden the
FillData member and the changes we have introduced makes it much easier for us to analyze our codebase and make wrong code look obviously wrong. You may be wondering what’s stopping us from simply creating a new
Person variable and casting to it. The answer is actually nothing, really. However, the same answer applies to preventing us from making a
Private procedure a
Public one or promoting a local variable to a global one: we know to not make a private procedure public because that breaks the encapsulation and makes for more messy codebase. By the same token, we want to become comfortable using abstractions (e.g. the interfaces) as a perfectly normal way to work with classes where we need to hide the implementation details from the consumers.
You might have noticed that in the figure 5-3, we used the syntax
PersonFactory.Create(). A common misconception regarding naming of procedures in VBA is that they cannot be the same name when they are
Public and in a standard module. Thus, a common approach would be to name the method something like
CreatePerson, which is a OK name. However, trying to come up with unique names for everything can quickly become a hassle. However, there is nothing preventing us from creating a
Public Create() As IPerson in a
PersonFactory module and a
Public Create() As IWidget in a
WidgetFactory module! By qualifying our calls to the
Create() methods with the module name, it immediately becomes obvious what we are using to create our objects and without the overhead of creating yet another class. This ought to illustrate why semantic naming matters much more than using notations which only add more noise without making wrong code look wrong.
As already has been mentioned, there are different ways we can create factories. It could be an actual class that you could allow the use of
New on. It could be a member on the predeclared class module. It depends on what you need. The key takeaway here is that by separating the concern of creating objects from the code that consumes them, we are able to make good use of interfaces to hide the members of the implementations.
You’ve learned how to create a class module and apply some of the good design principle including encapsulating and separating the concerns. You’ve also learned how you can use interfaces to control the access to members that shouldn’t be used by the calling code promiscuously. All those combine up to enable us to write a codebase that compiler is able to validate and enforce for us. Having the compiler do the work for us means we have less work when we review and try to understand what the code is doing. You’ve seen that does require more objects. We’ve had to create 2 class modules, 1 standard module and another standard module to do the testing/calling. With the original UDT approach outlined in section 2, it might be possible to do all in one module. A number of VBA developers may feel that keeping everything in one module makes for easier porting of code from one project to another. They may also feel that protecting the internals is easier to do with
Private methods. However, as you saw, sometimes that is simply not possible using a UDT which need public methods to provide “API” around it. Also, too many private members within a module usually makes the code untestable and brittle. Finally, using a
UserForm, or an Access form as the “class” only encourages highly coupled code that itself cannot be verified at compile-time. It is possible to set up a convention of doing things but if the convention can’t be enforced by the compiler, we would have do additional work in analyzing the code to ensure it is doing what it is intended. We will analyze those issues in a future post.
By applying good OOP design principles, we are able to make it easier on ourselves to write verifiable and testable code. The concern with managing the number of modules and being able to easily refactor code is actually a problem with the IDE, not with the language. For that reason, Rubberduck exists to alleviate those shortcomings and make it easier to apply good OOP design principles to your VBA code.