Globals and Ambient Context

Most of the time, we don’t need any global variables. State can usually be neatly encapsulated in an object, and a reference to this object can easily be passed as an argument to any procedure scope that needs it. But global scope is neither a necessary evil, nor necessarily evil. Like many things in programming, it’s a tool, and like many other tools, misusing it can cause pain.

The VBA code and host Excel workbook accompanying this article can be found on GitHub.


What is Global Scope?

When we declare a variable inside a procedure, we call it a “local variable” in reference to its scope being local to the procedure. “Module variables” are accessible within any procedure scope within the module they’re declared in. Public members of private modules (and Friend members of public modules) are only accessible within the project they live in, and Public members of public modules are global and can be accessed from other projects.

The different scopes of VBA: Global, project, module, and local.

Because in VBA class modules are private by default, and a public class is only PublicNotCreatable (as in, a referencing project cannot create a New instance of a class, factory methods must be provided), and also because “actually global” is in reality slightly more complicated than that (the VB_GlobalNamespace attribute is always going to be False for a VBA class), for the sake of simplicity when I talk about “global scope” and “globals” in this article, I’m treating global and project scopes as one and the same – but it’s important to know the difference, especially more so in scenarios where a VBA/Excel add-in/library is being referenced by other VBA projects, where a tidy public API is handy.

Keywords
Rubberduck recommends using the Dim keyword only in local scope, and to use the Private keyword to declare module-level variables. It also recommends using Public over Global, because nothing is really “global” in VBA and that makes the deprecated keyword potentially confusing. The Global keyword really means Public in VBA, and should be avoided.

Picture the VBA runtime executing some macro procedure and some variable needs to be incremented by 1. Scope determines whether that variable identifier is referring to a local, module, or global declaration. Accessibility is how we use code to restrict scope, using keywords like Private, Public, or Friend: if the variable identifier exists in a public module but is declared with the Private keyword, then it’s inaccessible and not in scope for the procedure we’re in.

So in search for the variable’s declaration we look for a local scope declaration by that name. If there isn’t any, we look for a module scope declaration for that name. Not there? We look at everything we can see in project scope. If we still haven’t found it then, we look for the declaration in the referenced libraries and projects, in priority order (so, the VBA standard library, then the host application’s own object model library, then everything else).

That’s scoping. Scopes and accessibility are closely related, but they’re different things. Think of accessibility as a tool to shape your private and public interfaces and APIs, keeping in mind that in VBA all module members are implicitly Public unless their declaration states otherwise.


Globals and Testability

Global variables are very useful: having a piece of data that is accessible from anywhere in the code does have its advantages. Used wisely, globals can very elegantly address cross-cutting concerns. Instead of having every method responsible for its own logging, or instead of passing a Logger instance to every method, each scope can access a single global Logger object (or invoke the same Log utility procedure), and there really isn’t any problem with that, …until you realize that your unit tests are all needlessly writing logs to some file under C:\Dev\VBA because the global logger is doing its job whether or not the code invoking it is being executed from a test runner… and this is making tests run code that isn’t related to these tests’ purpose: if there’s a bug in the logger code, it’s a test about the logger code that should be failing, not every single other test that couldn’t care less for the logging functionality.

From a testability standpoint, code with global dependencies can be difficult, if not impossible to test. In the case of a global Logger dependency, the logger’s interface would need to expose some kind of “kill switch” that tests can invoke to disable logging… but then modifying an otherwise perfectly good interface for the sake of making the object aware of whether it’s being invoked from a test or not, isn’t ideal at all (we’ll see why in a bit).

This Logger is a good example of a legitimate global service, but it’s “user code” that could always be pragmatically modified to accommodate testing. What about code that depends on global-scope services that aren’t “user code”?

Treating the Excel Object Model as a Dependency

Imagine needing to write tests for user-defined functions (UDF) that store a number of values in a global Dictionary and then schedule a macro that then runs (asynchronously!) and sends these values over to some web API that returns data that then ends up on the worksheet, underneath the calling UDF; the functions have dependencies on Application.Caller and Application.OnTime: we don’t own the Application global object, and we can’t modify its code to accommodate testing – what then?

Writing tests for a UDF is normally trivial: the function takes inputs, computes a result, and then returns it. Tests can supply various inputs and run the function through all kinds of cases and assert that it handles them correctly, by simply comparing its return value with what’s expected, and exceptional edge cases can have tests asserting that the expected error is thrown.

Writing tests for a side-effecting UDF that temporarily stores data in global scope is a lot more challenging, for many reasons. Remember, unit tests:

  • Should reliably produce the same outcome regardless of any external factors;
  • Should be fast, and not involve any I/O or network activity;
  • Should be able to be executed individually or in any given order without affecting outcome;
  • Should be able to be executed concurrently (at least in theory – VBA won’t run concurrent code).

With state shared between the tests, we have to be careful to correctly setup and clean-up that state before & after each test, so that each test gets a fresh canvas in a controlled environment… and then we can live with VBA unit tests that would likely break if executed concurrently, because VBA can’t run them concurrently anyway.


Testing Untestable Things

Back to this not-so-crazy UDF scenario with the Application.OnTime hack: it wouldn’t be acceptable for a test to literally wait for Excel to decide it’s time to invoke a macro, not any more than a test should be sending any actual HTTP requests (although that would be very a good way to actually be testing an API’s rate limits and get acquainted with throttling, I guess), let alone parse and process an actual HTTP response.

Such a user-defined function involves too many moving parts soldered together to be testable: making the code testable involves making the parts moving parts again, and yes it involves a proverbial blowtorch and lots of proverbial sparks flying everywhere.

Refactoring code to make it testable is a lot of fun, but the first step is, ideally, to fully grasp what’s going on and why.

If you aren’t familiar with using Application.OnTime in user-defined functions (only indirectly, because Application.OnTime calls, like run-time errors and many other members in the Excel object model, get “swallowed” when Excel is evaluating a UDF), it’s a pretty cool process that goes like this:

The calling cell contains the UDF’s return value just before the macro gets asynchronously invoked and produces its own output.

So if a UDF stored its arguments as key/value pairs in a global-scope dictionary, if all goes well and according to plan, the macro that runs a moment later gets to consume this data.

By storing the Application.Caller cell object reference in global scope, the side-effecting macro gets to know where to put its results table. There’s always the possibility that a second UDF overwrites this global state during the split-second between the moment a first UDF writes it and the moment the scheduled asynchronous read of this global state actually happens: it’s important to keep in mind that Ambient Context does not inherently address this particular problem; the state is still global and mutable from anywhere in the code, and there is never any guarantee that any scope will run to completion before the VBA runtime decides it’s an asynchronous callback’s turn to run.

The Application.Caller member isn’t going to return a Range reference when it’s not a worksheet cell invoking the function, we can’t afford to wait for Application.OnTime, and we’d like to avoid actually invoking any Win32 API functions during a test. That UDF simply isn’t testable as-is.

The solution is to introduce an abstraction to wrap the Application members we need, and make the side-effecting UDFs depend on that abstraction instead of invoking Application members directly.

AbstractionThe untestable code might look something like this:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    Set SomeGlobalRange = Application.Caller.Offset(RowOffset:=1)
    With SomeGlobalDictionary
        .Clear
        .Add "FirstParameter", FirstParameter
        .Add "SecondParameter", SecondParameter
    End With
    ScheduleMacro
End Function

Where ScheduleMacro involves a Win32 API call to schedule the execution of an Execute procedure that handles the Application.OnTime scheduling of the actual side-effecting procedure.

We want to be able to write a test that invokes this SideEffectingUDF function, and determines whether Application.Caller was invoked: Application.Caller is a dependency here, and for the test to be able to fulfill its purpose we must find a way to inject the dependencies so they can be controlled by the test, from outside the function.

Note how narrow such a test would be: it asserts that the UDF gets the Application.Caller reference, nothing more. Other tests would be similarly narrow, but for other things, and we don’t want a failing Application.Caller member call to interfere with these other tests by throwing error 91 before the test gets to do its thing. Whether or not we need to know if a UDF does or does not invoke Application.Caller, we still need a way to abstract the dependency away, to stub it.

You may be thinking “oh that’s easy” and be tempted go down this path:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    If TypeOf Application.Caller Is Excel.Range Then
        ' caller is a worksheet cell
        Set ThatGlobalCell = Application.Caller.Offset(RowOffset:=1)
        With ThatGlobalDictionary
            .Clear
            .Add "FirstParameter", FirstParameter
            .Add "SecondParameter", SecondParameter
        End With
        ScheduleMacro "SideEffectingMacro"
    Else
        ' caller is a unit test
        Set ThatGlobalCell = Sheet1.Cells(1, 1) ' tests can read as "Application.Caller was invoked"
        With ThatGlobalDictionary
            .Clear
            .Add "FirstParameter", FirstParameter
            .Add "SecondParameter", SecondParameter
        End With
        SideEffectingUDF = True ' tests can read this as "macro was scheduled"
    End If
End Function

While it does solve the problem of avoiding to involve Application.Caller and actually scheduling the macro in tests, there are several reasons why this is a terrible idea:

  • Function now has a higher Cyclomatic Complexity metric by virtue of now needing more execution paths to accomplish the same thing: the code is objectively and measurably more complex now, on top of being repetitive (copying & pasting any code is usually a sign something is off!).
  • Tests are no longer executing the same code as normal execution does, which means tests are now testing code that only exists because there are tests: the normal execution path remains untested, and that makes the tests worthless busy-work.
  • Tests now need to be making assumptions about how the function is implemented, which effectively casts the code into concrete instead of making it simpler & safer to modify.
  • Dependencies should be abstractions, and code should be working with these abstractions without regards to their actual implementation: code that acts differently when the runtime type of an abstraction is X vs when it’s Y, violates the Liskov Substitution Principle, the “L” of “SOLID” that essentially states that all implementations of a given abstraction should be treated the same.

The killer is the second bullet: if the sole purpose of a test is to determine whether Application.Caller was invoked, and the UDF says “oh we’re in a test, here yeah everything is all right, see”, then a UDF that does nothing but returning True would pass that test, and that is why the test is useless, as is the code duplication.

When we write a test whose purpose is to determine whether the Application.Caller dependency was invoked, the test should FAIL when it isn’t, otherwise that test is just as good as a deleted one.

Now picture the UDF looking like this instead:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    With AppContext.Current
        Set .Target = .Caller.Offset(RowOffset:=1)
        .Property("FirstParameter") = FirstParameter
        .Property("SecondParameter") = SecondParameter
        .ScheduleMacro
    End With
End Function

The UDF now only has one dependency, AppContext.Current, which is global state by virtue of being accessible from the default instance of the AppContext class; we’re tightly coupled with the AppContext class, but only because we specifically want to access global state in a controlled manner, and the rest of the function is working against the IAppContext abstraction. The state that was formerly a Range and a Dictionary globally-scoped declaration is now properly encapsulated in an object, and the “current” AppContext is coming into existence from outside the UDF scope (but still from within our own code), which is exactly what we want: now unit tests get to inject a TestContext instead of manipulating global state.

So how do we get there?


Implementation

The basic idea is to pull our dependencies from global scope, encapsulate them in a class module, …and then making an instance of that class an “ambient context” that’s still globally accessible, but that introduces the necessary abstraction needed to make that UDF fully testable.

We want to leverage the default instance of the AppContext class, so we’re going to need an AppContext class with a @PredeclaredId annotation and a Current property getter that returns some IAppContext instance. If you’re familiar with factory methods this will feel a bit like something you’ve already seen:

'@PredeclaredId
Option Explicit
Implements IAppContext
Private Type TState
    Factory As IAppContextFactory
    Current As IAppContext
    '...    
End Type
Private This As TState
'@Description "Gets the current (or default) context."
Public Property Get Current() As IAppContext
    Errors.GuardNonDefaultInstance Me, AppContext, TypeName(Me)
    
    If This.Current Is Nothing Then
        Set This.Current = This.Factory.Create
        Errors.GuardNullReference This.Factory, TypeName(Me), "IAppContextFactory.Create returned Nothing."
    End If
    
    Set Current = This.Current
End Property
Private Property Get IsDefaultInstance() As Boolean
    IsDefaultInstance = Me Is AppContext
End Property
Private Sub Class_Initialize()
    If IsDefaultInstance Then
        'must initialize context with sensible defaults:
        Set This.Factory = New AppContextFactory
        Set This.TimerProvider = New TimerProvider
    Else
        Set This.Properties = New Scripting.Dictionary
        'we want all instances to have the same provider instance:
        Set This.TimerProvider = AppContext.TimerProvider
    End If
End Sub

We don’t normally want Property Get procedures to be side-effecting, but with an Ambient Context what we want is to yield a cached instance of the context class, so when no instance already exists, the getter caches the created object so it’s readily available next time, making it accessible from anywhere in the project (aka “global”).

Abstract Factory

The default instance of the AppContext class does not know what the actual runtime type of the Current context is, and this polymorphism is the cornerstone making it all work: the Current property getter is responsible for caching the new context instance, but not for actually creating it. That’s the job of an abstract factory (the IAppContextFactory dependency) that we conveniently initialize to a concrete factory type that creates instances of… the AppContext class.

Why involve an abstract factory to create an instance of the class we’re in, you might ask? Because that’s only the default implementation, and with ability to Set the Factory reference from outside the class, tests can inject a different factory implementation, say, this one named TestContextFactory:

'@Folder "Tests.Stubs"
'@ModuleDescription "A factory that creates TestContext instances."
Option Explicit
Implements IAppContextFactory
Private Function IAppContextFactory_Create() As IAppContext
    Set IAppContextFactory_Create = New TestContext
End Function

Meanwhile the actual UDFs would be using this AppContextFactory implementation by default:

'@Folder "AmbientContext"
'@ModuleDescription "A factory that creates AppContext instances."
Option Explicit
Implements IAppContextFactory
Private Function IAppContextFactory_Create() As IAppContext
    Set IAppContextFactory_Create = New AppContext
End Function

The AppContext.Current property will happily cache an instance of any class whatsoever, as long as it implements the IAppContext interface. The abstract factory pattern allows us to spawn an instance of a class at run-time, of which we don’t necessarily know the actual “concrete” type at compile-time.

In other words just by reading the UDF code, there is no way to tell whether AppContext.Current is going to be an AppContext or a TestContext instance, and that is exactly what we want.

What this abstraction achieves, is the decoupling that is necessary for a test to be able to inject a TestContextFactory and take control of everything UDFs can do with an IAppContext object.

Context State

We know the context needs to wrap Application.Caller and Application.OnTime functionality. We know we need a Target cell, we need some Properties in an encapsulated Scripting.Dictionary. If we crammed all that into a single interface, we would get a somewhat crowded IAppContext interface that doesn’t quite adhere to the Interface Segregation Principle and Open/Closed Principle guidelines.

By abstracting away the macro-scheduling functionality into its own IAppTimer interface, and making that interface an abstract dependency of the context class, we can stub that abstract dependency and write tests for the logic of the context class itself. Without this extra step, the context can be stubbed to test the code that uses it, but the macro-scheduling bits would remain untestable.

Treating IAppTimer as a dependency of the context makes the IAppContext interface look like this:

'@Folder "AmbientContext.Abstract"
'@ModuleDescription "Encapsulates the data and macro invocation mechanism for a side-effecting UDF."
'@Interface
Option Explicit
'@Description "Gets the cell that invoked the currently-running user-defined function (UDF), if applicable; Nothing otherwise."
Public Property Get Caller() As Range
End Property
'@Description "Gets or sets the target reference cell that the side-effecting macro shall use."
Public Property Get Target() As Range
End Property
Public Property Set Target(ByVal Value As Range)
End Property
'@Description "Gets or sets a named value representing data passed between the UDF and the side-effecting macro."
Public Property Get Property(ByVal Name As String) As Variant
End Property
Public Property Let Property(ByVal Name As String, ByVal Value As Variant)
End Property
'@Description "Gets an array of all property names."
Public Property Get Properties() As Variant
End Property
'@Description "Gets or sets the IAppTimer dependency."
Public Property Get Timer() As IAppTimer
End Property
Public Property Set Timer(ByVal Value As IAppTimer)
End Property
'@Description "Clears all held state."
Public Sub Clear()
End Sub

Note that we’re not exposing the dictionary itself: rather we expose an indexed property to get/set the key/values, then by exposing the dictionary keys, the calling code gets to do everything it needs to do, without ever directly interacting with a Scripting.Dictionary, a bit as if the AppContext class were a custom collection.

Now, there’s something special about the IAppTimer dependency: we absolutely cannot have each context instance spawn timers willy-nilly, because a leaking Win32 timer is a nice way to send Excel up in flames. Yet, we need each context instance to be able to access the same IAppTimer reference.

A good way to solve this is by introducing a Provider mechanism. The interface looks like this:

'@ModuleDescription "A service that ensures all clients get the same IAppTimer instance."
'@Interface
Option Explicit
'@Description "Gets an IAppTimer instance."
Public Property Get Timer() As IAppTimer
End Property

What I’m calling a “provider” here is exactly the same mechanism that provides the IAppContext instance (a Property Get procedure that gets a cached object or creates the object and caches it), except no abstract factory needs to get involved here. The class also makes a very convenient place to put the name of the Win32 callback macro procedure:

Option Explicit
Implements ITimerProvider
Private Const MacroName As String = "Execute"
Private Property Get ITimerProvider_Timer() As IAppTimer
    Static Instance As AppTimer
    If Instance Is Nothing Then
        Set Instance = New AppTimer
        Instance.MacroName = MacroName
    End If
    Set ITimerProvider_Timer = Instance
End Property

TimerProvider the only object that creates a New AppTimer: as a result, every AppContext instance created from this factory is going to use the same IAppTimer reference, and if we need to write tests for AppContext we can inject a TestTimerProvider that returns a TestTimer.

Note that the “provider” mechanism is an implementation detail of AppContext: the TestContext doesn’t need this, because it just initializes itself with a TestTimer, while AppContext initializes itself with a TimerProvider that gets the IAppTimer instance. Being an implementation detail, there’s no ITimerProvider dependency on the abstract interface.


The Tests

The previously-untestable user-defined functions now look like this:

Public Function TestUDF(ByVal SomeParameter As Double) As Boolean
    On Error GoTo CleanFail
    
    With AppContext.Current
        
        Set .Target = .Caller.Offset(RowOffset:=1)
        .Property("Test1") = 42
        .Property("Test2") = 4.25 * SomeParameter
        .Timer.ExecuteMacroAsync
        
    End With
    
    TestUDF = True
CleanExit:
    Exit Function
CleanFail:
    TestUDF = False
    Resume CleanExit
    Resume
End Function

The code isn’t very far off from the original, but now we can write a test that passes when a UDF invokes the Caller member; when the UDF is invoked from a worksheet cell, IAppContext.Caller returns the Range reference returned by Application.Caller; when the exact same code is invoked from a test, IAppContext.Caller returns a bogus/test cell reference.

Similarly, when a UDF invokes IAppTimer.ExecuteMacroAsync, a Win32 API call schedules the execution of a callback macro that itself invokes Application.OnTime to schedule the execution of a side-effecting macro that can consume the state and alter the target range and worksheet; when the exact same code is invoked from a test, IAppTimer.ExecuteMacroAsync simply notes that it was invoked, …and does nothing else.

This test passes when IAppTimer.ExecuteMacroAsync is invoked from a UDF, and would fail if the UDF didn’t invoke it:

'@TestMethod("Infrastructure")
Private Sub TestUDF_SchedulesMacro()
    'inject the test factory:
    Set AppContext.Factory = New TestContextFactory
    
    'get the test context:
    Dim Context As TestContext
    Set Context = AppContext.Current
    
    'test factory already stubbed the timer:
    Dim StubTimer As TestTimer
    Set StubTimer = AppContext.Current.Timer
    
    'run the UDF:
    Dim Result As Boolean
    Result = Functions.TestUDF(0)
    
    'Assert that the UDF has invoked IAppContext.ScheduleMacro once:
    Const Expected As Long = 1
    Assert.AreEqual Expected, StubTimer.ExecuteMacroAsyncInvokes, "IAppTimer.ExecuteMacroAsync was invoked " & StubTimer.ExecuteMacroAsyncInvokes & " times; expected " & Expected
End Sub

Cohesion

Ambient Context is a fantastic tool to address cross-cutting concerns and leverage global scope in a way that does not hinder testing. It’s also useful for storing state and dependencies that would otherwise be held in global scope, when passing that state and dependencies as normal parameters isn’t possible.

That makes it a somewhat dangerous pattern: one must keep in mind that the state is still global, and globals that don’t need to be global, should not be global. By defining an explicit interface for the context (like IAppContext), we not only end up with neat abstractions: we also make it harder for the context interface to grow new members and for the class to become an over-engineered Globals.bas module.

Interfaces shouldn’t be designed to change. In .NET the IDisposable interface only mandates a parameterless Dispose method; IEquatable is all about an Equals method. A factory interface shouldn’t need more than a carefully parameterized Create method that only takes arguments that can’t be dependencies of the factory instance: we want to avoid modifying existing interfaces as much as possible, and since none of us can really predict the future… the best way to do that is to keep interfaces as slim as possible. Cohesion is what we’re after: a module that is cohesive will feel like everything is exactly where it should be.

If the members of a module don’t feel like they’re a cohesive and complete group of closely related methods, there’s a greater chance that more members need to be added in the future – and you will want to avoid that. Of course the “and complete” part can mean a few growing pains, but in general naming things is a great way to avoid the pitfalls of treating the context as some “state bag” where we just lazily stuff state without thinking it through. In that sense AppContext is probably one of the worst possible names for this: perhaps a FunctionContext that only exposes the Caller member would be a cleaner approach?

In the real world, ambient context is for things like System.Threading.Thread.CurrentThread in .NET: it’s very specialized, with a very specific purpose, and we don’t see it very often. Authorization mechanisms might use it too.

In VBA-land, I’ve never once needed to implement it until I came upon this side-effecting UDF scenario needing unit tests; macros are definitely much simpler to refactor for testability!

Model, View, ViewModel

100% VBA, 100% OOP

We’ve seen in UserForm1.Show what makes a Smart UI solution brittle, and how to separate the UI concerns from rest of the logic with the Model-View-Presenter (MVP) UI pattern. MVP works nicely with the MSForms library (UserForms in VBA), just like it does with its .NET Windows Forms successor. While the pattern does a good job of enhancing the testability of application logic, it also comes with its drawbacks: the View’s code-behind (that is, the code module “behind” the form designer) is still littered with noisy event handlers and boilerplate code, and the back-and-forth communication between the View and the Presenter feels somewhat clunky with events and event handlers.

Rubberduck’s UI elements are made with the Windows Presentation Foundation (WPF) UI framework, which completely redefines how everything about UI programming works, starting with the XML/markup-based (XAML) design, but the single most compelling element is just how awesome its data binding capabilities are.

We can leverage in VBA what makes Model-View-ViewModel (MVVM) awesome in C# without going nuts and writing a whole UI framework from scratch, but we’re still going to need a bit of an abstract infrastructure to work with. It took the will to do it and only costed a hair or two, but as far as I can tell this works perfectly fine, at least at the proof-of-concept stage.

This article is the first in a series that revolves around MVVM in VBA as I work (very much part-time) on the rubberduckdb content admin tool. There’s quite a bit of code to make this magic happen, so let’s kick this off with what it does and how to use it – subsequent articles will dive into how the MVVM infrastructure internals work. As usual the accompanying code can be found in the examples repository on GitHub (give it a star, and fork it, then make pull requests with your contributions during Hacktoberfest next month and you can get a t-shirt, stickers, and other free stuff, courtesy of Digital Ocean!).

Overview

The code in the examples repository isn’t the reason I wrote this: I mentioned in the previous post that I was working on an application to maintain the website content, and decided to explore the Model-View-ViewModel pattern for that one. Truth be told, MVVM is hands-down my favorite UI pattern, by far. This is simply the cleanest UI code I’ve ever written in VBA, and I love it!

A screenshot of a carefully-crafted dialog form for managing content served by rubberduckvba.com. A modal prompts the user for SQL Server credentials, all commands but the "reload" button are disabled.
The app is work in progress, but the property and command bindings work!

The result is an extremely decoupled, very extensible, completely testable architecture where every user action (“command”) is formally defined, can be programmatically simulated/tested with real, stubbed, or faked dependencies, and can be bound to multiple UI elements and programmatically executed as needed.

MVVM Quick Checklist

These would be the rules to follow as far a relationships go between the components of the MVVM pattern:

  • View (i.e. the UserForm) knows about the ViewModel, but not the Model;
  • ViewModel knows about commands, but nothing about a View;
  • Exactly what the Model actually is/isn’t/should/shouldn’t be, is honestly not a debate I’m interested in – I’ll just call whatever set of classes is responsible for hydrating my ViewModel with data my “model” and sleep at night. What matters is that whatever you call the Model knows nothing of a View or ViewModel, it exists on its own.

Before we dive into bindings and the infrastructure code, we need to talk about the command pattern.

Commands

A command is an object that implements an ICommand interface that might look like this:

'@Folder MVVM.Infrastructure
'@ModuleDescription "An object that represents an executable command."
'@Interface
'@Exposed
Option Explicit

'@Description "Returns True if the command is enabled given the provided binding context (ViewModel)."
Public Function CanExecute(ByVal Context As Object) As Boolean
End Function

'@Description "Executes the command given the provided binding context (ViewModel)."
Public Sub Execute(ByVal Context As Object)
End Sub

'@Description "Gets a user-friendly description of the command."
Public Property Get Description() As String
End Property

In the case of a CommandBinding the Context parameter is always the DataContext / ViewModel (for now anyway), but manual invokes could supply other kinds of parameters. Not all implementations need to account for the ViewModel, a CanExecute function that simply returns True is often perfectly fine. The Description is used to set a tooltip on the target UI element of the command binding.

The implementation of a command can be very simple or very complex, depending on the needs. A command might have one or more dependencies, for example a ReloadCommand might want to be injected with some IDbContext object that exposes a SelectAllTheThings function and the implementation might pull them from a database, or make them up from hard-coded strings: the command has no business knowing where the data comes from and how it’s acquired.

Each command is its own class, and encapsulates the logic for enabling/disabling its associated control and executing the command. This leaves the UserForm module completely devoid of any logic that isn’t purely a presentation concern – although a lot can be achieved solely with property bindings and validation error formatters.

The infrastructure code comes with AcceptCommand and CancelCommand implementations, both useful to wire up [Ok], [Cancel], or [Close] dialog buttons.

AcceptCommand

The AcceptCommand can be used as-is for any View that can be closed with a command involving similar semantics. It is implemented as follows:

'@Exposed
'@Folder MVVM.Infrastructure.Commands
'@ModuleDescription "A command that closes (hides) a View."
'@PredeclaredId
Option Explicit
Implements ICommand

Private Type TState
    View As IView
End Type

Private this As TState

'@Description "Creates a new instance of this command."
Public Function Create(ByVal View As IView) As ICommand
    Dim result As AcceptCommand
    Set result = New AcceptCommand
    Set result.View = View
    Set Create = result
End Function

Public Property Get View() As IView
    Set View = this.View
End Property

Public Property Set View(ByVal RHS As IView)
    GuardClauses.GuardDoubleInitialization this.View, TypeName(Me)
    Set this.View = RHS
End Property

Private Function ICommand_CanExecute(ByVal Context As Object) As Boolean
    Dim ViewModel As IViewModel
    If TypeOf Context Is IViewModel Then
        Set ViewModel = Context
        If Not ViewModel.Validation Is Nothing Then
            ICommand_CanExecute = ViewModel.Validation.IsValid
            Exit Function
        End If
    End If
    ICommand_CanExecute = True
End Function

Private Property Get ICommand_Description() As String
    ICommand_Description = "Accept changes and close."
End Property

Private Sub ICommand_Execute(ByVal Context As Object)
    this.View.Hide
End Sub

CancelCommand

This command is similar to the AcceptCommand in that it simply invokes a method in the View. This implementation could easily be enhanced by making the ViewModel track “dirty” (modified) state and prompting the user when they are about to discard unsaved changes.

'@Folder MVVM.Infrastructure.Commands
'@ModuleDescription "A command that closes (hides) a cancellable View in a cancelled state."
'@PredeclaredId
'@Exposed
Option Explicit
Implements ICommand

Private Type TState
    View As ICancellable
End Type

Private this As TState

'@Description "Creates a new instance of this command."
Public Function Create(ByVal View As ICancellable) As ICommand
    Dim result As CancelCommand
    Set result = New CancelCommand
    Set result.View = View
    Set Create = result
End Function

Public Property Get View() As ICancellable
    Set View = this.View
End Property

Public Property Set View(ByVal RHS As ICancellable)
    GuardClauses.GuardDoubleInitialization this.View, TypeName(Me)
    Set this.View = RHS
End Property

Private Function ICommand_CanExecute(ByVal Context As Object) As Boolean
    ICommand_CanExecute = True
End Function

Private Property Get ICommand_Description() As String
    ICommand_Description = "Cancel pending changes and close."
End Property

Private Sub ICommand_Execute(ByVal Context As Object)
    this.View.OnCancel
End Sub

This gives us very good indications about how the pattern wants user actions to be implemented:

  • Class can have a @PredeclaredId annotation and expose a factory method to property-inject any dependencies; here a IView object, but a custom SaveChangesCommand would likely get injected with some DbContext service class.
  • All commands need a description; that description is user-facing as a tooltip on the binding target (usually a CommandButton).
  • CanExecute can be as simple as an unconditional ICommand_CanExecute = True, or as complex as needed (it has access to the ViewModel context); keep in mind that this method can be invoked relatively often, and should perform well and return quickly.

It’s a simple interface with a simple purpose: attach a command to a button. The EvaluateCanExecute method invokes the command’s CanExecute function and accordingly enables or disables the Target control.

By implementing all UI commands as ICommand objects, we keep both the View and the ViewModel free of command logic and Click handlers. By adopting the command pattern, we give ourselves all the opportunities to achieve low coupling and high cohesion. That is, small and specialized modules that depend on abstractions that can be injected from the outside.

Property Bindings

In XAML we use a special string syntax (“markup extensions”) to bind the value of, say, a ViewModel property, to that of a UI element property:

<TextBox Text="{Binding SomeProperty, Mode=TwoWay, UpdateSourceTrigger=PropertyChanged}" />

As long as the ViewModel implements INotifyPropertyChanged and the property fires the PropertyChanged event when its value changes, WPF can automatically keep the UI in sync with the ViewModel and the ViewModel in sync with the UI. WPF data bindings are extremely flexible and can also bind to static and dynamic resources, or other UI elements, and they are actually slightly more complex than that, but this captures the essence.

Obviously MVVM with MSForms in VBA isn’t going to involve any kind of special string syntax, but the concept of a PropertyBinding can very much be encapsulated into an object (and XAML compiles down to objects and methods, too). At its core, a binding is a pretty simple thing: a source, a target, and a method to update them.

Technically nothing prevents binding a target to any object type (although with limitations, since non-user code won’t be implementing INotifyPropertyChanged), but for the sake of clarity:

  • The binding Source is the ViewModel
  • The SourcePropertyPath is the name of a property of the ViewModel
  • The binding Target is the MSForms control
  • The binding TargetProperty is the name of a property of the MSForms control

Note that the SourcePropertyPath resolves recursively and can be a property of a propertyof a property – as long as the string ultimately resolves to a non-object member.

.BindPropertyPath ViewModel, "SourcePath", Me.PathBox, _
    Validator:=New RequiredStringValidator, _
    ErrorFormat:=AggregateErrorFormatter.Create(ViewModel, _
        ValidationErrorFormatter.Create(Me.PathBox) _ 
            .WithErrorBackgroundColor _
            .WithErrorBorderColor, _
        ValidationErrorFormatter.Create(Me.InvalidPathIcon) _
            .WithTargetOnlyVisibleOnError("SourcePath"), _                
        ValidationErrorFormatter.Create(Me.ValidationMessage1) _
            .WithTargetOnlyVisibleOnError("SourcePath"))

The IBindingManager.BindPropertyPath method is pretty flexible and accepts a number of optional parameters while implementing sensible defaults for common MSForms controls’ “default property binding”. For example, you don’t need to specify a TargetProperty when binding a ViewModel property to a MSForms.TextBox: it will automatically binds to the Text property, but will accept to bind any other property.

The optional arguments are especially useful for custom data validation, but some of them also control various knobs that determine what and how the binding updates.

ValueBehavior
TwoWayBindingBinding will update the source when the target changes, and will update the target when the source changes.
OneWayBindingBinding will update the target when the source changes.
OneWayToSourceBinding will update the source when the target changes.
OneTimeBindingBinding will only update the target once.
The BindingMode enum values
ValueBehavior
OnPropertyChangedBinding will update when the bound property value changes.
OnKeyPressBinding will update the source at each keypress. Only available for TextBox controls. Data validation may prevent the keypress from reaching the UI element.
OnExitBinding will update the source just before target loses focus. Data validation may cancel the exit and leave the caret inside. This update source trigger is the most efficient since it only updates bindings when the user has finished providing a value.
The UpdateSourceTrigger enum values

Property Paths

The binding manager is able to recursively resolve a member path, so if your ViewModel has a ThingSection property that is itself a ViewModel with its own bindings and commands, that itself has a Thing property, know that the binding path can legally be “ThingSection.Thing“, and as long as the Source is the ViewModel object where a ThingSection property exists, and that the ThingSection porperty yields an object that has a Thing property, then all is good and the binding works. If ThingSection were to be Nothing when the binding is updated, then the target is assigned with a default value depending on the type. For example if ThingSection.Thing was bound to some TextBox1 control and the ThingSection property of the ViewModel was Nothing, then the Text property would end up being an empty string – note that this default value may be illegal, depending on what data validation is in place.

Data Validation

Every property binding can attach any IValueValidator implementation that encapsulates specialized, bespoke validation rules. The infrastructure code doesn’t include any custom validator, but the example show how one can be implemented. The interface mandates an IsValid function that returns a Boolean (True when valid), and a user-friendly Message property that the ValidationManager uses to create tooltips.

'@Folder MVVM.Example
Option Explicit
Implements IValueValidator

Private Function IValueValidator_IsValid(ByVal Value As Variant, ByVal Source As Object, ByVal Target As Object) As Boolean
    IValueValidator_IsValid = Len(Trim$(Value)) > 0
End Function

Private Property Get IValueValidator_Message() As String
    IValueValidator_Message = "Value cannot be empty."
End Property

The IsValid method provides you with the Value being validated, the binding Source, and the binding Target objects, which means every validator has access to everything exposed by the ViewModel; note that the method being a Function strongly suggests that it should not have side-effects. Avoid mutating ViewModel properties in a validator, but the message can be constructed dynamically if the validator is made to hold module-level state… although I would really strive to avoid making custom validators stateful.

While the underlying data validation mechanics are relatively complex, believe it or not there is no other step needed to implement custom validation for your property bindings: IBindingManager.BindPropertyPath is happy to take in any validator object, as long as it implements the IValueValidator interface.

Presenting Validation Errors

Without taking any steps to format validation errors, commands that can only execute against a valid ViewModel will automatically get disabled, but the input field with the invalid value won’t give the user any clue. By providing an IValidationErrorFormatter implementation when registering the binding, you get to control whether hidden UI elements should be displayed when there’s a validation error.

The ValidationErrorFormatter class meets most simple scenarios. Use the factory method to create an instance with a specific target UI element, then chain builder method calls to configure the formatting inline with a nice, fluent syntax:

Set Formatter = ValidationErrorFormatter.Create(Me.PathBox) _
                                        .WithErrorBackgroundColor(vbYellow) _
                                        .WithErrorBorderColor
MethodPurpose
CreateFactory method, ensures every instance is created with a target UI element.
WithErrorBackgroundColorMakes the target have a different background color given a validation error. If no color is specified, a default “error background color” (light red) is used.
WithErrorBorderColorMakes the target have a different border color given a validation error. If no color is specified, a default “error border color” (dark red) is used. Method has no effect if the UI control isn’t “flat style” or if the border style isn’t “fixed single”.
WithErrorForeColorMakes the target have a different fore (text) color given a validation error. If no color is specified, a default “error border color” (dark red) is used.
WithErrorFontBoldMakes the target use a bold font weight given a validation error. Method has no effect if the UI element uses a bolded font face without a validation error.
WithTargetOnlyVisibleOnErrorMakes the target UI element normally hidden, only to be made visible given a validation error. Particularly useful with aggregated formatters, to bind the visibility of a label and/or an icon control to the presence of a validation error.
The factory and builder methods of the ValidationErrorFormatter class.

The example code uses an AggregateErrorFormatter to tie multiple ValidationErrorFormatter instances (and thus possibly multiple different target UI controls) to the the same binding.

Value Converters

IBindingManager.BindPropertyPath can take an optional IValueConverter parameter when a conversion is needed between the source and the target, or between the target and the source. One useful value converter can be one like the InverseBooleanConverter implementation, which can be used in a binding where True in the source needs to bind to False in the target.

The interface mandates the presence of Convert and ConvertBack functions, respectively invoked when the binding value is going to the target and the source. Again, pure functions and performance-sensitive implementations should be preferred over side-effecting code.

'@Folder MVVM.Infrastructure.Bindings.Converters
'@ModuleDescription "A value converter that inverts a Boolean value."
'@PredeclaredId
'@Exposed
Option Explicit
Implements IValueConverter

Public Function Default() As IValueConverter
    GuardClauses.GuardNonDefaultInstance Me, InverseBooleanConverter
    Set Default = InverseBooleanConverter
End Function

Private Function IValueConverter_Convert(ByVal Value As Variant) As Variant
    IValueConverter_Convert = Not CBool(Value)
End Function

Private Function IValueConverter_ConvertBack(ByVal Value As Variant) As Variant
    IValueConverter_ConvertBack = Not CBool(Value)
End Function

Converters used in single-directional bindings don’t need to necessarily make both functions return a value that makes sense: sometimes a value can be converted to another but cannot round-trip back to the original, and that’s fine.

String Formatting

One aspect of property bindings I haven’t tackled yet, is the whole StringFormat deal. Once that is implemented and working, the string representation of the target control will be better separated from its actual value. And a sensible default format for some data types (Date, Currency) can even be inferred from the type of the source property!

Another thing string formatting would enable, is the ability to interpolate the value within a string. For example there could be a property binding defined like this:

.BindPropertyPath ViewModel, "NetAmount", Me.NetAmountBox, StringFormat:="USD$ {0:C2}"

And the NetAmountBox would read “USD$ 1,386.77” given the value 1386.77, and the binding would never get confused and would always know that the underlying value is a numeric value of 1386.77 and not a formatted string. Now, until that is done, string formatting probably needs to involve custom value converters. When string formatting works in property bindings, any converter will get invoked before: it’s always going to be the converted value that gets formatted.

ViewModel

Every ViewModel class is inherently application-specific and will look different, but there will be recurring themes:

  • Every field in the View wants to bind to a ViewModel property, and then you’ll want extra properties for various other things, so the ViewModel quickly grows more properties than comfort allows. Make smaller “ViewModel” classes by regrouping related properties, and bind with a property path rather than a plain property name.
  • Property changes need to propagate to the “main” ViewModel (the “data context”) somehow, so making all ViewModel classes fire a PropertyChanged event as appropriate is a good idea. Hold a WithEvents reference to the “child” ViewModel, and handle propagation by raising the “parent” ViewModel’s own PropertyChanged event, all the way up to the “main” ViewModel, where the handler nudges command bindings to evaluate whether commands can execute. One solution could be to register all command bindings with some CommandManager object that would have to implement IHandlePropertyChanged and would relieve the ViewModel of needing to do this.

Each ViewModel should implement at least two interfaces:

  • IViewModel, because we need a way to access the validation error handler and this interface makes a good spot for it.
  • INotifyPropertyChanged, to notify data bindings when a ViewModel property changes.

Here is the IViewModel implementation for the example code – the idea is really to expose properties for the view to bind, and we must not forget to notify handlers when a property value changes – notice the RHS-checking logic in the Property Let member:

'@Folder MVVM.Example
'@ModuleDescription "An example ViewModel implementation for some dialog."
'@PredeclaredId
Implements IViewModel
Implements INotifyPropertyChanged
Option Explicit

Public Event PropertyChanged(ByVal Source As Object, ByVal PropertyName As String)

Private Type TViewModel
    
    'INotifyPropertyChanged state:
    Handlers As Collection
    
    'CommandBindings:
    SomeCommand As ICommand
    
    'Read/Write PropertyBindings:
    SourcePath As String
    SomeOption As Boolean
    SomeOtherOption As Boolean
    
End Type

Private this As TViewModel
Private WithEvents ValidationHandler As ValidationManager

Public Function Create() As IViewModel
    GuardClauses.GuardNonDefaultInstance Me, ExampleViewModel, TypeName(Me)
    
    Dim result As ExampleViewModel
    Set result = New ExampleViewModel
    
    Set Create = result
End Function

Public Property Get Validation() As IHandleValidationError
    Set Validation = ValidationHandler
End Property

Public Property Get SourcePath() As String
    SourcePath = this.SourcePath
End Property

Public Property Let SourcePath(ByVal RHS As String)
    If this.SourcePath <> RHS Then
        this.SourcePath = RHS
        OnPropertyChanged "SourcePath"
    End If
End Property

Public Property Get SomeOption() As Boolean
    SomeOption = this.SomeOption
End Property

Public Property Let SomeOption(ByVal RHS As Boolean)
    If this.SomeOption <> RHS Then
        this.SomeOption = RHS
        OnPropertyChanged "SomeOption"
    End If
End Property

Public Property Get SomeOtherOption() As Boolean
    SomeOtherOption = this.SomeOtherOption
End Property

Public Property Let SomeOtherOption(ByVal RHS As Boolean)
    If this.SomeOtherOption <> RHS Then
        this.SomeOtherOption = RHS
        OnPropertyChanged "SomeOtherOption"
    End If
End Property

Public Property Get SomeCommand() As ICommand
    Set SomeCommand = this.SomeCommand
End Property

Public Property Set SomeCommand(ByVal RHS As ICommand)
    Set this.SomeCommand = RHS
End Property

Public Property Get SomeOptionName() As String
    SomeOptionName = "Auto"
End Property

Public Property Get SomeOtherOptionName() As String
    SomeOtherOptionName = "Manual/Browse"
End Property

Public Property Get Instructions() As String
    Instructions = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
End Property

Private Sub OnPropertyChanged(ByVal PropertyName As String)
    RaiseEvent PropertyChanged(Me, PropertyName)
    Dim Handler As IHandlePropertyChanged
    For Each Handler In this.Handlers
        Handler.OnPropertyChanged Me, PropertyName
    Next
End Sub

Private Sub Class_Initialize()
    Set this.Handlers = New Collection
    Set ValidationHandler = ValidationManager.Create
End Sub

Private Sub INotifyPropertyChanged_OnPropertyChanged(ByVal Source As Object, ByVal PropertyName As String)
    OnPropertyChanged PropertyName
End Sub

Private Sub INotifyPropertyChanged_RegisterHandler(ByVal Handler As IHandlePropertyChanged)
    this.Handlers.Add Handler
End Sub

Private Property Get IViewModel_Validation() As IHandleValidationError
    Set IViewModel_Validation = ValidationHandler
End Property

Private Sub ValidationHandler_PropertyChanged(ByVal Source As Object, ByVal PropertyName As String)
    OnPropertyChanged PropertyName
End Sub

Nothing much of interest here, other than the INotifyPropertyChanged implementation and the fact that a ViewModel is really just a fancy word for a class that exposes a bunch of properties that magically keep in sync with UI controls!

View

In a Smart UI, that module is, more often than not, a complete wreck. In Model-View-Presenter it quickly gets cluttered with many one-liner event handlers, and something just feels clunky about the MVP pattern. Now, I’m trying really hard, but I can’t think of a single reason to not want UserForm code-behind to look like this all the time… this is absolutely all of it, there’s no cheating going on:


'@Folder MVVM.Example
'@ModuleDescription "An example implementation of a View."
Implements IView
Implements ICancellable
Option Explicit

Private Type TView
    'IView state:
    ViewModel As ExampleViewModel
    
    'ICancellable state:
    IsCancelled As Boolean
    
    'Data binding helper dependency:
    Bindings As IBindingManager
End Type

Private this As TView

'@Description "A factory method to create new instances of this View, already wired-up to a ViewModel."
Public Function Create(ByVal ViewModel As ExampleViewModel, ByVal Bindings As IBindingManager) As IView
    GuardClauses.GuardNonDefaultInstance Me, ExampleView, TypeName(Me)
    GuardClauses.GuardNullReference ViewModel, TypeName(Me)
    GuardClauses.GuardNullReference Bindings, TypeName(Me)
    
    Dim result As ExampleView
    Set result = New ExampleView
    
    Set result.Bindings = Bindings
    Set result.ViewModel = ViewModel
    
    Set Create = result
    
End Function

Private Property Get IsDefaultInstance() As Boolean
    IsDefaultInstance = Me Is ExampleView
End Property

'@Description "Gets/sets the ViewModel to use as a context for property and command bindings."
Public Property Get ViewModel() As ExampleViewModel
    Set ViewModel = this.ViewModel
End Property

Public Property Set ViewModel(ByVal RHS As ExampleViewModel)
    GuardClauses.GuardExpression IsDefaultInstance, TypeName(Me)
    GuardClauses.GuardNullReference RHS
    
    Set this.ViewModel = RHS
    InitializeBindings

End Property

'@Description "Gets/sets the binding manager implementation."
Public Property Get Bindings() As IBindingManager
    Set Bindings = this.Bindings
End Property

Public Property Set Bindings(ByVal RHS As IBindingManager)
    GuardClauses.GuardExpression IsDefaultInstance, TypeName(Me)
    GuardClauses.GuardDoubleInitialization this.Bindings, TypeName(Me)
    GuardClauses.GuardNullReference RHS
    
    Set this.Bindings = RHS

End Property

Private Sub BindViewModelCommands()
    With Bindings
        .BindCommand ViewModel, Me.OkButton, AcceptCommand.Create(Me)
        .BindCommand ViewModel, Me.CancelButton, CancelCommand.Create(Me)
        .BindCommand ViewModel, Me.BrowseButton, ViewModel.SomeCommand
        '...
    End With
End Sub

Private Sub BindViewModelProperties()
    With Bindings
        
        .BindPropertyPath ViewModel, "SourcePath", Me.PathBox, _
            Validator:=New RequiredStringValidator, _
            ErrorFormat:=AggregateErrorFormatter.Create(ViewModel, _
                ValidationErrorFormatter.Create(Me.PathBox).WithErrorBackgroundColor.WithErrorBorderColor, _
                ValidationErrorFormatter.Create(Me.InvalidPathIcon).WithTargetOnlyVisibleOnError("SourcePath"), _
                ValidationErrorFormatter.Create(Me.ValidationMessage1).WithTargetOnlyVisibleOnError("SourcePath"))
        
        .BindPropertyPath ViewModel, "Instructions", Me.InstructionsLabel
        
        .BindPropertyPath ViewModel, "SomeOption", Me.OptionButton1
        .BindPropertyPath ViewModel, "SomeOtherOption", Me.OptionButton2
        .BindPropertyPath ViewModel, "SomeOptionName", Me.OptionButton1, "Caption", OneTimeBinding
        .BindPropertyPath ViewModel, "SomeOtherOptionName", Me.OptionButton2, "Caption", OneTimeBinding
        
        '...
        
    End With
End Sub

Private Sub InitializeBindings()
    If ViewModel Is Nothing Then Exit Sub
    BindViewModelProperties
    BindViewModelCommands
    Bindings.ApplyBindings ViewModel
End Sub

Private Sub OnCancel()
    this.IsCancelled = True
    Me.Hide
End Sub

Private Property Get ICancellable_IsCancelled() As Boolean
    ICancellable_IsCancelled = this.IsCancelled
End Property

Private Sub ICancellable_OnCancel()
    OnCancel
End Sub

Private Sub IView_Hide()
    Me.Hide
End Sub

Private Sub IView_Show()
    Me.Show vbModal
End Sub

Private Function IView_ShowDialog() As Boolean
    Me.Show vbModal
    IView_ShowDialog = Not this.IsCancelled
End Function

Private Property Get IView_ViewModel() As Object
    Set IView_ViewModel = this.ViewModel
End Property

Surely some tweaks will be made over the next couple of weeks as I put the UI design pattern to a more extensive workout with the Rubberduck website content maintenance app – but having used MVVM in C#/WPF for many years, I already know that this is how I want to be coding VBA user interfaces going forward.

I really love how the language has had the ability to make this pattern work, all along.

To be continued…

Secure ADODB

Unless you’re hosted in Access, your VBA project doesn’t have access to a database engine. If you’re in Excel, it’s easy to treat the host workbook as a database and each worksheet as a table. While we can build an application that uses Excel worksheets to store data, we probably shouldn’t do that. The reasons are many, but primarily (pun …yeah, intended), we want to be able to establish bullet-proof referential integrity between records/tables; while Excel is great for many things, it’s useless for that: it’s the job of a relational database system (RDBMS), not that of any worksheet software, no matter how powerful. Power Query is very much worth looking into, but if you’re building a small CRUD (Create/Read/Update/Delete) application in VBA, you’ll want VBA code responsible for the data access – enter ADODB, …and every pitfall that comes with it.

In this article we will explore a heavily object-oriented solution to querying a database securely with the ADODB library.


Securely?

Querying a database with ADODB is easy: just set up a connection, open it, then execute whatever SQL statement you need through the Connection, and you get the results in a Recordset object:

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "ConnectionString"

Dim rs As ADODB.Recordset
Set rs = conn.Execute("SELECT Field1, Field2 FROM Table1")

'...

rs.Close
conn.Close

That is great for one-timer, ad-hoc queries: things quickly get messy when you start needing multiple queries, or when your SQL statement needs to be invoked repeatedly with different values:

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "ConnectionString"

Dim i As Long
For i = 1 To 10
    Dim rs As ADODB.Recordset
    Set rs = conn.Execute("SELECT Field1, Field2 FROM Table1 WHERE Field3 = " & i)
    '...
    rs.Close
Next
conn.Close

This right here – WHERE SomeField = " & i, is making the database engine work harder than it needs to… and it’s costing server-side performance, because as far as the engine knows, it’s getting a different query every time – and thus computes the same execution plan over and over, every time… when it could just be reusing it. Databases are smart. Like, wicked smart… but yeah we still need to ask for the right thing!

Compare to something like this:

Const sql As String = "SELECT Field1, Field2 FROM Table1 WHERE Field3 = ?"
Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "ConnectionString"

Dim i As Long
For i = 1 To 10
    Dim cmd As ADODB.Command
    Set cmd = New ADODB.Command
    cmd.CommandType = adCmdText
    cmd.CommandText = sql
    cmd.Parameters.Append cmd.CreateParameter(Type:=adInteger, Value:= i)

    Dim rs As ADODB.Recordset
    Set rs = cmd.Execute
    '...
    rs.Close
Next
conn.Close

Oh my, so much more code, so little gain – right?

Using ADODB.Command when queries involve a WHERE (and/or VALUES) clause and user-provided (directly or not) values is not only more efficient (the cached execution plan is reused because the command string is identical every time), it’s also more secure. Concatenating user inputs into SQL command strings is a common rookie mistake, and it’s a practice that is way more widespread than it should be (regardless of the language, paradigm, or platform); your code becomes vulnerable to SQL Injection Attacks – something that may or may not be in your threat model, but that inevitably turns into… easily avoidable bugs: think of what might happen if a user entered O'Connor in that LastName field. If you’re thinking “oh that’s easy! I’ll just double-up single quotes, and fixed!“, then you’re playing a needlessly exhausting game of cat-and-mouse with the next thing that will break your clever escaping: the mouse wins.

Abstract thoughts

Much simpler to just use an ADODB.Command every time, and when you need it parameterized, to Append any number of ADODB.Parameter objects to its Parameters collection. Except, it does make a lot of code to write, every time.

What do we do when we see repetitive patterns in code? If you’re thinking “we put it in a function!” then you’re thinking abstraction and that’s exactly the right train of thought.

We’re just going to take this abstraction… and make it an object. Then think of what objects it needs in order to do its job, and abstract these objects behind interfaces too, and take these abstractions in as constructor parameters of our Create “static” factory method. Rinse & repeat until all dependencies are property-injected and all responsibilities are nicely encapsulated into their own classes. It was fun!

I wrote an original version of this functionality little while ago – you can find the original version on Code Review, and see how different/similar it is to this simplified/improved version in our Examples repository on GitHub.

The original was just an ADODB wrapper class though, couldn’t really be unit-tested, and was annoying to maintain because it felt very repetitive. This version is separating the type mappings from the parameter-providing logic, which makes configuring these mappings is done through an object that’s solely responsible for these mappings; it also separates the command from the connection, and abstracts away that connection enough to enable unit testing and cover quite a large part of the API – but most importantly, this version exposes adequate abstractions for the calling code to use and stub in its own unit tests.

VBA code written with this API (and the principles it demonstrates) can easily be fully testable, without ever actually hitting any database.

I can do this in the immediate pane:

?UnitOfWork.FromConnectionString("connection string").Command.GetSingleValue("SELECT Field1 FROM Table1 WHERE Id=?", 1)

I mean, it’s a contrived example, but with a valid connection string, query, and arguments, that’s all we need to get an actual parameterized ADODB command sending that 1 as an actual ADODB parameter, …and the following debug output:

Begin connect...
Connect completed. Status: 1
Begin transaction completed. 
Begin execute...
Execute completed, -1 record(s) affected.
{whatever value was in Field1}
Rollback transaction completed.
Disconnect completed. Status: 1

I made DbConnection listen in on whatever events the ADODB connection is firing, pending the implementation of an adapter to expose some IDbConnectionEvents members – the idea is to end up with client code that can inject its own callbacks and do things like log such messages. In the meantime Debug.Print statements are producing this debug output, but that’s it’s an implementation detail: it doesn’t publicly expose any of these events. It couldn’t, either: the rest of the code needs to work with the IDbConnection interface, and interfaces unfortunately can’t expose events in VBA.


SecureADODB

Some might call it layered spaghetti. Others call it lasagna. I call it well-abstracted code that reads and maintains like a charm and provably works as intended. There is nothing, absolutely nothing wrong with having many class modules in a VBA project: the only problem is… well, the VBE itself:

Project Explorer is making OOP rather painful. In fact it makes any kind of modularization painful.
Code Explorer makes the VBE more OOP-friendly: now you can have folders regrouping modules by functionality rather than just by module type.

Nice, rich APIs involve many related objects, interfaces, methods – members that make up the object model the API’s client code will be working with. As long as we can keep all these classes organized, there’s no problem having many of them.

Before we look at the implementation, let’s review the interfaces and the overall structure.

Only two interfaces aren’t being stubbed for unit tests. IUnitOfWork because as the top-level object nothing in the object model consumes it. It is needed though, because client code can inject it as a dependency of some FooRepository class, and then tests can provide it with a StubUnitOfWork that implements IUnitOfWork.

The other “façade” interface is ITypeMap. This one isn’t really needed (neither is the predeclared instance of AdoTypeMappings or its Default factory method), something felt wrong with the client code without it. While the class is essentially just a dictionary / literally a map, there’s something rather elegant about depending on an ITypeMap rather than some Scripting.Dictionary.

The two dark blue interfaces are abstract factory interfaces, each with a “real” and a “stub” implementation for tests: these are very simple classes whose entire purpose is to create an object of a particular type.

If we consider IParameterProvider an implementation detail of IDbCommandBase, that leaves us with only the core stuff: IDbCommandBase, IDbCommand, and IDbConnection – everything else just revolves around these.

DbCommandBase

The old SqlCommand code had two sets of commands: “Execute” for methods you could pass a Connection to, and “QuickExecute” for methods that created a connection on-the-spot. I decided to split the two behaviors into two distinct implementation of the same interface, and that’s how I ended up with DefaultDbCommand and AutoDbCommand. As I was cleaning up the two new classes, I had to notice these two classes needed a number of common bits of functionality… as would any other implementation of IDbCommand.

In a language that supports inheritance, I would probably make the two classes inherit a third abstract “base” class where I’d implement the IDbCommand interface. In VBA, we can’t derive a class from another, or inherit members from another class: inheritance is flat-out unavailable. There’s an alternative though, and it’s arguably even better than inheritance: composition. We can put the common functionality in a third class, and then have the two implementations take an instance of that “base” class as we would any other dependency – effectively achieving what we wanted out of inheritance, but through composition.

Code is said to be “decoupled” when none of its concrete components are inter-dependent, as is apparent with the solid black “depends on” arrows here. Decoupled components can easily be swapped for other implementations, like …test stubs.

What’s wrong with inheritance?

Don’t get me wrong, inheritance is very cool: with an abstract class you can have templated methods, where a method in the base class (typically a method that implements some interface member) invokes an abstract or virtual method (typically with protected scope) that the inherited class must override and provide an implementation for. Rubberduck uses this pattern in quite a few places (inspections, notably). Without inheritance, it’s just not something that’s possible.

Inheritance is described as a “is a” relationship, while composition is more of a “has a” relationship. This is important, because when the only consideration weighting in favor of inheritance is the need for two classes to share some functionality, it’s exactly why inheritance should not be used.


Decoupling FTW

The “base” class appeared as a need to have a place for IDbCommand implementations to access shared functionality. I wanted to return disconnected recordsets, and retrieving the value of the first field of the first record of a recordset isn’t something that’s glaringly implementation-specific. The other piece of functionality I needed, was a function that creates the ADODB.Command object and adds the parameters.

Because I wanted this class to create the ADODB.Command, I needed it to be able to turn a Variant into an ADODB.Parameter through some mapping, and since I didn’t want my class to be necessarily coupled with that mapping, or anything remotely related to configuring ADODB parameters… I’m property-injecting an IParameterProvider dependency:

Public Function Create(ByVal provider As IParameterProvider) As IDbCommandBase
    Errors.GuardNonDefaultInstance Me, DbCommandBase
    Errors.GuardNullReference provider
    
    Dim result As DbCommandBase
    Set result = New DbCommandBase
    Set result.ParameterProvider = provider
    
    Set Create = result

End Function

Validating the command string / arguments

Since the commands are given an SQL command string to execute, and a ParamArray array of arguments that should have the same number of items as there are ? ordinal parameters in the SQL command string, we have an opportunity to catch a missing or extraneous argument before we even send the command string to the database server. And because this validation logic would have to be the same regardless of what IDbCommand implementation we’re looking at, DbCommandBase makes the best place to put it.

This implementation is probably too naive for a number of edge cases, but sufficient for most: we’re simply counting the number of ? characters in the sql string, and comparing that with the number of elements in the args array. We need to handle errors here, because if the args array is empty, evaluating UBound(args) and/or LBound(args) will throw a “subscript out of range” run-time error 9.

Public Function ValidateOrdinalArguments(ByVal sql As String, ByRef args() As Variant) As Boolean
    On Error GoTo CleanFail
    Dim result As Boolean
    
    Dim expected As Long
    expected = Len(sql) - Len(Replace(sql, "?", vbNullString))
    
    Dim actual As Long
    On Error GoTo CleanFail 'if there are no args, LBound/UBound are both out of bounds
    actual = UBound(args) + (1 - LBound(args))
    
CleanExit:
    result = (expected = actual)
    ValidateOrdinalArguments = result
    Exit Function
CleanFail:
    actual = 0
    Resume CleanExit
End Function

Getting a disconnected Recordset

If we created a database connection, issued a command against it, and received the recordset from ADODB.Command.Execute, and then we close the connection and return that recordset, then the calling code can’t use the data anymore: a connected recordset only works if the calling code owns the connection. So we need a way to issue a disconnected recordset, while still using an ADODB.Command. The way to do this, is to pass the command as the Source argument to Recordset.Open, and to use a static, client-side cursor:

Private Function GetDisconnectedRecordset(ByVal cmd As ADODB.Command) As ADODB.Recordset
    Errors.GuardNullReference cmd
    Errors.GuardNullReference cmd.ActiveConnection
    
    Dim result As ADODB.Recordset
    Set result = New ADODB.Recordset
    
    result.CursorLocation = adUseClient
    result.Open Source:=cmd, CursorType:=adOpenStatic
    
    Set result.ActiveConnection = Nothing
    Set GetDisconnectedRecordset = result
End Function

Getting a single value result

With functions to validate the parameters, create commands and get a disconnected recordset, we have everything we need for IDbCommand implementations to do their job, but if we leave it like this, we’ll end up with all implementations copying the logic of IDbCommand.GetSingleValue: best have that logic in DbCommandBase and avoid as much repetition as possible.

Private Function GetSingleValue(ByVal db As IDbConnection, ByVal sql As String, ByRef args() As Variant) As Variant
    Errors.GuardEmptyString sql
    
    Dim cmd As ADODB.Command
    Set cmd = CreateCommand(db, adCmdText, sql, args)
    
    Dim results As ADODB.Recordset
    Set results = GetDisconnectedRecordset(cmd)
    
    GetSingleValue = results.Fields.Item(0).value
End Function

Creating the command

A few things can go wrong when creating the ADODB.Command object: we need an ADODB.Connection that’s open, and the parameters must be valid. Since we’re not executing the command just yet, we don’t have to worry about everything that could go wrong actually executing the command string and processing the parameters on the server. So the strategy here is to guard against invalid inputs as much as possible, and then to handle errors when we add the parameters, and return the Command object with whatever parameters were successfully added. We don’t need to try salvaging the rest of the parameters if one blows up, since that failing parameter will fail command execution anyway, but there isn’t much we can do about it, other than perhaps throw an error and have the caller not even try to run the command – but here I decided that the server-side errors would be more useful than any custom “invalid parameter” error.

Note that the ADODB.Command object is actually created by the method-injected IDbConnection dependency. This creates a seam between the class and ADODB, despite the inherent coupling with the ADODB.Command type: it makes the command’s ActiveConnection an implementation detail of IDbConnection.CreateCommand, and that’s all I needed to make this method work with a stub connection that isn’t actually connecting to anything:

Private Function CreateCommand(ByVal db As IDbConnection, ByVal commandType As ADODB.CommandTypeEnum, ByVal sql As String, ByRef args() As Variant) As ADODB.Command
    Errors.GuardNullReference db
    Errors.GuardEmptyString sql
    Errors.GuardExpression db.State <> adStateOpen, message:="Connection is not open."
    Errors.GuardExpression Not ValidateOrdinalArguments(sql, args), message:="Arguments supplied are inconsistent with the provided command string parameters."
    
    Dim cmd As ADODB.Command
    Set cmd = db.CreateCommand(commandType, sql)
    
    On Error GoTo CleanFail
    Dim arg As ADODB.Parameter
    For Each arg In this.ParameterProvider.FromValues(args)
        cmd.parameters.Append arg
    Next
    
CleanExit:
    Set CreateCommand = cmd
    Exit Function
CleanFail:
    Resume CleanExit
End Function

DbCommand

As mentioned before, there are two implementations for the IDbCommand interface: one that creates and owns its own IDbConnection, the other that takes it in as a dependency.

This abstraction represents an object that can take an SQL statement and parameters, and return the result(s) to its caller.

DefaultDbCommand receives its IDbConnection dependency through property injection in its Create factory method.

AutoDbCommand takes a connection string and an IDbConnectionFactory instead.

UnitOfWork uses a DefaultDbCommand because the unit of work needs to own the connection, but AutoDbCommand could be used instead of a unit of work, if we just need a quick SELECT and no transaction.

Abstract Factory

IDbConnectionFactory is an Abstract Factory here. This is needed, because unit tests need to be able to inject a stub factory that produces stub connections: an abstract factory is a factory interface that creates objects of a type that is also an abstraction – in this case, IDbConnectionFactory.Create returns an IDbConnection object. Implementing this factory class is exactly as simple as you’d think – here’s DbConnectionFactory:

'@Exposed
'@Folder("SecureADODB.DbConnection")
'@ModuleDescription("An implementation of an abstract factory that creates DbConnection objects.")
Option Explicit
Implements IDbConnectionFactory

Private Function IDbConnectionFactory_Create(ByVal connString As String) As IDbConnection
    Set IDbConnectionFactory_Create = DbConnection.Create(connString)
End Function

And here’s StubDbConnectionFactory:

'@Folder("Tests.Stubs")
'@ModuleDescription("A stub acting as a IDbConnectionFactory implementation.")
Option Explicit
Implements IDbConnectionFactory
Private Type TInvokeState
    CreateConnectionInvokes As Long
End Type
Private this As TInvokeState

Private Function IDbConnectionFactory_Create(ByVal connString As String) As IDbConnection
    this.CreateConnectionInvokes = this.CreateConnectionInvokes + 1
    Set IDbConnectionFactory_Create = New StubDbConnection
End Function

Public Property Get CreateConnectionInvokes() As Long
    CreateConnectionInvokes = this.CreateConnectionInvokes
End Property

The test stub is more “complex” because it tracks method invocations, so that tests can know whether & how many times any given member was invoked during a test run.

The Abstract Factory pattern is very useful with Dependency Injection: it gives us an abstraction to inject when a class needs a dependency that just cannot be injected when the object is created – the alternative would be tight coupling: if we weren’t injecting a connection factory, then the command class would’ve had to be the one invoking DbConnection.Create – tightly coupling it with the DbConnection class and instantly making unit testing impossible. An abstract factory removes the coupling and allows unit tests to inject an alternative/stub implementation of the factory that creates StubDbConnection objects.

Wrapping it all up

AutoDbConnection can very well be consumed as-is by the client code:

Dim results As ADODB.Recordset
Set results = AutoDbConnection.Create(connString, New DbConnectionFactory, DbCommandBase.Create(AdoParameterProvider.Create(AdoTypeMappings.Default))).Execute(sql)

The only problem is that, well, the dependencies need to be resolved somehow, and that means the client code is now responsible for wiring everything up. While each component has a clear purpose, explicitly creating all these objects quickly gets old and redundant: we need an object that simplifies this – enter IUnitOfWork, and now we can use this much simpler code:

Dim results As ADODB.Recordset
Set results = UnitOfWork.FromConnectionString(connString).Command.Execute(sql)

Unit of Work is a design pattern that encapsulates a transaction: each individual operation can succeed or fail, and the unit of work either succeeds or fails as a whole. These notions are abstracted in the IUnitOfWork interface:

'@Folder("SecureADODB.UnitOfWork")
'@ModuleDescription("Represents an object encapsulating a database transaction.")
'@Interface
'@Exposed
Option Explicit

'@Description("Commits the transaction.")
Public Sub Commit()
End Sub

'@Description("Rolls back the transaction.")
Public Sub Rollback()
End Sub

'@Description("Creates a new command to execute as part of the transaction.")
Public Function Command() As IDbCommand
End Function

When a UnitOfWork is created, it initiates a database transaction. When it is destroyed before the transaction is committed, the transaction gets rolled back and from the database’s point of view, it’s like nothing happened.

Transaction?

If you’re unfamiliar with database transactions, there’s an easy example to illustrate what they do: imagine you have an Accounts table, and you’re processing a transfer – you need to UPDATE the record for the source account to deduct the transfer amount, then UPDATE the record for the destination account to add the transferred amount. In a happy world where everything goes well that would be the end of it… but the world is a cruel place, and assuming the 1st command goes through, nothing guarantees nothing will blow up when sending the 2nd command. Without transactions, the funds would simply vanish: they’re gone from the first account, and they were never added to the second account. With a transaction, we can rollback everything when the 2nd operation completes, no funds vanish and the data is exactly the way it was before the transaction started.


Again, the implementation is pretty straightforward – the only peculiarity is that the class has two factory methods – one named Create that takes all the dependencies in, and another named FromConnectionString that conveniently wires up a default set of dependencies (and then passes them to the Create method to avoid duplicating code).

'@Folder("SecureADODB.UnitOfWork")
'@ModuleDescription("An object that encapsulates a database transaction.")
'@PredeclaredId
'@Exposed
Option Explicit
Implements IUnitOfWork
Private Type TUnitOfWork
    Committed As Boolean
    RolledBack As Boolean
    Connection As IDbConnection
    CommandFactory As IDbCommandFactory
End Type
Private this As TUnitOfWork

'@Description("Creates a new unit of work using default configurations.")
'@Ignore ProcedureNotUsed
Public Function FromConnectionString(ByVal connString As String) As IUnitOfWork
    
    Dim db As IDbConnection
    Set db = DbConnection.Create(connString)
    
    Dim provider As IParameterProvider
    Set provider = AdoParameterProvider.Create(AdoTypeMappings.Default)
    
    Dim baseCommand As IDbCommandBase
    Set baseCommand = DbCommandBase.Create(provider)
    
    Dim factory As IDbCommandFactory
    Set factory = DefaultDbCommandFactory.Create(baseCommand)
    
    Set FromConnectionString = UnitOfWork.Create(db, factory)
    
End Function

'@Inject: just an idea.. see #https://github.com/rubberduck-vba/Rubberduck/issues/5463
Public Function Create(ByVal db As IDbConnection, ByVal factory As IDbCommandFactory) As IUnitOfWork
    Errors.GuardNonDefaultInstance Me, UnitOfWork
    Errors.GuardNullReference factory
    Errors.GuardNullReference db
    Errors.GuardExpression db.State <> adStateOpen, message:="Connection should be open."
    
    Dim result As UnitOfWork
    Set result = New UnitOfWork
    Set result.CommandFactory = factory
    Set result.Connection = db
    
    Set Create = result
End Function

'@Inject: this member should only be invoked by Me.Create, where Me is the class' default/predeclared instance.
'@Ignore ProcedureNotUsed: false positive with v2.5.0.5418
Friend Property Set Connection(ByVal value As IDbConnection)
    Errors.GuardDoubleInitialization this.Connection
    Errors.GuardNullReference value
    Set this.Connection = value
    this.Connection.BeginTransaction
End Property

'@Inject: this member should only be invoked by Me.Create, where Me is the class' default/predeclared instance.
'@Ignore ProcedureNotUsed: false positive with v2.5.0.5418
Friend Property Set CommandFactory(ByVal value As IDbCommandFactory)
    Errors.GuardDoubleInitialization this.CommandFactory
    Errors.GuardNullReference value
    Set this.CommandFactory = value
End Property

Private Sub Class_Terminate()
    On Error Resume Next
    If Not this.Committed Then this.Connection.RollbackTransaction
    On Error GoTo 0
End Sub

Private Sub IUnitOfWork_Commit()
    Errors.GuardExpression this.Committed, message:="Transaction is already committed."
    Errors.GuardExpression this.RolledBack, message:="Transaction was rolled back."
    On Error Resume Next ' not all providers support transactions
    this.Connection.CommitTransaction
    this.Committed = True
    On Error GoTo 0
End Sub

Private Function IUnitOfWork_Command() As IDbCommand
    Set IUnitOfWork_Command = this.CommandFactory.Create(this.Connection)
End Function

Private Sub IUnitOfWork_Rollback()
    Errors.GuardExpression this.Committed, message:="Transaction is already committed."
    On Error Resume Next ' not all providers support transactions
    this.Connection.RollbackTransaction
    this.RolledBack = True
    On Error GoTo 0
End Sub

Errors

If you paid close attention to the code listings so far, you likely already noticed the many Errors.GuardXxxxx member calls scattered throughout the code. There are probably as many ways to deal with custom errors as there are VBA classes out there, this is one way. Probably not the best way, but it feels “just right” for me in this case and I think I like it enough to keep using it until the problems it creates become clearer (there’s always something). Errors is a standard private module in the project, that defines custom error codes. Okay I was lazy and deemed SecureADODBCustomError all I needed, but it could also have been an Enum with descriptive names for each custom error code.

The module simply exposes a small number of very simple Sub procedures that make it easy for the rest of the code to raise meaningful custom errors:

'@Folder("SecureADODB")
'@ModuleDescription("Global procedures for throwing common errors.")
Option Explicit
Option Private Module

Public Const SecureADODBCustomError As Long = vbObjectError Or 32

'@Description("Re-raises the current error, if there is one.")
Public Sub RethrowOnError()
    With VBA.Information.Err
        If .Number <> 0 Then
            Debug.Print "Error " & .Number, .Description
            .Raise .Number
        End If
    End With
End Sub

'@Description("Raises a run-time error if the specified Boolean expression is True.")
Public Sub GuardExpression(ByVal throw As Boolean, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Invalid procedure call or argument.")
    If throw Then VBA.Information.Err.Raise SecureADODBCustomError, Source, message
End Sub

'@Description("Raises a run-time error if the specified instance isn't the default instance.")
Public Sub GuardNonDefaultInstance(ByVal instance As Object, ByVal defaultInstance As Object, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Method should be invoked from the default/predeclared instance of this class.")
    Debug.Assert TypeName(instance) = TypeName(defaultInstance)
    GuardExpression Not instance Is defaultInstance, Source, message
End Sub

'@Description("Raises a run-time error if the specified object reference is already set.")
Public Sub GuardDoubleInitialization(ByVal instance As Object, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Object is already initialized.")
    GuardExpression Not instance Is Nothing, Source, message
End Sub

'@Description("Raises a run-time error if the specified object reference is Nothing.")
Public Sub GuardNullReference(ByVal instance As Object, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Object reference cannot be Nothing.")
    GuardExpression instance Is Nothing, Source, message
End Sub

'@Description("Raises a run-time error if the specified string is empty.")
Public Sub GuardEmptyString(ByVal value As String, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "String cannot be empty.")
    GuardExpression value = vbNullString, Source, message
End Sub

Most of these procedures are invoked as the first executable statement in a given scope, to raise an error given invalid parameters or internal state, such as these:

Private Sub IUnitOfWork_Commit()
    Errors.GuardExpression this.Committed, message:="Transaction is already committed."
    Errors.GuardExpression this.RolledBack, message:="Transaction was rolled back."
    On Error Resume Next ' not all providers support transactions
    this.Connection.CommitTransaction
    this.Committed = True
    On Error GoTo 0
End Sub

Consistently raising such errors is the single best way to ensure our objects are always in a known and usable state, because we outright forbid them to be invalid. These validation clauses are called guard clauses, hence the GuardXxxxx procedure names.

A lot of the unit tests simply verify that, given the specified conditions, the expected error is raised:

'@TestMethod("Factory Guard")
Private Sub Create_ThrowsIfNotInvokedFromDefaultInstance()
    On Error GoTo TestFail
    
    With New AutoDbCommand
        On Error GoTo CleanFail
        Dim sut As IDbCommand
        Set sut = .Create("connection string", New StubDbConnectionFactory, New StubDbCommandBase)
        On Error GoTo 0
    End With
    
CleanFail:
    If Err.Number = ExpectedError Then Exit Sub
TestFail:
    Assert.Fail "Expected error was not raised."
End Sub

If each guard clause has a unit test, then the tests are effectively documenting how the objects are meant to be used. With more specific custom errors, the tests would be more accurate, but there’s a point where you need to look at what you’ve got and say “I think I can work with that”, and move on.


Audience

Obviously, one doesn’t import 20 classes into their VBA project just to send one ADODB command to a database server. However if you’re maintaining a VB6 application that uses ADODB all over the place, leaks connections, leaves recordsets dangling, …then importing this API can really help tighten up the data access code in that legacy app. Or maybe you’re writing a complex data-driven system in VBA for Excel because that’s all you’ve got, and a UnitOfWork abstraction makes sense for you.

The goal here is mostly to 1) demonstrate proper usage of ADODB.Command for secure, parameterized queries, and 2) demonstrate that Classic VB (VB6/VBA) has always had everything everyone ever needed to write full-blown object-oriented code that leverages abstraction, encapsulation, and polymorphism – making it possible to write clean and fully unit-tested code.

…and of course, it makes a great practical application of the OOP concepts discussed in many other articles on this blog. Studying the code in this project gives you insight on…

  • OOP foundations: abstraction, encapsulation, polymorphism.
  • SOLID principles: single responsibility, dependency inversion, etc.
  • DI techniques: property injection, abstract factory.
  • Unit testing: what to test, how to test, stubbing dependencies, etc.
  • Using custom errors, guard clauses, input validation.
  • Leveraging Rubberduck annotations, minimizing inspection results.

Office-JS & Script Lab

Apparently this is this blog’s 100th article (!), and since Rubberduck is also about the future of Office automation in VBA, I wanted to write about what’s increasingly being considered a serious contender for an eventual replacement of Visual Basic for Applications. Just recently Mr.Excel (Bill Jelen) uploaded a video on YouTube dubbing it the “VBA killer”, and without being over-dramatic, I can’t help but to pragmatically agree with the sentiment… to an extent.

Forget VBA, think Win32 and COM: the Web has been “threatening” the future of Windows desktop applications of all kinds for about as long as VBA has been around. Windows desktop development went from COM-based to .NET, and now to cross-platform .NET Core, and there’s still COM interoperability built into .NET. It’s 2020 and Microsoft SQL Server runs perfectly fine on Linux, and you can use Microsoft Visual Studio on your Mac now, and a lot of what Microsoft does is open-sourced and accepts contributions, including .NET itself… and TypeScript is up there, too.

VBA isn’t going anywhere.

COM hasn’t gone anywhere either. If you used any Declare statements in VBA you probably know about user32.dll and kernel32.dll. The Win32 API is here to stay; COM is here to stay. My reading is that as long as the Windows plumbing exists to make it possible, VBA has no reason to go anywhere. The problem is that VBA and its COM-based Win32 infrastructure are essentially a dead end: it’s literally not going anywhere. The VBE has long been abandoned, and VBA as a language is stuck 20 years ago… but it’s likely going to stick around for a long time in desktop-land, even if (when?) the Excel COM type library stops getting new members – as the freezing of the GitHub repository holding the official VBA documentation suggests:

“This repo is no longer accepting PRs or new issues.”

Maybe (probably) I’m reading way too much into this, but to me that is a sign that we’ve reached a critical point in VBA’s history/lifetime. I do note that the repository wasn’t made read-only and that it’s still possible to submit a pull request, but the wording strongly suggests not to.

Meanwhile, the Office Extensibility team is hard at work getting the Excel Online automation capabilities of Office-JS on par with what can be achieved on Win32/desktop with VBA. As time marches forward, eventually we’ll reach a tipping point where Office-JS stabilizes while more and more enterprises embrace the Web as a platform: maybe I’m over-estimating how long that transition will take, but even well beyond that tipping point, COM and VBA will very likely still be around for a long, long time. It’s just that eventually the Excel team will have to stop updating (but not necessarily stop shipping) the COM type library, and focus on cross-platform extensibility.

Now, have you tried Excel Online? Personally, I don’t use it a lot (Rubberduck is Win32-only), but functions like XLOOKUP and SORT (and dynamic arrays in general) are a massive game-changer, and I will neither confirm nor deny that there are even more amazing capabilities to come. Things like this should make anyone seriously think twice before opting for a plain old perpetual desktop license: Excel 2016 isn’t going to get XLOOKUP anymore than Excel 2010 ever will…


This week I decided I was tired of seeing proof-of-concept “hello world” code demonstrating what Office-JS can do, and went on to explore and scratch more than just the surface. I found a Tetris game and decided to port my OOP Battleship from VBA to TypeScript… a language I know next to nothing about (and, looking at that Tetris game code and comparing it to mine… it shows!).

Script Lab

If you’re a VBA enthusiast, the first thing you notice in Excel Online, is the absence of a Developer tab. To automate Excel on the Web, you need to add Script Lab, a free add-on that brings up a task pane titled “Code”, that is very simple to use and that looks like this:

The default snippet merely sets up a “Run” UI button and wires it up to invoke a run async function that… does nothing but bring up a little banner at the top of the task pane that says “Your code goes here”.

As VBA developers, we’re used to having an actual IDE with an edit-and-continue debugger, dividing our projects into modules, and dragging and dropping controls onto a UserForm visual designer. So, your first impression of Script Lab might very well be that it’s even less of a code editor than the VBE is – especially with Rubberduck! You have to walk into it with an open mind, and with an open heart you just might discover a new friend, like I did.

Paradigm Shift

I’ve written code for a long time, but I’m not a web developer. HTML, JavaScript and CSS have scared me ever since they came into existence: too many things to think about, too many browsers, too many little things that break here but work there. I’ve been telling myself “I should try to do this” for years now, and I have to say that the project in the screenshot below is really my first [somewhat] serious attempt at anything web, …if we exclude what little ASP.NET/MVC I wrote for the rubberduckvba.com website (I’m more of a backend guy okay!).

So here’s the paradigm: that task pane is your playground, your sandbox – you have full control over everything that happens in there, the only limit is really just how bad you can be at CSS and HTML:

It’s not playable yet. I’ll definitely share it when it is …after a code review and a refactoring!

You can pop the code editor panel out into a separate browser window, which I warmly recommend doing – the code window docked on one side, the worksheet on the other. Another thing you’ll want to do is tweak your user settings to set editor.wordwrap: 'off', because for some reason the default setting is to word-wrap long lines of code, …which makes an interesting [CSS] tab when you have base-64 encoded .png image resources.

You’ll definitely want to pop the code editor into its own separate browser window.

There are a couple minor annoyances with the editor itself. Working with a single-file script for any decent-sized project, means you’re going to be scrolling up and down a lot. Hard to reliably reproduce, but I’m finding the editor tends to frequently (but thankfully, harmlessly) crash given a syntax error, like if you deleted a semicolon or something. Navigating between tabs loses your caret position, which means more scrolling. Could be just my machine (or my way-too-large-for-its-own-good script), but I’ve also experienced frequent and significant slow-downs and delays when typing.

Not having edit-and-continue debugging capabilities is a major paradigm shift as well, but then Script Lab isn’t meant to be a full-blown Integrated Development Environment… and the code that runs isn’t the code you’re editing; TypeScript compiles down to pure JavaScript, and mapping files need to get involved to help a TypeScript editor map the compiled JavaScript to the source TypeScript instructions.

On the bright side, like in Visual Studio { scopes } can be folded /collapsed, which does help reduce the amount of scrolling around and is a very useful and welcome editor feature. Also I couldn’t help but notice with utter glee that the editor auto-completes parentheses, braces, brackets, single and double quotes, …but while it does highlight matching parenthesis, unlike Rubberduck’s self-closing pair feature, backspacing onto a ( will not delete the matching closing ) character. One nice thing it does that Rubberduck autocompletion doesn’t, is that it wraps the selection: you can select an expression, type ( and instead of overwriting your selection with that character, it “wraps” the selected expression and you end up with (your selection).

As a programming language, TypeScript feels very much like the single best way to approach JavaScript: it supports strong types like a statically-typed language, and dynamic types, …like JavaScript (think Variant in VBA, but one to which you can tack-on any member you like at run-time). Coming from C# I’m finding myself surprisingly capable in this language that supports inherently object-oriented structures like classes and interfaces, and where even string literals have a ton of useful members (built-in support for regular expressions! lookbehinds in regex patterns!). Learning how string interpolation works will quickly make VBA concatenations feel clunky. Array methods will quickly become second-nature and you’ll realize just how much looping we do in VBA just because the types we’re dealing with have so little functionality.

But the most significant thing has to be how functions are now first-class citizens that can be passed around as parameters like any other object, just like we do in C# with delegates and lambda expressions. For example, in the constructor of my Ship class, I’m populating a Map<GridCoord, boolean> to hold the ship’s internal hit-state:

this.state = new Map<GridCoord, boolean>(
  new Array(this.size).fill(false).map((value: boolean, index: number): [GridCoord, boolean] => {
    let p = orientation === ShipOrientation.Horizontal 
      ? position.offset(index - 1, 0) 
      : position.offset(0, index - 1);
    return [p, false];
  }
);

We’re creating a ship of a particular size and orientation, and the state means to hold the hit-state (true: hit) of each individual grid coordinate occupied by the ship. new Array(this.size).fill(false) creates an array of the appropriate length, filled with false Boolean values; but I wanted to map the array indices to actual grid coordinates to make my life easier, so I simply use .map((value, index):[GridCoord, boolean] => {...}) to do exactly that!

Reads like character soup? Don’t worry, that code is more verbose than it needs to be, and the lambda syntax is confusing to everyone that never worked with it. In a nutshell, (i) => {...} represents a function that takes an i parameter. There is no As keyword to specify data types in TypeScript, instead we would declare a numeric variable with e.g. var i: number. That means (value, index):[GridCoord, boolean] => {...} represents a function that takes a value and an index parameter (their values are provided by the map method), and returns a tuple (the square-bracketed part; can be thought of as some kind of anonymous type that’s defined on-the-spot with unnamed but typed members) made of a GridCoord and a boolean value. Therefore, the body of that function works out what GridCoord/boolean value to yield for each item of the Array(this.size) array.

Ternary (i.e. 3-operands) operators are another nice thing VBA doesn’t have. foo = bar ? a : b; assigns a to foo if bar evaluates to true, and assigns b otherwise. The closest we have in VBA is the IIf function, but because the provided true-value and false-value arguments are arguments, they both need to be evaluated before the function is even invoked.

I could go on and on about every little language feature TypeScript has that VBA doesn’t, but the truth is, there’s simply no possible comparison to be made: as a language (I’m not talking about the capabilities of the Excel object model here), VBA loses on every single aspect. And while VBA is essentially constrained to the VBE, TypeScript is in no way constrained to Script Lab. In fact if I wanted to make an actual serious Office-JS project, I’d likely be using VSCode, which I admittedly have yet to use for anything, but I’ve heard only good things about this lightweight IDE… and if I didn’t like it then I could just stick to good old Visual Studio.


VBA will very likely remain the uncontested King of Office automation on desktop for a very long time still: programming in TypeScript is a lot of fun to me, but I’m not Joe-in-Accounting – I write code (C#, T-SQL, VBA, …) for a living, and I doubt Script Lab, HTML, CSS, JavaScript and Chrome developer tools appeal as much to someone that isn’t enthusiastic about not just automating Office, not just VBA, but about programming in general. And for that, and that alone, I posit that VBA will continue to rule as King of Win32 Office automation for many years to come, and Rubberduck will be there to keep adding modern-IDE functionalities to the Visual Basic Editor.

The King is dead, long live the King!

To be continued…

Password Authentication

Authenticating the user of our application is a common problem, with common pitfalls – some innocuous, some fatal. It’s also a solved problem, with a fairly standard solution. Unfortunately, it’s also a problem that’s too often solved with naive, “good-enough” solutions that make any security expert twitch.

The vast majority of scenarios don’t need any custom authentication. Accessing a SQL Server database? Use Windows Authentication! Windows Auth not possible? Use SQL Authentication over a secure network! App authentication isn’t for authenticating a user with a server. More like, the application itself needs a concept of users and privileges granted to certain groups of users, and so we need to prompt the user for a user name and a password. What could possibly go wrong?

Security First: Threat Model Assessment

The first question we need to ask ourselves, is literally “what could possibly go wrong?” — as in, what are we trying to do? If the answer is along the lines of:

  • Enhance user experience with tailored functionality
  • Grouping users into “roles” for easier management
  • Prevent accidental misuse of features

…then you’re on the right track. However if you’re thinking more in terms of…

  • Prevent intentional misuse of features
  • Securely prevent groups of users from accessing functionalities
  • Securely $(anything)

…then you’re going to need another kind of approach. VBA code is not secure, period. Even if the VBA project is password-protected, the VBE can be tricked into unlocking it with some clever Win32 API calls. So, the threat model should take into account that a power user that wants to see your code… will likely succeed …pretty easily, too.

That doesn’t mean VBA code gets a pass to do everything wrong! If you’re going to do password authentication, you might as well do it right.

Where to store users’ passwords?

We’ve all done this:

Private Const ADMIN_PWD As String = "@Dm!n"

…without realizing that the code of a VBA project – even locked – is compressed into a binary file that’s zipped with the rest of the Excel host document. But nothing prevents anyone from peeking at it, say, with Notepad++

Of course it had to be on line 42.

Obviously, hard-coding passwords is the worst possible idea: we need somewhere safe, right?

Truth is, not really. You could have everything you need in a hidden worksheet that anyone can see if they like; a database server is ideal, of course, but not necessary, if parts of your host document can be used as one (looking at you too, Microsoft Access).

The reason it doesn’t matter if the “passwords table” is compromised, is because you do not store passwords, period. Not even encrypted: the “passwords table” contains nothing that can be processed (decrypted) and then used as a password.

What you store is a hash of the users’ passwords, never the passwords themselves. For example, if a user’s password was password and we hashed it with the SHA256 hashing algorithm, we would be storing the following value:

5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8

Contrary to encryption and encoding, there is by definition no way to revert a hash value back to the original string password. It’s possible that some random string that’s not password might produce the same hash value (i.e. a hash collision) – but very (very very) unlikely, at least with SHA256 or higher.

There are many different hashing algorithms, producing values of varying length, at varying speeds: with cryptographically secure requirements, using slow algorithms that produce values with a low risk of collision will be preferred (harder/longer to brute-force). Other applications might use a faster MD5 hash that’s “good enough” if not very secure, for many things but a password.

Now obviously, if any two users have the same password, their SHA256 hash would be the same. If that’s a concern (it should be), then the solution is to use a salt: prepend a random string to the password, and hash the salted password string – assuming all users use a different salt value (it can be safely stored alongside the user record), then it becomes impossible to tell whether any two users have the same password just by looking at the table contents… and this is why a hidden worksheet is a perfectly fine place to store your user passwords if you can’t use a database for whatever reason.

Even if you know every user’s hash and salt value, you can’t know what password could possibly be prefixed with that particular salt value for a SHA256 hash algorithm to give exactly these specific bytes. These users all have the same password.

Storing a salted password hash prevents “translating” the hash values wholesale, using a lookup/”rainbow” table that contains common passwords and their corresponding hash representation. Even if one password is compromised, other users with the same password wouldn’t be, because their hash is different, thanks to the “salt” bytes.

Whether we code in C#, PHP, JavaScript, Python, Java, ..or VBA, there’s simply not a single valid reason to store user passwords in plain text. But how do we get that hash value out of a password string in the first place?

Hashing with VBA

There’s… no built-in support whatsoever for hashing in VBA… but nothing says we can’t make explicit late binding and the .NET Framework work for us! Note that we’re invoking the ComputeHash_2 method, because it’s an overload of the ComputeHash method that takes the byte array we want to give it. COM/VBA doesn’t support method overloading, so when .NET exposes overloads to COM, it appends _2 to the method name, _3, _4, and so on for each overload. The order depends on… the order they were written to the IDL, which means you could… just trust Stack Overflow on that one, and go with ComputeHash_2:

Public Function ComputeHash(ByVal value As String) As String
    Dim bytes() As Byte
    bytes = StrConv(value, vbFromUnicode)
    
    Dim algo As Object
    Set algo = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    Dim buffer() As Byte
    buffer = algo.ComputeHash_2(bytes)
        
    ComputeHash = ToHexString(buffer)
End Function
Private Function ToHexString(ByRef buffer() As Byte) As String
    Dim result As String
    Dim i As Long
    For i = LBound(buffer) To UBound(buffer)
        result = result & Hex(buffer(i))
    Next
    ToHexString = result
End Function

This code would feel right at home in a SHA256Managed standard module, or it could be a class that implements some IHashAlgorithm interface with a ComputeHash method – and with it we have everything we need to start handling password-based authentication in VBA …by today’s best practices.

What follows is an object-oriented approach to leveraging this function in a VBA project that needs to authenticate a user. An online copy of this code can be downloaded from GitHub: https://github.com/rubberduck-vba/examples/tree/master/Authentication


IHashAlgorithm

I like having functionality neatly abstracted, so instead of just having a public ComputeHash function that computes the SHA256 hash for a given string, I’ll have a class module formalizing what a hash algorithm does:

'@Folder("Authentication.Hashing")
'@ModuleDescription("An interface representing a hashing algorithm.")
'@Interface
Option Explicit
'@Description("Computes a hash for the given string value.")
Public Function ComputeHash(ByVal value As String) As String
End Function

One implementation would be this SHA256Managed class module:

'@Folder("Authentication.Hashing")
'@PredeclaredId
Option Explicit
Implements IHashAlgorithm
Private base As HashAlgorithmBase
'@Description("Factory method creates and returns a new instance of this class.")
Public Function Create() As IHashAlgorithm
    Set Create = New SHA256Managed
End Function
Private Sub Class_Initialize()
    Set base = New HashAlgorithmBase
End Sub
Private Function IHashAlgorithm_ComputeHash(ByVal value As String) As String
    Dim bytes() As Byte
    bytes = StrConv(value, vbFromUnicode)
    
    Dim algo As Object
    Set algo = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    Dim buffer() As Byte
    buffer = algo.ComputeHash_2(bytes)
        
    IHashAlgorithm_ComputeHash = base.ToHexString(buffer)
End Function

By coding against an interface (i.e. by invoking ComputeHash off the IHashAlgorithm interface), we are making the code easier to modify later without breaking things: if a functionality needs a MD5 hash algorithm instead of SHA256, we can implement a MD5Managed class and inject that, and no client code needs to be modified, because the code doesn’t care what specific algorithm it’s working with, as long as it implements the IHashAlgorithm interface.

The HashAlgorithmBase class is intended to be used by all implementations of IHashAlgorithm, so we’re using composition to simulate inheritance here (the coupling is intended, there’s no need to inject that object as a dependency). The class simply exposes the ToHexString function, so that any hashing algorithm can get a hex string out of a byte array:

'@Folder("Authentication.Hashing")
'@ModuleDescription("Provides common functionality used by IHashAlgorithm implementations.")
Option Explicit
'@Description("Converts a byte array to a string representation.")
Public Function ToHexString(ByRef buffer() As Byte) As String
    Dim result As String
    Dim i As Long
    For i = LBound(buffer) To UBound(buffer)
        result = result & Hex(buffer(i))
    Next
    ToHexString = result
End Function

At this point we can already test the hashing algorithm in the immediate pane:

?SHA256Managed.Create().ComputeHash("abc")
BA7816BF8F1CFEA414140DE5DAE2223B0361A396177A9CB410FF61F2015AD

The next step is to create an object that’s able to take user credentials, and tell its caller whether or not the credentials are good. This is much simpler than it sounds like.

UserAuthModel

The first thing we need to address, is the data we’re going to be dealing with – the model. In the case of a dialog that’s prompting for a user name and a password, our model is going to be a simple class exposing Name and Password read/write properties, and here an IsValid property returns True if the Name and Password values aren’t empty:

'@Folder("Authentication")
Option Explicit
Private Type TAuthModel
    Name As String
    Password As String
    IsValid As Boolean
End Type
Private this As TAuthModel
Public Property Get Name() As String
    Name = this.Name
End Property
Public Property Let Name(ByVal value As String)
    this.Name = value
    Validate
End Property
Public Property Get Password() As String
    Password = this.Password
End Property
Public Property Let Password(ByVal value As String)
    this.Password = value
    Validate
End Property
Public Property Get IsValid() As Boolean
    IsValid = this.IsValid
End Property
Private Sub Validate()
    this.IsValid = Len(this.Name) > 0 And Len(this.Password) > 0
End Sub

Since this isn’t a model for changing a password, the validation logic doesn’t need to worry about the password’s length and/or content – only that a non-empty value was provided; your mileage may vary!

If we wanted the UI to provide a ComboBox dropdown to pick a user name, then the model class would need to encapsulate an array or collection that contains the user names, and that array or collection would be provided by another component.

IAuthService

When my object-oriented brain thinks “authentication”, what shapes up in my mind is a simple interface that exposes a single Boolean-returning function that takes user credentials, and returns True when authentication succeeds with the provided credentials.

Something like this:

'@Folder("Authentication")
'@ModuleDescription("An interface representing an authentication mechanism.")
'@Interface
Option Explicit
'@Description("True if the supplied credentials are valid, False otherwise.")
Public Function Authenticate(ByVal model As UserAuthModel) As Boolean
End Function

If we have a hidden worksheet with a table containing the user names, salt values, and hashed passwords for all users, then we could implement this interface with some WorksheetAuthService class that might look like this:

'@Folder("Authentication")
'@ModuleDescription("A service responsible for authentication.")
'@PredeclaredId
Option Explicit
Implements IAuthService
Private Type TAuthService
    Algorithm As IHashAlgorithm
End Type
Private Type TUserAuthInfo
    Salt As String
    Hash As String
End Type
Private this As TAuthService
Public Function Create(ByVal hashAlgorithm As IHashAlgorithm)
    With New WorksheetAuthService
        Set .Algorithm = hashAlgorithm
        Set Create = .Self
    End With
End Function
Public Property Get Self() As IHashAlgorithm
    Set Self = Me
End Property
Public Property Get Algorithm() As IHashAlgorithm
    Set Algorithm = this.Algorithm
End Property
Public Property Set Algorithm(ByVal value As IHashAlgorithm)
    Set this.Algorithm = value
End Property
Private Function GetUserAuthInfo(ByVal user As String, ByRef outInfo As TUserAuthInfo) As Boolean
'gets the salt value & password hash for the specified user; returns false if user can't be retrieved.
    On Error GoTo CleanFail
    With PasswordsSheet.Table
        
        Dim nameColumnIndex As Long
        nameColumnIndex = .ListColumns("Name").Index
        
        Dim saltColumnIndex As Long
        saltColumnIndex = .ListColumns("Salt").Index
        
        Dim hashColumnIndex As Long
        hashColumnIndex = .ListColumns("PasswordHash").Index
        
        Dim userRowIndex As Long
        userRowIndex = Application.WorksheetFunction.Match(user, .ListColumns(nameColumnIndex).DataBodyRange, 0)
        
        outInfo.Salt = Application.WorksheetFunction.Index(.ListColumns(saltColumnIndex).DataBodyRange, userRowIndex)
        outInfo.Hash = Application.WorksheetFunction.Index(.ListColumns(hashColumnIndex).DataBodyRange, userRowIndex)
    
    End With
    GetUserAuthInfo = True
CleanExit:
    Exit Function
CleanFail:
    Debug.Print Err.Description
    Debug.Print "Unable to retrieve authentication info for user '" & user & "'."
    outInfo.Salt = vbNullString
    outInfo.Hash = vbNullString
    GetUserAuthInfo = False
    Resume CleanExit
End Function
Private Function IAuthService_Authenticate(ByVal model As UserAuthModel) As Boolean
    
    Dim info As TUserAuthInfo
    If Not model.IsValid Or Not GetUserAuthInfo(model.Name, outInfo:=info) Then Exit Function
    
    Dim pwdHash As String
    pwdHash = this.Algorithm.ComputeHash(info.Salt & model.Password)
    
    IAuthService_Authenticate = (pwdHash = info.Hash)
    
End Function

If we only look at the IAuthService_Authenticate implementation, we can easily tell what’s going on:

  • If for any reason we can’t identify the specified user / get its authentication info, we bail
  • Using the user’s Salt string, we use the hashing algorithm’s ComputeHash method to get a hash string for the specified password.
  • Authentication succeeds if the hashed salted password matches the stored hash string for that user.

Note how the provided model.Password string isn’t being copied anywhere, or compared against anything.

The GetUserAuthInfo function is being considered an implementation detail here, but could easily be promoted to its own IUserAuthInfoProvider interface+implementation: the role of that function is to get the Salt and PasswordHash values for a given user, and here we’re pulling that from a table on a worksheet, but other implementations could be pulling it from a database: this is a concern in its own right, and could very easily be argued to belong in its own class, abstracted behind its own interface.

IAuthView

If we’re going to have a dialog for the user to enter their credentials into, then everything we’ve seen about the Model-View-Presenter UI design pattern is applicable here – we already have our model, and now we need an abstraction for a view.

'@Folder("Authentication")
'@Interface
Option Explicit
'@Description("Shows the view as a modal dialog. Returns True unless the dialog is cancelled.")
Public Function ShowDialog() As Boolean
End Function
Public Property Get UserAuthModel() As UserAuthModel
End Property

From an abstract standpoint, the view is nothing more than a function that displays the dialog and returns False if the dialog was cancelled, True otherwise.

The concrete implementation will be a UserForm that includes two textboxes, two command buttons, and a few labels – like this:

The code-behind for the form is very simple:

  • Change handlers for the textboxes assign the corresponding model property
  • Click handlers for the command buttons simply Hide the form
  • A Create factory method takes a UserAuthModel object reference
  • Model is exposed for property injection (only the factory method uses this property)
'@Folder("Authentication")
'@PredeclaredId
Option Explicit
Implements IAuthView
Private Type TAuthDialog
    UserAuthModel As UserAuthModel
    IsCancelled As Boolean
End Type
Private this As TAuthDialog
Public Function Create(ByVal model As UserAuthModel) As IAuthView
    If model Is Nothing Then Err.Raise 5, TypeName(Me), "Model cannot be a null reference"
    Dim result As AuthDialogView
    Set result = New AuthDialogView
    Set result.UserAuthModel = model
    Set Create = result
End Function
Public Property Get UserAuthModel() As UserAuthModel
    Set UserAuthModel = this.UserAuthModel
End Property
Public Property Set UserAuthModel(ByVal value As UserAuthModel)
    Set this.UserAuthModel = value
End Property
Private Sub OnCancel()
    this.IsCancelled = True
    Me.Hide
End Sub
Private Sub Validate()
    OkButton.Enabled = this.UserAuthModel.IsValid
End Sub
Private Sub CancelButton_Click()
    OnCancel
End Sub
Private Sub OkButton_Click()
    Me.Hide
End Sub
Private Sub NameBox_Change()
    this.UserAuthModel.Name = NameBox.Text
    Validate
End Sub
Private Sub PasswordBox_Change()
    this.UserAuthModel.Password = PasswordBox.Text
    Validate
End Sub
Private Sub UserForm_QueryClose(Cancel As Integer, CloseMode As Integer)
    If CloseMode = VbQueryClose.vbFormControlMenu Then
        Cancel = True
        OnCancel
    End If
End Sub
Private Function IAuthView_ShowDialog() As Boolean
    Me.Show vbModal
    IAuthView_ShowDialog = Not this.IsCancelled
End Function
Private Property Get IAuthView_UserAuthModel() As UserAuthModel
    Set IAuthView_UserAuthModel = this.UserAuthModel
End Property

The important thing to note, is that the form itself doesn’t do anything: it’s just an I/O device your code uses to interface with the user – nothing more, nothing less. It collects user-provided data into a model, and ensures the dialog validates that model.

The form knows about the UserAuthModel and its properties (Name, Password, IsValid), and nothing else. It doesn’t know how to get a list of user names to populate a dropdown so that the user can pick a name from a list (that could be done, but then the model would need a UserNames property). It doesn’t know how to verify whether the provided password string is correct. It’s …just not its job to do anything other than relay messages to & from the user.

IAuthPresenter

We have a UserAuthModel that holds the user-supplied credentials. We have a WorksheetAuthService that can take these credentials and tell us if they’re good, using any IHashAlgorithm implementation. We’re missing an object that pieces it all together, and that’s the job of a presenter.

What we want is for the code that needs an authenticated user, to be able to consume a simple interface, like this:

'@Folder("Authentication")
'@ModuleDescription("Represents an object that can authenticate the current user.")
'@Interface
Option Explicit
'@Description("True if user is authenticated")
Public Property Get IsAuthenticated() As Boolean
End Property
'@Description("Prompts for user credentials")
Public Sub Authenticate()
End Sub

Now, any class that encapsulates functionality that involves authenticating the current user can be injected with an IAuthPresenter interface, and when IsAuthenticated is True we know our user is who they say they are. And if we inject the same instance everywhere, then the user only needs to enter their credentials once for the authentication state to be propagated everywhere – without using any globals!

'@Folder("Authentication")
'@PredeclaredId
'@ModuleDescription("Represents an object responsible for authenticating the current user.")
Option Explicit
Implements IAuthPresenter
Private Type TPresenter
    View As IAuthView
    AuthService As IAuthService
    IsAuthenticated As Boolean
End Type
Private this As TPresenter
Public Function Create(ByVal service As IAuthService, ByVal dialogView As IAuthView) As IAuthPresenter
    Dim result As AuthPresenter
    Set result = New AuthPresenter
    Set result.AuthService = service
    Set result.View = dialogView
    Set Create = result
End Function
Public Property Get AuthService() As IAuthService
    Set AuthService = this.AuthService
End Property
Public Property Set AuthService(ByVal value As IAuthService)
    Set this.AuthService = value
End Property
Public Property Get View() As IAuthView
    Set View = this.View
End Property
Public Property Set View(ByVal value As IAuthView)
    Set this.View = value
End Property
Private Sub IAuthPresenter_Authenticate()
    If Not this.View.ShowDialog Then Exit Sub
    this.IsAuthenticated = this.AuthService.Authenticate(this.View.UserAuthModel)
End Sub
Private Property Get IAuthPresenter_IsAuthenticated() As Boolean
    IAuthPresenter_IsAuthenticated = this.IsAuthenticated
End Property

At this point any standard module macro (aka entry point) can create the presenter and its dependencies:

Public Sub DoSomething()
    
    Dim model As UserAuthModel
    Set model = New UserAuthModel
    
    Dim dialog As IAuthView
    Set dialog = AuthDialogView.Create(model)
    
    Dim algo As IHashAlgorithm
    Set algo = SHA256Managed.Create()
    
    Dim service As IAuthService
    Set service = WorksheetAuthService.Create(algo)
    
    Dim presenter As IAuthPresenter
    Set presenter = AuthPresenter.Create(service, dialog)
    
    presenter.Authenticate
    If presenter.IsAuthenticated Then
        MsgBox "Welcome!", vbInformation
    Else
        MsgBox "Access denied", vbExclamation
    End If
    
End Sub 

If this were real application code, instead of consuming the presenter it would be injecting it into some class instance, and invoking a method on that class. This composition root (where we compose the application / instantiate and inject all the dependencies) would probably be in the Workbook_Open handler, so that the authentication state can be shared between components.


Authorisation

Up to this point, we only cared for authentication, i.e. identifying the current user. While very useful, it doesn’t tell us who’s authorized to do what. Without some pretty ugly code that special-cases specific users (e.g. “Admin”), we’re pretty limited here.

One proven solution, is to use role-based authorisations. Users belong to a “group” of users, and it’s the “group” of users that’s authorized to do things, not users themselves.

In order to do this, the WorksheetAuthService implementation needs to be modified to add a RoleId member to the TUserAuthInfo, and the IAuthService.Authenticate method could return a Long instead of a Boolean, where 0 would still mean a failed authentication, but any non-zero value would be the authenticated user’s RoleId.

Roles could be defined by an enum (note the default / 0 value):

Public Enum AuthRole
    Unauthorized = 0
    Admin
    Maintenance
    Auditing
End Enum

Or, role membership could be controlled in Active Directory (AD), using security groups – in that case you’ll want your IAuthService implementation to query AD instead of a worksheet, and the IAuthPresenter implementation to hold the current user’s role ID along with its authentication status.


There are many ways to go about implementing authentication, and many implementation-specific concerns. For example, if you’re querying a database for this, you’ll want to use commands and proper parameterization to avoid the problems associated with SQL Injection vulnerabilities: maybe a user named Robert');DROP TABLE USERS;-- isn’t part of your threat model, but can Tom O'Neil log onto your system without breaking anything?

Regardless of the approach, if you’re comparing the user’s plain-text password input with the plain-text password stored in $(storage_medium), you’re doing it wrong – whether that’s in VBA or not.

Code Name: Sheet1

There are so many ways to get ahold of a Worksheet reference: you can dereference it from a Sheets collection, and even then you need to decide whether that’ll be off Workbook.Sheets or Workbook.Worksheets, two properties that both return a Sheets collection that will contain the worksheet you’re looking for. The Workbook might be the ActiveWorkbook, or it could be some object variable that was assigned earlier, with the result of Workbooks.Open. Or you might like living on the edge, and activate the Window that has some path/filename as a caption, and then work off the ActiveWorkbook. Every single one of these cases have a thing in common: the Workbook involved isn’t necessarily ThisWorkbook.

ActiveWorkbook vs. ThisWorkbook

In Excel, only one single Workbook is ever the ActiveWorkbook at any given point in time. If all workbooks are closed, then ActiveWorkbook will be Nothing (add-ins in particular, need to mind this). When a workbook is activated, it fires an Activate event; if another workbook was active before that, then that workbook fired a Deactivate event.

The ActiveWorkbook can change in the middle of a loop that uses a DoEvents statement to keep Excel responsive, because the user clicked somewhere and that click was allowed to be handled, because Excel remains responsive: if the user can interact with Excel, you can never assume what ActiveWorkbook is referring to – it can be literally any workbook, or none at all. And after the next instruction it might be something else.

For all these reasons, ActiveWorkbook and ActiveSheet are object you will want to capture into a local variable at the beginning of whatever it is that you need to do, and then use that variable and never refer to ActiveSheetexplicitly or not, for the rest of that procedure. For example instead of this:

Public Sub DoSomething()
    ActiveSheet.Range("A1").Value = 42
    ActiveSheet.Range("A2").Value = VBA.DateTime.Date
End Sub

You’d do that:

Public Sub DoSomething()
    Dim sheet As Worksheet
    Set sheet = ActiveSheet
    sheet.Range("A1").Value = 42
    sheet.Range("A2").Value = VBA.DateTime.Date
End Sub

Of course that’s just an example: if I had to write such a small procedure in real code, I’d skip the local variable and have a With block withold the object reference for me – note the . dereferencing operator qualifying the Range member calls:

Public Sub DoSomething()
    With ActiveSheet
        .Range("A1").Value = 42
        .Range("A2").Value = VBA.DateTIme.Date
    End With
End Sub

This would be very, very different:

Public Sub DoSomething()
    With ActiveSheet
        Range("A1").Value = 42
        Range("A2").Value = VBA.DateTIme.Date
    End With
End Sub

Note the missing dereferencing . operator now: the With ActiveSheet block variable is never actually accessed here. So what sheet is it that these Range member calls are referring to? If that code is written anywhere other than in some worksheet module, then they’re implicitly referring to ActiveSheet. If that same identical code is written in some worksheet module (say, Sheet1), then it refers to that sheet (that’s Me, aka Sheet1).

Implicit qualifiers are evil: they strip vital context out of the code, and suddenly you need to do more than just read the code to understand what’s going on. If you’re going to be referring to ActiveSheet, you might as well be explicit about it.

So what’s ThisWorkbook then? In a word, it’s the host document: the Excel workbook in which your VBA project is hosted. ThisWorkbook always refers specifically to this host document, even if your VBA project is an add-in. Maybe it’s the ActiveWorkbook. Maybe it isn’t.

A very common mistake, is to treat the worksheets of ThisWorkbook like the worksheets of any other workbook (active or not).

Compile-Time, Run-Time

Another common mistake, is to treat worksheets of ThisWorkbook that already exist in ThisWorkbook.Worksheets at compile-time, the same way you’d treat worksheets that only come into existence at run-time.

If the sheet is already in the workbook when your VBA project is in design mode, then at compile-time a project-scope automagic Workbook variable (constant?) exists, named after the (Name) property of the module:

The “Name” property (bottom) is the sheet tab caption that the user can modify as they please; users don’t even get to see the “(Name)” property (top) unless they bring up the VBE.

By default the code name of the first sheet of an empty workbook, is Sheet1, same as its Name property value. When you do this:

Dim sheet As Worksheet
Set sheet = ThisWorkbook.Worksheets("Sheet1")
sheet.Range("A1").Value = 42

You are using this Name property, …and if a user renames the sheet, the statements suddenly starts raising run-time error 9 subscript out of range.

But if you gave the (Name) property a nice meaningful identifier name, say SummarySheet, then you could do this instead:

SummarySheet.Range("A1").Value = 42

SummarySheet is a programmatic identifier that is much harder to tamper with than the sheet tab’s caption, if you’re the worksheet’s end user.

You can’t use worksheets’ code names to access any other sheets than those that exist in ThisWorkbook at compile-time, so a really good habit to take early on, is to name things. Leave ThisWorkbook alone, but name every worksheet module in your project. And then use these names whenever you can: these worksheets are part of your VBA project, they never need to be obtained from a Sheets collection.

In fact, Set sheet = Sheets("Sheet1") is at best a missed opportunity, when the “Sheet1” in question exists in ThisWorkbook. At worst, it’s an outright bug… and that’s the reasoning behind Rubberduck’s sheet accessed using string inspection.

Rubberduck Annotations

I wrote about this unfortunately hard-to-discover feature in 2017, but a lot has happened since then, and there’s 5 times more of you now! The wiki is essentially up-to-date, but I’m not sure of its viewership. So here’s a recap of annotations in the late Rubberduck 2.4.1.x pre-release builds, that 2.5.0.x will launch with.

What we call “annotations” are special comments that look like these:

'@Folder("MyProject.Abstract")
'@ModuleDescription("An interface that describes an object responsible for something.")
'@Interface
'@Exposed

'@Description("Does something")
Public Sub DoSomething()
End Sub

Syntax

Rubberduck’s parser includes a grammar rule that captures these special comments, such that we “see” them like any other language syntax element (tokens), and can analyze them as such, too.

The syntax is rather simple, and is made to look like a procedure call – note that string arguments must be surrounded with double quotes:

'@AnnotationName arg1, arg2, "string argument"

If desired, parentheses can be used, too:

'@AnnotationName(arg1, arg2)
'@AnnotationName("string argument")

Whether you use one notation or the other is entirely up to personal preference, both are completely equivalent. As with everything else, consistency should be what matters.

There’s an inspection that flags illegal/unsupported annotations that you, if you’re using this @PseudoSyntax for other purposes, will probably want to disable: that’s done by setting its severity level to DoNotShow in the inspection settings, or by simply clicking “disable this inspection” from the inspection results toolwindow.

Keep in mind that while they are syntactically comments as far as VBA is concerned, to Rubberduck parsing the argument list of an annotation needs to follow strict rules. This parses correctly:

'@Folder "Some.Sub.Folder" @ModuleDescription "Some description" : some comment

Without the : instruction separator token, the @ModuleDescription annotation parses as a regular comment. After : though, anything goes.

There are two distinct types of annotation comments: some annotations are only valid at module level, and others are only valid at member level.

Module Annotations

Module-level annotations apply to the entire module, and must appear in that module’s declarations section. Personally, I like having them at the very top, above Option Explicit. Note that if there’s no declaration under the last annotation, and no empty line, then the placement becomes visually ambiguous – even though Rubberduck correctly understands it, avoid this:

Option Explicit
'@Description("description here")
Public Sub DoSomething() '^^^ is this the module's or the procedure's annotation?
End Sub

Let it breathe – always have an empty line between the end of the module’s declarations section (there should always at least be Option Explicit there) and the module’s body:

Option Explicit
'@Folder("MyProject") : clearly belongs to the module

'@Description("description here")
Public Sub DoSomething() '^^^ clearly belongs to the procedure
End Sub

What follows is a list of every single module-level annotation currently supported (late v2.4.1.x pre-release builds), that v2.5.0 will launch with.

@Folder

The Visual Basic Editor regroups modules in its Project Explorer toolwindow, by component type: you get a folder for your “Modules”, another folder for your “Class Modules”; if you have userforms they’re all under a “Forms” folder, and then the document modules are all lumped under some “Microsoft Excel Objects” folder (in an Excel host, anyway). While this grouping is certainly fine for tiny little automation scripts, it makes navigation wildly annoying as soon as a project starts having multiple features and responsibilities.

In a modern IDE like Visual Studio, code files can be regrouped by functionality into a completely custom folder hierarchy: you get to have a form in the same folder as the presenter class that uses it, for example. With Rubberduck’s Code Explorer toolwindow, you get to do exactly the same, and the way you do this is with @Folder annotations.

'@Folder("Root.Parent.Child")
Option Explicit

The @Folder annotation takes a single string argument representing the “virtual folder” a module should appear under, where a dot (.) denotes a sub-folder – a bit like .NET namespaces. Somewhere deep in the history of this annotation, there’s a version that’s even named @Namespace. “Folder” was preferred though, because “Namespace” was deemed too misleading for VBA/VB6, given the language doesn’t support them: all module names under a given project must still be unique. The Code Explorer toolwindow uses these annotations to build the folder hierarchy to organize module nodes under, but the folders don’t actually exist: they’re just a representation of the annotation comments in existing modules – and that is why there is no way to create a new, empty folder to drag-and-drop modules into.

It is strongly recommended to adopt a standard and consistent PascalCase naming convention for folder names: future Rubberduck versions might very well support exporting modules accordingly with these folder annotations, so these “virtual folders” might not be “virtual” forever; by using a PascalCase naming convention, you not only adopt a style that can be seamlessly carried into the .NET world; you also make your folders future-proof. Avoid spaces and special characters that wouldn’t be legal in a folder name under Windows.

The ModuleWithoutFolder inspection (under “Rubberduck Opportunities”), if enabled, will warn you of modules where this annotation is absent. By default, Rubberduck’s Code Explorer will put all modules under a single root folder named after the VBA project. While this might seem rather underwhelming, it was a deliberate decision to specifically not re-create the “by component type” grouping of the VBE and encourage our users to instead regroup modules by functionality.

@IgnoreModule

The @IgnoreModule annotation is automatically added by the “Ignore in Module” inspection quick-fix, which effectively disables a specific code inspection, but only in a specific module. This can be useful for inspections that have false positives, such as procedure not used firing results in a module that contains public parameterless procedures that are invoked from ActiveX controls on a worksheet, which Rubberduck isn’t seeing (hence the false positives), but that are otherwise useful, such that you don’t necessarily want to completely disable the inspection (i.e. set its severity level to DoNotShow).

If no arguments are specified, this annotation will make all inspections skip the module. To skip a specific inspection, you may provide its name (minus the Inspection suffix) as an argument. To ignore multiple inspections, you can separate them with commas like you would any other argument list:

'@IgnoreModule ProcedureNotUsed, ParameterNotUsed

Alternatively, this annotation may be supplied multiple times:

'@IgnoreModule ProcedureNotUsed
'@IgnoreModule ParameterNotUsed

Use the : instruction separator to terminate the argument list and add an explanatory comment as needed:

'@IgnoreModule ProcedureNotUsed : These are public macros attached to shapes on Sheet1

Note that the arguments (inspection names) are not strings: enclosing the inspection names in string literals will not work.

@TestModule

This was the very first annotation supported by Rubberduck. This annotation is only legal in standard/procedural modules, and marks a module for test discovery: the unit testing engine will only scan these modules for unit tests. This annotation does not support any parameters.

@ModuleDescription(“value”)

Given a string value, this annotation can be used to control the value of the module’s hidden VB_Description attribute, which determines the module’s “docstring” – a short description that appears in the VBE’s Object Browser, and that Rubberduck displays in its toolbar and in the Code Explorer.

Because Rubberduck can’t alter module attributes in document modules, this annotation is illegal in modules representing objects owned by the host application (i.e. “document” modules), such as Worksheet modules and ThisWorkbook.

@PredeclaredId

This annotation does not support any parameters, and can be used to control the value of the hidden VB_PredeclaredId attribute, which determines whether a class has a default instance. When a class has a default instance, its members can be invoked without an instance variable (rather, using an implicit one named after the class itself), like you did every single time you’ve ever written UserForm1.Show – but now you get to have a default instance for your own classes, and this opens up a vast array of new possibilities, most notably the ability to now write factory methods in the same class module as the class being factory-created, effectively giving you the ability to initialize new object instances with parameters, just like you would if VBA classes had parameterized constructors:

Dim something As Class1
Set something = Class1.Create("test", 42)

@Exposed

VBA classes are private by default: this means if you make a VBA project that references another, then you can’t access that class from the referencing project. By setting the class’ instancing property to PublicNotCreatable, a referencing project is now able to consume the class (but the class can only be instantiated inside the project that defines it… and that’s where factory methods shine).

This annotation visibly documents that the class’ instancing property has a non-default value (this can easily be modified in the VBE’s properties toolwindow).

@Interface

In VBA every class modules defines a public interface: every class can Implements any other class, but not all classes are created equal, and in the vast majority of the time what you want to follow the Implements keyword will be the name of an abstract interface. An abstract interface might look like this:

'@Interface
Option Explicit

Public Sub DoSomething()
End Sub

Adding this annotation to a module serves as metadata that Rubberduck uses when analyzing the code: the Code Explorer will display these modules with a dedicated “interface” icon, and an inspection will be able to flag procedures with a concrete implementation in these modules.

@NoIndent

Rubberduck’s Smart Indenter port can indent your entire VBA project in a few milliseconds, but automatically indenting a module can have undesirable consequences, such as losing hidden member attributes. Use this annotation to avoid accidentally wiping hidden attributes in a module: the indenter will skip that module when bulk-indenting the project.


Member Annotations

Member-level annotations apply to the entire procedure they’re annotating, and must be located immediately over the procedure’s declaration:

'@Description("Does something")
Public Sub DoSomething()
    '...
End Sub

As with module annotations, multiple member annotations can be specified for the same procedure – either by stacking them, or enumerating them one after the other:

'@DefaultMember
'@Description("Gets the item at the specified index")
Public Property Get Item(ByVal index As Long) As Object
    '...
End Property

Member annotations that aren’t immediately above the procedure declaration, will be flagged as illegal by the IllegalAnnotation inspection:

'@Description("Does something") : <~ annotation is illegal/misplaced

Public Sub DoSomething()
    '...
End Sub

@Description

This very useful annotation controls the value of the member’s hidden VB_Description attribute, which defines a docstring that appears in the bottom panel of the Object Browser when the member is selected – Rubberduck also displays this content in the context-sensitive (selection-dependent) label in the Rubberduck VBIDE toolbar.

Toolbar label content is dependent on the current selection in the editor and includes the value of the hidden attribute’s value.

@Ignore

Similar to @IgnoreModule, the purpose of the member-level @Ignore annotation is to get specific inspections to ignore the annotated procedure: it works identically.

@DefaultMember

Only one single member of a class can be the class’ default member. Default members should generally be avoided, but they are very useful for indexed Item properties of custom collection classes. This annotation takes no arguments.

@Enumerator

Custom collections that need to support For Each enumeration are required to have a member that returns an IUnknown, and hidden flags and attributes: this annotation clearly identifies the special member, and gets the hidden flags and attributes right every time.

'@Enumerator
Public Property Get NewEnum() As IUnknown
    Set NewEnum = encapsulatedCollection.[_NewEnum]
End Property

@ExcelHotkey

This rather specific annotation works in Excel-hosted VBA projects (as of this writing its absence may cause inspection false positives in other host applications, like Microsoft Word).

When the VBA project is hosted in Microsoft Excel, you can use this annotation to assign hotkeys using the same mechanism Excel uses to map hotkeys to recorded macros.

'@ExcelHotkey "D" : Ctrl+Shift+D will invoke this procedure in Excel
Public Sub DoSomething()
    '...
End Sub

'@ExcelHotkey "d" : Ctrl+D will invoke this procedure in Excel
Public Sub DoSomethingElse()
    '...
End Sub

Note that the annotation will work regardless of whether the argument is treated as a string literal or not – only the first character of the annotation argument is used, and its case determines whether the Shift key is involved in the hotkey combination (all hotkeys involve the Ctrl key): use an uppercase letter for a Ctrl+Shift hotkey.

@Obsolete

Code under continued maintenance is constantly evolving, and sometimes in order to avoid breaking existing call sites, a procedure might need to be replaced by a newer version, while keeping the old one around: this annotation can be used to mark the old version as obsolete with an explanatory comment, and inspections can flag all uses of the obsolete procedure:

'@Obsolete("Use DoSomethingElse instead.")
Public Sub DoSomething()
    '...
End Sub

Public Sub DoSomethingElse()
    '...
End Sub
The argument string appears in the inspection results for each use of the obsolete member.

Test Method Annotations

These annotations have been in Rubberduck for a very long time, and they are actually pretty easy to discover since they are automatically added by Rubberduck when adding test modules and test methods using the UI commands – but since Test Settings can be configured to not include setup & teardown stubs, it can be easy to forget they exist and what they do.

@TestMethod

This annotation is used in test modules to identify test methods: every test must be marked with this annotation in order to be discoverable as a test method. It is automatically added by Rubberduck’s “add test method” commands, but needs to be added manually if a test method is typed manually in the editor rather than inserted by Rubberduck.

This annotation supports a string argument that determines the test’s category, which appears in the Test Explorer toolwindow and enables grouping by category. If no category argument is specified, “Uncategorized” is used as a default:

@TestMethod("Some Category")
Private Sub TestMethod1()
'...
End Sub

The other @TestXxxxx member annotations are used for setup & teardown. If test settings have the “Test module initialization/cleanup” option selected, then @ModuleInitialize and @ModuleCleanup procedure stubs are automatically added to a new test module. If test settings have “Test method initialization/cleanup” selected, then @TestInitialize and @TestCleanup procedure stubs are automatically added a new test modules.

@TestInitialize

In test modules, this annotation marks procedures that are invoked before every single test in the module. Use that method to run setup/initialization code that needs to execute before each test. Each annotated procedure is invoked, but the order of invocation cannot be guaranteed… however there shouldn’t be a need to have more than one single such initialization method in the module.

@TestCleanup

Also used in test modules, this annotation marks methods that are invoked after every single test in that test module. Use these methods to run teardown/cleanup code that needs to run after each test. Again, each annotated procedure is invoked, but the order of invocation cannot be guaranteed – and there shouldn’t be a need to have more than one single such cleanup method in the module.

@ModuleInitialize

Similar to @TestInitialize, but for marking procedures that are invoked once for the test module, before the tests start running. Use these procedures to run setup code that needs to run before the module’s tests begin to run; each annotated procedure will be invoked, but the order of invocation cannot be guaranteed. Again, only one such initialization procedure should be needed, if any.

@ModuleCleanup

Similar to @TestCleanup, but for marking procedures that are invoked once for the test module, after all tests in the module have executed. Use these procedures to run teardown/cleanup code that needs to run after all module’s tests have completed; each annotated procedure will be invoked, but the order of invocation isn’t guaranteed. Only one such cleanup procedure should be needed, if any.


Annotations are one of Rubberduck’s most useful but unfortunately also one of its most obscure and hard-to-discover features. Fortunately, we have plans to surface them as right-click context menu commands in the 2.5.x release cycle.

Document Modules

The first time you discovered the Visual Basic Editor and visualized the VBA project in the Project Explorer, when you first met ThisWorkbook and Sheet1, document modules were the world: your baby steps in VBA were very likely in the code-behind of ThisWorkbook or Sheet1 – document modules are the gateway to VBA-land.

In the VBIDE Extensibility object model, the modules are accessed via the VBComponents property of a VBProject object, and the VBComponent.Type property has a value of vbext_ComponentType.vbext_ct_Document (an enum member value) for both ThisWorkbook and Sheet1 modules, but would be vbext_ct_StdModule for a standard module like Module1. Clearly if document modules are class modules, there’s enough “special” about them that they need to be treated differently by the VBA compiler.

Document?

VBA itself is host-agnostic: whether it’s hosted in Excel, PowerPoint, AutoCAD, Sage, Corel Draw, or any other one of over 200 licensed hosts, it’s the same VBA (there’s no such thing as “Excel VBA”, really), and it doesn’t have any intrinsic knowledge of what an Excel.Worksheet is – only the Excel library does; when your VBA project is hosted in Excel, then the Excel type library is automatically added to (and locked; you can’t remove it from) your project. Document modules are a special kind of module that the VBIDE treats as part of the VBA project, but it can’t add or remove them: If you want to add or remove a Worksheet module from a VBA project, you need to actually add or remove a worksheet from the host Excel workbook.

So, other than we can’t directly add such components to a VBA project without involving the host application’s object model, what’s special about them?

They’re in the VBA project, because the host application decided that’s what’s in a VBA project hosted in – here Excel, but each host application gets to decide whether a project includes a module to represent the host document, and under what circumstances to add or remove other types of modules, like Worksheet, or Chart.

Document modules can’t be instantiated or destroyed, so there’s no Initialize or Terminate event handler for them. However, you get to create a handler for any or every one of many convenient events that the host application fires at various times and that the authors of the object model deemed relevant to expose as programmatic extensibility points. For ThisWorkbook, this includes events like Workbook.Open and Workbook.NewSheet, for example. For Worksheet modules, that’s worksheet-specific events like Worksheet.SelectionChange or Worksheet.Change, that you can use to “hook” custom code when such or such thing happens in the host document.

ThisWorkbook

Your VBA project is hosted inside an Excel document. In the VBA project, the host document is referred to with an identifier, and that identifier is (by default anyway, and renaming it is not something I’d recommend doing) ThisWorkbook.

The Excel workbook that contains the current VBA project: that’s ThisWorkbook. When you’re in the code-behind of that module, you’re extending a Workbook object: if you type Me., the VBE’s IntelliSense/autocomplete list will display all the public members you’d find on any other Workbook object, plus any Public (explicitly or not) member. That’s what’s special about a document module: it literally inherits members from another class, as in inheritance – something VBA user code cannot do. Isn’t it fascinating that, under the hood, Visual Basic for Applications apparently has no problem with class inheritance? Something similar happens with UserForm code: the UserForm1 class inherits the members of any other UserForm, “for free”. And of course every Sheet1 inherits the members of every other Worksheet in the world.

So, procedures you write in a document module, should logically be very closely related to that particular document. And because host-agnostic logic can’t add/remove these modules, you’ll want to have as little code as possible in them – and then as a bonus, your VBA project becomes easier to keep under source control, because the code is in modules that VBE add-ins (wink wink) are able to properly import back in and synchronize to & from the file system.


What about ActiveWorkbook?

ActiveWorkbook refers to the one and only workbook that is currently active in the Excel Application instance, which may or may not be ThisWorkbook / the host document. It’s easy to confuse the two, and even easier to write code that assumes one is the other: the macro recorder does it, many documentation examples and Stack Overflow answers do it too. But reliable code is code that makes as few assumptions as possible – sooner or later, built-in assumptions are broken, and you’re faced with an apparently intermittent error 1004 that sometimes happens when you’re debugging and stepping through the code, and it happened to a user once or twice but the problem always seemed to vaporize just because you showed up at the user’s cubicle and stood there watching as nothing blew up and everything went fine. *Shrug*, right?


Accessing Worksheets

You shouldn’t be dereferencing worksheets all the time. In fact, you rarely even need to. But when you do, it’s important to do it right, and for the right reasons. The first thing you need to think of, is whether the sheet exists in ThisWorkbook at compile-time. Meaning, it’s there in the host document, you can modify it in Excel and there’s a document module for it in the VBA project.

That’s the first thing you need to think about, because if the answer to that is “yep, it’s right there and it’s called Sheet3“, then you already have your Worksheet object and there’s no need to dereference it from any Sheets or Worksheets collection!

Dim sheet As Worksheet
Set sheet = ThisWorkbook.Worksheets("Sheet1") '<~ bad if Sheet1 exists at compile-time!

Set sheet = Sheet1 '<~ bad because redundant: the variable obfuscates the target!
sheet.Range("A1").Value = 42 '<~ bad if sheet is a local variable, but good if a parameter

Sheet1.Range("A1").Value = 42 '<~ most reliable way to refer to a worksheet

The magic Sheet1 identifier comes from the (Name) property of the Sheet1 document module under ThisWorkbook in the VBA project: set that property to a valid and meaningful name for that specific worksheet, and you have a user-proof way to refer to your ConfigurationSheet, the SummarySheet, and that DataSheet. If the user decides to rename the DataSheet to “Data (OLD)” for some reason, this code is now broken:

ThisWorkbook.Worksheets("Data").Range("A1").Value = 42

Meanwhile this code will survive any user-induced sheet-name tampering:

DataSheet.Range("A1").Value = 42

Wait, is it .Sheets() or .Worksheets()?

The first thing to note, is that they aren’t language-level keywords, but member calls. If you don’t qualify them, then in the ThisWorkbook module you’ll be referring to Me.Worksheets (i.e. ThisWorkbook.Worksheets), and anywhere else in the VBA project that same code be implicitly referring to ActiveWorkbook.Worksheets: that’s why it’s important to properly qualify member calls. Worksheets is a member of a Workbook object, so you explicitly qualify it with a Workbook object.

Now, Sheets and Worksheets both return an Excel.Sheets collection object, whose default member Item returns an Object. Both are happy to take a string with a sheet name, or an integer with a sheet index. Both will be unhappy (enough to raise runtime error 9 “subscript out of range”) with an argument that refers to a sheet that’s not in the (implicitly or not) qualifying workbook object. Both will return a Sheets collection object if you give it an array of sheet names: that’s one reason why the Item member returns an Object and not a Worksheet. Another reason is that sometimes a sheet is a Chart, not a Worksheet.

Use the Worksheets collection to retrieve Worksheet items; the Sheets collection contains all sheets in the qualifying workbook, regardless of the type, so use it e.g. to retrieve the Chart object for a chart sheet. Both are equivalent, but Worksheets is semantically more specific and should be preferred over Sheets for the common Worksheet-dereferencing scenarios.


Dereferencing Workbooks

If you only need to work with ThisWorkbook, then you don’t need to worry about any of this. But as soon as your code starts opening other workbooks and manipulating sheets that are in these other workbooks, you need to either go nuts over what workbook is currently the ActiveWorkbook as you Activate workbooks and repeatedly go Workbooks("foo.xlsm").Activate…. or properly keep a reference to the objects you’re dealing with.

When you open another workbook with Application.Workbooks.Open, that Open is a function, a member of the Excel.Workbooks class that returns a Workbook object reference if it successfully opens the file.

Workbooks.Open is also side-effecting: successfully opening a workbook makes that workbook the new ActiveWorkbook, and so global state is affected by its execution.

When you then go and work off ActiveWorkbook or unqualified Worksheets(...) member calls, you are writing code that is heavily reliant on the side effects of a function, and global state in general.

The right thing to do, is to capture the function’s return value, and store the object reference in a local variable:

Dim book As Workbook
Set book = Application.Workbooks.Open(path) '<~ global-scope side effects are irrelevant!

Dim dataSheet As Worksheet
Set dataSheet = book.Worksheets("DATA")

If a workbook was opened by your VBA code, then your VBA code has no reason to not have a Workbook reference to that object.

So when is ActiveWorkbook useful then?

As an argument to a procedure that takes a Workbook parameter because it doesn’t care what workbook it’s working with, or when you need to assign a Workbook object variable (presumably a WithEvents module-scope private variable in some class module) to whatever workbook is currently active. With few specific exceptions, that’s all.

ActiveWorkbook.Whatever is just not code that you normally want to see anywhere.


Cha-cha-cha-Chaining Calls

Strive to keep the VBA compiler aware of everything that’s going on. Moreover if all the code you write is early-bound, then Rubberduck understands it as completely as it can. But implicit late binding is alas too easy to introduce, and the primary cause for it is chained member calls:

book.Worksheets("Sheet1").Range("A1").Value = 42 '<~ late bound at .Range

Everything after the Worksheets("Sheet1") call is late-bound, because as described above, Excel.Sheets.Item returns an Object, and member calls against Object can only ever be resolved at run-time.

By introducing a Worksheet variable to collect the Object, we cast it to a usable compile-time interface (that’s Worksheet), and now the VBA compiler can resume validating the .Range member call:

Dim sheet As Worksheet
Set sheet = book.Worksheets("Sheet1")
sheet.Range("A1").Value = 42 '<~ all early bound

Chained early-bound member calls are fine: the compiler will be able to validate the Range.Value member call, because the Excel.Worksheet.Range property getter returns a Range reference. If it returned Object, we would have to declare a Range local variable to capture the Excel.Range object we want to work with, like this:

Dim sheet As Worksheet
Set sheet = book.Worksheets("Sheet1") '<~ good: casts Object into Worksheet.

Dim cell As Range
Set cell = sheet.Range("A1") '<~ redundant: sheet.Range("A1") returns an early-bound Range.
cell.Value = 42 '<~ early-bound, but sheet.Range("A1").Value = 42 would also be early-bound.

Avoid declaring extraneous variables, but never hesitate to use a local variable to turn an Object into a compile-time type that gives you IntelliSense, autocompletion, and parameter quick-info: you’ll avoid accidentally running into run-time error 438 for typos Option Explicit can’t save you from. Using the compiler to validate everything it can validate, is a very good idea.

If you need to repeatedly invoke members off an early-bound object, introducing a local variable reduces the dereferencing and helps make the code feel less repetitive:

sheet.Range("A1").Value = "Rubberduck"
sheet.Range("A1").Font.Bold = True
sheet.Range("A1").Font.Size = 72
sheet.Range("A1").Font.Name = "Showcard Gothic"

By introducing a local variable, we reduce the cognitive load and no longer repeatedly dereference the same identical Range object pointer every time:

Dim cell As Range
Set cell = sheet.Range("A1")

cell.Value = "Rubberduck"
cell.Font.Bold = True
cell.Font.Size = 72
cell.Font.Name = "Showcard Gothic"

Arguably, nested With blocks could hold all the object references involved, and reduces the dereferencing to a strict minimum (.Font is only invoked once, and the reference is witheld), but it’s very debatable whether it enhances or hurts readability:

With sheet.Range("A1")
    .Value = "Rubberduck"
    With .Font
        .Bold = True
        .Size = 72
        .Name = "Showcard Gothic"
    End With
End With

Avoiding nested With blocks seems a fair compromise:

Dim cell As Range
Set cell = sheet.Range("A1")

cell.Value = "Rubberduck"
With cell.Font
    .Bold = True
    .Size = 72
    .Name = "Showcard Gothic"
End With

All this is fairly subjective, of course, and really applies to everything you ever write in VBA (not just when coding against the Excel object model), but any of it is better than this (very cleaned-up) macro-recorder code:

    Range("A1").Select
    ActiveCell.Value = "Rubberduck"
    With Selection.Font
        .Name = "Showcard Gothic"
        .Bold = True
        .Size = 72
    End With

We note the implicit ActiveSheet reference with the implicitly-qualified Range member call; we note the use of Range.Select followed by a use of ActiveCell; we note a With block holding a late-bound reference to Range.Font through the Selection object, and the compiler gets to validate absolutely nothing inside that With block.

The macro recorder doesn’t declare local variables: instead, it Selects things and then works late-bound against the Selection. That’s why it’s a bad teacher: while it’s very useful to show us what members to use to accomplish something, it does everything without leveraging any compile-time checks, and teaches to Activate the sheet you want to work with so that your unqualified Range and Cells member calls can work off the correct sheet… but now you know why, how, and when to dereference a Worksheet object into a local variable, you don’t need any Select and Activate!

“I’m not a programmer”

If you’re an accountant, a sales analyst, or in any other office position where writing VBA code helps you do your job faster, you may have uttered the words “I’m not a programmer” before, and that wouldn’t have been wrong: once the code is written, you’d tweak it every once in a while to fix a bug here or there, and then move on to do your actual job as soon as things look like they work as they should. If you’re finding that as weeks and months pass, you’re spending more and more time debugging that code, it might pay off to learn a bit more about how “actual programmers” do things, but be warned: it’s a bit of a rabbit hole – in a good way, but still, it goes as deep as you’re willing to go.

If you’re a consultant delivering advanced Excel solutions to your business clients, you may have thought, said, or written the same words, too. But you like your worksheets efficient, flexible, reliable, easy to maintain; over the years you’ve become an expert at dissecting and modeling a business problem into a fine solution that will live on and grow with the client. You are a “power user”, a professional, and it shows.

Writing code involves exactly the same identical problem-solving thought process: dissecting a problem into small steps, and modeling it into …a sequence of executable statements. When you use a SUM function in a worksheet, you assess whether the range of cells you’re adding up will need to grow over time, and you make sure it’s as simple as possible to add or insert a new row without breaking the integrity of the worksheet: you’re not just solving the problem at hand, you’re anticipating the extension points, facilitating them, making it harder to break things. You shade the background of cells intended for data entry, use borders around fields, conditional formatting, and data validation to ensure everything is obvious and remains consistent; you source data validation lists from named ranges pointing to a column in a table, so that adding new possible valid entries is easy as pie and requires no other step than… adding the new possible valid entries in the table. Next to another table that requires a particular sort order because it’s being used by dynamic named ranges that source data validation lists for co-dependent dropdowns, you might put up a very obvious formatted shape with an inner text that explains why that table needs to be sorted by this column then that column, and what happens to the associated validation dropdowns when the sort is broken.

See, you are a programmer. Worksheets are programs – even more intensely so with the amazing new features coming to Excel: dynamic arrays are changing the entire paradigm and turning the very thinking of worksheet functions into something that really isn’t very far from the mindset you’d have in functional programming.

So why is it that VBA code is so often seen as merely a sequence of executable statements then? Why is it that “it works, therefore it’s good enough” is so often where the bar is? You could have made that data validation list work off a standard range of cells in some (hidden?) column somewhere off-screen, and that would have worked too… but “well, it works” isn’t where the bar should be at, and you know it.

Depending on what the code needs to do, VBA code can become much more than just a macro once you start not just solving the problem at hand, but also anticipating the extension points, facilitating them, making it easier to maintain, and harder to break things. If writing code is part of what you do for a living, then you might as well write good code. Good code isn’t just code that works. It’s code that adheres to a number of language-agnostic principles and modern-day guidelines that even plain procedural code should follow. It’s code that Joe from accounting probably couldn’t have written by themselves, but that they could likely read and understand (at least at the higher abstraction levels), and if they know enough VBA to be dangerous, they could likely even maintain and extend it!

I’m not saying every piece of VBA code needs OOP and dependency injection and inversion of control and 20 class modules with 1 method each ought to get involved in sorting a ListObject. Just that maybe, VBA code would be a little less dreadful to the eventual IT staff inheriting it, if instead of saying “I’m not a programmer”, we cared about the quality of our code in the exact same way we care about the quality of our worksheets and dashboards, or Access databases and reports, or whatever it is that we’re doing.

The Macro Recorder Curse

The macro recorder is a wonderful thing. It’s one of the tools at your disposal to assist you in your journey, a good way to observe code that does exactly what you just did, and learn what parts of the object model to use for doing what. The problems begin when you see macro recorder code constantly invoking Range.Select and Worksheet.Activate, working against the Selection object, generating dozens of repetitive statements and redundant code that can easily trick a neophyte into thinking “so that’s how it’s done!” – the macro recorder is a great way to familiarize with a number of APIs, …and that’s all it needs to be.

There are very few ways to write more inefficient and bug-prone code, than to use the macro recorder’s output as a model of how VBA code should be written. How to avoid Select and Activate has to be the single most linked-to post in the VBA tag on Stack Overflow, yet an untold number of young souls remain to be saved from the curse of the macro recorder.

Of course, we have all the tools we need to defeat that curse. I’m not going to repeat everything in that very good SO thread, but the crux of it boils down to, in my opinion, a few simple things.

Early Binding and Guard Clauses

From an automation standpoint, Selection is an interesting object. In the Excel object model, Selection is a Shape that’s selected, the Chart you just clicked on, the Range of cells you navigate to. If the current selection is relevant to your code, consider making it an input with an explicit object type: the selection you’re expecting very likely has a very specific type, like a Range. Simple example:

Public Sub MyMacro()
    Selection.Value = 42 'multiple possible errors
End Sub

If Selection is pulled from that code and taken in as a Range parameter instead, we eliminate all ambiguities and restore the natural balance of the universe by coding against the Range interface rather than against Object – which means compile-time validation and IntelliSense:

Public Sub MyMacro()
    If Not TypeOf Selection Is Excel.Range Then Exit Sub '<~ that's a *guard clause*
    DoSomething Selection
End Sub

Private Sub DoSomething(ByVal target As Range)
    target.Value = 42
End Sub

Note the similarities between MyMacro in the first snippet, and DoSomething in the second one – it’s what they do differently that makes… all the difference. Now the procedure can work with any Range object, whether it’s actually selected or not.

Working with Selection is never really needed: what you can do against Selection you can do with any Range if what you mean to work with is a Range, or any Chart if what you mean to work with is a Chart.

It might look like it’s more code, more complicated to write – it might even be. But binding these types at compile-time makes things much simpler, really. When we make a member call against Object, the compiler doesn’t even care that the member exists. This involves overhead at run-time, and a non-zero chance of error 438 being raised when the member does not exist. Like Variant, Object is very flexible… too much for its own good.

A member call against Selection is inherently late-bound. Exactly like dynamic in C#, you want to break out of it, as soon as possible: if you’re expecting a Range, declare a Range and run with it, be explicit. In turn, you’ll be rewarded with VBA blowing up with a type mismatch error (13) early on, rather than with some object doesn’t support property or method error (438), possibly far removed from where the problem really stems from – the instruction that wrongly assumed a Selection could be treated like any old Range:

Public Sub MyMacro()
    Dim cell As Range
    Set cell = Selection '<~ type mismatch if Selection isn't a Range
End Sub

The macro recorder will never generate a single control flow statement. No loops, no conditionals, no variables, no structure. If you take something that’s tedious, and make a computer do it tediously for you, you may be “getting the job done”, but you’re 1) being very mean to your computer, and 2) you could easily be misusing the object model in ways that make it inefficient. The best way to tell a computer to do something 20 times isn’t to tell it 20 times to do one thing!

By simply introducing/declaring variables to hold the Worksheet and Range objects we’re working with, we eliminate the need to Select and Activate everything we touch, and Selection becomes useless, for the most part – if your macro is kicked off with a keyboard shortcut and works with whatever is selected at that time, more power to you – but that doesn’t mean your entire code needs to work with Selection, only that you need to validate what’s selected and switch to early binding as early as possible, using properly typed local variables.

Use Range.Select when you need to programmatically visually select cells in the worksheet, for the user to see that selection being made and later perhaps interacted with.