Introducing Rubberduck 2.5.2

Version 2.5.1 was released August 22, 2020. Since then, the installer was downloaded over 11,600 times; we are now 420 commits and 650 modified files later, and the time has come to deliver all that work into a convenient little installer package and move on to the next dev/release cycle.

What’s New?

If you’ve kept up with latest pre-release builds (especially in the last few weeks), nothing much. If you’ve been patiently waiting for the next release, you’re in for a treat!

The first thing you’ll probably notice is the shiny new splash screen design:

It’s the same old yellow splash made with Paint.NET, with a tiled reflection distortion effect against the background, a semi-transparent white bottom panel, and a finer font. Do you like it?

Fixed Bugs

50-some issues labelled “bug” were closed between 2020-08-22 and mid-April 2021, many of them thanks to flicking the switch on leveraging our internal ITypeLib API for user code – thanks to earlier invaluable contributions from the amazing Wayne Phillips (vbWatchdog, twinBASIC), Rubberduck is now able to tap into the actual in-memory COM type library compiled from the VBA code and, eventually, fill the remaining the gaps in Rubberduck’s understanding of the code: Rubberduck now understands enough to be able to tell that ThisWorkbook has a _Workbook subtype, and that Sheet1 has a _Worksheet subtype, …and that’s enough to identify the ThisWorkbook module at long last, and as a result Rubberduck’s ImplicitActiveSheetReference and ImplicitActiveWorkbookReference inspections get to work exactly as intended, and the door is now opened for so many interesting things…

New Inspections

A Rubberduck release wouldn’t be a Rubberduck release without at least a handful of new inspections. The IllegalAnnotation inspection is being replaced by InvalidAnnotation, UnrecognizedAnnotation, and together with the new AnnotationInIncompatibleComponentType inspection they allow Rubberduck to better convey exactly what’s wrong with a given “illegal” annotation comment.

Annotation in Incompatible Component Type

Some annotations cannot be used in certain types of modules. For example, attribute-related annotations cannot be used in document modules (because Rubberduck cannot import back the modified modules), and a @TestModule annotation is only meaningful in a standard module.

Note that the @Description, @ModuleDescription and @VariableDescription annotations do work in document modules now, because Rubberduck is now reading docstrings off annotations rather than hidden attributes.

Implicit Containing Workbook Reference

Code in the ThisWorkbook module (Excel) referring to members of the Workbook class, have an implicit Me qualifier. This makes an unqualified Worksheets(1) retrieval in ThisWorkbook refer to ThisWorkbook.Worksheets(1), but an identical statement in any other module would be (implicitly) referring to ActiveWorkbook. By qualifying such member calls with Me, the intent is clarified.

Implicit Containing Worksheet Reference

Code in a worksheet module (Excel) referring to members of the Worksheet class, have an implicit Me qualifier. This makes an unqualified Range member call in the Sheet2 module refer to Sheet2, but an identical statement in any other module would be (implicitly) referring to ActiveSheet. By qualifying such member calls with Me, the intent is clarified.

Invalid Annotation

Flags unbound annotations; that is, annotation comments that were correctly parsed as Rubberduck annotations but that could not be associated with a target element. This would happen when a module annotation is used in local scope, or a member annotation at module level. This inspection only flags annotation comments that parsed as Rubberduck annotations.

Misleading ByRef Parameter

The RHS/Value parameter of a Property Let procedure is always passed by value. As such, an explicit ByRef modifier on such a parameter definition is misleading. From MS-VBAL (VBA language specifications) section 5.3.1.7 Property Declarations:

§ If the <value-param> of a <property-LHS-declaration> does not have a <parameter-mechanism> element or has a <parameter-mechanism> consisting of the keyword ByRef, it has the same meaning as if it instead had a <parameter-mechanism> element consisting of the keyword ByVal.
§ The <value-param> of a <property-LHS-declaration> always has the runtime semantics of a ByVal parameter.

Unrecognized Annotation

This inspection flags comments that parsed like a Rubberduck annotation, but aren’t recognized or supported. It picks up typos in Rubberduck annotations, and annotation-like comments that aren’t Rubberduck annotations but parse as such. Splicing this specific scenario from other invalid annotations is particularly useful when you want to mute inspection results for non-Rubberduck annotations while still validating the supported ones.


New Quick Fixes

This release also introduces a handful of new quick-fixes:

AnnotateEntryPoint

This fix is now available for ProcedureNotUsed inspection results in standard and document modules; it simply annotates a member with the new @EntryPoint annotation which specifically instructs ProcedureNotUsed to ignore that member. Use this quick-fix for UDFs and macro procedures that are attached to document objects and don’t need an Excel hotkey. If your project is hosted in an Excel workbook, macros annotated with @ExcelHotkey are also considered as entry points now.

DeclareAsExplicitType

VariableTypeNotDeclared inspection results could always be “fixed” by making the declared type an explicit Variant; this new quick-fix makes Rubberduck infer the declared type from usage where possible, which is objectively awesome.

QualifyWithMe

This new quick-fix is available for the new implicit containing workbook/worksheet reference inspections, making the reference to the containing module explicit.

Introduce Get Accessor

The Write-Only Property inspection gets a new quick-fix with this release; this iteration does not try to infer the backing field, so further manual edits are needed, but it’s a start.


New UI Language: Italian

Thanks to a timely contribution by @PhilCattivocaratere, we are thrilled to announce that this release introduces Italian as a UI language:

Every single UI string in Rubberduck comes from a localized resource file. Translating all the resources for a new language can take 3-5 hours, and then it’s only a matter of keeping the translations up-to-date by creating a small pull request when new resource strings are added for new features.

In a nutshell

Here’s a quick summary of the most significant pull requests and commits merged this cycle:

  • Encapsulate Field enhancements
  • We are now leveraging our internal ITypeLib API
  • We are now building Rubberduck with the latest version of Visual Studio 2019
  • Precompiler directives now parse correctly with line continuations
  • Internal CodeBuilder API honors indenter settings when generating code
  • Fixed a number of issues with name conflict validation
  • Test methods now support a @TestIgnore annotation to ignore a test
  • Specific projects can now be ignored by the parser
  • Users no longer need to accept the GPLv3 as if it were an End User License Agreement (EULA)
  • Custom templates extensions is changing from .rdt to .template
  • Implicit Variant inspection quick-fix will now infer the best type from usage instead of just making the variable an explicit Variant
  • For...Next loop variables no longer trigger a variable not used inspection
  • Implicit Public Member inspection will now flag Enum types and Type structures
  • Branch “master” was renamed to “main”
  • New Property Group indenter settings
  • Arrays declared with ReDim now correctly resolve the declared type
  • @Description, @VariableDescription, and @ModuleDescription can now be used in document modules (cannot be synchronized)
    • Documentation strings are now read from annotations when missing from attributes
  • Start menu link to website now uses https
  • Fixed context menu positioning
  • New @EntryPoint annotation marks a standard or document module member as invoked from outside the code; as such the Procedure Not Used inspection will no longer flags members annotated with @EntryPoint or @ExcelHotkey (Excel only).
  • Several other opportunistic fixes left & right, improved overall stability.
  • Shiny new splash screen; debug builds now indicate “debug” instead of a meaningless local build number (build version# is controlled by the AppVeyor CI build server; local builds are all .0).
  • Expand/collapse all in Code Explorer
  • Rubberduck CommandBar label will now show the corresponding parameter declaration for a selected argument, and Find all References will now include arguments at call sites for parameter declarations (previous versions would only count named arguments).
  • Find Symbol navigation tool works again.
  • Find all References search results will now highlight the target reference in its context.
  • Added Italian UI resources.

Possible (Silent) Crash on Exit

I haven’t personally experienced it in a long time in Excel, but Rubberduck may run into issues tearing down, sometimes causing an AccessViolationException when it unloads, which can either crash the host process or leave it hanging as a ghosted process that will interfere with reloading: verify that the host process (e.g. ACCESS.EXE) has shut down completely using Task Manager when you close everything, and make sure to kill any such ghosted processes before loading Rubberduck in a new process.

Sounds familiar? If you’ve been following the project all along, you probably remember similar behavior in earlier releases – at one point during this development cycle we thought the problem was finally under control, but the cure was worse than the disease and there was a chance that the host document / project gets completely corrupted and impossible to open in the VBE: because we think it’s much better to sometimes crash on teardown than to corrupt our users’ host documents forever, we have reverted that “fix” and will have to come up with something else.


What’s Next?

Lots of good stuff, including a new peek definition command to the code pane, Code Explorer, and the VBE’s own Project Explorer‘s context menus – the feature was developed too late to make the cut for this release, but will be available in 2.5.2.x pre-release builds very soon:

Peek Definition commands pop a panel that shows you the syntax-highlighted source code for a type or member. The pop-up panel can then be dragged around to keep it in sight while editing.

In the Unit Testing department, a mocking framework is about to debut as an experimental feature with a number of technical limitations.

I’m going to be turning my attention towards code path analysis this cycle; this internal API is needed to implement the more advanced inspection ideas, and an Extract Method refactoring needs it too.

To be continued…

Globals and Ambient Context

Most of the time, we don’t need any global variables. State can usually be neatly encapsulated in an object, and a reference to this object can easily be passed as an argument to any procedure scope that needs it. But global scope is neither a necessary evil, nor necessarily evil. Like many things in programming, it’s a tool, and like many other tools, misusing it can cause pain.

The VBA code and host Excel workbook accompanying this article can be found on GitHub.


What is Global Scope?

When we declare a variable inside a procedure, we call it a “local variable” in reference to its scope being local to the procedure. “Module variables” are accessible within any procedure scope within the module they’re declared in. Public members of private modules (and Friend members of public modules) are only accessible within the project they live in, and Public members of public modules are global and can be accessed from other projects.

The different scopes of VBA: Global, project, module, and local.

Because in VBA class modules are private by default, and a public class is only PublicNotCreatable (as in, a referencing project cannot create a New instance of a class, factory methods must be provided), and also because “actually global” is in reality slightly more complicated than that (the VB_GlobalNamespace attribute is always going to be False for a VBA class), for the sake of simplicity when I talk about “global scope” and “globals” in this article, I’m treating global and project scopes as one and the same – but it’s important to know the difference, especially more so in scenarios where a VBA/Excel add-in/library is being referenced by other VBA projects, where a tidy public API is handy.

Keywords
Rubberduck recommends using the Dim keyword only in local scope, and to use the Private keyword to declare module-level variables. It also recommends using Public over Global, because nothing is really “global” in VBA and that makes the deprecated keyword potentially confusing. The Global keyword really means Public in VBA, and should be avoided.

Picture the VBA runtime executing some macro procedure and some variable needs to be incremented by 1. Scope determines whether that variable identifier is referring to a local, module, or global declaration. Accessibility is how we use code to restrict scope, using keywords like Private, Public, or Friend: if the variable identifier exists in a public module but is declared with the Private keyword, then it’s inaccessible and not in scope for the procedure we’re in.

So in search for the variable’s declaration we look for a local scope declaration by that name. If there isn’t any, we look for a module scope declaration for that name. Not there? We look at everything we can see in project scope. If we still haven’t found it then, we look for the declaration in the referenced libraries and projects, in priority order (so, the VBA standard library, then the host application’s own object model library, then everything else).

That’s scoping. Scopes and accessibility are closely related, but they’re different things. Think of accessibility as a tool to shape your private and public interfaces and APIs, keeping in mind that in VBA all module members are implicitly Public unless their declaration states otherwise.


Globals and Testability

Global variables are very useful: having a piece of data that is accessible from anywhere in the code does have its advantages. Used wisely, globals can very elegantly address cross-cutting concerns. Instead of having every method responsible for its own logging, or instead of passing a Logger instance to every method, each scope can access a single global Logger object (or invoke the same Log utility procedure), and there really isn’t any problem with that, …until you realize that your unit tests are all needlessly writing logs to some file under C:\Dev\VBA because the global logger is doing its job whether or not the code invoking it is being executed from a test runner… and this is making tests run code that isn’t related to these tests’ purpose: if there’s a bug in the logger code, it’s a test about the logger code that should be failing, not every single other test that couldn’t care less for the logging functionality.

From a testability standpoint, code with global dependencies can be difficult, if not impossible to test. In the case of a global Logger dependency, the logger’s interface would need to expose some kind of “kill switch” that tests can invoke to disable logging… but then modifying an otherwise perfectly good interface for the sake of making the object aware of whether it’s being invoked from a test or not, isn’t ideal at all (we’ll see why in a bit).

This Logger is a good example of a legitimate global service, but it’s “user code” that could always be pragmatically modified to accommodate testing. What about code that depends on global-scope services that aren’t “user code”?

Treating the Excel Object Model as a Dependency

Imagine needing to write tests for user-defined functions (UDF) that store a number of values in a global Dictionary and then schedule a macro that then runs (asynchronously!) and sends these values over to some web API that returns data that then ends up on the worksheet, underneath the calling UDF; the functions have dependencies on Application.Caller and Application.OnTime: we don’t own the Application global object, and we can’t modify its code to accommodate testing – what then?

Writing tests for a UDF is normally trivial: the function takes inputs, computes a result, and then returns it. Tests can supply various inputs and run the function through all kinds of cases and assert that it handles them correctly, by simply comparing its return value with what’s expected, and exceptional edge cases can have tests asserting that the expected error is thrown.

Writing tests for a side-effecting UDF that temporarily stores data in global scope is a lot more challenging, for many reasons. Remember, unit tests:

  • Should reliably produce the same outcome regardless of any external factors;
  • Should be fast, and not involve any I/O or network activity;
  • Should be able to be executed individually or in any given order without affecting outcome;
  • Should be able to be executed concurrently (at least in theory – VBA won’t run concurrent code).

With state shared between the tests, we have to be careful to correctly setup and clean-up that state before & after each test, so that each test gets a fresh canvas in a controlled environment… and then we can live with VBA unit tests that would likely break if executed concurrently, because VBA can’t run them concurrently anyway.


Testing Untestable Things

Back to this not-so-crazy UDF scenario with the Application.OnTime hack: it wouldn’t be acceptable for a test to literally wait for Excel to decide it’s time to invoke a macro, not any more than a test should be sending any actual HTTP requests (although that would be very a good way to actually be testing an API’s rate limits and get acquainted with throttling, I guess), let alone parse and process an actual HTTP response.

Such a user-defined function involves too many moving parts soldered together to be testable: making the code testable involves making the parts moving parts again, and yes it involves a proverbial blowtorch and lots of proverbial sparks flying everywhere.

Refactoring code to make it testable is a lot of fun, but the first step is, ideally, to fully grasp what’s going on and why.

If you aren’t familiar with using Application.OnTime in user-defined functions (only indirectly, because Application.OnTime calls, like run-time errors and many other members in the Excel object model, get “swallowed” when Excel is evaluating a UDF), it’s a pretty cool process that goes like this:

The calling cell contains the UDF’s return value just before the macro gets asynchronously invoked and produces its own output.

So if a UDF stored its arguments as key/value pairs in a global-scope dictionary, if all goes well and according to plan, the macro that runs a moment later gets to consume this data.

By storing the Application.Caller cell object reference in global scope, the side-effecting macro gets to know where to put its results table. There’s always the possibility that a second UDF overwrites this global state during the split-second between the moment a first UDF writes it and the moment the scheduled asynchronous read of this global state actually happens: it’s important to keep in mind that Ambient Context does not inherently address this particular problem; the state is still global and mutable from anywhere in the code, and there is never any guarantee that any scope will run to completion before the VBA runtime decides it’s an asynchronous callback’s turn to run.

The Application.Caller member isn’t going to return a Range reference when it’s not a worksheet cell invoking the function, we can’t afford to wait for Application.OnTime, and we’d like to avoid actually invoking any Win32 API functions during a test. That UDF simply isn’t testable as-is.

The solution is to introduce an abstraction to wrap the Application members we need, and make the side-effecting UDFs depend on that abstraction instead of invoking Application members directly.

AbstractionThe untestable code might look something like this:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    Set SomeGlobalRange = Application.Caller.Offset(RowOffset:=1)
    With SomeGlobalDictionary
        .Clear
        .Add "FirstParameter", FirstParameter
        .Add "SecondParameter", SecondParameter
    End With
    ScheduleMacro
End Function

Where ScheduleMacro involves a Win32 API call to schedule the execution of an Execute procedure that handles the Application.OnTime scheduling of the actual side-effecting procedure.

We want to be able to write a test that invokes this SideEffectingUDF function, and determines whether Application.Caller was invoked: Application.Caller is a dependency here, and for the test to be able to fulfill its purpose we must find a way to inject the dependencies so they can be controlled by the test, from outside the function.

Note how narrow such a test would be: it asserts that the UDF gets the Application.Caller reference, nothing more. Other tests would be similarly narrow, but for other things, and we don’t want a failing Application.Caller member call to interfere with these other tests by throwing error 91 before the test gets to do its thing. Whether or not we need to know if a UDF does or does not invoke Application.Caller, we still need a way to abstract the dependency away, to stub it.

You may be thinking “oh that’s easy” and be tempted go down this path:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    If TypeOf Application.Caller Is Excel.Range Then
        ' caller is a worksheet cell
        Set ThatGlobalCell = Application.Caller.Offset(RowOffset:=1)
        With ThatGlobalDictionary
            .Clear
            .Add "FirstParameter", FirstParameter
            .Add "SecondParameter", SecondParameter
        End With
        ScheduleMacro "SideEffectingMacro"
    Else
        ' caller is a unit test
        Set ThatGlobalCell = Sheet1.Cells(1, 1) ' tests can read as "Application.Caller was invoked"
        With ThatGlobalDictionary
            .Clear
            .Add "FirstParameter", FirstParameter
            .Add "SecondParameter", SecondParameter
        End With
        SideEffectingUDF = True ' tests can read this as "macro was scheduled"
    End If
End Function

While it does solve the problem of avoiding to involve Application.Caller and actually scheduling the macro in tests, there are several reasons why this is a terrible idea:

  • Function now has a higher Cyclomatic Complexity metric by virtue of now needing more execution paths to accomplish the same thing: the code is objectively and measurably more complex now, on top of being repetitive (copying & pasting any code is usually a sign something is off!).
  • Tests are no longer executing the same code as normal execution does, which means tests are now testing code that only exists because there are tests: the normal execution path remains untested, and that makes the tests worthless busy-work.
  • Tests now need to be making assumptions about how the function is implemented, which effectively casts the code into concrete instead of making it simpler & safer to modify.
  • Dependencies should be abstractions, and code should be working with these abstractions without regards to their actual implementation: code that acts differently when the runtime type of an abstraction is X vs when it’s Y, violates the Liskov Substitution Principle, the “L” of “SOLID” that essentially states that all implementations of a given abstraction should be treated the same.

The killer is the second bullet: if the sole purpose of a test is to determine whether Application.Caller was invoked, and the UDF says “oh we’re in a test, here yeah everything is all right, see”, then a UDF that does nothing but returning True would pass that test, and that is why the test is useless, as is the code duplication.

When we write a test whose purpose is to determine whether the Application.Caller dependency was invoked, the test should FAIL when it isn’t, otherwise that test is just as good as a deleted one.

Now picture the UDF looking like this instead:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    With AppContext.Current
        Set .Target = .Caller.Offset(RowOffset:=1)
        .Property("FirstParameter") = FirstParameter
        .Property("SecondParameter") = SecondParameter
        .ScheduleMacro
    End With
End Function

The UDF now only has one dependency, AppContext.Current, which is global state by virtue of being accessible from the default instance of the AppContext class; we’re tightly coupled with the AppContext class, but only because we specifically want to access global state in a controlled manner, and the rest of the function is working against the IAppContext abstraction. The state that was formerly a Range and a Dictionary globally-scoped declaration is now properly encapsulated in an object, and the “current” AppContext is coming into existence from outside the UDF scope (but still from within our own code), which is exactly what we want: now unit tests get to inject a TestContext instead of manipulating global state.

So how do we get there?


Implementation

The basic idea is to pull our dependencies from global scope, encapsulate them in a class module, …and then making an instance of that class an “ambient context” that’s still globally accessible, but that introduces the necessary abstraction needed to make that UDF fully testable.

We want to leverage the default instance of the AppContext class, so we’re going to need an AppContext class with a @PredeclaredId annotation and a Current property getter that returns some IAppContext instance. If you’re familiar with factory methods this will feel a bit like something you’ve already seen:

'@PredeclaredId
Option Explicit
Implements IAppContext
Private Type TState
    Factory As IAppContextFactory
    Current As IAppContext
    '...    
End Type
Private This As TState
'@Description "Gets the current (or default) context."
Public Property Get Current() As IAppContext
    Errors.GuardNonDefaultInstance Me, AppContext, TypeName(Me)
    
    If This.Current Is Nothing Then
        Set This.Current = This.Factory.Create
        Errors.GuardNullReference This.Factory, TypeName(Me), "IAppContextFactory.Create returned Nothing."
    End If
    
    Set Current = This.Current
End Property
Private Property Get IsDefaultInstance() As Boolean
    IsDefaultInstance = Me Is AppContext
End Property
Private Sub Class_Initialize()
    If IsDefaultInstance Then
        'must initialize context with sensible defaults:
        Set This.Factory = New AppContextFactory
        Set This.TimerProvider = New TimerProvider
    Else
        Set This.Properties = New Scripting.Dictionary
        'we want all instances to have the same provider instance:
        Set This.TimerProvider = AppContext.TimerProvider
    End If
End Sub

We don’t normally want Property Get procedures to be side-effecting, but with an Ambient Context what we want is to yield a cached instance of the context class, so when no instance already exists, the getter caches the created object so it’s readily available next time, making it accessible from anywhere in the project (aka “global”).

Abstract Factory

The default instance of the AppContext class does not know what the actual runtime type of the Current context is, and this polymorphism is the cornerstone making it all work: the Current property getter is responsible for caching the new context instance, but not for actually creating it. That’s the job of an abstract factory (the IAppContextFactory dependency) that we conveniently initialize to a concrete factory type that creates instances of… the AppContext class.

Why involve an abstract factory to create an instance of the class we’re in, you might ask? Because that’s only the default implementation, and with ability to Set the Factory reference from outside the class, tests can inject a different factory implementation, say, this one named TestContextFactory:

'@Folder "Tests.Stubs"
'@ModuleDescription "A factory that creates TestContext instances."
Option Explicit
Implements IAppContextFactory
Private Function IAppContextFactory_Create() As IAppContext
    Set IAppContextFactory_Create = New TestContext
End Function

Meanwhile the actual UDFs would be using this AppContextFactory implementation by default:

'@Folder "AmbientContext"
'@ModuleDescription "A factory that creates AppContext instances."
Option Explicit
Implements IAppContextFactory
Private Function IAppContextFactory_Create() As IAppContext
    Set IAppContextFactory_Create = New AppContext
End Function

The AppContext.Current property will happily cache an instance of any class whatsoever, as long as it implements the IAppContext interface. The abstract factory pattern allows us to spawn an instance of a class at run-time, of which we don’t necessarily know the actual “concrete” type at compile-time.

In other words just by reading the UDF code, there is no way to tell whether AppContext.Current is going to be an AppContext or a TestContext instance, and that is exactly what we want.

What this abstraction achieves, is the decoupling that is necessary for a test to be able to inject a TestContextFactory and take control of everything UDFs can do with an IAppContext object.

Context State

We know the context needs to wrap Application.Caller and Application.OnTime functionality. We know we need a Target cell, we need some Properties in an encapsulated Scripting.Dictionary. If we crammed all that into a single interface, we would get a somewhat crowded IAppContext interface that doesn’t quite adhere to the Interface Segregation Principle and Open/Closed Principle guidelines.

By abstracting away the macro-scheduling functionality into its own IAppTimer interface, and making that interface an abstract dependency of the context class, we can stub that abstract dependency and write tests for the logic of the context class itself. Without this extra step, the context can be stubbed to test the code that uses it, but the macro-scheduling bits would remain untestable.

Treating IAppTimer as a dependency of the context makes the IAppContext interface look like this:

'@Folder "AmbientContext.Abstract"
'@ModuleDescription "Encapsulates the data and macro invocation mechanism for a side-effecting UDF."
'@Interface
Option Explicit
'@Description "Gets the cell that invoked the currently-running user-defined function (UDF), if applicable; Nothing otherwise."
Public Property Get Caller() As Range
End Property
'@Description "Gets or sets the target reference cell that the side-effecting macro shall use."
Public Property Get Target() As Range
End Property
Public Property Set Target(ByVal Value As Range)
End Property
'@Description "Gets or sets a named value representing data passed between the UDF and the side-effecting macro."
Public Property Get Property(ByVal Name As String) As Variant
End Property
Public Property Let Property(ByVal Name As String, ByVal Value As Variant)
End Property
'@Description "Gets an array of all property names."
Public Property Get Properties() As Variant
End Property
'@Description "Gets or sets the IAppTimer dependency."
Public Property Get Timer() As IAppTimer
End Property
Public Property Set Timer(ByVal Value As IAppTimer)
End Property
'@Description "Clears all held state."
Public Sub Clear()
End Sub

Note that we’re not exposing the dictionary itself: rather we expose an indexed property to get/set the key/values, then by exposing the dictionary keys, the calling code gets to do everything it needs to do, without ever directly interacting with a Scripting.Dictionary, a bit as if the AppContext class were a custom collection.

Now, there’s something special about the IAppTimer dependency: we absolutely cannot have each context instance spawn timers willy-nilly, because a leaking Win32 timer is a nice way to send Excel up in flames. Yet, we need each context instance to be able to access the same IAppTimer reference.

A good way to solve this is by introducing a Provider mechanism. The interface looks like this:

'@ModuleDescription "A service that ensures all clients get the same IAppTimer instance."
'@Interface
Option Explicit
'@Description "Gets an IAppTimer instance."
Public Property Get Timer() As IAppTimer
End Property

What I’m calling a “provider” here is exactly the same mechanism that provides the IAppContext instance (a Property Get procedure that gets a cached object or creates the object and caches it), except no abstract factory needs to get involved here. The class also makes a very convenient place to put the name of the Win32 callback macro procedure:

Option Explicit
Implements ITimerProvider
Private Const MacroName As String = "Execute"
Private Property Get ITimerProvider_Timer() As IAppTimer
    Static Instance As AppTimer
    If Instance Is Nothing Then
        Set Instance = New AppTimer
        Instance.MacroName = MacroName
    End If
    Set ITimerProvider_Timer = Instance
End Property

TimerProvider the only object that creates a New AppTimer: as a result, every AppContext instance created from this factory is going to use the same IAppTimer reference, and if we need to write tests for AppContext we can inject a TestTimerProvider that returns a TestTimer.

Note that the “provider” mechanism is an implementation detail of AppContext: the TestContext doesn’t need this, because it just initializes itself with a TestTimer, while AppContext initializes itself with a TimerProvider that gets the IAppTimer instance. Being an implementation detail, there’s no ITimerProvider dependency on the abstract interface.


The Tests

The previously-untestable user-defined functions now look like this:

Public Function TestUDF(ByVal SomeParameter As Double) As Boolean
    On Error GoTo CleanFail
    
    With AppContext.Current
        
        Set .Target = .Caller.Offset(RowOffset:=1)
        .Property("Test1") = 42
        .Property("Test2") = 4.25 * SomeParameter
        .Timer.ExecuteMacroAsync
        
    End With
    
    TestUDF = True
CleanExit:
    Exit Function
CleanFail:
    TestUDF = False
    Resume CleanExit
    Resume
End Function

The code isn’t very far off from the original, but now we can write a test that passes when a UDF invokes the Caller member; when the UDF is invoked from a worksheet cell, IAppContext.Caller returns the Range reference returned by Application.Caller; when the exact same code is invoked from a test, IAppContext.Caller returns a bogus/test cell reference.

Similarly, when a UDF invokes IAppTimer.ExecuteMacroAsync, a Win32 API call schedules the execution of a callback macro that itself invokes Application.OnTime to schedule the execution of a side-effecting macro that can consume the state and alter the target range and worksheet; when the exact same code is invoked from a test, IAppTimer.ExecuteMacroAsync simply notes that it was invoked, …and does nothing else.

This test passes when IAppTimer.ExecuteMacroAsync is invoked from a UDF, and would fail if the UDF didn’t invoke it:

'@TestMethod("Infrastructure")
Private Sub TestUDF_SchedulesMacro()
    'inject the test factory:
    Set AppContext.Factory = New TestContextFactory
    
    'get the test context:
    Dim Context As TestContext
    Set Context = AppContext.Current
    
    'test factory already stubbed the timer:
    Dim StubTimer As TestTimer
    Set StubTimer = AppContext.Current.Timer
    
    'run the UDF:
    Dim Result As Boolean
    Result = Functions.TestUDF(0)
    
    'Assert that the UDF has invoked IAppContext.ScheduleMacro once:
    Const Expected As Long = 1
    Assert.AreEqual Expected, StubTimer.ExecuteMacroAsyncInvokes, "IAppTimer.ExecuteMacroAsync was invoked " & StubTimer.ExecuteMacroAsyncInvokes & " times; expected " & Expected
End Sub

Cohesion

Ambient Context is a fantastic tool to address cross-cutting concerns and leverage global scope in a way that does not hinder testing. It’s also useful for storing state and dependencies that would otherwise be held in global scope, when passing that state and dependencies as normal parameters isn’t possible.

That makes it a somewhat dangerous pattern: one must keep in mind that the state is still global, and globals that don’t need to be global, should not be global. By defining an explicit interface for the context (like IAppContext), we not only end up with neat abstractions: we also make it harder for the context interface to grow new members and for the class to become an over-engineered Globals.bas module.

Interfaces shouldn’t be designed to change. In .NET the IDisposable interface only mandates a parameterless Dispose method; IEquatable is all about an Equals method. A factory interface shouldn’t need more than a carefully parameterized Create method that only takes arguments that can’t be dependencies of the factory instance: we want to avoid modifying existing interfaces as much as possible, and since none of us can really predict the future… the best way to do that is to keep interfaces as slim as possible. Cohesion is what we’re after: a module that is cohesive will feel like everything is exactly where it should be.

If the members of a module don’t feel like they’re a cohesive and complete group of closely related methods, there’s a greater chance that more members need to be added in the future – and you will want to avoid that. Of course the “and complete” part can mean a few growing pains, but in general naming things is a great way to avoid the pitfalls of treating the context as some “state bag” where we just lazily stuff state without thinking it through. In that sense AppContext is probably one of the worst possible names for this: perhaps a FunctionContext that only exposes the Caller member would be a cleaner approach?

In the real world, ambient context is for things like System.Threading.Thread.CurrentThread in .NET: it’s very specialized, with a very specific purpose, and we don’t see it very often. Authorization mechanisms might use it too.

In VBA-land, I’ve never once needed to implement it until I came upon this side-effecting UDF scenario needing unit tests; macros are definitely much simpler to refactor for testability!

Synchronizing your VBA project with files in a folder

Sync Project commands in the Code Explorer context menu.

VBA code being embedded in a host document might be very practical for certain aspects of both development and deployment, but let’s face it, it also makes using source control (e.g. git, SVN, mercurial, etc.) with VBA projects rather frustrating. As a developer, committing source code to a repository is usually a very simple task, because the code files live in the file system, and git can track changes and additions. With VBA, we commit the code that’s exported on the file system, the host document may contain different code, and merging remote changes implies exporting your code again, working out any merge conflicts with the exported code, then re-importing the merged changes into the host document.

Which wouldn’t be so bad… if the VBE had a nice way of exporting more than one single file at a time, and if importing files had an option to replace modules when they already exist… instead of importing the module with a “1” suffix as if that were something anyone ever needed to do!

Did You Know?
The VBE’s “Import File…” command doesn’t make it very obvious, but it does support importing multiple files at once. Simply select multiple files when prompted for what file to import!
Another little known feature of the VBE (one of the few Rubberduck hasn’t enhanced yet) is that its Project Explorer toolwindow is a drag-and-drop destination that can accept files you dragged from the Windows Explorer (⊞+E).

Rubberduck’s Export Project… command prompts for a folder, and then proceeds to export all modules there – overwriting any existing files in that folder. By default, the hotkey for that command is Ctrl+Shift+E, but it can be reconfigured to any key combination you like.

The context menu of Rubberduck’s Code Explorer toolwindow has a Sync Project sub-menu that offers two commands:

  • Update Components from Files…
  • Replace Contents from Files…

Rubberduck in general needs more documentation, but exactly what these “Sync Project” commands do is something that goes well beyond just using Rubberduck and they really deserve all the attention they can get, since they exist to facilitate an actual development workflow that looks something like this:

Import source files; make code changes; export source files; commit, push, pull, merge; rinse & repeat!

Update Components

This command prompts for source code files to import into your project.

  • If the project already contains a module with the same name as one of the imported files, the module is considered the same, and replaced with the imported version.
  • If the project does not already contain the imported modules, they’re simply added to the project.
  • If the project contains modules with different names than the imported files, these modules remain in the project.

Replace Contents

This command also prompts for source code files to import into your project, but the selected files will replace everything in the current project. Because this command is potentially destructive, a confirmation is required.

  • The entire project becomes the selected files.
  • If the project contained (non-document) modules before, they are removed before the import is performed.

We have a number of open issues (here, here, and here) about getting the “export project” command to take the @Folder annotations into account, and transpose the virtual folder hierarchy into an actual folder hierarchy on the file system, which would play nicely with version control and would help better organize a VBA repository.

Rubberduck 2.5.1

Here we are again, some 580+ commits and 1000+ modified files later, with 10 contributors involved (with particular thanks to @MDoerner and @BZngr, and honorable mentions to @IvenBach and @testingoutgith1) in over 60 pull requests since the last release: time to look back at what was done and call it version 2.5.1! If you’ve been keeping up with pre-release builds, none of this is going to be news to you, but with over 9.1K downloads of v2.5.0 a lot of you seem to prefer to upgrade less often but more significantly, so here’s a timely recap.

But first, let’s get the known problems out of the way.

Known Issues

Making a VBIDE add-in means we can’t know or assume what our host application is going to be, and different hosts sometimes wire things up differently – and this can spell trouble under certain circumstances. Making a VBIDE add-in in .NET has even further implications: while it’s how we can extend a 64-bit VBE, it’s also causing various type cast errors/exceptions when other add-ins are loaded.

Possible Crash

The Visual Basic Editor has a peculiar way of loading its add-ins: Rubberduck’s (and any other VBIDE add-in’s) entry point is invoked by the VBE before the VBE has completely finished constructing itself – accessing the object model too early can throw COM exceptions that take down Rubberduck as it initializes.

Normally Rubberduck initializes itself, then proceeds to parse the project (if it’s an empty project then the bulk of that is Rubberduck loading everything defined in VBA7.DLL and the type library for the host application’s object model) – normally if the VBE isn’t ready for this yet, we bail out and don’t access any objects and the “Refresh” button says “Pending” instead of “Ready”, and by the time you manually run that command the VBE has finished initializing and the only annoyance is that the initial parse isn’t automatic.

But in certain host applications (Microsoft Access being a known one, but I’ve seen it happen in Excel as well, although not with a recent build), sometimes the VBE actually isn’t ready to take member calls against its own object model, and the result is a COM exception that is either caught and then Rubberduck says it can’t initialize, or thrown several layers deeper, uncaught, and then everything goes up in flames.

Loading Rubberduck manually from the VBE’s Add-Ins Manager is sure annoying, but is really the only 100% sure-shot way to load any VBIDE add-in with a properly initialized VBE, regardless of the host application. Note that the installer registers Rubberduck as a VBE add-in with the LoadBehavior flag set to load at startup. If Rubberduck blows up at startup or fails to initialize, consider editing this configuration to make it load manually (exception details should normally be logged for the first start-up).

Heavy on Memory (RAM)

Rubberduck has always used a lot of memory to keep all the code metadata handy and cache a lot of things to improve processing performance. Working on a large legacy project that generates lots of inspection results can grind the main thread of the host process to a halt as the toolwindow renders the many objects (whether the toolwindow is displayed or not).

Unless you are discovering Rubberduck with a new, empty VBA project, consider first reviewing the settings – can’t hurt to review them either way:

  • Disable “run inspections automatically on successful parse”, so that they only run if you explicitly refresh them from the Inspection Results toolwindow;
  • Set inspection severity to “Do not Show” for inspections that could produce thousands upon thousands of results, like “use meaningful names” if you’re into Hungarian Notation for example, or “use of bang operator” if that’s the only way you’re ever accessing recordset fields in Access;

Other general performance tips:

  • Rubberduck parses per-module, so when you leave a module after modifying it, trigger a parse – by the time you’re in the other module and have scrolled to where you want to be and are in that mindset, the modified module will have processed.
  • Reduce coupling: the more modules are inter-dependent, the more modifying a module requires re-resolving identifier references in the dependent modules.
  • Avoid complex grammar: bang operators, among other code constructs, are somewhat ambiguously defined and ultimately parse in two passes, with the first one failing. The standard member call syntax parses faster, in a single parser pass.

Undesirable Interactions

If you are using the free but rather old 32-bit MZ-Tools 3.x productivity add-in, this section shouldn’t be a concern. However MZ-Tools 8.x was rewritten from the ground up, ported from VB6 into .NET-land, and while its author Carlos Quintero took extraordinary steps to isolate MZ-Tools from other in-process .NET add-ins and has issued recommendations for Rubberduck to do the same, …there is still a chance the two add-ins bump into each other; if MZ wins, RD is essentially bricked.

MZ-Tools normally runs inside its own .NET AppDomain, except when hosted in AutoDesk products (Inventor, AutoCAD), which implement VBE initialization in a way that breaks MZ-Tools’ startup mechanics – up until recently it was assumed this collision only happens in AutoDesk hosts, but a recent support ticket involving Microsoft Access was filed and implicates interactions with MZ-Tools.

This issue manifests itself with InvalidCastException being thrown at various points, often during initialization, or later during parse: the exception message involves attempting to cast COM objects like Microsoft.Vbe.interop._VBProject into types such as VBClassicExtensibility.VBProjectClass, where VBClassicExtensibility is defined by MZ-Tools, not Rubberduck.

One thing that can be attempted to mitigate this problem, would be to set MZ-Tools to not load on start-up, and manually load it after Rubberduck has initialized… but sadly this cross-add-in confused COM marshaling is simply not supposed to happen given MZ-Tools’ AppDomain mechanics, and we don’t really have any solutions for this – same as we don’t really have any solution for cases where COM registrations are broken (e.g. when multiple Microsoft Office product versions are running side-by-side but were not installed in chronological order – that’s an officially unsupported scenario, per Microsoft).

As a result, using Rubberduck together with other .NET-based add-ins cannot be considered a completely fail-safe scenario, and we have to treat this as a “known issue” here, and the work-around sucks and boils down to “drop other add-ins, or drop Rubberduck”. This is actually probably true at various degrees of all .NET-based VBIDE add-ins.

On the bright side, we have taken several steps in this release cycle to prepare the ground not only to get Rubberduck to build correctly in the latest & greatest Visual Studio 2019, but also to get most of our build process ready for .NET Core – so when .NET Core 5 is released in a few weeks, we can try to get Rubberduck to run on the shiny new Core framework, which theoretically makes AppDomain completely moot, and so we have very little incentive to work on getting Rubberduck to load its own AppDomain the way MZ-Tools does: if we can make Rubberduck build and run on .NET Core 5, then this problem should simply disappear… in theory.


Enhancements & New Features

This release does not introduce any new top-level Rubberduck features, but makes a number of very useful user-facing additions nonetheless, on top of the many under-the-hood enhancements made this cycle.

Surfacing Annotations

One of the most useful and powerful features of Rubberduck, annotations are special comments that use a particular but relatively simple syntax – these are all grammatically valid:

'@AnnotationName("text")
'@AnnotationName "text"
'@AnnotationName("text", 123) : there can be comments here
'@AnnotationName "text", 123
'@AnnotationName Identifier1, Identifier2, ...IdentifierN

While the syntax itself is reasonably simple to use, the problem was that unless you knew every supported annotation, well then the @AnnotationName part kind of had to be a guess.

Rubberduck uses these annotations for various purposes, from identifying Rubberduck test modules to keeping hidden module/member attributes in sync with these comments (this includes the ability to document and literally map Excel hotkeys using VBA comments). You can read everything we’ve documented about them on the project’s website.

In Rubberduck 2.5.1.x builds, we finally get new commands in the code pane and Code Explorer context menus, that bring up a dialog that gives us all the options to easily and safely annotate everything that can be annotated, using the correct syntax and arguments every time:

a window describing the member to annotate and the annotation to add
Select the annotation to add, and supply the argument values. Magic!
Adding an @Ignore annotation to ignore a specific inspection, can now be done without needing to know the exact name we decided to call that inspection class in Rubberduck’s source code!

Encapsulate Field Enhancements

This particular refactoring has seen a terrific enhancement that makes it very easy to cleanly and quickly turn a set of public fields into Property Get/Let members, with a Private Type TClassName and a module-scope Private this As TClassName instance variable – and all properties automatically reading/writing from it. You can see this feature in action in the previous article.

The new Wrap fields in Private Type functionality leverages the very useful Private Type pattern.

Unit Testing

The Test Explorer now makes it easier to ignore one or more specific selected tests, or all tests under a given category/group, by exposing the context menu commands that add or remove the @TestIgnore annotation as appropriate; having this command in the Test Explorer makes it possible to annotate a test method while a completely different and unrelated module is maximized in the code editor.

Hmm, …these icons are out of control, aren’t they… expect that to change soon-ish…

Running tests while results are regrouped by outcome is still a known issue (tests run painfully slow because the UI thread is busy re-sorting and re-rendering the list every time a test finishes running), but everything works much more smoothly when the tests are regrouped any other way.

Code Inspections

Ok the logic for that is currently broken on the website (working on that… somewhat) so this is much harder than it will be in the future when I’ll just look at the [New] tab on the inspections page of the website and every inspection that is in [next] but not in [main] will be listed right there. From skimming through every pull request merged since the last release:

  • Function return value not used inspection now more clearly targets call sites, and now ignores non-user code.
  • Function return value is always discarded inspection is the old “return value not used” logic targeting the function itself, when none of the call sites capture the function’s return value.
  • Implicitly typed const inspection was added to flag Const declarations without an As clause to specify an explicit type.
  • Assignment not used inspection now correctly handles an assignment that is overridden in the next statement but first read in the RHS expression of the assignment.

Not user-facing but critically important nonetheless, is all the behind-the-scenes work done to simplify inspecting VBA code as much as possible. This cycle saw a tremendous amount of technical debt paid in the code inspections department, that pave the way for future enhancements like, say, having the ability to run inspections per-module; as the number of implemented inspections continues to grow, the ability to scope inspections in a more granular way is going to be very useful for our plan to eventually report inspection results in a custom code pane, with colored squiggly lines (that’s v3.x stuff, though).

Applying Quick-Fixes

The Code Inspections toolwindow has been updated with a context menu that makes it much simpler to apply a quick-fix to one or more inspection results; all available quick-fixes appear in the context menu, each with various options to apply to the selection. This menu is also shown by clicking the “Fix” drop-down menu from the toolwindow’s toolbar.

The weird “fix all occurrences in procedure/module/project” link buttons in the bottom panel are now gone, completely replaced with a more flexible menu system.

Code Explorer Enhancements

The Code Explorer toolwindow context menu now includes a move to folder command to easily organize your project components, and there’s a new setting to enable drag and drop in Code Explorer (disabled by default):

A new setting enables dragging & dropping capabilities in the Code Explorer toolwindow.

With that setting enabled, you can now move a code file to an existing folder, simply by dragging it from its location and dropping it onto a folder node! The setting was made to require being explicitly enabled, to avoid discovering that feature by accidentally dragging a code file somewhere (that did happen in beta/testing).

Website Integration

You may have noticed rubberduckbva.com is “under reconstruction”. Before that, the site’s content was mostly static, with only the inspections list assembled from content parsed from the Rubberduck.CodeAnalysis.xml, downloaded periodically off GitHub through unauthenticated REST API requests. That worked relatively well until a spike in traffic occurred following the release of Excel Insights, a book collectively authored by 24 Microsoft MVP Award recipients, including myself: suddenly a bug in the caching mechanism became very apparent when the site’s home page started getting served as a wonderful HTTP 500 error page.

Since the website hosting came with a SQL Server database that I wasn’t using, I decided to start using it and make the site pull the content from there rather than directly off the GitHub API. I wrote a small console application, got myself a private API key to make properly authenticated REST API requests, and now there’s a scheduled task running on a virtual machine in my garage, that runs this application every 30 minutes to update the installer asset download counts and verify whether the XML documentation assets are up-to-date for the latest pre-release build, and then proceeds to parse the XML docs and generate/update the database records: the website simply pulls the data from the database at every request, and now the website couldn’t bust GitHub’s REST API limits even if it tried.

Documenting Rubberduck is challenging: there are a lot of features, and there isn’t really any user guide that’s constantly being kept up-to-date. The wiki on the repository is terribly outdated in several parts, and the feature announcements on this blog are nice when you’re following the project along its journey, but in a dream world using Rubberduck would be content found on the website, and contributing to Rubberduck would be content found in the repository’s wiki.

Parsing the xml-docs into website content is a step in that direction. Nobody wants to maintain documentation, but xml-doc comments are part of the source code, and we even put source code analyzers in place that will break the build if we try to introduce an inspection, quick-fix, or annotation, without properly documenting it with xml-docs.

Every single inspection, quick-fix, and annotation has thorough documentation, including code examples that may span multiple modules. But best of all, every single page generated from source code includes an “edit this page” link that points to a GitHub page where you can literally edit the xml documentation for the inspection you were looking at (and review its source code if you like – it’s the same file!) – and just like that, all you need to contribute to Rubberduck (yes, single-character typo fixes and additional useful code examples are welcome!) is a GitHub login!

Every page generated from xml documentation includes an “Edit this page” link at the bottom.

Because of how the request routing on the website was setup, it was easy to make Rubberduck link in-app inspection results to this website content – you can now click a URL at the bottom of the inspection results toolwindow (this will likely change one way or another in the future) to bring up the details page with the xml documentation and code examples:

The URL at the bottom of the adjustable panel (it’s possible you need to scroll the content or make the bottom panel tall enough to show it) opens that URL in a new tab in whatever your default browser is.

The in-app content exists as localized resources, lovingly translated by our international contributors; the website content however, is only available in English, because we’re absolutely not going to start translating XML comments in the source code. But the processed content actually resides in a database, so it wouldn’t be impossible to eventually localize it at that level, as well – we’re just not there yet at all at the moment.

The website content is often different than the in-app content, and over time it should be expected to grow more and more in-depth, thorough and descriptive.

The revamped rubberduckvba.com domain will ultimately span 3 sites, including api.rubberduckvba.com, which will eventually expose REST endpoints for various purposes, including Rubberduck’s “check for newer version at startup” feature; for example something like api.rubberduckvba.com/indenter.json that might accept some VBA code in a JSON object in the request’s body, and respond with a JSON object in the response body containing the indented VBA code. Or api.rubberduckvba.com/inspect.json that might also accept some VBA code (presumably along with some metadata about the module type) in a JSON object in the request’s body, but could respond with a JSON object representing all inspection results for it. It’s still all just brewing ideas at this stage. The other subdomain, admin.rubberduckvba.com, is going to host a web-based, GitHub-authenticated version of the VBA program I’m going to present in my next article: a tool for managing and editing most of the website’s content.

Moving Forward

Rubberduck is becoming a pretty mature code base and has an ever-increasingly better abstracted internal framework/API to understand and manipulate VBA code. The project now builds with the latest version of Microsoft Visual Studio 2019, and we’re hoping COM Automation support in .NET Core 5 will allow us to build an increasing number of the project’s components with it; I’m thinking the “main” type library is better off under the old tech, but I’ll be more than happy to be proven wrong here!

A rough roadmap for v2.5.1.x might include…

  • More resolver capabilities unlocked by fully leveraging our internal ITypeLib API
  • Syntax-highlighted preview of the changes for all refactorings (and quick-fixes?)
  • Some Code Path Analysis API, to help implement the more complex inspection ideas
  • The Moq wrapper mocking framework
  • Block Completion, maybe
  • Anything else anyone feels like contributing to the project!

The goal for the rest of the 2.x cycle is to prepare everything that needs to happen in order to implement our own custom code editor window – giving us full, complete control over every single token and everything that can possibly happen in that custom code pane. We’re talking code folding, custom theming, that kind of thing.


‘Main’ vs ‘Master’ – Why it Matters

You may have noticed (or not) that the website is now labeling “main” the branch formerly known as “master”. As a French native, “master/slave” terminology in any non-actual master/slave context has always sounded a bit weird to me, but I’m a white man in North America (although not in the US) and I get the luxury to read these words and decide that they don’t affect me, and reflecting on the events of this summer has taught me that this is part of what white privilege is.

I don’t do political & editorial commenting, I prefer to leave that space to others – but I warmly recommend watching 13th on Netflix and, if you can handle it, When They See Us. Black lives matter, it’s simple – and no, it doesn’t say “but white lives don’t” anywhere between the lines.

So yes, we’re going to be taking steps to alter the language in Rubberduck a bit in this cycle. The “master” branch will be renamed to “main”, yes, but we’ll also come up with a better term for “white-listing” identifier names. It won’t stop racism, no, indeed. But it won’t hurt anyone, either.

Peace!

Builder Walkthrough

Note: this article was updated 2021-04-13 with screenshots from the latest v2.5.1.x pre-release build; the extract interface enhancements shown will green-release with v2.5.2.

We’ve seen how to leverage the default instance of a class module to define a stateless interface that’s perfect for a factory method. At the right abstraction level, most objects will not require more than just a few parameters. Often, parameters are related and can be abstracted/regrouped into their own object. Sometimes that makes things expressive enough. Other times, there’s just nothing we can do to work around the fact that we need to initialize a class with a dozen or more values.

The example code for this article can be found in our Examples repository.

A class with many properties

Such classes are actually pretty common; any entity object representing a database record would fit the bill. Let’s make a User class. We’re using Rubberduck, so this will be quick!

We start with a public field for each property we want:

Option Explicit
Public Id As String
Public UserName As String
Public FirstName As String
Public LastName As String
Public Email As String
Public EmailVerified As Boolean
Public TwoFactorEnabled As Boolean
Public PhoneNumber As String
Public PhoneNumberVerified As Boolean
Public AvatarUrl As String

Now we hit Ctrl+` to trigger a parse, right-click any of the variables and select Encapsulate Field from the Refactor menu (or Ctrl+Shift+F if you haven’t tweaked the default hotkeys):

The command will be disabled if the caret/selection isn’t on a public field, or if the module has been modified since the last time Rubberduck parsed it.

Check the wrap fields in private type box, then click the Select all button and hit OK.

The Encapsulate Field refactoring also lets you rename the properties and their respective backing field.

Now the module looks like this, and all you had to do was to declare a bunch of public fields:

Option Explicit
Private Type TUser
    Id As String
    UserName As String
    FirstName As String
    LastName As String
    Email As String
    EmailVerified As Boolean
    TwoFactorEnabled As Boolean
    PhoneNumber As String
    PhoneNumberVerified As Boolean
    AvatarUrl As String
End Type
Private this As TUser
Public Property Get Id() As String
    Id = this.Id
End Property
Public Property Let Id(ByVal value As String)
    this.Id = value
End Property
Public Property Get UserName() As String
    UserName = this.UserName
End Property
Public Property Let UserName(ByVal value As String)
    this.UserName = value
End Property
Public Property Get FirstName() As String
    FirstName = this.FirstName
End Property
Public Property Let FirstName(ByVal value As String)
    this.FirstName = value
End Property
Public Property Get LastName() As String
    LastName = this.LastName
End Property
Public Property Let LastName(ByVal value As String)
    this.LastName = value
End Property
Public Property Get Email() As String
    Email = this.Email
End Property
Public Property Let Email(ByVal value As String)
    this.Email = value
End Property
Public Property Get EmailVerified() As Boolean
    EmailVerified = this.EmailVerified
End Property
Public Property Let EmailVerified(ByVal value As Boolean)
    this.EmailVerified = value
End Property
Public Property Get TwoFactorEnabled() As Boolean
    TwoFactorEnabled = this.TwoFactorEnabled
End Property
Public Property Let TwoFactorEnabled(ByVal value As Boolean)
    this.TwoFactorEnabled = value
End Property
Public Property Get PhoneNumber() As String
    PhoneNumber = this.PhoneNumber
End Property
Public Property Let PhoneNumber(ByVal value As String)
    this.PhoneNumber = value
End Property
Public Property Get PhoneNumberVerified() As Boolean
    PhoneNumberVerified = this.PhoneNumberVerified
End Property
Public Property Let PhoneNumberVerified(ByVal value As Boolean)
    this.PhoneNumberVerified = value
End Property
Public Property Get AvatarUrl() As String
    AvatarUrl = this.AvatarUrl
End Property
Public Property Let AvatarUrl(ByVal value As String)
    this.AvatarUrl = value
End Property

I love this feature! Rubberduck has already re-parsed the module, so next we right-click anywhere in the module and select the Extract Interface refactoring, and check the box to select all Property Get accessors (skipping Property Let):

Extract Interface can automatically implement the extracted interface for you, and you can extract a public interface out of a private class.

Having a read-only interface for client code that doesn’t need the Property Let accessors makes an objectively cleaner API: assignments are recognized as invalid at compile time.

We get a read-only IUser interface for our efforts (!), and now the User class has an Implements IUser instruction at the top, …and these new members at the bottom:

Private Property Get IUser_ThingId() As String
    IUser_ThingId = ThingId
End Property

Private Property Get IUser_UserName() As String
    IUser_UserName = UserName
End Property

Private Property Get IUser_FirstName() As String
    IUser_FirstName = FirstName
End Property

Private Property Get IUser_LastName() As String
    IUser_LastName = LastName
End Property

Private Property Get IUser_Email() As String
    IUser_Email = Email
End Property

Private Property Get IUser_EmailVerified() As Boolean
    IUser_EmailVerified = EmailVerified
End Property

Private Property Get IUser_TwoFactorEnabled() As Boolean
    IUser_TwoFactorEnabled = TwoFactorEnabled
End Property

Private Property Get IUser_PhoneNumber() As String
    IUser_PhoneNumber = PhoneNumber
End Property

Private Property Get IUser_PhoneNumberVerified() As Boolean
    IUser_PhoneNumberVerified = PhoneNumberVerified
End Property

Private Property Get IUser_AvatarUrl() As String
    IUser_AvatarUrl = AvatarUrl
End Property

The scary part is that it feels as though if Extract Interface accounted for the presence of a Private Type in a similar way Encapsulate Field does, then even the TODO placeholder bits could be fully automated. Might be something to explore there… Update: automagic implementation completed!

Now we have our read-only interface worked out, if we go by previous posts’ teachings, , that is where we make our User class have a predeclared instance, and expose a factory method that I’d typically name Create:

'@Description "Creates and returns a new user instance with the specified property values."
Public Function Create(ByVal Id As String, ByVal UserName As String, ...) As IUser
    '...
End Function

Without Rubberduck, in order to have a predeclared instance of your class you would have to export+remove the class module, locate the exported .cls file, open it in Notepad++, edit the VB_PredeclaredId attribute value to True, save+close the file, then re-import it back into your VBA project.

With Rubberduck, there’s an annotation for that: simply add '@PredeclaredId at the top of the class module, parse, and there will be a result for the AttributeValueOutOfSync inspection informing you that the class’ VB_PredeclaredId attribute value disagrees with the @PredeclaredId annotation, and then you apply the quick-fix you want, and you just might have synchronized hidden attributes across the with a single click.

'@PredeclaredId
Option Explicit

When it’s a factory method for a service class that takes in dependencies, 2-3 parameters is great, 5+ is suspicious. But here we’re taking in values, pure data – not some IFileWriter or other abstraction. And we need quite a lot of them (here 10, but who knows how many that can be!), and that’s a problem, because this is very ugly:

Set identity = User.Create("01234", "Rubberduck", "contact@rubberduckvba.com", False, ...)

Using named parameters can help:

Set identity = User.Create( _
    Id:="01234", _
    UserName:="Rubberduck", _
    Email:="contact@rubberduckvba.com", _
    EmailVerified:=False, _
    Phone:="555-555-5555", _
    PhoneVerified:=False, _
    ...)

But the resulting code still feels pretty loaded, and that’s with consistent line breaks. Problem is, that limits the number of factory method parameters to 20-ish (if we’re nice and stick to one per line), since that’s how many line continuations the compiler will handle for a single logical line of code.

Surely there’s a better way.

Building the Builder

I wrote about this pattern in OOP Design Patterns: The Builder, but in retrospect that article was really just a quick overview. Let’s explore the builder pattern.

I like to design objects from the point of view of the code that will be consuming them. In this case what we want to end up with, is something like this:

Set identity = UserBuilder.Create("01234", "Rubberduck") _
    .WithEmail("contact@rubberduckvba.com", Verified:=False) _
    .WithPhone("555-555-5555", Verified:=False) _
    .Build

This solves a few problems that the factory method doesn’t:

  • Optional arguments become explicitly optional member calls; long argument lists are basically eliminated.
  • Say Id and UserName are required, i.e. a User object would be invalid without these values; the builder’s own Create factory method can take these required values as arguments, and that way any User instance that was built with a UserBuilder is guaranteed to at least have these values.
  • If we can provide a value for EmailVerified but not for Email, or for PhoneVerified but not for Phone, and neither are required… then with individual properties the best we can do is raise some validation error after the fact. With a UserBuilder, we can have WithEmail and WithPhone methods that take a Verified Boolean parameter along with the email/phone, and guarantee that if EmailVerified is supplied, then Email is supplied as well.

I like to start from abstractions, so let’s add a new class module – but don’t rename it just yet, otherwise Rubberduck will parse it right away. Instead, copy the IUser interface into the new Class1 module, select all, and Ctrl+H to replace “Property Get ” (with the trailing space) with “Function With” (without the trailing space). Still with the whole module selected, we replace “String” and “Boolean” with “IUserBuilder”. The result should look like this:

'@Interface
Option Explicit
Public Function WithId() As IUserBuilder
End Function
Public Function WithUserName() As IUserBuilder
End Function
Public Function WithFirstName() As IUserBuilder
End Function
Public Function WithLastName() As IUserBuilder
End Function
Public Function WithEmail() As IUserBuilder
End Function
Public Function WithEmailVerified() As IUserBuilder
End Function
Public Function WithTwoFactorEnabled() As IUserBuilder
End Function
Public Function WithPhoneNumber() As IUserBuilder
End Function
Public Function WithPhoneNumberVerified() As IUserBuilder
End Function
Public Function WithAvatarUrl() As IUserBuilder
End Function

We’re missing a Build method that returns the IUser we’re building:

Public Function Build() As IUser
End Function

Now we add the parameters and remove the members we don’t want, merge the related ones into single functions – this is where we define the shape of our builder API: if we want to make it hard to create a User with a LastName but without a FirstName, or one with TwoFactorEnabled and PhoneNumberVerified set to True but without a PhoneNumber value… then with a well-crafted builder interface we can make it do exactly that.

Once we’re done, we can rename the class module to IUserBuilder, and that should trigger a parse. The interface might look like this now:

'@Interface
'@ModuleDescription("Incrementally builds a User instance.")
Option Explicit
'@Description("Returns the current object.")
Public Function Build() As IUser
End Function
'@Description("Builds a user with a first and last name.")
Public Function WithName(ByVal FirstName As String, ByVal LastName As String) As IUserBuilder
End Function
'@Description("Builds a user with an email address.")
Public Function WithEmail(ByVal Email As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
End Function
'@Description("Builds a user with SMS-based 2FA enabled.")
Public Function WithTwoFactorAuthentication(ByVal PhoneNumber As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
End Function
'@Description("Builds a user with an avatar at the specified URL.")
Public Function WithAvatar(ByVal Url As String) As IUserBuilder
End Function

Then we can add another class module, and type Implements IUserBuilder under Option Explicit, then hit Ctrl+` to parse. Unless you disabled the “check if code compiles before parsing” setting (it’s enabled by default), you should be seeing this warning:

The project can’t compile, because the interface isn’t implemented.

Click Yes to parse anyway (normally we only want compilable code, but in this case we know what we’re doing, I promise), then right-click somewhere in the Implements IUserBuilder statement, and select the Implement Interface refactoring:

Creating all these method stubs manually, or… letting Rubberduck create them all at once in a split-second?

The result is as follows, and makes a good starting point:

Option Explicit
Implements IUserBuilder
Private Function IUserBuilder_Build() As IUser
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithName(ByVal FirstName As String, ByVal LastName As String) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithEmail(ByVal Email As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithTwoFactorAuthentication(ByVal PhoneNumber As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithAvatar(ByVal Url As String) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function

We’re “building” an IUser object. So we have a module-level User object (we need the class’ default interface here, so that we can access the Property Let members), and each With method sets one property or more and then returns the current object (Me). That last part is critical, it’s what makes the builder methods chainable. We’ll need a Build method to return an encapsulated IUser object. So the next step will be to add a @PredeclaredId annotation and implement a Create factory method that takes the required values and injects the IUser object into the IUserBuilder instance we’re returning; then we can remove the members for these required values, leaving only builder methods for the optional ones. We will also add a value parameter of the correct type to each builder method, and make them all return the current object (Me). Once the class module looks like this, we can rename it to UserBuilder, and Rubberduck parses the code changes – note the @PredeclaredId annotation (needs to be synchronized to set the hidden VB_PredeclaredId attribute to True:

'@PredeclaredId
'@ModuleDescription("Builds a User object.")
Option Explicit
Implements IUserBuilder
Private internal As User
'@Description("Creates a new UserBuilder instance.")
Public Function Create(ByVal Id As String, ByVal UserName As String) As IUserBuilder
    Dim result As UserBuilder
    Set result = New UserBuilder
    
    '@Ignore UserMeaningfulName FIXME
    Dim obj As User
    Set obj = New User
    obj.Id = Id
    obj.UserName = UserName
    
    Set result.User = internal
    Set Create = result
End Function
'@Ignore WriteOnlyProperty
'@Description("For property injection of the internal IUser object; only the Create method should be invoking this member.")
Friend Property Set User(ByVal value As IUser)
    If Me Is UserBuilder Then Err.Raise 5, TypeName(Me), "Member call is illegal from default instance."
    If value Is Nothing Then Err.Raise 5, TypeName(Me), "'value' argument cannot be a null reference."
    Set internal = value
End Property
Private Function IUserBuilder_Build() As IUser
    If internal Is Nothing Then Err.Raise 91, TypeName(Me), "Builder initialization error: use UserBuilder.Create to create a UserBuilder."
    Set IUserBuilder_Build = internal
End Function
Private Function IUserBuilder_WithName(ByVal FirstName As String, ByVal LastName As String) As IUserBuilder
    internal.FirstName = FirstName
    internal.LastName = LastName
    Set IUserBuilder_WithName = Me
End Function
Private Function IUserBuilder_WithEmail(ByVal Email As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    internal.Email = Email
    internal.EmailVerified = Verified
    Set IUserBuilder_WithEmail = Me
End Function
Private Function IUserBuilder_WithTwoFactorAuthentication(ByVal PhoneNumber As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    internal.TwoFactorEnabled = True
    internal.PhoneNumber = PhoneNumber
    internal.PhoneNumberVerified = Verified
    Set IUserBuilder_WithTwoFactorAuthentication = Me
End Function
Private Function IUserBuilder_WithAvatar(ByVal Url As String) As IUserBuilder
    internal.AvatarUrl = Url
    Set IUserBuilder_WithAvatar = Me
End Function

Now, when I said default instances and factory methods (here too) are some kind of fundamental building block, I mean we’re going to be building on top of that, starting with this builder pattern; the Create method is intended to be invoked off the class’ default instance, like this:

Set builder = UserBuilder.Create(internalId, uniqueName)

The advantages are numerous, starting with the possibility to initialize the builder with everything it needs (all the required values), so that the client code can call Build and consume a valid User object right away.

Side note about this FIXME comment – there’s more to it than it being a signpost for the reader/maintainer:

    '@Ignore UserMeaningfulName FIXME
    Dim obj As User

By default only TODO, BUG, and NOTE markers are picked up, but you can easily configure Rubberduck to find any marker you like in comments, and then the ToDo Explorer lets you easily navigate them all:

Rubberduck has a ToDo Explorer toolwindow that can be configured (click the cogwheel icon) to pick up “FIXME” anywhere in comments, anywhere in the project. Or “HERE BE DRAGONS”.

Another noteworthy observation:

'@Ignore WriteOnlyProperty
'@Description("For property injection of the internal IUser object; only the Create method should be invoking this member.")
Friend Property Set User(ByVal value As IUser)
    If Me Is UserBuilder Then Err.Raise 5, TypeName(Me), "Member call is illegal from default instance."
    If value Is Nothing Then Err.Raise 5, TypeName(Me), "'value' argument cannot be a null reference."
    Set internal = value
End Property

Me is always the current object, as in, an instance of this class module, presenting the default interface of this class module: the If Me Is UserBuilder condition evaluates whether Me is the object known as UserBuilder – and right now there’s no such thing and the code doesn’t compile.

Synchronizing Attributes & Annotations

Rubberduck knows we mean that class to have a VB_PredeclaredId attribute value of True because of the @PredeclaredId annotation, but it’s still just a comment at this point. Bring up the inspection results toolwindow, and find the results for the MissingAttribute inspection under Rubberduck Opportunities:

Clicking Fix all occurrences in project will automatically add all the missing attributes.

That didn’t fix the VB_PredeclaredId attributes! Why?! The reason is that the attribute isn’t missing, only its value is out of sync. We’ll have to change this (pull requests welcome!), but for now you’ll find the AttributeValueOutOfSync inspection results under the Code Quality Issues group. If you group results by inspection, its miscategorization doesn’t matter though:

When attributes and annotations contradict each other, the AttributeValueOutOfSync inspection starts issuing results.

Adjust the attribute value accordingly (right-click the inspection result, or select “adjust attribute value(s)” from the “Fix” dropdown menu), and now your UserBuilder is ready to use:

Dim identity As IUser
Set identity = UserBuilder.Create(uniqueId, uniqueName) _
                          .WithName(first, last) _
                          .WithEmail(emailAddress) _
                          .Build

…and misuse:

Set UserBuilder.User = New User '<~ runtime error, illegal from default instance
Debug.Print UserBuilder.User.AvatarUrl '<~ compile error, invalid use of property
Set builder = New UserBuilder
Set identity = builder.Build '<~ runtime error 91, builder state was not initialized
Set builder = New UserBuilder
Set builder = builder.WithEmail(emailAddress) '<~ runtime error 91

Conclusions

Model classes with many properties are annoying to write, and annoying to initialize. Sometimes properties are required, other times properties are optional, others are only valid if another property has such or such value. This article has shown how effortlessly such classes can be created with Rubberduck, and how temporal coupling and other state issues can be solved using the builder creational pattern.

Using this pattern as a building block in the same toolbox as factory methods and other creational patterns previously discussed, we can now craft lovely fluent APIs that can chain optional member calls to build complex objects with many properties without needing to take a gazillion parameters anywhere.

Hello, Rubberduck 2.5.0

Creating the pull request to merge the current [next] branch into [master] is always thrilling: the incredible amount of work that goes into Rubberduck, release after release, never ceases to amaze me. This time (again!), the pull request is well over 1.2K commits. Green-release version 2.4.1.0 was all the way back on March 25, 2019 – which was the Monday that immediately followed the last MVP Global Summit.

What’s new?

If you’ve been keeping up with pre-release builds, you already know. If you’re still using v2.4.1.0 and have the check for newer version at startup setting enabled, your ducky will be telling you about the new build next time you fire up the VBE.

When you update to v2.5, you’ll notice a new option for the check for newer version at startup setting: there’s a new “check for pre-release builds” option that can let you know not only of a new minor version bump, but also for every pre-release build – which effectively means you now get to keep Rubberduck as up-to-date as possible (every merged pull request), without needing to subscribe to GitHub email notifications.

Splash Screen

But the first thing you’ll notice (assuming you haven’t disabled it) will be the splash screen going back to the 2.4.0 yellow ducky splash – if you didn’t know, v2.4.1 was “ThunderFrame Edition” and all this time the splash screen was a nod to our dear friend Andrew Jackson:

Rubberduck’s repository is still filled with hundreds of Andrew’s ideas, and his impact on the project will remain with us forever. This ducky is based on Andrew’s work, too:

I’m not a fan of the font (it’s the same as on the ThunderSplash), but SHOWCARD GOTHIC was getting old and annoyingly too playful-looking. If a graphic artist is reading this and has a nice idea they’d like to contribute, they’re welcome to do so!

But you’re not here to read about the splash screen, are you?

Website/GitHub Integration

In the past, a new green-release meant Rubberduck needed to be deployed to the project’s website itself, so that the /version/build pages could respond with the assembly version of the Rubberduck.dll file deployed. Today the website only needs a Rubberduck build to support the online indenter page, and we only need to update that build to keep the online indenter preview tool up-to-date: if no indenter changes are made, then nothing needs to be updated – the website uses GitHub’s REST API to get the latest pre-release and official “green release” version numbers, but also to download the latest xml-doc from the Rubberduck.CodeAnalysis project, and with that the website’s /inspections/list page will now start identifying the newer inspections that are only available in a pre-release build, versus those present in the latest “green release” (this hasn’t kicked in yet, only because the [master] branch didn’t have any xml-docs to download). The /inspections/details pages are also entirely generated from the in-code xml documentation, including the many examples: we’ll eventually start linking to these pages in the inspection results toolwindow, with “why am I seeing this?” links/buttons.

New Features?

New inspections and new quickfixes, of course – but mostly lots of bugs fixed, and extremely important enhancements to the resolver logic effectively warrant the minor version bump. As mentioned in What’s Cooking for Rubberduck 2.5.x, special attention was given to the resolution of implicit default member calls and bang notation – and with that there’s very, very little early-bound code (if any) that Rubberduck isn’t understanding.

Self-closing pairs aren’t a new feature, but Rubberduck will now ship with the feature enabled by default (was opt-in before). We have been able to hijack and suppress the annoying “beep” that the VBE sounds when the Parameter Quick-Info command doesn’t have anything to show, and this has unlocked restoring automatic quick-info when typing the argument list of a function or procedure call: before that, using self-closing pairs worked pretty nicely, but parameter quick-info had to be manual, which was rather disturbing.

VBA + Source Control

If you’ve been following the project for some time, you probably remember the defunct source control panel – a toolwindow that essentially implemented Visual Studio’s Team Explorer and let you synchronize your VBA project with the files in a git repository. It would also list modified files and let you commit, push, pull, fetch, create new branches, merge them, etc. It failed and isn’t coming back, but the Code Explorer in v2.5 brings back the ability to synchronize the contents of your VBA project from the file system:

Update Components from Files will update existing modules from files in a selected folder, and Replace Contents from Files will make the VBA project mirror the contents of the selected folder (creating new project components/modules as needed). Because Visual Basic 6.0 already works off the file system, in VB6 we only offer the Update Components from Files command.

Keep in mind that while the contents of document modules can be imported, new document modules can’t be added to the host project by the VBE (the host application owns these modules: see this article): for this reason you will want to minimize the amount of code you have in modules like ThisWorkbook and other Worksheet modules in Excel, or in reports & forms in Access. Implementing the actual functionality in separate modules will make things much easier to work with this feature in conjunction with source control (whether you use git, mercurial, SVN, or any other VCS technology).

Visual Studio 2019

Rubberduck has been built with Visual Studio 2017 for quite some time: we have successfully updated all projects in the solution to the awesome new .csproj format, and until now the WPF (Windows Presentation Foundation – the .NET UI framework we use to design our toolwindows and dialogs) dependencies made it impossible to upgrade our build process to work in Visual Studio 2019 until the release of .NET Core 3 last September. This release marks the milestone where we flip the page, sunset Visual Studio 2017 – the first pull request to be merged after v2.5.0, will be one that updates the build process to work with Visual Studio 2019.

If you have forked or cloned Rubberduck, please note that Rubberduck will no longer build in VS2017, as soon as it builds in VS2019.


What Next?

One of the biggest road blocks that’s currently keeping us from implementing a lot of the amazing inspection ideas (and bringing back a proper Extract Method refactoring!), is the lack of proper code path analysis. With that, we’ll have standard tooling that all these inspections can share and reuse (rather than reinvent a rather complex wheel everytime), and then we can tackle the many open Code Path Analysis issues. I’ll be posting an “Inside Rubberduck” article about the architecture and thinking behind this at some point.

Another road block, that’s currently keeping Rubberduck from fully understanding the interfaces it’s looking at, is flicking the switch for our internal TypeLib API, which taps deep into the VBIDE’s guts and gives us visibility on the internal ITypeLib of the VBA project. Rubberduck is already leveraging some of these capabilities (that’s how unit testing works in every VBA host application), but by flicking that switch we’ll be able to, among many other things, pick up the Workbook interface of the ThisWorkbook module… which unlocks fixing a number of long-standing issues and inspection false positives.

Block Completion is another upcoming feature that will possibly be getting my attention in 2020, but not before Code Path Analysis does.

In order to address the growing concerns of performance and memory consumption (especially in larger projects, which currently work best in 64-bit hosts, and possibly not at all in 32-bit hosts), we are exploring implementing a Language Server to offload parsing & resolution out of the host process, similar to how VSCode & Roslyn works, and possibly also moving a lot of the in-memory storage of referenced type libraries’ declarations to an out-of-process database.

Hacktoberfest 2019 & Rubberduck

For the 3rd year in 2019, Rubberduck celebrated open-source with Digital Ocean’s (6th) Hacktoberfest event. This year’s was our busiest ever, with ~50 pull requests created, 11 new forks and 24 new stars in the past month; 67 issues were closed since October 1st, by 11 authors – including 4 first-time contributors!

This means… quite a lot was accomplished. Recap.

Parser & Resolver Tweaks

I’m always impressed with these, because such tweaks always enhance Rubberduck’s understanding of VBA code, so every such enhancement means better and more accurate inspections and overall functionality. This time, attention was given to Debug.Print and other statements that support a particular “output list” syntax, which would previously break the parser and fail to resolve identifier references in such expressions. Debug is no longer being treated as a “fake class”; rather, we’re now treating it as a more accurate “fake module”. VBA is special-casing Print methods in a way that forces us to parse it differently than a standard member call, and if you look carefully at how it’s syntax-highlighted, you’ll notice Debug.Print is really treated as a keyword, which is rather awkward. Print statements will now correctly parse and process their output list arguments.

It was also brought to our attention that under some regional settings OS configurations, exported forms could format decimal numbers in form properties with a comma for a decimal separator, and this tripped the parser since decimal literals are expected to be US-formatted! This was also fixed.

Default members and let-coercion behavior received significant attention too, and now bang notation (! operators) correctly resolve… and Rubberduck is now able to expand this notation (and other implicit default member calls) into explicit, early-bound code that preserves compiler validation and prevents the various problems associated with inadvertent late binding. Implicit procedure calls are now being identified, suspicious Let assignments are now being flagged, and jagged arrays now resolve correctly, too.

The broken resolution of Enum members has been corrected, and Next i will now correctly resolve the reference to i.

Inspection & Refactoring Fixes

An oversight in the implementation of the “Parameter can be ByVal” inspection was causing false positives with Enum types – a rather trivial fix. Much less trivial, the handling of @Ignore and @IgnoreModule annotations across all inspections was refactored and centralized, making it much harder to implement an inspection that doesn’t heed these annotations. Bugs were fixed in EncapsulatePublicField, UntypedFunction, and EmptyMethod inspections.

Name validation (e.g. when renaming an identifier) was refined and consolidated, and will now be more consistent and less prone to falsely claim name clashes.

UI & UX Enhancements

Autocompletion features will now be enabled by default for a new installation (note that the updated default will not affect an upgrade install). The caption of the “Ok” button of the settings dialog was changed to “Save & Close”, and the Extract Interface refactoring can now be invoked from the Code Explorer context menu. Inspection grouping was improved, every search box can now be cleared with the ESC key, and a long-standing issue with undocked toolwindows has been fixed: toolwindows made undockable can now easily be made dockable again.

The version check feature was updated, too: the wording of the notification message now includes the currently-running version number, and a new setting can now enable notifications for every pre-release build in-between. This update was made possible with the deployment of important changes to the project’s website: previously, the /Build/Version/stable URL would return the version of the Rubberduck build that was running on the web server – now, we hit the GitHub REST API and grab all releases: that’s how the home page can display this table with the breakdown of download stats; it’s how the /Build/Version endpoints know what version number to return, and how the /Inspections content is now generated directly from XML-doc comments in the source code. Note that caching/timing issues may cause the feature to announce a version number, and the actual latest build is actually more recent than that.

Miscellaneous

Our AppVeyor builds now generate a SHA256 hash for installer builds, that we issue as an asset that can be downloaded separately from the installer itself. Downloadable assets also include the .xml file that the website uses to generate the /Inspections content; because this content needs to follow certain specific conventions, a suite of Roslyn analyzers was implemented to inspect the XML documentation in the source code, and fail the build if a new inspection couldn’t properly show up on the website.

The PermissiveAssertClass is now leveraging Variant conversion in its IEqualityComparer<object> implementation, making Assert.AreEqual behave exactly as you would expect VBA to work. The settings dialog lets you select which IAssertClass implementation to use for new test modules.

Rubberduck’s own test suite is now up to well over 7,100 unit tests, and there are many refactoring opportunities, and a long-time contributor proposed to introduce an InspectionTestsBase abstract class, enhance our MockVbeBuilder, and boil writing test cases for inspections down to a one-liner, in two pull requests that netted a significant and welcome cut in our test code (diffs were +426/-1,759 and +1,431/-5,262, respectively), making a good payment on some technical debt.

Coming Up

With Hacktoberfest behind us, we’re getting ready to release v2.5.0 in November; translation rounds have begun, so now is a very good time to submit a pull request to add a new supported language, or to help complete other translations.

Rubberduck is currently built with Visual Studio 2017; there’s a draft pull request that changes the AppVeyor build environment to use Visual Studio 2019 and the newest SDK – the current and new build processes aren’t compatible: once that pull request is merged, Rubberduck will build correctly with Visual Studio 2019, and will start failing in VS 2017. The plan is to proceed with this IDE upgrade as v2.5.0 is released later this month.

Here’s to a successful Hacktoberfest, thanks to everyone that participated!

Dependency Injection + Inversion of Control + VBA

Whether VBA can do serious OOP isn’t a question – it absolutely can: none of the SOLID principles have implications that disqualify VBA as a language, and this means we can implement dependency injection and inversion of control. This article will go over the general principles, and then subsequent articles will dive into various dependency injection techniques you can use in VBA code.

A quick summary of these fundamental guidelines, before we peek at DI and IoC:

SOLID

Single Responsibility Principle

Split things up, and then some. Write loop bodies in another procedure, extract if/else blocks into other small specialized procedures. Do as little as possible, aim for each procedure to have a well-defined single responsibility.

Open/Closed Principle

Designing classes that are “open for extension, but closed for modification” is much easier said than done, but definitely worth striving for; by adhering to the other SOLID principles, this one just naturally falls into place. In a nutshell, you’ll want to be able to add features by extending a class rather than modifying it (and risk breaking something) – the only code you need to think about is the code for the new feature… and how you’re going to be testing it.

Liskov Substitution Principle

Say you write a procedure that takes an IFooRepository parameter. Whether you invoke it with some SqlFooRepository, MySqlFooRepository, or FakeFooRepository, should make no difference whatsoever: each implementation fulfills the interface’s contract, each implementation could be swapped for another without altering the logic of the procedure.

Interface Segregation Principle

Write small, specialized interface with a clear purpose, that won’t likely need to grow new members in the future: IFooRepository.GetById is probably fine, but IFooRepository.GetByName looks like someone had a specific or particular implementation in mind when they designed the interface, and now you need to implement a GetByName method for a repository where that makes no sense.

Dependency Inversion Principle

Depend on abstractions, not concrete implementations – your code has dependencies, and you want them abstracted away behind interfaces that you receive as parameters.


What is a dependency?

You’re writing a procedure, and you need to invoke a method that belongs to another object or module – say, MsgBox: with it your procedure can warn the user of an error, or easily get a yes/no answer. But this ability comes with a cost: now there’s no way to invoke that procedure without popping a message box and stopping execution until it’s dismissed. Hard-wired dependencies make unit testing difficult (if not impossible), so we inject them instead, as abstractions.

And dependency injection?

MsgBox is a bad example – Rubberduck’s FakesProvider already lets you configure MsgBox calls any way your testing requires, and no pop-up ..pops up. But let’s say the procedure needs to do things to a Worksheet.

We could make the procedure take a Worksheet parameter, and that would be method injection.

Since we’re in a class module (right?), we could have a Property Set member that takes a Worksheet value argument and assigns it to a Worksheet instance field that our method can work with, and that would be property injection.

We could have a factory method on our class’ default instance, that receives a Worksheet argument and property-injects it to a New instance of the class, then returns an instance of the class that’s ready to use (behind an interface that doesn’t expose any Property Set accessor for the injected dependencies), and that would be as close to the ideal constructor injection as you could get in a language without constructors.

What “control” is inverted, and why?

When a method News up all its dependencies, it’s a control freak that doesn’t let the outside world know anything about what objects it needs to do its job: it’s a black box that other code needs to take as “it just works”, and we can’t do much to alter how it works.

With inversion of control (IoC), you give up that control and let something else New things up for you, and that’s why Dependency Injection (DI) goes hand-in-hand with it. IoC implies completely reversing the dependency graph. Take a UserForm that reads from / writes to a worksheet, with code-behind that implements every little bit of what needs to happen in CommandButton1_Click handlers – a “Smart UI” – reversing the dependency graph means the form’s code-behind is now only concerned about the data it needs to present to the user, and the data it needs to collect from the user; the CommandButton1 button was renamed to AcceptButton, and its Click handler does one single thing: it invokes a SaveChangesCommand object’s Execute method, and everything that was in that click handler is now in that ICommand implementation. The command knows nothing of any userform; it works with the model, that it receives in an Object parameter to its Execute method.

It all comes down to one thing: testability. You want to test your commands and what they do, how they manipulate the model – so you pull as much as possible out of UI-dependent code and into specialized classes that only know as much as they need to know. The form’s code-behind (aka the view) knows about the model, the commands; the model knows about nothing but itself; the commands know about the model; a presenter would know about both the view and the model, but shouldn’t need to care for commands.

If none of the components create their dependencies / if all components have their dependencies injected, then if we follow the dependency chain we arrive to an entry point: in VB6 that would be some Public Sub Main(); in VBA, that could be any Public Sub procedure / “macro” in a standard module, or any Worksheet or Workbook event handler. These entry points all need to New up (or otherwise provide) everything in the dependency graph (e.g. class1 depends on class2 which depends on class3 and class4, …), and then invoke the desired functionality.

What is testable code?

Testable code is code for which you can fully control/inject all the dependencies of that code. This is where coding against abstractions pays off: you can leverage polymorphism and implement test doubles / stubs / fakes as needed. The presence of a New keyword in a method is a rather obvious indicator of a dependency; it’s the implicit dependencies that are harder to spot. These could be a MsgBox prompt, a UserForm dialog, but also Open, Close, Write, Kill, Name keywords, or maybe ActiveSheet, or ActiveWorkbook implicit member calls against a hidden global object; even the current Date can be a hidden dependency if it’s involved in behavior you want to cover with one or more unit tests. The Rnd function is definitely a dependency as well.

SOLID code is inherently testable code. If you write the tests first (Test-Driven Development / TDD), you could even conceivably end up with SOLID-compliant code out of necessity.

Say you want to bring up a dialog that collects some inputs, and one of these inputs needs to be a decimal value greater than or equal to 0 but less than 1 – what are the odds that such validation logic ends up buried in some TextBox12_Change handler (and duplicated in 3 places) in the UserForm module if the problem is tackled from a testability standpoint? That’s right: exactly none.

If the first thing you do is create a MyViewModelTests module with a MySpecialDecimal_InvalidIfGreaterThanOne test method, there’s a good chance your next move could be to add a MyViewModel class with a MySpecialDecimal property – be it only so that the test method can compile:

'@TestMethod("ValidationTests")
Public Sub MySpecialDecimal_InvalidIfGreaterThanOne()
    Dim sut As MyViewModel
    Set sut = New MyViewModel
    sut.MySpecialDecimal = 42
    Assert.IsFalse sut.IsValid
End Sub

So we need this MyViewModel.IsValid member now:

Public Property Get IsValid() As Boolean
End Property

At this point we can run the test in Rubberduck’s Test Explorer, and see it fail. Never trust a test you’ve never seen fail! The next step is to write just enough code to make the test pass:

Public Property Get IsValid() As Boolean
    IsValid = MySpecialDecimal < 1
End Property

This prompts us to write another test that we know would fail:

'@TestMethod("ValidationTests")
Public Sub MySpecialDecimal_InvalidIfNegative()
    Dim sut As MyViewModel
    Set sut = New MyViewModel
    sut.MySpecialDecimal = -1
    Assert.IsFalse sut.IsValid
End Sub

So we tweak the code to make it pass:

Public Property Get IsValid() As Boolean
    IsValid = MySpecialDecimal >= 0 And MySpecialDecimal < 1
End Property

We then run the whole test suite, to validate that this change didn’t break any green test, which would mean a regression bug was introduced – and the red test is telling you exactly which input scenario broke.


In a vanilla VBE, OOP quickly gets out of hand, for any decently-sized project: wading through many class modules in the legacy editor, locating implementations of the interfaces you’re coding against – things that you would seamlessly deal with in a modern IDE, become excruciatingly painful when modules are all listed alphabetically under one single “classes” folder, and when Ctrl+F “Implements {name}” is the only thing that can help you locate interface implementations.

Rubberduck not only addresses the organization of your OOP project (with “@Folder” annotations that let you organize & regroup modules by functionality) and enhances navigation tooling (“find all implementations”, “find all references”, “find symbol”, etc.), it also provides a unit testing framework, so that testing your VBA code is done the same way it’s done in other languages and modern IDEs, with Assert expressions that make or break a green test.

But if you write unit tests for your object-oriented VBA code, you’ll quickly notice that when your tests need to inject a fake implementation of a dependency, a consequence is that you often end up with a lot of “test fake” classes whose sole purpose is to support unit testing. This is double-edged, because you need to be careful that you’re testing the right thing (i.e. the actual object/method under test) and not whether your test fake/stub is behaving correctly.

Rubberduck has well over 5K unit tests, and most of them would be very hard to implement without the ability to setup proper mocking. Using the popular Moq framework, we are able to create and configure these “test fakes” without actually writing a class that implements the interface we need to inject into the component we’re testing.

Soon, these capabilities will land in the VBA landscape, with Rubberduck’s unit testing tools wrapping up Moq to let VBA code do exactly that.

What’s Cooking for Rubberduck 2.5.x

If you’ve been following the project all along, this isn’t going to be news, but we kind of missed the v2.4.2 milestone we were slated to release back in April, and here we are with our [next] branch (“pre-release” builds) being a whopping 580+ commits ahead of [master] (“green-release” builds). These commits change a lot of things… so much that v2.4.1 will end up being the only “green-release” of the 2.4.x release cycle, and we’ve decided next release will have to be 2.5.0 – but what is it specifically that warrants such delays and the +1 on the minor version number?

ITypeLib

Perhaps the most important set of changes since v1.2 where we introduced an ANTLR-generated parser, this internal API was actually introduced last year, but until relatively recently it was only used to make the unit testing feature fully host-agnostic (i.e. unit testing works in every host application since) and to retrieve project-level precompiler constants, which closed an otherwise desperate gaping hole in Rubberduck’s understanding of the code that’s in the editor. We are also using it to retrieve and manipulate project references, and possibly in other places I don’t recall at the moment.

But this internal API unlocks much more power than that, and until very recently we hadn’t really started tapping into it. During the v2.5.x cycle, we’ll be using it to instantly populate the Code Explorer toolwindow with tree nodes that still drill down to member level – of course Rubberduck won’t know where a procedure is referenced or be able to refactor anything until parsing has actually occurred, but the project should be instantly navigatable regardless.

We have already begun leveraging this ITypeLib API to augment resolver capabilities, notably with regards to member and module attributes: we can now read most of their values without needing to export anything to any temp file.

So what this API does, is that it taps into VBA/VB6’s internal storage: you may not realize, but compiling your VBA code, internally, creates a COM type library. With this API we can safely query this type library and model user code modules and their members just like any other COM type library, e.g. project references. This means Rubberduck is be able to know what interfaces a document module implements – in other words, when we fully leverage this API we will be able to tell that Sheet1 is a Worksheet and that ThisWorkbook is a Workbook… which means a library-specific inspection like “sheet accessed using string” can now work exactly as intended. We already correctly identify event handler procedures in document modules thanks to these new capabilities; it might seem simple on the surface, but knowing that Sheet1 is a Worksheet and that this Worksheet_Change procedure is handling the Change event of that Worksheet interface, requires looking well beyond the code… and a side-effect of this, is that “procedure not used” no longer fires inspection results for them (the inspection already ignored event handler procedures… all it needed was for the resolver to recognize event handlers in document modules as such).

Default Member Resolution

Once again, a tremendous amount of effort went into augmenting resolver capabilities. This piece of the puzzle is the cornerstone that makes everything else fall into place: if we’re able to issue an inspection result when a variable is never referenced, it’s because the resolver processed all the parse trees and located no references to that variable. But it’s much more than just unused variables: the resolver is the literal central nervous system of Rubberduck – if the resolver doesn’t work well, everything else falls apart.

Except, resolving VBA code correctly is hard. We have an inspection that means to flag missing Set keywords, and until recently it would fire false positives whenever implicit default member calls were involved. What’s that? Picture this code:

Range("A1") = Range("B1")

What’s really happening is this:

Global.Range("A1").[_Default] = Global.Range("A1").[_Default]

But in order to know that, Rubberduck needs to know much more about the code than just what the code is saying: it needs to know that Range is an implicitly-qualified member call on Global (or is it? what if that very same code is in the code-behind of the Sheet3 module?), and that it has a default member that’s the target of this assignment on the left-hand side, and the provider of a value on the right-hand side; it needs to know that this default member yields a Variant (and not another object, which may have its own default member, which might yield an object, which may have a default member, which… so yeah, recursive resolution). And once it knows all that, it can warn you about implicit default member assignments, and soon about any implicit default member call – and help you make them explicit!

Bang notation now also resolves correctly. You write this:

Dim rs As ADODB.Recordset
Set rs = conn.Execute(procName)
Debug.Print rs!Field1

Rubberduck sees this:

Dim rs As ADODB.Recordset
Set rs = conn.Execute(procName)
Debug.Print rs.Fields.Item("Field1").Value

…and this means we’ll soon be able to offer quickfixes/refactorings that turn one notation into the other, and vice-versa.

This is where Rubberduck’s resolver is at, and I need to pinch myself to believe just how crazy wicked awesome it’s becoming – it’s not perfect, but I’m positive, and I’ll repeat this even though it’s been the case for a very long while, but no other VBIDE add-in understands VBA as deeply as Rubberduck.

Moq.Mock<T>

Rubberduck uses the Moq framework for its thousands of unit tests. With it, we’re able to inject “mock” implementations of any abstract dependency: unit testing isn’t complete without a mocking framework, and there’s none for VBA, …for now.

The amount of work involved is astounding, but the important and hard parts are working and we’re just a few road-bumps away from having a COM-visible Moq wrapper API that VBA code can consume to mock any class – your Class1 module or ISomething interface, a ListObject Excel table, any Word.Range, ADODB.Connection, or Scripting.FileSystemObject. This is a massive and complete game-changer that takes unit testing VBA code to a whole new level of credibility.


Timeline

To be honest, there isn’t really any timeline on the table: the 2.5.0 green-release will happen when it does. In the meantime you’ll want to keep an eye on pre-release builds: in the next couple of weeks we’ll be polishing the new features, reviewing what few inspection false positives remain, address a number of prioritized bugs (the all-or-nothing collapsing/expanding of grouping grids, for one), and then we’ll be ready.

There’s plenty of work for all levels and skills, you’re welcome to help us!

Introducing the Reference Explorer

Back in the 2.1.x announcement post over a whole year ago, one of the bullet points about the upcoming roadmap said we were going to “make you never want to use the VBE’s Project References dialog ever again“; it took a bit longer than expected, but as far as we can tell, this feature does exactly that.

If you’ve been following the project on social media recently, you already know that the next version of Rubberduck will introduce a very exciting, unique new feature: the Reference Explorer dialog, and the addition of a references node in the Code Explorer tree.

Vanilla-VBE

Since forever, adding a reference to the active project in the VBE is a rather… vanilla experience. Functional, but somewhere between bland and tedious.

What’s wrong with it?

Regardless of what we think of the very 1998-era buttons docked on the side, the dialog works. There’s a list of available libraries (sorted alphabetically), we can browse for unlisted ones, cancel or accept changes, and the libraries that are selected when the dialog is displayed, are conveniently shown at the top of the list!

On a closer look though…

The vanilla-VBE project references dialog
  1. The list of available libraries has the available libraries listed in alphabetical order. You can’t resize the dialog to show more, but you get first-key search. The Scripting runtime’s library name starts with “Microsoft”… which happens to also be the case for a few other libraries; this makes the extremely useful Scripting.Dictionary and Scripting.FileSystemObject classes pretty much hidden until you stumble upon a blog post or a Stack Overflow answer that introduces them.
  2. The selected libraries show up at the top of the list, in priority order. Locked libraries are stacked at the top. You use the up/down arrow buttons to move the selected library up or down, but you can’t move the locked ones.
  3. The priority buttons are used to determine the identifier resolution order; if an identifier exists in two or more libraries, VBA/VB6 binds to the type defined in the library with the highest priority. There’s no visual cue in the list itself to identify the locked-in type libraries, so the Enabled state of these buttons is used to convey that information: you can’t move the locked-in, default references.
  4. The bottom panel is useful… but the path gets cropped if it’s longer than the rather narrow dialog can fit, and you can’t select or copy the text. The actual library version number isn’t shown.

Visual Studio

Let’s take a look at what adding a project reference using the latest version of Microsoft Visual Studio feels like:

The Microsoft Visual Studio 2017 Reference Manager dialog

The dialog can be resized, search is no longer limited to a single character, but still limited to the beginning of the [Name]. The library info is now richer; it moved to the right side, and a panel on the left side determines the contents of the list. Other than that, besides a new [Version] column and a nice dark theme, …the mechanics are pretty much the same as they were 20 years ago: check boxes in a list. Priority is no longer relevant in .NET though – namespaces fixed that.

Rubberduck

This screenshot was taken shortly before the pull request was opened:

The Rubberduck ‘Add/Remove References’ dialog (work in process: release build may differ)
  1. Available libraries appear in a list on the left-hand side of the dialog. Like in Visual Studio, the version number appears next to the library name, and the list is sorted alphabetically. There is no checkbox: instead, the selected library can be moved into the list of referenced libraries.
  2. Referenced libraries appear in a list on the right-hand side of the dialog. Since there is no checkbox, the selected library can be moved back into the list of available libraries.
  3. Priority up/down buttons appear for the selected referenced library, unless it’s locked.
  4. Icons differentiate locked libraries, libraries that were already referenced when the dialog was shown, and libraries that were newly added. In the list of available libraries, recent and pinned libraries have an icon too.
  5. Search works on a “contains” basis, and matches the library name, description, and path. It immediately filters the list of available libraries.
  6. Tabs for quickly accessing type libraries recently referenced, or pinned libraries, or registered. Host-specific project types are in a separate tab, as applicable.
  7. Bottom panel displays the full name and path of the selected type library. The text can be selected and copied into the clipboard.
  8. Browse button allows referencing any project/library that isn’t listed anywhere. If a library can’t be loaded, it will appear in the list as a broken reference, before it’s even tentatively added to the project.

If you haven’t seen it in action yet, here’s a sneak peek:

Of course that’s just the beginning: layout is not completely final, drag-and-drop functionality remains to-do, among other enhancements.

A first iteration of this feature will likely be merged some time next week, and since this is a major, completely new feature, we’ll bump the minor version and that will be Rubberduck 2.4.0, to be released by the end of 2018…

…not too long after the imminent 2.3.1 hotfix release.

If you think this is one of the coolest things a VBE add-in could possibly do, you’re probably not alone. Share the news, and star us on GitHub!