Exploring version 1.10 - The rename engine

clock October 10, 2008 07:50

Yesterday we finally released VB Migration Partner 1.10 BETA, which registered users can download from our site. Being a beta, we decided to keep the more recent "official" version (1.00.06) online, so that customers can have a choice.

The new version includes so many exciting refactoring features that allow you produce the kind of high-quality VB.NET code that experienced .NET develors can write, only much much faster. For example, you can now reliably merge nested IF blocks or convert On Error Goto and On Error Resume Next statements into Try-Catch blocks by just adding a new project-level pragma.

In this article I show how to leverage the brand-new symbol rename engine. A few customers asked for the ability to rename classes, methods, and controls during the migration process so that the VB.NET code would abide by their naming coding standards. Being the authors of a successful book entirely devoted to coding standard, we couldn't ignore the request.

Earlier versions of VB Migration Partner already included a rename engine - necessary to rename VB6 members that happened to be named after VB.NET keywords, among other things - and more expert users were able to hook into that engine by writing a VB Migration Partner Extender, i.e. a separate DLL that interacts with our tool during the three stages of the migration process (parsing, interpreting, code generation).

DECLARATIVE RULES
Writing an extender requires some familiarity with VB Migration Partner's object model, something most customers don't want toget involved into. (VB Migration Partner already comes with a few extenders that allow you to do complex auxiliary tasks, such as polishing UserControl classes and generating cumulative migration reports when migrating multiple projects in batch mode.)

Thus we decided to expose the inner rename engine to the outside and provide our users with the ability to write declarating rename rules based on regular expressions and stored in an XML file. Thanks to the internal rename engine the entire process was very simple and took a couple of days, most of which were spent to write a series of unit tests and a sample XML file containing a set of standard rename rules for you to use, study, and modify as you see fit.

Enabling this new feature is as simple as adding one ApplyRenameRules pragma anywhere in the VB6 project's source code:

    '## ApplyRenameRules  c:\myrules.xml

The argument is the path of the XML file containing all the rename rules in declarative format. If you omit the argument, the standard RenameRules.xml file that comes with VB Migration Partner 1.10 is used. Instead of adding an ApplyRenameRules pragma in each VB6 project you can store it in a VBMigrationPartner.pragmas file shared by one or more projects. Even better, you can store it in the master pragmas file located in VB Migration Partner's install folder, in which case all yourmigrated projects will use it. Here's a fragment of the XML file that comes with the standard installation:

 <?xml version="1.0" encoding="utf-8" ?>
<NamingRules>
  
   <!-- Components section -->
   <Symbol kind="Component" type="VB.Label" >
      <!-- ensure that we have a "lbl" prefix -->
      <Rule pattern="^(lbl)?" replace="lbl" />
      <!-- drop the "Label" suffix -->
      <Rule pattern="Label$" replace="" />
   </Symbol>

   <!-- more rules for all common controls -->

</NamingRules>

The kind attribute is used first to determine the kind of symbol the rule applies to. Valid names are Form, Class, Module, Type, UserControl, Enum, Component, Method (includes subs and functions), Field, Property, Event, EnumItem (the members of an Enum declaration), Constant, and Variables (includes, local variables and parameters.)

If kind is equal to Component or Variable, you can optionally include a type attribute, that specifies the VB6 type of the member to which the rule applies.In the above example, the type attribue is used to specify that the rule applies to all components of type VB.Label (the standard VB6 label). But you aren't confined to renaming controls: for example, the following rule renames all Integer variables named “i" into “index”

   <Symbol kind="Variable" type="Integer" >
      <Rule pattern="^i$" replace="index" ignoreCase="false" />
   </Symbol>

Renaming rules are based on regular expressions. The pattern attribute is used to decide whether the rule applies to a given member, and the replace attribute specifies how the member should be renamed. The ignoreCase optional attribute should be false if you want the pattern to be applied in case-sensitive mode (by default comparisons are case-insensitive). Finally, the changeCase optional attribute allows you to change the case of the result and can be equal to “lower”, “upper”, “pascal” (first char is uppercase), or “camel” (first char is lowercase).

Being Visual Basic a case-insensitive language, in most cases the default behavior is fine. However, consider the following renaming requirement: many VB6 developers preferred to use “C” as a prefix for all class names, but this guideline has been deprecated in VB.NET. Here’s a rule that drops the leading “C”, but only if it is followed by another uppercase character:

   <Symbol kind="Class" >
      <Rule pattern="^C(?=[A-Z])" replace="" ignoreCase="false" />
   </Symbol>

The changeCase attribute is useful to enforce naming rules that involve the case of identifiers. For example, a few developers like all-uppercase constant names; here's a rule that enforce this guideline:

   <Symbol kind="Constant" >
      <Rule pattern="^.+$" replace="${0}" changeCase="upper" />
   </Symbol>

 

There are a few other options that I don't cover here, if you are interested just have a look at the Renaming Program Members section of our manual. 

PROGRAMMATIC RULES
Declarative rules are easy to write and test - if you are familiar with regexes, at least - but there are times when you want to be in full control of how a symbol or a category of symbols are renamed. Achieving this feat with VB Migration Partner is quite easy:

1) Create a Class Library project inside Visual Studio 2005 or 2008 (you can use any .NET language, including VB.NET or C#)
2) Add a reference to the CodeArchitects.VBMigrationEngine.Dll
3) Create a public class, name it as you prefer, and make it inherit from CodeArchitects.VBMigrationLibrary.SpecialRuleBase
4) Override the GetSymbolNetName method
5) write the code that analyzes the properties of the VBSymbol passed as an argument and return its new name.
6) compile the DLL and drop it in VB Migration Partner's install folder.

For example, let's say that you want to replace the "lng" prefix with the "int" for all 32-bit integer variables defined inside a method named MyMethod. Here's the code that does it:

Public Class MyRenameRules
   Inherits SpecialRuleBase

   Public Overrides Function GetSymbolNetName(ByVal symbol As VBSymbol, _
         ByVal netName As String) As String
      If symbol.Kind = SymbolKind.LocalVar _
            AndAlso symbol.TypeName = "Long" _
            AndAlso symbol.ParentSymbol.Name = "MyMethod" Then
         If netName.StartsWith("lng") Then netName = "int" & netName.Substring(3)
      End If
      Return netName
   End Function
End Class


It couldn't be simpler, uh?

A word of caution: when you supercede VB Migration Partner's rename engine you are responsible for generating names that are both valid and unique. For example, if you drop either a prefix or a suffix, you should ensure that the new name is still unique, otherwise the generated VB.NET project won't compile. 



Subtle array cloning issues

clock October 5, 2008 10:46

As you may know, VB6 performs a copy of the array when assigning an array to another array variable. After the assignment, the source and destination arrays are distinct objects, as this code demonstrates:

Dim source(10) As Integer
source(1) = 111
Dim dest() As Integer
dest() = source()
Debug.Print source(1), dest(1)    ' displays "111,111"
' changing an element in either array doesn’t affect the other
source(1) = 222
Debug.Print source(1), dest(1)    ' displays "222,111"

Conversely, an array assignment under VB.NET just copies the array address:, the source and destination array variables point to the same set of elements. In other words, changing an element through either variable would affect the value seen by the other array variable.

For these reasons, both the Upgrade Wizard and earlier versions of VB Migration Partner automatically force a copy operation when converting an array assignment. This is how Upgrade Wizard converts the above VB6 code snippet:

Dim source(10) As Integer
source(1) = 111
Dim dest() As Integer  
dest = source.Clone()
Debug.Print source(1), dest(1)    ' displays "111,111"
' changing an element in either array doesn’t affect the other
source(1) = 222
Debug.Print source(1), dest(1)    ' displays "222,111"

It seems that adding a call to the Clone method solves the problem once and for all, but unfortunately this isn’t the case. As a matter of fact, there are several cases when the Upgrade Wizard (and probably other VB6-to-VB.NET conversion tools) doesn’t preserve functional equivalence with the original VB6 code.

For startes, a plain Clone method doesn't work correctly if the array you are assigning is an array of arrays (Under VB6, an array of arrays is a Variant array whose individual elements are arrays.) The problem here is that the Array.Clone method performs a shallow copy and doesn’t correctly copy nested arrays and the converted VB.NET application doesn’t work as it should. For this reason, starting with version 1.10, VB Migration Partner supports the CloneArray6 helper method, which can perform both a shallow copy and a deep copy (if the second argument is True, also nested arrays are copied):

   Dim arrayOfArrays() As Variant
    ' …
    Dim dest() As Variant
    dest() = CloneArray6(arrayOfArrays(), True)

The CloneArray6 method is available both in VB6 and VB.NET, but the VB6 version is a do-nothing version and doesn't change the application behavior. The CloneArray6 method can correctly handle arrays of arrays of any nesting level, arrays of structures, and more in general arrays of any ICloneable object.

Another case when UpgradeWizard generates bugged VB.NET code is when assigning a structure that contain one or more arrays. Consider this VB.NET code, generated by Upgrade Wizard when converting an assignment between two structures:

Structure MyUDT
    Public arr() As Integer
End Structure

Sub Test()
    Dim udt As MyUDT
    Dim udt2 As MyUDT
    udt2 = udt
    ' udt and udt2 point to different structures, but
    ' they share the same set of array elements
    ' …
End Sub

It is obvious that the assignment between structures doesn’t work as in the original VB6 code, because the udt and udt2 structures are pointing to the same array: changing one element in the array in one structure affects the array pointed to by the other structure.Unlike Upgrade Wizard and other conversion tools, VB Migration Partner works around this issue by rendering the assignment as follows:

Structure MyUDT
    Public arr() As Integer

    Function Clone() As MyUDT
        Dim copy As MyUDT = Me
        copy.arr = CloneArray6(Me.arr)
       Return copy
    End Function

End Structure

Sub Test()
    Dim udt As MyUDT
    Dim udt2 As MyUDT
    udt2 = udt.Clone()
    ' udt and udt2 are now completely disjoint objects, and share no arrays
    ' …
End Sub

You might bump into another elusive bug when invoking methods that store a reference to the array passed as an argument. The simplest example of this problem is when you store an array into multiple items of a Collection, as this VB6 code demonstrates:

Dim col As New Collection
Dim values(1) As String
values(0) = "Francesco": values(1) = "Balena"
col.Add(values)
' notice that here we reuse the same array…
values(0) = "Giuseppe": values(1) = "Dimauro"
col.Add(values)
Debug.Print col(1)(0) & "," & col(2)(0)   ' => "Francesco,Giuseppe"

The Upgrade Wizard converts this code verbatim, but the resulting VB.NET code just doesn’t work as expected: both col(1) and col(2) elements actually point to the same array, therefore the text displayed in the Output window is not the one you hoped for:

Debug.WriteLine(col(1)(0) & "," & col(2)(0))  ' => "Giuseppe,Giuseppe"

A solution to this problem is to explicitly ReDim the array after each Add method, as follows:

Dim col As New Collection
Dim values(1) As String
values(0) = "Francesco": values(1) = "Balena"
col.Add(values)
' allocate a new memory block for array elements
ReDim values(1)
values(0) = "Giuseppe": values(1) = "Dimauro"
col.Add(values)

However, this workaround adds unnecessary overhead to the VB6 code. Starting with VB Migration Partner version 1.10 you can avoid this overhead and have the most efficient behavior in both VB6 and VB.NET by using the CloneArray6 method:

Dim col As New Collection
Dim values(1) As String
values(0) = "Francesco": values(1) = "Balena"
col.Add(CloneArray6(values))
values(0) = "Giuseppe": values(1) = "Dimauro"
col.Add(CloneArray6(values))

The bottom line: pay close attention to how arrays are assigned and copied in your VB.NET project and carefully scrutinize the code that your VB6-to-VB.NET migration tool generates in these cases. Else you might be spending hours trying to track down these elusive bugs.



A better cure for a common VB6 misconception

clock October 3, 2008 02:48

In an article I wrote about two months ago I showed how you can fix a programming mistake that is quite common among VB6 developers, who tend to believe that the following declaration

     Dim x1, y1, x2, y2 As Integer

declares four integer variables, whereas it actually declares only one integer (y2) and three Variant variables. The solution I proposed required one PreProcess pragma for statements declaring two variables, another PreProcess pragma for statements declaring three variables, and so forth.

Yesterday Marco Giampetruzzi - from the VB Migration Partner Team - surprised me with a better solution that uses just one pragma to account for all Dim, Public, Private, and Static statements, regardless of the number of variables they declare:

    '## PreProcess "(?<=\b(Dim|Public|Private|Static)\b.*)
                    (?<!\b(Sub|Function|Property)\b.*)
                    (?<!As\s+)(?<var>\b\w+\b)(?=,.*\bAs\s+(?<type>\w+))",
                    "${var} As ${type}", True

Let me briefly explain how it works. The first regex tells VB Migration Partner to search an identifier and name it as "var" (?<var>\w+). This identifier must preceded on the same line by one of the declaration keywords (?<=\b(Dim|Public|Private|Static)\b.*). To prevent that parameters on a method declaration be mistakenly matched, we also check that the identifier not be preceded by a keyword used in method and property declarations (?<!\b(Sub|Function|Property)\b.*). Next, the regex ensures that the indentifier be immediately followed by a comma and (after some characters) by an As clause (=?,.*\bAs\b\s+(?<type>\w+)). Notice that above pragma doesn't support array declarations.

Very smart indeed... thanks Marco!

You can build similar PreProcess pragmas to account for similar cases. For example, the following pragma adds an As clause to parameters inside method declarations:

    '## PreProcess "(?<=\b(Sub|Function|Property)\b.*)
                    (?<!As\s+)(?<var>\b\w+\b)(?=,.*\bAs\s+(?<type>\w+))",
                    "${var} As ${type}", True

If you aren't familiar with regexes, drop a line to our tech support and we'll cook one for you!


Announcing VB Migration Partner 1.10 - the refactoring release!

clock September 21, 2008 10:26

We released VB Migration Partner 1.0 in May and the product has been an instant success, thanks to its great features that were long waited for by all VB6 developers wishing to move to the .NET world. In these months we have fixed some residual bugs but, even more important, we took note of all the requests from our customers. The result is the brand-new VB Migration Partner 1.10 version, which introduces many new features that help you to produce better and faster VB.NET. For this reason we have internally dubbed it as the refactoring release.

code migration that just works

GoSub refactoring
VB Migration Partner has always fully supported the conversion of the Gosub keyword, by interspersing the code with additional labels and a simulated return stack. This approach guarantees perfect functional equivalence with the original VB6 code, but admittedly delivers VB.NET code that isn’t very readable.

VB Migration Partner 1.10 goes further by letting you convert Gosub into call to separate private methods, i.e. the same kind of refactoring a skilled developer would perform. The private method receives all the parameters it requires, and these parameters are passed with ByRef if the callee actually modifies them.

We are very proud of this exciting feature. No other migration tool on the market supports anything similar, as far as we know. You can enable it by means of the new ConvertGosubs pragma, which can be scoped at the project, file, and method level.

Once again, we privilege functional equivalence over “beautiful” code, therefore the pragma has no effect if the Gosub can’t be refactored correctly, such as when the method contains one or more Resume statements or the target label is also referenced by a Goto or On Error statement.

Structured Exception Handling (SEH)
While VB.NET fully supports the On Error and Resume keywords, it is true that VB.NET programmers should use Structured Exception Handling (SEH) and the Try-Catch block whenever possible. You can now use the UseTryCatch pragma to have perform the substitution automatically. The pragma can be scoped at the project, file, and method scope, therefore you can precisely state where you want it and where On Error statements should be left in code.

The conversion isn’t possible in some cases – for example, if the method contains multiple On Error Goto statements or if the statement appears inside a conditional block (e.g. If or Select block) – thus VB Migration Partner performs the conversion only if the functional equivalence between VB6 and VB.NET can be fully preserved.

Type inference
VB Migration Partner 1.10 is capable, in most cases, to correctly deduce a less generic data type for Variant local variables, parameters, class fields, functions and properties. For example, if a Variant local variable is always assigned a floating point value, it can be automatically rendered as a Double variable instead of a more generic Object variable.

The new feature is enabled by means of the new InferType pragma. Like all other pragmas, you can set the scope for this action to the entire project, a specific file, or a given method, therefore it’s quite easy to decide exactly where type inference should be used. You can decide whether to limit type inference to only the members that lacks an explicit “As” clause or extend it to all Variant elements.

Unreachable code and unused member removal
Since its early betas, VB Migration Partner included a sophisticated code analysis engine that was able to spot both unreachable code (e.g. the code that immediately follows an Exit Sub statement, unless the code begins with a label referenced by a Goto o Gosub) and unused methods and properties. Developers could then decide to manually remove the code in question, to make the converted VB.NET project leaner and faster.

The new 1.10 version adds two pragmas – RemoveUnreachableCode and RemoveUnusedMembers – which eliminate the need for any manual action. Both pragmas let you specify whether the code in question should be completely removed or just commented out, therefore you can easily make your simulations and check whether the removal would have any consequence of your code.

Please notice that the code analysis engine might mistakenly flag as “unused” a method that is actually invoked via late-binding. You can quickly realize if this is the case, and if so you can use the MarkAsReferenced or MarkPublicAsReferenced pragmas to let VB Migration Partner know that the method shouldn’t be touched. (These pragmas have been available since early beta versions.)

If merging
VB Migration Partner has always supported the selective replacement of And/Or keywords with AndAlso/OrElse, but this feature had to be performed “semi-manually” by including a proper LogicalOps pragma.

VB Migration Partner 1.10 is capable of analyzing nested If blocks and decide whether it is safe to merge two nested If blocks into a single block whose condition concatenates the individual conditions by means of the AndAlso keyword. This feature is enabled by means of the MergeIfs pragma and you decide whether to scope it at the project, file, or method level. (In general, using project scope is always safe, therefore you can just place this pragma in the master VBMigrationPartner.pragmas file.)

There is no limit to the number of Ifs that can be automatically merged by this feature.
 
Naming guidelines enforcement
Since version 1.0, VB Migration Partner has allowed users to rename any member in the converted VB.NET by writing an extension DLL. However, in the forthcoming version 1.10 we made things even simpler by supporting rename rules stored in an XML file in a very simple syntax.

This new feature is enabled by means of the ApplyRenameRules pragma, which lets you specify the path of the XML file that holds the rename rules. We provide an XML file that contains the standard .NET naming guidelines and that you can extend or modify to match *your* guidelines. (Don’t forget that Giuseppe Dimauro and I are the authors of Practical Guidelines and Best Practices for VB.NET and C# Developers, thus we know what we are talking about.)

For example, a typical naming guideline might dictate that all push buttons should have a name with “btn” prefix and that any other prefix – such as the “cmd” prefix, often used in VB6 – should be dropped off. A guideline isn’t restricted to forms and controls, though, and you are in controls of how VB Migration Partner should rename variables, methods, fields, properties, parameters, and so forth. You can even decide to restrict rules to certain project items.

And yes, if declarative XML isn’t enough flexible for your renaming requirements, you can still implement custom rename rules using any .NET programming language.

Support for Infragistic’s ActiveThreed Plus library
Many VB6 applications use one or more SS** controls, namely SSCommand, SSOption, SSCheck, SSFrame, and SSPanel. These controls were originally included in earlier Visual Basic releases to support features such as 3-D effects. More powerful versions of those controls were later gathered by their manufacturer – Sheridan, and then Infragistics – into a commercial library named ActiveThreed Plus.

Not only does VB Migration Partner support all the original SS** controls that were included in Microsoft Visual Basic, it also supports other controls in the ActiveThreed Plus library, namely SSResizer, SSScroll, SSRibbon, and SSTransition. (It doesn’t support only two of the 11 controls included in the library: SSSplash and SSSplitter.) Notice that these controls are supported by means of fully managed .NET controls and you don’t need to deploy the original ActiveX controls with the converted VB.NET application (but you need the original controls when running the migration process.)

Preserve your investement! VB Migration Partner users can adopt the convert-test-fix methodology and solve all compile/runtime errors exclusively by means of pragmas and without any manual intervention on the generated VB.NET. This means that they can leverage all the new refactoring features by simply adding the new seven pragmas to the master VBMigrationPartner.pragmas file, and restart the migration process. With VB Migration Partner it's that simple! Laughing

That’s all for now, but I will post more details in the future. VB Migration Partner 1.10 will be released and made available to registered users in a couple weeks.



A common VB6 misconception... and how to cure it

clock July 31, 2008 12:17

One of our users is facing the migration of a large VB6 application and found that the coders often declared local variables using this "concise" syntax:

    Dim x1, x2, y1, y2 As Integer

It is evident that the original intention was to declare four Integer variables. The truth is, this statement declares only y2 as an integer variable, whereas the type of x1, x2, and y1 variables is affected by Defxxx statements (e.g. DefInt or DefDbl). If the current file contains no Defxxx statement, then the type of these variable is Variant. This was a common misconception among less experienced VB6 developers or developers that grew up with C or C++.

Both the Upgrade Wizard and VB Migration Partner convert the first three Variant variables into VB.NET "As Object" variables, which is formally correct but isn't what the original developer meant. Our user asked whether there was anything he could do about it. Quite surprisingly, VB Migration Partner offers a neat and elengant solution to this issue.

In fact, if you are migrating a VB6 application written by developers who consistently applied this sloppy coding practice, you can help VB Migration Partner to restore the intended data type by means of one or more PreProcess pragmas. More precisely, you need one pragma to account for Dim statements with two variables, another pragma to account for Dim statements with three variables, and so forth:

'## PreProcess "\b(?<kw>Private|Public|Dim|Static)\b\s+(?<v1>\w+)\s*,
\s*(?<v2>\w+)\s+As\s+(?<type>\w+(\.\w+)?)",
"${kw} ${v1} As ${type}, ${v2} As ${type}", true

'## PreProcess "\b(?<kw>Private|Public|Dim|Static)\b\s+(?<v1>\w+)\s*,
\s*(?<v2>\w+)\s*,\s*(?<v3>\w+)\s+As\s+(?<type>\w+(\.\w+)?)",
"${kw} ${v1} As ${type}, ${v2} As ${type}, ${v3} As ${type}", true

'## PreProcess "\b(?<kw>Private|Public|Dim|Static)\b\s+(?<v1>\w+)\s*,
\s*(?<v2>\w+)\s*,\s*(?<v3>\w+)\s*,
\s*(?<v4>\w+)\s+As\s+(?<type>\w+(\.\w+)?)",
"${kw} ${v1} As ${type}, ${v2} As ${type}, ${v3} As ${type}, ${v4} As ${type}", true

'## PreProcess "\b(?<kw>Private|Public|Dim|Static)\b\s+(?<v1>\w+)\s*,
\s*(?<v2>\w+)\s*,\s*(?<v3>\w+)\s*,\s*(?<v4>\w+)\s*,
\s*(?<v5>\w+)\s+As\s+(?<type>\w+(\.\w+)?)",
"${kw} ${v1} As ${type}, ${v2} As ${type}, ${v3} As ${type},
 ${v4} As ${type}, ${v5} As ${type}", true

Notice that these pragmas account for variables and class fields declared with Dim, Static, Public, and Private keywords. You can easily create similar PreProcess to account for more than five variables in a single line.

When these pragmas are used, the above Dim statement is expanded into the following VB6 code immediately before the migration begins:
    Dim x1 As Integer, x2 As Integer, y1 As Integer, y2 As Integer
which, of course, is translated to VB.NET as:
    Dim x1 As Short, x2 As Short, y1 As Short, y2 As Short

Important Note: Keep in mind that these pragmas affect may mistakenly affect the type of variables that SHOULD remain Object variables. Always double check that the resulting VB.NET code works as intended.

Problem solved! 

Update (October 3, 2008): Marco Giampetruzzi (from the VB Migration Partner Team) found a simpler and better way to solve the same problem, as you can read here.



Refactoring VB.NET code with regular expressions

clock July 21, 2008 10:22

One of the key features of VB Migration Partner is migration pragmas, and possibly the most powerful pragma is PostProcess, which enables you to modify the VB.NET being generated. The PostProcess pragma is based on regular expressions and works much like the Regex.Replace method.

The PostProcess pragma can really do wonders in the hands of a developer who is familiar with regexes. As a matter of fact, we have already uploaded several KB articles that show how you can refactor your code using PostProcess pragmas:

[HOWTO] Modify project-level options in VB.NET programs
[HOWTO] Move the declaration of a variable into a For or For Each loop
[HOWTO] Transform “Not x Is y” expressions into “x IsNot y” expressions
[HOWTO] Replace App6 properties with native VB.NET members
[HOWTO] Get rid of warnings related to Screen.MousePointer property
[HOWTO] Convert While loops into Do loops
[HOWTO] Simplify code produced for UserControl classes
[HOWTO] Change the base class of a VB.NET class

[HOWTO] Avoid compilation errors caused by unsupported designers

In this post I am going to show you how you can use the PostProcess pragma in a creative way to optimize common Boolean expressions. The regex patterns I will show aren't exactly simple and I won't explain each in depth. But trust me, they work and they work great!

Important note #1: like all substitutions based on regular expressions, there is a small probability that the techniques discussed in this article might suffer from false matches and, therefore, might produce invalid VB.NET code. Ensure that you add these pragmas to your VB6 code only after you have checked that the conversion works well and be prepared to remove these pragmas if you notice that they produce invalid VB.NET code.

Important note #2: most regex patterns in this article are too long for the page width and will be split in two or more lines. If you are in trouble, at the bottom of this KB article you will find a link to a text file that gathers all of the PostProcess pragmas I am introducing here.

1. Comparisons with True and False

Some VB6 developers like to explicitly compare a Boolean variable with True or False, as in this example:
    Dim x As Integer, res As Boolean, res2 As Boolean
    ' ...
    If res = True And res2 = False Then x = 0


Explicit comparison of a Boolean variable with True or False adds a little overhead, which can be avoided by re-rewriting the expression as follows:
    If res And Not res2 Then x = 0

Here are the two PostProcess pragmas that do the trick, the former accounting for comparisons with True and the latter for comparisons with False:

'## PostProcess "(?<=\n[ \t]*)(?<key>If|ElseIf)\b(?<pre>.+)\b(?<cond>\S+)
    [ \t]*=[ \t]*\bTrue\b(?<post>.+)\bThen\b",
    "${key}${pre}${cond}${post}Then", True

'## PostProcess "(?<=\n[ \t]*)(?<key>If|ElseIf)\b(?<pre>.+)\b(?<cond>\S+)
    [ \t]*=[ \t]*\bFalse\b(?<post>.+)\bThen\b",
    "${key}${pre}Not ${cond}${post}Then", True


The condition can appear also inside Do, Loop, and While statements, thus we need two more PostProcess pragmas:

'## PostProcess "(?<=\n[ \t]*)(?<key>(Do|Loop)[ \t]+(While|Until)|While)\b
    (?<pre>.+)\b(?<cond>\S+)\s*=\s*\bTrue\b",
    "${key}${pre}${cond}", True

'## PostProcess "(?<=\n[ \t]*)(?<key>(Do|Loop)[ \t]+(While|Until)|While)\b
    (?<pre>.+)\b(?<cond>\S+)\s*=\s*\bFalse\b",
    "${key}${pre}Not ${cond}", True


2. Not keyword in Do While and Loop While statements

If a Do While or Loop While statement is followed by a condition that is prefixed by the Not keyword, you can drop the Not keyword and transform the statement into Do Until or Loop Until, respectively. For example, the following code
    Do While Not res
        ' ...
    Loop

can be transformed into:
    Do Until res
        ' ...
    Loop


This is the PostProcess pragma that can perform this substitution for you:

'## PostProcess "(?<=\n[ \t]*)(?!.*\b(And|Or|Xor)\b)(?<key>Do|Loop)
    [ \t]+While[ \t]+Not[ \t]+(?<cond>.+)", "${key} Until ${cond}", True


Notice that the regex pattern fails if the condition contains an And, Or, or Xor operator. This check is necessary to avoid to mistakenly match an expression such as this:
    Do While Not res And x = 10

3. Not keyword in Do Until and Loop Until statements

If a Do Until or Loop Until statement is followed by a condition that is prefixed by the Not keyword, you can drop the Not keyword and transform the statement into Do While or Loop While, respectively. For example, the following code
    Do Until Not res
        ' ...
    Loop

can be transformed into:
    Do While res
        ' ...
    Loop


The PostProcess pragma that can perform this substitution is as follows:

'## PostProcess "(?<=\n[ \t]*)(?!.*\b(And|Or|Xor)\b)(?<key>Do|Loop)
    [ \t]+Until[ \t]+Not[ \t]+(?<cond>.+)", "${key} While ${cond}", True


4. Assigning True/False to a variable in a If...Then...Else block

A common (yet inefficient) coding pattern among developers is to use an If…Then…Else block to assign True or False to a Boolean variable. For example, consider the following VB6 code:
    Dim x As Integer, res As Boolean, res2 As Boolean
    ' ...
    If x <> 0 Then
        res = True
    Else
        res = False
    End If            

    If x > 10 And x < 20 Then
        res2 = False
    Else
        res2 = True
    End If


It is evident that it can be simplified (and optimized) as follows:

    res = (x <> 0)
    res2 = Not (x > 10 And x < 20)


Creating a regular expression pattern that allows you capture and transform the If…Then…Else block isn’t a trivial task, also because we need a different PostProcess pragma to account for single-line If statements:

'## PostProcess "(?<=\n[ \t]*)If[ \t]+(?<cond>.+?)[ \t]+Then[ \t]*\r?\n[ \t]
    *\b(?<var>(Return\b|\S+[ \t]*=))[ \t]*True[ \t]*\r?\n[ \t]*Else
    [ \t]*\r?\n[ \t]*\k<var>[ \t]*False[ \t]*\r?\n[ \t]*End[ \t]+If
    [ \t]*\r?\n", "${var} (${cond})\r\n", True

'## PostProcess "(?<=\n[ \t]*)If[ \t]+(?<cond>.+?)[ \t]+Then[ \t]+
    (?<var>(Return\b|\S+[ \t]*=))[ \t]*True[ \t]+Else[ \t]+\k<var>
    [ \t]*False[ \t]*\r?\n", "${var} (${cond})\r\n", True


The next two PostProcess pragmas account for the cases when the variable is assigned False when the condition is true. Again, we need a second pragma to account for single-line IF statements.

'## PostProcess "(?<=\n[ \t]*)If[ \t]+(?<cond>.+?)[ \t]+Then[ \t]*\r?\n[ \t]
    *\b(?<var>(Return\b|\S+[ \t]*=))[ \t]*False[ \t]*\r?\n[ \t]*Else
    [ \t]*\r?\n[ \t]*\k<var>[ \t]*True[ \t]*\r?\n[ \t]*End[ \t]+If
    [ \t]*\r?\n", "${var} Not (${cond})\r\n", True

'## PostProcess "(?<=\n[ \t]*)If[ \t]+(?<cond>.+?)[ \t]+Then[ \t]+
    (?<var>(Return\b|\S+[ \t]*=))[ \t]*False[ \t]+Else[ \t]+\k<var>
    [ \t]*True[ \t]*\r?\n", "${var} Not (${cond})\r\n", True


Notice that the patterns that follow account both for variable assignements and for the Return keyword used to return a value from a function (such Return keyword is used by VB Migration Partner if possible).

5. Assigning different values to a variable inside an If...Then...Else block

Another common coding pattern - which is actually a variant of the previous one - is to use an If…Then…Else block to assign one of two possible values to the same variable, as in these examples:
    If x <> 0 Then
        y = x + 10
    Else
        y = 20
    End If

    ' single-line variant
    If x <> 0 Then y = x + 10 Else y = 20


There is a more concise – though not necessarily more efficient – way to perform the same assignment:
    y = IIf(x <> 0, x + 10, 20)

If you are interested in performing this substitution automatically, you can use the following two pragmas, where the latter works for single-line IF statements:

'## PostProcess "(?<!\bEnd[ \t]+)\bIf[ \t]+(?<cond>.+?)[ \t]+Then[ \t]*\r?\n
    [ \t]*\b(?<var>(Return\b|\S+[ \t]*=))[ \t]*(?<v1>[^\r\n]+)\r?\n
    [ \t]*Else[ \t]*\r?\n[ \t]*\k<var>[ \t]*(?<v2>[^\r\n]+)[ \t]*\r?\n
    [ \t]*End[ \t]+If[ \t]*\r?\n",
    "${var} IIf(${cond}, ${v1}, ${v2})\r\n", True

'## PostProcess "(?<!\bEnd[ \t]+)\bIf[ \t]+(?<cond>.+?)[ \t]+Then[ \t]+
    (?<var>(Return\b|\S+[ \t]*=))[ \t]*(?<v1>[^\r\n]+)\bElse[ \t]+\k<var>
    [ \t]*(?<v2>[^\r\n]+)\r?\n",
    "${var} IIf(${cond}, ${v1}, ${v2})\r\n", True


---------------------------

That's it, folk! If you ever wondered why you should care about regular expressions, I hope I gave you something to think about Wink



Neat tricks for smooth migration of calls to Windows API methods

clock June 13, 2008 03:13

VB Migration Partner does a superb in dealing with Windows API calls. Here's a summary of the features that it supports and that are out of reach for the Upgrade Wizard included in Visual Studio 2005/2008:

  • converts As Any parameters, by creating all the necessary overloads
  • deals correctly with API methods that take a callback address (e.g. EnumWindows, EnumFonts)
  • provides recommendation about the .NET object/method that can effectively replace the API method; we cover 300+ different API calls
  • in a few cases, it automatically replace API calls with calls to the corresponding .NET object/method
  • ensures that string immutability doesn't prevent the VB.NET code from working correctly (see this article)
  • generates the correct MarshalAs attributes for elements in Type (Structure) blocks
  • correctly translates fixed-length strings inside Type blocks, so that they work correctly when passed to the Windows API method
  • automatically initializes static arrays inside Type blocks, so that you don't get unexpected crashes when invoking an API method that expects to find a buffer there
  • creates a wrapper method that ensures that orphaned delegates don't cause an unexpected runtime exception, an advanced programming technique discussed in this KB article
  • includes the VB6WindowsSubclasser class that helps you correctly migrate subclassing-based techniques (as explained here)


In spite of all these innovations, there are cases when you still need to manually edit either the original VB6 code or the converted VB.NET. This happens, for example, if the original code uses the VarPtr, StrPtr, or ObjPtr functions to pass memory pointers to an external API method. These three functions aren't supported under VB.NET and there is no simple way to simulate them.

The good news is that in the vast majority of cases you don't need to deal with memory pointers under .NET, because the .NET Framework offer a valid "pure" alternative to the API method in question. In this article I'll illustrate how you can take advantage of VB Migration Partner features to reduce manual edits to the very minimum and take advantage of the convert-test-fix methodology.

For simplicity's sake, let's focus on one of the simplest API methods, the GetSystemDirectory Windows API. Here's a piece of VB6 code that displays the system directory path:

    ' Main.Bas module
    Public Declare Function GetSystemDirectory Lib "kernel32.dll" Alias "GetSystemDirectoryA" _
        (ByVal lpBuffer As String, ByVal nSize As Long) As Long

    Sub Main()
        Dim buffer As String, length As Long, windir As String
        buffer = Space(256)
        length = GetSystemDirectory(buffer, Len(buffer))
        winDir = Left(buffer, length)
        MsgBox winDir
    End Sub

The first step is in refactoring this code so that you make all the Declares private and move them to another BAS module, that exposes them by means of standard VB6 methods. (If you usually write tidy and maintainable VB6 code, odds are that you have already taken this step.)

    ' This is the APIHelpers.Bas file
    Private Declare Function GetSystemDirectory Lib "kernel32.dll" Alias "GetSystemDirectoryA" _
        (ByVal lpBuffer As String, ByVal nSize As Long) As Long

    ' returns the Windows directory
    Public Function SystemDirectory() As String
        Dim buffer As String, length As Long
        buffer = Space(256)
        length = GetSystemDirectory(buffer, Len(buffer))
        SystemDirectory = Left(buffer, length)
    End Function

The code that actually displays or otherwise uses the Windows directory path is now simpler. Notice that we explicitly include the module name (APIHelpers) in the method call. This tip reduces the odds that another method with same name exists elsewhere in the project, but the technique explained later works even if you don't include such a name:

    ' the Main.Bas module
    Sub Main()
        Dim windir As String
        winDir = APIHelpers.SystemDirectory
        MsgBox winDir
    End Sub

At this point you have a VB6 project that works exactly like the original one, but it is better organized and structured, with all Declares statements gathered in one single module. Let's see how to migrate this code to VB.NET and get rid of all dependencies from non-NET code.

First and foremost, we prepare a VB.NET module that exposes the same methods as the original APIHelpers.bas but doesn't use any Declare statement. Here's how we can render the SystemDirectory function using native .NET calls:

    ' This is the APIHelpers.vb file (VB.NET)
    Module APIHelpers
        Public Function SystemDirectory() As String
            Return Environment.SystemDirectory
        End Function
    End Module

Next, we use an ExcludeCurrentFile pragma to exclude the APIHelpers.bas VB6 module from migration process and we use an AddSourceFile pragma to add the APIHelpers.vb VB.NET file to the converted Visual Studio project. The neat result is that the code in Main now calls the .NET version of the method, which doesn't use any unmanaged calls:

    ' This is the APIHelpers.Bas file
    '## ExcludeCurrentFile
    '## AddSourceFile "c:\vbnet\modules\apihelpers.vb"

    Private Declare Function GetSystemDirectory Lib "kernel32.dll" Alias "GetSystemDirectoryA" _
        (ByVal lpBuffer As String, ByVal nSize As Long) As Long
    ' ... remainder of module as before...

This solution works great, but we can improve it. In fact, a (minor) problem is that the resulting VB.NET code still uses wrapper methods and doesn't look like the "native" .NET code that an experienced VB.NET developer would write. Fear not, because all you need is a project-level PostProcess pragma:

    ' This is the APIHelpers.Bas file
    '## ExcludeCurrentFile
    '## project:PostProcess "(APIHelpers\.)?SystemDirectory", "Environment.SystemDirectory"

Notice that I dropped the AddSourceFile pragma because you don't need the wrapper method any longer (at least in this simplified example). Using similar techniques you can provide a .NET equivalent for most methods that require API calls under VB6, including methods that take arguments.

One of the long-terms goals we have in Code Architects is to apply these concepts on a larger scale to create VB6 helper modules and their corresponding VB.NET versions, to help all VB6 developers to easily migrate their API-intensive applications. Just stay tuned, as usual!



Comparing the & operator, the String.Concat method and the String.Format method

clock February 20, 2008 12:16

You might be aware that VB.NET supports at least 4 techniques to append strings:

    1) the & operator (or the + operator)

    2) the String.Concat static method

    3) the String.Format static method

    4) the StringBuilder object and its Append and Insert methods

By peeking at the code that the VB.NET compiler generates for the & (or +) operator, you can quickly realize that this operator is rendered as a call to the String.Concat method, thus using either technique #1 or #2 has no impact on performance and is just a matter of coding style. I usually prefer the & operator over String.Concat for readibility's sake, and I guess most VBers do the same, but it should be clear that the two are perfectly equivalent if you are concerned only about speed. (NOTE: when working with values other than strings, you might find String.Concat more readable, because it doesn't force you to explicitly invoke the ToString() method of each operand.)

When working with long strings that undergo many concatenations, the StringBuilder object is the fastest technique, period. Too much digital ink has been spilled on this topic, and I won't repeat those well-known concepts here. (BTW, if you're looking for a smart way to replace the & operator with the StringBuilder object without messing up your existing code, read here.)

However, when you have to perform a few concatenations on many short strings, the overhead needed to setup the StringBuilder object often shadows the benefits of its Append method. In these scenarios, the choice is among techniques #1 (or #2, which is equivalent) and technique #3. Today I decided to take some time to compare their performance. I run 100,000 concatenations over two or more 10-char strings. Here are the results on my on a 2.20 GHz Core Duo, 2G RAM Dell notebook running Vista (timings are in milliseconds and are averaged over several runs):

# of operands   Concat    Format
2                  12         60
3                  22         95
4                  28        150
5                 125        160
6                 185        192
7                 210        215
8                 235        239
9                 270        272

To summarise, the Concat method runs 4-5 times faster than Format with 4 operands of fewer, but there is no significant difference with 6 or more operands. It is interesting to notice the steep increase (from 28 to 125 milliseconds) for the Concat method when passing fro 4 to 5 arguments. The reason: there is no String.Concat overload that takes 4 arguments, therefore the VB.NET (and C#) compiler has to build a temporary array with 5 elements and pass it to the String.Concat overload that takes a ParamArray. The same thing happens with the String.Format method when you pass from 3 to 4 arguments.

In general, I prefer to use the String.Format method when appending 3 or more strings, except inside time-critical code. For example, I use it for building error messages and other UI elements; in this case the loss of speed rarely (or never) affects the overall execution speed. Another advantage of String.Format is that you can easily create a table of messages and store them on a database, an XML file, or just a plain text file (possibly stored as a resource). For example, all VB Migration Partner error and warning messages are stored in a format like this:

        Using the '{0}' Windows API method as an argument to the '{1}' function might result in an string filled with spaces. Please split the next line in two statements.

This approach makes the code more readable and maintenable. Plus, localizing the code for a language other than English will be a breeze, if I ever need to.

---------------------------------------

Speaking ofconcatenations, the fact that Concat, Format and other static methods of the System.String class have an overload that takes a ParamArray (and therefore a standard array) makes it possible to implement a few nice tricks. For example, you can quickly concatenate all the elements in an array using this code:

    Dim arr() As String = {"one", "two", "three", "four"}
    Dim res As String = String.Concat(arr)    ' => onetwothreefour

If you need to use a delimiter between each element or concatenate onlya subset of the array, use the String.Join method:

    res = String.Join(" ", arr)         ' => one two three four
    res = String.Join(" ", arr, 1, 2)   ' => two three

When working with arrays of 10 elements or fewer, or you are interested in just the first 10 elements, you can use the Format method in creative ways, as in this code:

    ' display elements in random order
    res = String.Format("{3} {1} {0} {2}", arr)    ' => four two one three 

 


Speed up string concatenations after the migration from VB6

clock December 22, 2007 03:41

When you migrate code from VB6 - regardless of whether you are using VB Migration Partner, the Upgrade Wizard, or another automatic conversion tool - chances are that string-intensive code will actually run slower under VB.NET, if it uses a lot of string concatenation operations. For example, this code takes 2.8 seconds when it runs in VB6 and 27 seconds after its conversion to VB.NET on my 3GHz system:

Dim s As String = ""
For i As Integer = 1 To 100000
    s = s + "*"
Next

The solution is of course trivial: just replace the string variable with a StringBuilder object. Alas, this fix requires that you completely revise the VB.NET code, because you need to replace all + and & operators with the Append method, not to mention cases where the StringBuilder is used as an argument to string functions such as Trim or Len.

Is there a way to speed up the previous code with a minimal impact on the code itself? The answer is yes and the solution is actually very simple: you just need to create a VB.NET class that is based on the System.Text.StringBuilder object, that redifines the + and & operators, and that supports the implicit conversion to and from the System.String type. Authoring such a StringBuilder6 class is a matter of minutes:

' a wrapper for the StringBuilder object, with support for + and & operators

Public Class StringBuilder6

    Private buffer As New System.Text.StringBuilder

    ' return the inner string

    Public Overrides Function ToString() As String
       
Return buffer.ToString()
    End Function

    Public Shared Operator +(ByVal op1 As StringBuilder6, ByVal op2 As String) As StringBuilder6
        op1.buffer.Append(op2)
       
Return op1
    End Operator

    Public Shared Operator &(ByVal op1 As StringBuilder6, ByVal op2 As String) As StringBuilder6
        op1.buffer.Append(op2)
        Return op1
    End Operator

    ' convert to string

    Public Shared Widening Operator CType(ByVal op As StringBuilder6) As String
       
Return op.ToString()
    End Operator

    ' convert from string

    Public Shared Widening Operator CType(ByVal str As String) As StringBuilder6
        Dim op As New StringBuilder6()
        op.buffer.Append(str)
        Return op
    End Operator

End Class

Once the StringBuilder6 class is in place, you just need to replace the type of the String variable:

Dim s As StringBuilder6 = ""

After this edit, the loop runs in 0.008 seconds, that is about 2000 times faster!!! Not bad, for such a simple fix :-)

The StringBuilder6 class is so useful that we decided to include it in the forthcoming version 0.92 of VB Migration Partner. Even better, you can use a SetType pragma right in the original VB6 code to tell our tool that a given string variable must be rendered with this new type:

'## s.SetType StringBuilder6
Dim s As String = ""
For i As Integer = 1 To 100000
    s = s + "*"
Next