Ty Anderson

Populating Excel workbooks and ranges with arrays

Microsoft Excel, with its almost endless supply of rows and columns, is data DisneyLand. It’s fun to work with data and make it easy to consume. Part of the fun, at least for developers, is populating Excel with data. Today, I want to take another look at how to use arrays to populate Excel with data.

A tool

 

The scenario

You have an array filled with data and you want to insert that data into an Excel spreadsheet. You would be considered quite reasonable if you attempted to create an array and insert it. Something like this:

Dim myArray as Object () = GetMyData
Excel.Range(myRange).Value = MyArray

This strategy will insert data but it will not make you happy. What you will see is each cell within the range receives the same data… the first field in the array.

We want to populate Excel with data as efficiently as possible. This means we are not going to navigate a worksheet row-by-row, column-by-column to insert data. That would be what you call “slower than a West Texas tumbleweed“. What we need is a strategy that allows us to insert the data in a single bound… or one fell swoop… or lickety split.

The basic strategy to work with Excel arrays

I touched on the strategy two articles ago in Working with Excel tables and ranges. If you charge ahead and try to do this on your own, you might be tempted to create a 1D array and insert it into a cell of your choosing. Don’t do this. It will hurt you because it won’t work. Instead, heed this wisdom and create at least a 2D array. For the convenience’s sake, I’ve include my code sample from two articles ago (I measure time not in minutes, but in articles).

Private Sub ArrayByRandomLoop(targetRange As Excel.Range)
    'We'll use a dynamically silly array (or is it silly-ly dynamic?)
    'But it is easy to imagine creating your own array dynamically
    Dim dataToInsert As Object(,) = New Object(5, 1) {}
 
    For i As Integer = 0 To 5
        dataToInsert(i, 0) = "Row " + i.ToString()
    Next
 
    targetRange.Value2 = dataToInsert
End Sub

This sample is ideal for benign scenarios (i.e. somewhat lame code samples) that 1) are not dynamic and 2) require only a single column of data. Let’s turn up the volume just a little.

How to prepare your arrays for Excel ranges

Arrays are not created equally. They vary in shape, size, and purpose. Some are completely filled with data; others are less filled and/or empty. When inserting into Excel, we want to work an arrays that have every element filled. Thus, before we insert an array into a spreadsheet, we should prepare them.

Empty arrays

You can check if an array is completely empty by checking each element for Nothing. If each one is Nothing, we know we have an empty array.

Private Function CheckIsEmptyArray(theArray As System.Array) As Boolean
    For i As Integer = 0 To theArray.GetUpperBound(0)
        For j As Integer = 0 To theArray.GetUpperBound(1)
            If theArray.GetValue(i, j) IsNot Nothing Then
                Return False
            End If
        Next
    Next
 
    Return True
End Function

If so much as one element within the array is not Nothing, we know the array has at least a little value.

Empty values

To find empty values, we do the same as with looking for empty arrays. We look at each element to determine if its value is empty (Nothing). But here there is a twist, can you spot it?

Private Sub InsertValuesWhereEmpty(theArray As System.Array)
    For iRow As Integer = 0 To theArray.GetUpperBound(0) 'The number of rows
        For iColumn As Integer = 0 To theArray.GetUpperBound(1) 'The number of columns
            Dim val As Object = theArray(iRow, iColumn)
            If val Is Nothing Then
                theArray(iRow, iColumn) = CVErr(ADXxlCVError.xlErrNull)
            End If
        Next
    Next
End Sub
 
Public Function CVErr(ByVal xlError As ADXxlCVError) As Object
    Dim hr As Integer = &H800A0000 + xlError
    Return New ErrorWrapper(hr)
End Function

The purpose of this method is to prepare the array by ensuring every element has a value. The twist is that when it finds an empty element, it inserts an Excel error using one of the ADXxlCVError constants. The effect is that, when eventually inserted into a spreadsheet, the user will quickly notice the cells with missing data.

There is a call to another CVErr function to convert the chosen constant to an error recognized by Excel.

Using Value versus Value2 versus Formula

When inserting arrays in to Excel spreadsheets, you have three property choices: Value, Value2, and Formula. Which one is the correct one? Before I give my opinion, let’s consider each of them first.

  • Value: This is a parameterized property that lets you get and set the value for a specified range.
  • Value2: Same as value BUT it doesn’t support Currency and Date data types. Also, it does not require a parameter.

Whether you use one over the other probably depends on your preferred language. Developers using C# tend to use Value2 as Value doesn’t even show up in Intellisense. Those of you who prefer VB.NET probably use Value2. For more background, you can read Eric Carter’s comments of Value versus Value2.

But there is another…

  • Formula: This is the formula string of the range. This is exactly what you see displayed in the Excel formula bar when you select a cell in the Excel window. It’s common to insert a valid formula into this property… and thus set its value. I mention it here because if you select a 1D or 2D array, you can use an Excel array of the same dimensions to set the range’s Formula.

That’s that theory. I tend to utilize Value2 no matter what language I use.

Fill array with a query and insert into a worksheet

Lots of people like to fill an Excel workbook with data from a database. They then like to go completely nuts analyzing that data. For these types of users we can do some automation.

We can connect to a database.
We can query the database.
We can build an array and insert that array into the spreadsheet.

Take a look…

Private Sub ArrayByQuery(targetRange As Excel.Range)
    Dim cnn As OleDbConnection = New OleDbConnection()
    cnn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" + _
        "Data Source=[INSERT YOUR PATH HERE]\Northwind2007.accdb;Persist Security Info=False;"
 
    Dim sql As String = "SELECT Top 11 Customers.Company, Customers.[First Name], " + _
        "Customers.[Job Title], Customers.[State/Province] From Customers;"
    Dim cmd As OleDbCommand = New OleDbCommand(sql, cnn)
    cnn.Open()
 
    Dim reader As OleDbDataReader = cmd.ExecuteReader()
 
    If reader.HasRows Then
        'BUILD the array
        Dim dataToInsert As Object(,) = New Object(10, 3) {}
        Dim i As Integer = 0
 
        While reader.Read
            dataToInsert(i, 0) = reader(0).ToString()
            dataToInsert(i, 1) = reader(1).ToString()
            dataToInsert(i, 2) = reader(2).ToString()
            dataToInsert(i, 3) = reader(3).ToString()
            i = i + 1
        End While
 
        targetRange.Value2 = dataToInsert
    Else
        'Maybe give the user a message to tell them "yo, you don't have any data."
    End If
End Sub

The ArrayByQuery method does all these things. I admit I cheated and hard-coded the array dimensions. There are lots of ways to build arrays. The point here is that after populating the array with data, you simply insert it into a range. That’s it and that’s all.

Fill the array with a range and insert into another worksheet

Databases are fine and dandy and are a common contributor to worksheet data. But, all too often users need to copy data from other sections of an Excel workbook. The ArrayByRange sample accepts a string that represents a named range. It then copies it to the current selection.

Private Sub ArrayByRange(rangeName As String)
    Dim dataToInsert(,) As Object
 
    Dim destinationRange As Excel.Range = ExcelApp.Selection
    Dim sourceRange As Excel.Range = Nothing
    Try
        sourceRange = ExcelApp.Range(rangeName)
    Catch
    End Try
 
    If sourceRange IsNot Nothing Then
        ReDim dataToInsert(0 To sourceRange.Rows.Count, 0 To sourceRange.Columns.Count)
        dataToInsert = sourceRange.Value
        destinationRange.Value2 = dataToInsert
 
        Marshal.ReleaseComObject(sourceRange)
    End If
 
    Marshal.ReleaseComObject(destinationRange)
End Sub

It works perfectly. Give it a try and you’ll see what I mean.

The fastest way – generate Excel files directly

These samples work with the Excel object model and its overhead. This is a suitable strategy if you are not looking to do bulk, array-insertion actions. But if you do need some to scale the solution, I recommend you re-architect to build Excel files directly.

To do this, you need to avoid the Excel object and utilize other APIs. Here are a few that let you do this.

  • Open XML SDK: This is the Office team’s API for creating Microsoft Office documents without the need for Office. It’s free and fairly well documented.
  • EPPlus: This .NET library is specific to Excel. It simplifies usage of the OpenXML SDK. It has a 5-star rating on Codeplex and I recommend it over trying directly working with the Open XML SDK.
  • ASPOSE Cells for .NET: This is a commercial (i.e. not free) .NET library that also allows you to build Excel files directly. It has a wider feature set than EPPlus (it should, given it is not free) including specific use cases for inserting arrays.

***

Given these are samples and are meant to wet your whistle but not quench your thirst. I think you can easily extend these code examples to do more than just copy data. For example, in the second sample, you could copy the range, perform a calculation, then insert the results. As I heard in a conference call recently, “The possibilities are unlimitless!” No kidding.

Available downloads:

This sample Excel add-in was developed using Add-in Express for Office and .net:

Excel Populate With Arrays Add-in (VB.NET)

You may also be interested in:

5 Comments

  • https://secure.gravatar.com/avatar/0221b06217daa8ff7d5dfc4ac5cb3020?s=32&d=https%3A%2F%2Fsecure.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D32&r=G Benny says:

    hi. I would like to ask is it possible to insert data into Excel using vb coding from Infopath form in Sharepoint? If possible, can you please show me the step as well as coding to do it? Thank you in advance for your feedback. :)

  • https://secure.gravatar.com/avatar/c68f03884efb33fd95bcfbd9f1078c07?s=32&d=https%3A%2F%2Fsecure.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D32&r=G Ty Anderson says:

    Hi Benny,

    I believe you can using Excel Automation:
    http://support.microsoft.com/kb/219151

    You can automate Excel from InfoPath to create a spreadsheet and write your data to it.
    Another option, is to avoid Excel Automation and use the Open Office XML API to build the spreadsheet without any need to automate Excel.

  • https://secure.gravatar.com/avatar/52f584d8ef9d74547e26e350f25eee71?s=32&d=https%3A%2F%2Fsecure.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D32&r=G Bjorn says:

    Hi, thank you for the explanation about the multi dimensional array being used to fill the excel cells accordingly when parsing the array to a range. Big help, thanks.

  • https://secure.gravatar.com/avatar/f45e46a087c466eeec4f2f7584060098?s=32&d=https%3A%2F%2Fsecure.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D32&r=G Samsudeen says:

    My visual studio code is not working and need some help please.

    I have created a template on Excel and am planning to populate the table via a code in Visual studio. Code created is;

    “Actuals”: [
    {
    “startDate”:”2017-03-01″,
    “production_tasp_mwh”: 222762.4241,
    “availability_contractual”: 0.9830,
    “wind_energy_content”: 114.1,
    “capacity_factor”: 0.475,
    “availability_park”: 0.9516,
    “availability_production_based”: “0.9244”
    },
    ],
    “Budget”: [
    {
    “startDate”:”2017-03-01″,
    “production_tasp_mwh”: 208636.0,
    “availability_contractual”: 0.975,
    “wind_energy_content”: 108.0,
    “capacity_factor”: 0.548,
    “availability_park”: null,
    “availability_production_based”: “null”

    ],
    “Productions”: [
    {
    “month_start_date”: “2017-03-01″,
    “budget”: 208636.0,
    “actual”: 222762.0,
    “budget_YTD”: 695452.0,
    “actual_YTD”: “621272”
    }

    The table template created is

    Production
    Month Budget Actual
    [[Production.month_start_date]] [[Budget.production]] [[Actual.production

    Budget YTD Actual YTD
    [[Budget_YTD.production]] [[Actual_YTD.production]]

    The code is working but the values are not populated in the template. Not sure why and any help much appreciated.

  • https://secure.gravatar.com/avatar/29957f26ad2d8ba527fd9cc8cfa7b2e0?s=32&d=https%3A%2F%2Fsecure.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D32&r=G Andrei Smolin (Add-in Express Team) says:

    Hello Samsudeen,

    > Code created is

    What you show is *not* an array. Please re-check the article.

Post a comment

Have any questions? Ask us right now!