0

I have hierarchical data like this

Country Region  Category       ProgramName
USA     North   SchoolName     A
USA     North   SchoolName     B
USA     South   SchoolName     C
Brasil  East    SchoolName     D
Brasil  East    CollegeName    E
Brasil  West    CollegeName    F

I would like to pivot it into a user readable format.

Pivot

I am able to build the pivot table, however I would like to use nonnumeric data as the pivot. The VBA code in this answer seems promising but it can only pivot a single nonhierarchical column. How can I achieve my goal?

jjjjjjjjjjj
  • 417
  • 8
  • 28
  • You could add a data column translating the letters to numbers, e.g. 1..6 in your case, do the pivot on this column, copy/paste the Pivot Table's content and substitute back the letters. – Excelosaurus Nov 10 '17 at 21:42
  • @Excelosaurus The list is always changing so unfortunately I can't create a numerical reference table. If there is more than one `SchoolName` for a region and country the then number is aggregated. – jjjjjjjjjjj Nov 10 '17 at 21:46
  • I'm confused: you say that the number is aggregated. What number? – jeffreyweir Nov 10 '17 at 21:50
  • Argh, had missed the "concatenated" A & B. @jeffrey, OP was responding to my (unusable) suggestion. – Excelosaurus Nov 10 '17 at 21:50
  • If I understand correctly, 'Data' isn't actually data. It's an attribute. So give it another name (I'll use "Program" here) and add a new column called 'Data' and put a 1 in it for every line. Then put that numerical Data field into the PivotTable Data area, and put the Program field in the rows area. Then you can use number formatting to make the 1 values in the Values area look like tick marks or something. – jeffreyweir Nov 10 '17 at 21:54
  • @jeffreyweir That's correct, it's an attribute. Edited for clarity. – jjjjjjjjjjj Nov 10 '17 at 22:14
  • I have the hierarchical data as column attribute, and Category as row attribute. If I use your suggested method, I can only pivot along the row OR column attribute, but I need both. – jjjjjjjjjjj Nov 10 '17 at 22:15
  • Should I try modifying the macro in my link to use VLOOKUP for the row and hierarchical column names? – jjjjjjjjjjj Nov 10 '17 at 22:16
  • Either I still don't understand your requirements, or you simply can't do what you are trying to do. How can both A and B show in the Data area of the Pivot at the same time? Pivots are aggregation machines. Yet in your sample output, A and B are shown separately. You just can't do this, code or no code. The nearest you will get using a PivotTable is as I show in the image of my answer, but I'm unclear if this is what you want. – jeffreyweir Nov 10 '17 at 23:11
  • Clarification on the above: if A and B were numbers, a PivotTable would add them together and show the result at the relevant 'intersection' in the Values area. I understand you want text to show in the data area of the Pivot, which is why you are linking to my code at the other thread. But that code is still 'effectively' converting aggregated numbers that a PivotTable serves up to text, by mapping the aggregation to a text string. But you are doing something different: trying to feed the VALUES area text, in a way where the PivotTable doesn't aggregate duplicates. My code won't help here. – jeffreyweir Nov 10 '17 at 23:22
  • How many distinct Program Names might there be in your source table? – Excelosaurus Nov 11 '17 at 00:31
  • @jeffreyweir I want to pivot it into the mockup in the post. – jjjjjjjjjjj Nov 14 '17 at 16:08
  • @Excelosaurus I can't say for sure. Program Names are added and changed regularly. Let's say several hundred. – jjjjjjjjjjj Nov 14 '17 at 16:09
  • I've built a crude text transposition class (my answer). – Excelosaurus Nov 14 '17 at 16:21

2 Answers2

2

I couldn't find code lying around on the 'net to do just what you're looking for. It might be possible through some Get & Transform sorcery, but that's not my field of expertise. Because it's an interesting problem and because I can think of use cases for my own projects, here's my take on it.

Disclaimer: this code is hot off the stove and hasn't been thoroughly tested. Use at your own risk.

First, create a new workbook and, on Sheet1, set these values starting at cell A1 (I've added the SubCategory column for testing purposes):

Country Region  Category     SubCategory  ProgramName
USA     North   SchoolName   X            A
USA     North   SchoolName   X            B
USA     South   SchoolName   Y            C
Brasil  East    SchoolName   Y            D
Brasil  East    CollegeName  X            E
Brasil  West    CollegeName  Y            F

Then, create a class module named CTextTransposer and paste this code into it:

Option Explicit

Private Const DEFAULT_VALUES_SEPARATOR As String = ", "

Private m_rngSource As Excel.Range
Private m_dicAcrossSourceColumnIndexes As Object 'Scripting.Dictionary
Private m_dicDownSourceColumnIndexes As Object 'Scripting.Dictionary
Private m_lDataSourceColumnIndex As Long
Private m_bRepeatAcrossHeaders As Boolean
Private m_bRepeatDownHeaders As Boolean
Private m_sKeySeparator As String
Private m_sValuesSeparator As String

Private Sub Class_Initialize()
    Set m_dicAcrossSourceColumnIndexes = CreateObject("Scripting.Dictionary")
    Set m_dicDownSourceColumnIndexes = CreateObject("Scripting.Dictionary")
    m_sKeySeparator = ChrW(&HFFFF)
    m_sValuesSeparator = DEFAULT_VALUES_SEPARATOR
End Sub

Private Sub Class_Terminate()
    On Error Resume Next
    Set m_rngSource = Nothing
    Set m_dicAcrossSourceColumnIndexes = Nothing
    Set m_dicDownSourceColumnIndexes = Nothing
End Sub

Public Sub Init(ByVal prngSource As Excel.Range)
    Set m_rngSource = prngSource
End Sub

Public Sub SetAcross(ByVal psSourceColumnHeader As String)
    StoreHeaderColumnIndex m_dicAcrossSourceColumnIndexes, psSourceColumnHeader
End Sub

Public Sub SetDown(ByVal psSourceColumnHeader As String)
    StoreHeaderColumnIndex m_dicDownSourceColumnIndexes, psSourceColumnHeader
End Sub

Public Sub SetData(ByVal psSourceColumnHeader As String)
    m_lDataSourceColumnIndex = GetHeaderColumnIndex(psSourceColumnHeader)
End Sub

Public Property Let RepeatAcrossHeaders(ByVal value As Boolean)
    m_bRepeatAcrossHeaders = value
End Property

Public Property Get RepeatAcrossHeaders() As Boolean
    RepeatAcrossHeaders = m_bRepeatAcrossHeaders
End Property

Public Property Let RepeatDownHeaders(ByVal value As Boolean)
    m_bRepeatDownHeaders = value
End Property

Public Property Get RepeatDownHeaders() As Boolean
    RepeatDownHeaders = m_bRepeatDownHeaders
End Property

Public Property Let ValuesSeparator(ByVal value As String)
    m_sValuesSeparator = value
End Property

Public Property Get ValuesSeparator() As String
    ValuesSeparator = m_sValuesSeparator
End Property

Private Sub StoreHeaderColumnIndex(ByRef pdicTarget As Object, ByVal psColumnHeader As String)
    pdicTarget(GetHeaderColumnIndex(psColumnHeader)) = True
End Sub

Private Function GetHeaderColumnIndex(ByVal psColumnHeader As String) As Long
    GetHeaderColumnIndex = Application.WorksheetFunction.Match(psColumnHeader, m_rngSource.Rows(1), 0)
End Function

Public Sub TransposeTo( _
    ByVal prngDestinationTopLeftCell As Excel.Range, _
    ByRef prngDownColumnHeaders As Excel.Range, _
    ByRef prngAcrossColumnHeaders As Excel.Range, _
    ByRef prngRowColumnHeaders As Excel.Range, _
    ByRef prngData As Excel.Range)

    Dim dicAcrossArrays As Object 'Scripting.Dictionary
    Dim dicDownArrays As Object 'Scripting.Dictionary
    Dim dicDistinctAcross As Object 'Scripting.Dictionary
    Dim dicDistinctDown As Object 'Scripting.Dictionary
    Dim vntSourceData As Variant
    Dim vntSourceColumnIndex As Variant
    Dim lSourceRowIndex As Long
    Dim lDestinationColumnIndex As Long
    Dim lDestinationRowIndex As Long
    Dim sAcrossKey As String
    Dim sDownKey As String
    Dim vntKey As Variant
    Dim vntKeyParts As Variant
    Dim lKeyPartIndex As Long

    If m_rngSource Is Nothing Then
        prngDestinationTopLeftCell.Value2 = "(Not initialized)"
    ElseIf (m_dicAcrossSourceColumnIndexes.Count = 0) Or (m_dicDownSourceColumnIndexes.Count = 0) Or (m_lDataSourceColumnIndex = 0) Then
        prngDestinationTopLeftCell.Value2 = "(Not configured)"
    ElseIf m_rngSource.Rows.Count = 1 Then
        prngDestinationTopLeftCell.Value2 = "(No data)"
    Else
        InitColumnIndexDictionaries m_dicAcrossSourceColumnIndexes, dicAcrossArrays, dicDistinctAcross
        InitColumnIndexDictionaries m_dicDownSourceColumnIndexes, dicDownArrays, dicDistinctDown
        vntSourceData = m_rngSource.Columns(m_lDataSourceColumnIndex)

        'Down column headers.
        ReDim downColumnHeaders(1 To 1, 1 To m_dicDownSourceColumnIndexes.Count) As Variant
        lDestinationColumnIndex = 1
        For Each vntSourceColumnIndex In m_dicDownSourceColumnIndexes.Keys
            downColumnHeaders(1, lDestinationColumnIndex) = m_rngSource.Cells(1, vntSourceColumnIndex).value
            lDestinationColumnIndex = lDestinationColumnIndex + 1
        Next
        Set prngDownColumnHeaders = prngDestinationTopLeftCell.Resize(1, m_dicDownSourceColumnIndexes.Count)
        prngDownColumnHeaders.value = downColumnHeaders

        'Across column headers.
        ReDim acrossColumnHeaders(1 To m_dicAcrossSourceColumnIndexes.Count, 1 To dicDistinctAcross.Count) As Variant
        lDestinationColumnIndex = 1
        For Each vntKey In dicDistinctAcross.Keys
            vntKeyParts = Split(vntKey, m_sKeySeparator, Compare:=vbBinaryCompare)
            For lKeyPartIndex = 0 To UBound(vntKeyParts)
                acrossColumnHeaders(lKeyPartIndex + 1, lDestinationColumnIndex) = vntKeyParts(lKeyPartIndex)
            Next
            lDestinationColumnIndex = lDestinationColumnIndex + 1
        Next
        If Not m_bRepeatAcrossHeaders Then
            For lDestinationRowIndex = 1 To m_dicAcrossSourceColumnIndexes.Count
                For lDestinationColumnIndex = dicDistinctAcross.Count To 2 Step -1
                    If acrossColumnHeaders(lDestinationRowIndex, lDestinationColumnIndex) = acrossColumnHeaders(lDestinationRowIndex, lDestinationColumnIndex - 1) Then
                        acrossColumnHeaders(lDestinationRowIndex, lDestinationColumnIndex) = Empty
                    End If
                Next
            Next
        End If
        Set prngAcrossColumnHeaders = prngDestinationTopLeftCell.Cells(1, m_dicDownSourceColumnIndexes.Count + 1).Resize(m_dicAcrossSourceColumnIndexes.Count, dicDistinctAcross.Count)
        prngAcrossColumnHeaders.value = acrossColumnHeaders

        'Down row headers.
        ReDim downRowHeaders(1 To dicDistinctDown.Count, 1 To m_dicDownSourceColumnIndexes.Count) As Variant
        lDestinationRowIndex = 1
        For Each vntKey In dicDistinctDown.Keys
            vntKeyParts = Split(vntKey, m_sKeySeparator, Compare:=vbBinaryCompare)
            For lKeyPartIndex = 0 To UBound(vntKeyParts)
                downRowHeaders(lDestinationRowIndex, lKeyPartIndex + 1) = vntKeyParts(lKeyPartIndex)
            Next
            lDestinationRowIndex = lDestinationRowIndex + 1
        Next
        If Not m_bRepeatDownHeaders Then
            For lDestinationRowIndex = dicDistinctDown.Count To 2 Step -1
                For lDestinationColumnIndex = 1 To m_dicDownSourceColumnIndexes.Count
                    If downRowHeaders(lDestinationRowIndex, lDestinationColumnIndex) = downRowHeaders(lDestinationRowIndex - 1, lDestinationColumnIndex) Then
                        downRowHeaders(lDestinationRowIndex, lDestinationColumnIndex) = Empty
                    End If
                Next
            Next
        End If
        Set prngRowColumnHeaders = prngDestinationTopLeftCell.Cells(m_dicAcrossSourceColumnIndexes.Count + 1, 1).Resize(dicDistinctDown.Count, m_dicDownSourceColumnIndexes.Count)
        prngRowColumnHeaders.value = downRowHeaders

        'Data.
        ReDim vntDestinationData(1 To dicDistinctDown.Count, 1 To dicDistinctAcross.Count) As Variant
        For lSourceRowIndex = 2 To m_rngSource.Rows.Count
            sAcrossKey = GetKey(m_dicAcrossSourceColumnIndexes, dicAcrossArrays, lSourceRowIndex)
            sDownKey = GetKey(m_dicDownSourceColumnIndexes, dicDownArrays, lSourceRowIndex)
            lDestinationColumnIndex = dicDistinctAcross(sAcrossKey)
            lDestinationRowIndex = dicDistinctDown(sDownKey)
            vntDestinationData(lDestinationRowIndex, lDestinationColumnIndex) = vntDestinationData(lDestinationRowIndex, lDestinationColumnIndex) & m_sValuesSeparator & vntSourceData(lSourceRowIndex, 1)
        Next
        For lDestinationRowIndex = 1 To dicDistinctDown.Count
            For lDestinationColumnIndex = 1 To dicDistinctAcross.Count
                If Not IsEmpty(vntDestinationData(lDestinationRowIndex, lDestinationColumnIndex)) Then
                    vntDestinationData(lDestinationRowIndex, lDestinationColumnIndex) = Mid$(vntDestinationData(lDestinationRowIndex, lDestinationColumnIndex), Len(m_sValuesSeparator) + 1)
                End If
            Next
        Next
        Set prngData = prngDestinationTopLeftCell.Cells(1 + m_dicAcrossSourceColumnIndexes.Count, 1 + m_dicDownSourceColumnIndexes.Count).Resize(dicDistinctDown.Count, dicDistinctAcross.Count)
        prngData.value = vntDestinationData
    End If

    Set dicAcrossArrays = Nothing
    Set dicDownArrays = Nothing
    Set dicDistinctAcross = Nothing
    Set dicDistinctDown = Nothing
End Sub

Private Sub InitColumnIndexDictionaries(ByVal pdicSourceColumnIndexes As Object, ByRef pdicArrays As Object, ByRef pdicDistinct As Object)
    Dim vntSourceColumnIndex As Variant
    Dim lSourceRowIndex As Long
    Dim sKey As String

    Set pdicArrays = CreateObject("Scripting.Dictionary")
    Set pdicDistinct = CreateObject("Scripting.Dictionary")

    For Each vntSourceColumnIndex In pdicSourceColumnIndexes.Keys
        pdicArrays(vntSourceColumnIndex) = m_rngSource.Columns(vntSourceColumnIndex).value
    Next

    For lSourceRowIndex = 2 To m_rngSource.Rows.Count
        sKey = GetKey(pdicSourceColumnIndexes, pdicArrays, lSourceRowIndex)
        If Not pdicDistinct.Exists(sKey) Then
            pdicDistinct(sKey) = pdicDistinct.Count + 1
        End If
    Next
End Sub

Private Function GetKey(ByVal pdicSourceColumnIndexes As Object, ByVal pdicArrays As Object, ByVal plSourceRowIndex As Long) As String
    Dim sResult As String
    Dim vntSourceColumnIndex As Variant

    sResult = ""

    For Each vntSourceColumnIndex In pdicSourceColumnIndexes.Keys
        sResult = sResult & m_sKeySeparator & CStr(pdicArrays(vntSourceColumnIndex)(plSourceRowIndex, 1))
    Next
    sResult = Mid(sResult, 2)

    GetKey = sResult
End Function

Finally, create a module and paste this code into it:

Option Explicit

Public Sub TestTextTransposer()
    On Error GoTo errHandler

    Dim oTT As CTextTransposer
    Dim rngDownColumnHeaders As Excel.Range
    Dim rngAcrossColumnHeaders As Excel.Range
    Dim rngDownRowHeaders As Excel.Range
    Dim rngData As Excel.Range

    Application.ScreenUpdating = False
    Application.EnableEvents = False

    Set oTT = New CTextTransposer
    With oTT
        .Init Sheet1.Cells(1, 1).CurrentRegion

        .SetAcross "Country"
        .SetAcross "Region"

        .SetDown "Category"
        .SetDown "SubCategory"

        .SetData "ProgramName"

        .RepeatAcrossHeaders = False
        .RepeatDownHeaders = False
        .ValuesSeparator = vbLf

        .TransposeTo Sheet1.Cells(10, 8), rngDownColumnHeaders, rngAcrossColumnHeaders, rngDownRowHeaders, rngData
    End With

    Application.Union(rngDownRowHeaders, rngAcrossColumnHeaders).EntireColumn.AutoFit
    Application.Union(rngAcrossColumnHeaders, rngDownRowHeaders).EntireRow.AutoFit
    rngDownRowHeaders.VerticalAlignment = xlTop

Recover:
    On Error Resume Next
    Set rngData = Nothing
    Set rngDownRowHeaders = Nothing
    Set rngAcrossColumnHeaders = Nothing
    Set rngDownColumnHeaders = Nothing
    Set oTT = Nothing
    Application.EnableEvents = True
    Application.ScreenUpdating = True
    Exit Sub

errHandler:
    MsgBox Err.Description, vbExclamation + vbOKOnly, "Error"
    Resume Recover
End Sub

Run the TestTextTransposer sub and observe the results starting on Sheet1, cell H10. Looking at the test code, you'll see I've used all options offered by the class, plus I've made use of the ranges it returns to do some basic formatting.

I won't explain all of the details here, but you'll see it boils down to a few dictionaries and some array manipulations. Hope it helps.

Note: as posted, the classe's dictionaries keyed by strings are case-sensitive, so your source data has to be prepared with this in mind. This can easily parameterized by adding another property to the class.

Here's the end result (with a little more formatting applied): enter image description here

jeffreyweir
  • 4,668
  • 1
  • 16
  • 27
Excelosaurus
  • 2,789
  • 1
  • 14
  • 20
  • 1
    I'd still run with a PivotTable myself, but this is really cool! Looking forward to learning from your code. – jeffreyweir Nov 14 '17 at 20:16
  • 1
    Thanks for your edit. You'll only learn about bad Hungarian notation habits ;-) – Excelosaurus Nov 14 '17 at 20:36
  • I copied the test data to cell A1 in Sheet1. I get `Unable to get the Match property of the WorksheetFunction class` and have narrowed it down to `GetHeaderColumnIndex = Application.WorksheetFunction.Match(psColumnHeader, m_rngSource.Rows(1), 0)`. Not sure what the cause is. – jjjjjjjjjjj Nov 16 '17 at 16:28
  • I run the macro from the Visual Basic window. Would this be a cause? – jjjjjjjjjjj Nov 16 '17 at 16:33
  • You have to distribute the data across rows and columns, so A1 contains "Country", B1 contains "Region", A2 contains "USA", etc. – Excelosaurus Nov 16 '17 at 17:04
  • Don't give up! @jeffreyweir could make it work (he's the one who added an image of the results). `MATCH()` is looking for an exact match; do you have trailing spaces in the column headers maybe? When the error occurs, what's the value of psColumnHeader and what's the address of m_rngSource? – Excelosaurus Nov 16 '17 at 17:17
  • The error shows up in a window and doesn't tell me the variable values. I'll check how to output to the Immediate window. – jjjjjjjjjjj Nov 16 '17 at 17:54
  • Sheet1 isn't the only sheet in the workbook. Could it be this? – jjjjjjjjjjj Nov 16 '17 at 17:58
  • 1
    Sure could. Replace Sheet1 by whichever sheet codename contains your data. A sheet's CodeName is its (Name) property as seen from the VB editor (click on the worksheet in the Project Explorer; you can change (Name) to anything you see fit). Or you could use `ThisWorkbook.Worksheets("YourSheetName")` but that ties your code to the worksheet's tab name in Excel, which isn't desirable. – Excelosaurus Nov 16 '17 at 18:02
  • I never knew that! The user facing name was Sheet1, but the MS object name was Sheet11. The code worked after I modified the MS object name in the module. – jjjjjjjjjjj Nov 16 '17 at 20:36
  • @jeffreyweir and Excelosaurus, thanks very much to both of you for your help! I wish I could vote you up more than once. – jjjjjjjjjjj Nov 17 '17 at 16:17
1

So from your answer it sounds like you want this:

enter image description here

But PivotTables actually give you a much better way of viewing the exact same information natively, like this:

enter image description here

...the bonus being that there's no repetition of those G rows...instead you get a count. But other than that, you get the exact same information from either. Any particular reason why you don't want the 'native' PivotTable layout?

jeffreyweir
  • 4,668
  • 1
  • 16
  • 27
  • Not quite. I'd like the numbers replaced by text. There can be more than one ProgramName per Country, Region and Category. – jjjjjjjjjjj Nov 14 '17 at 16:12
  • 1
    Re "there can be more than one ProgramName pre Country, Region and Category": The above Pivot would handle that. Let's say you had two more ProgramNames called G for "CollegeName", "Brasil" and "North". When you add that to your data and refresh the Pivot, you've got a new line showing just that, with a count of 2. Isn't that what you are trying to show? – jeffreyweir Nov 14 '17 at 19:37
  • 1
    Sure, you could have some way to replace each number with a letter, and instead of having the one line for "CollegeName", "Brasil", and "North" you could have two lines, with F shown in the values area. But that ultimately does not represent any different 'dimension' of your data than shown above. – jeffreyweir Nov 14 '17 at 19:39
  • The native layout works to convey the information, but this chart will be presented to others. I'd like to avoid forcing people to look back and forth for the name if possible. – jjjjjjjjjjj Nov 16 '17 at 16:31
  • Two words: Freeze Panes. – jeffreyweir Nov 16 '17 at 19:32
  • It may be printed out, so I want to make sure it works in Excel and as a static table. – jjjjjjjjjjj Nov 16 '17 at 20:21
  • Apologies for all the requirements, I didn't think this would be such a complicated task! – jjjjjjjjjjj Nov 16 '17 at 20:21
  • Okay. So you face a trade-off: use the native functionality, but make the user move their eyes. Or use VBA code, that saves on eye movements, but adds significant complexity. Complexity that you and whoever inherits your work had best understand completely, lest something break in future. Personally I'd err for the simpler option. I'd probably add two columns on the right hand side that just point at the two columns on the left hand side, lest someone find themselves looking at data at the right of the PivotTable. That way their eyes have less distance to travel. – jeffreyweir Nov 16 '17 at 20:26