1

I am receiving a large file 500k+ lines but all the content is in column A. I need to run a macro that will transpose the data into matrix form but will only create a new row when it finds "KEY*" in the ActiveCell. For example:

| KEY 4759839 | asljhk | 35049 | | sklahksdjf|
| KEY 359     | skj    | 487   |y| 2985789   |

The above data in my file would originally look like this in column A:

KEY 4759839
asljhk
35049

sklahksdjf
KEY 359
skj
487
y
2985789

Considerations:

  • Blank cells need to be transposed as well, so the macro cant stop based on emptyCell
  • The number of cells between KEY's is not constant so it actually needs to read the cell to know if it should create a new row
  • It can either stop based on say 20 empty cells in a row or prompt for a max row number
  • (Optional) It would be nice if there was some sort of visual indicator for the last item in a row so that its possible to tell if the last item(s) were blank cells

I searched around and found a macro that had the same general theme but it went based on every 6 lines and I did not know enough to try to modify it for my case. But in case it helps here it is:

Sub kTest()
    Dim a, w(), i As Long, j As Long, c As Integer
    a = Range([a1], [a500000].End(xlUp))
    ReDim w(1 To UBound(a, 1), 1 To 6)
    j = 1
    For i = 1 To UBound(a, 1)
        c = 1 + (i - 1) Mod 6: w(j, c) = a(i, 1)
        If c = 6 Then j = j + 1
    Next i
    [c1].Resize(j, 6) = w
End Sub

I would greatly appreciate any help you can give me!

assylias
  • 321,522
  • 82
  • 660
  • 783
tehaaron
  • 2,250
  • 10
  • 31
  • 54

2 Answers2

2

This works with the sample data you provided in your question - it outputs the result in a table starting in B1. It runs in less than one second for 500k rows on my machine.

Sub kTest()
    Dim originalData As Variant
    Dim result As Variant
    Dim i As Long
    Dim j As Long
    Dim k As Long
    Dim countKeys As Long
    Dim countColumns As Long
    Dim maxColumns As Long

    originalData = Range([a1], [a500000].End(xlUp))

    countKeys = 0
    maxColumns = 0

    'Calculate the number of lines and columns that will be required
    For i = LBound(originalData, 1) To UBound(originalData, 1)
        If Left(originalData(i, 1), 3) = "KEY" Then
            countKeys = countKeys + 1
            maxColumns = IIf(countColumns > maxColumns, countColumns, maxColumns)
            countColumns = 1
        Else
            countColumns = countColumns + 1
        End If
    Next i

    'Create the resulting array
    ReDim result(1 To countKeys, 1 To maxColumns) As Variant

    j = 0
    k = 1
    For i = LBound(originalData, 1) To UBound(originalData, 1)
        If Left(originalData(i, 1), 3) = "KEY" Then
            j = j + 1
            k = 1
        Else
            k = k + 1
        End If
        result(j, k) = originalData(i, 1)
    Next i

    With ActiveSheet
        .Cells(1, 2).Resize(UBound(result, 1), UBound(result, 2)) = result
    End With

End Sub
assylias
  • 321,522
  • 82
  • 660
  • 783
1

Tested and works:

    Sub test()
    Row = 0
    col = 1

    'Find the last not empty cell by selecting the bottom cell and moving up
    Max = Range("A650000").End(xlUp).Row 'Or whatever the last allowed row number is

    'loop through the data
    For i = 1 To Max
        'Check if the left 3 characters of the cell are "KEY" and start a new row if they are
        If (Left(Range("A" & i).Value, 3) = "KEY") Then
             Row = Row + 1
             col = 1
        End If

        Cells(Row, col).Value = Range("A" & i).Value
        If (i > Row) Then
            Range("A" & i).Value = ""
        End If
        col = col + 1

    Next i
End Sub
Dan
  • 45,079
  • 17
  • 88
  • 157
  • It's fixed now. Tested and runs correctly even if the KEY values are not spaced constantly – Dan May 10 '12 at 15:09
  • For your optional requirement try this in the above code: If (Left(Range("A" & i).Value, 3) = "KEY") Then Cells(Row + 1, col) = "END" Row = Row + 1 col = 1 End If – Dan May 10 '12 at 15:15
  • Doing this over 500k records will take several minutes (actually around 7 minutes on my machine). – assylias May 10 '12 at 15:21
  • This worked perfectly the very first time although it takes quite awhile (that is not a very big deal). I will try your option bit now as well. Thanks!! – tehaaron May 10 '12 at 15:25
  • If the other method runs in a second compared to 7 minutes then I highly recommend trying to get @assylias's code to work for you! – Dan May 10 '12 at 15:29
  • @Dan Haha - if he only needs it once, and needs to spend more than 7 minutes on my solution to get it to work, then yours is better! – assylias May 10 '12 at 15:30
  • Also turn screen updating off for this code, that should speed it up. – Dan May 10 '12 at 15:33
  • I will actually be running it on a regular basis. Possibly daily but there are a million other things outside of Excel that I could be doing for that 7min if need be :) – tehaaron May 10 '12 at 15:34
  • @assylias Sure but it could be useful for others for the future – Dan May 10 '12 at 15:35
  • It looks like it was a typo and we got a happy ending ;-) – assylias May 10 '12 at 16:17