1

I realized after manual scanning 90 records, that this was going to be painful and tedious unless I made use of automation.

I have this set of data, about 4000 records that occurs in a pattern that I want to track. The first column is the important one. I want to scan through the column and record in a NEW column how man times that number has occurred. Is it possible to do this programatically in Excel?

Note: I'm not just looking for a single pattern or single occurrence of a pattern.

E.g. in this sample 313 occurs 1 time, 314 occurs 6 times, 315 occurs 2 times, etc.

At the end of the occurrences I want it to look like

--- Desired Output -------

313 1       343  1
314 1   344  
314 2   344 
314 3   344
314 4   344
314 5   344  
314 1   345  6
315 2   345  
315 1   346  2


-- Sample Data ------------------------------------
313 1   343
314 1   344
314 2   344
314 3   344
314 4   344
314 5   344
314 1   345
315 2   345
315 1   346
316 2   346
316 1   347
317 2   347
318 1   348
318 2   348
319 1   349
319 2   349
319 3   349  

5/23/13 The data is delimited by the spaces. It is not all in one cell. I don't know how to create a grid picture here. The leftmost cell is the one I want counted.

The desired output is the example of what I want. There are six occurrences of 314, I want the count summary cell to be compiled in the row of the last occurrence.

Community
  • 1
  • 1
Ken Ingram
  • 1,538
  • 5
  • 27
  • 52

4 Answers4

1

I backed up, slowed down and went to some basic programming principles, as slow as they feel at times.

  1. Flowchart
  2. Pseudocode
  3. prototype
  4. test
  5. repeat 3 and 4 as needed.

I found that the following code did EXACTLY what I needed. I share it for any who follow.

Sub countFoo()
Dim startCell As Range
Dim preCell As Range
Dim counter As Integer
Dim startPoint As Range, endPoint As Range
Dim fileName As String, delimitingCharacter As String, SQLpre As String, SQLpost As String
Dim SQL As String
Dim outfile As Integer

fileName = "update_foo.sql"
SQLpre = "UPDATE foo SET foodata = "
SQLpost = " WHERE details = '"
outfile = FreeFile()
Open fileName For Output As outfile
counter = 1

Set startPoint = Cells(2, 4)
startPoint.Activate

Debug.Print "Start Point:" & startPoint.Address
Debug.Print startPoint.Value

Set startCell = ActiveCell
Set preCell = startCell.Offset(-1, 0)


Do While startCell.Value <> "END"

If (startCell.Value = preCell.Value) Then
  counter = counter + 1
  Set preCell = startCell
  Set startCell = startCell.Offset(1, 0)
ElseIf ((startCell.Value <> preCell.Value) Or (startCell.Value = "END")) Then
  startCell.Offset(-1, 3).Value = counter
  If counter > 1 Then
    startCell.Offset(-1, 0).Interior.Color = 5296274
    startCell.Offset(-1, 1).Interior.Color = 5296274
    startCell.Offset(-1, 2).Interior.Color = 5296274
    startCell.Offset(-1, 3).Font.Bold = True
    With startCell.Offset(-1, 3).Interior
      .Pattern = xlGray8
      .PatternColor = 65535
      .Color = 5296274
    End With
  End If
  SQL = SQLpre & counter & SQLpost & startCell.Offset(-1, 0).Value & "';"
  Print #outfile, SQL
  counter = 1
  Set preCell = startCell
  Set startCell = startCell.Offset(1, 0)
End If
Loop
Close #outfile
End Sub
Ken Ingram
  • 1,538
  • 5
  • 27
  • 52
0

If all you want to do is count the number of ocurrences of a certain numer in a certain range all you have to do is use COUNTIF(range,criteria)

where range is the cells where you want to check ( according to you it would be "A1:A4000") and criteria is the number you are loonking for, it can also be an ocrrence like ">55" where it counts how many cells the value is bigger than 55.

Hope it helps, Bruno

The code i mentioned in the comment:

CurrentRowA = 1
LastRowA = Range("A50000").End(xlUp).Row
Dim r As Range
While CurrentRowA <= LastRowA
    CurrentRowB = 1
    LastRowB = Range("B50000").End(xlUp).Row
    Do While CurrentRowB <= LastRowB
        If Cells(CurrentRowA, "A").Value = Cells(CurrentRowB, "B").Value Then
            Exit Do
        Else
        CurrentRowB = CurrentRowB + 1
        End If
    Loop
    If CurrentRowB > LastRowB Then
        Cells(CurrentRowB, "B").Value = Cells(CurrentRowA, "A").Value
        Set r = Range("A1", "A" & LastRowA)
        Cells(CurrentRowB, "C").Value = Application.CountIf(r, Cells(CurrentRowA, "A").Value)
    End If
    CurrentRowA = CurrentRowA + 1
Wend
LastRowB = Range("B50000").End(xlUp).Row
Range("B2", "C" & LastRowB).Cut
Range("B1").Select
ActiveSheet.Paste

If what i described in my latest comment is what you really want all you have to do is paste this formulas in B1 =COUNTIF($A$1:A1;A1) and drag it to the last cell or double click in that blac square on B1 bottomtight corner, then if the calcution is automatic it's done, if it's manual you have to click calculate now and it's done

Hope it helps, Bruno

Newbie
  • 863
  • 1
  • 7
  • 16
  • Close Bruno, but I'm not just looking for one pattern or a specific pattern. I want to parse through the entire set and each time a pattern occurs, I want to record how many times that pattern occurrs: 1 or more. – Ken Ingram May 19 '13 at 21:51
  • OH, then i guess you can use a macro that willstart in A1 until it reachs A4000, and for each cell value you check if it is already in column B, if not you go the the first empty position of B and place the formula there, i'll put a code example in my answer, just gonna edit it :) – Newbie May 19 '13 at 21:55
  • I saw your new edit, you want to say if it's the first ocurrence second and so on? – Newbie May 19 '13 at 22:29
  • I want to know it is a single occurrence or when it is the last of several, which number is it. – Ken Ingram May 19 '13 at 22:35
  • I'll try your formula. Thanks. – Ken Ingram May 19 '13 at 22:35
  • edited my answer again, Hope it helps, any other question just ask ;) – Newbie May 19 '13 at 22:39
  • 1
    You might want to look into the "Scripting.Dictionary", you would be able to remove the inner loop which would it up a fair bit! – NickSlash May 20 '13 at 01:10
0

Paste this in D1 and drag down.

=IF(A2<>A1,COUNTIF($A$1:$A$100000,A1),"")

Adjust the range as you need. This formula assumes that the first 3 digits are in there own cell.

If your sample data is all in one column then you will have to use a Sumproduct with a Left function in place of the countif. You can use the following Formula in this case, But if your sample data is in 3 columns definatly use my fast formula.

=IF(LEFT(A1,3)<>LEFT(A2,3),SUMPRODUCT(--(LEFT($A$1:$A$100000,3)=LEFT(A1,3))),"")

EDIT Based on your comments and answer I have made a full guide on using the countif method as VBA should ALWAYS be avoided if possible. You had issues because your sample data provided in your question did not contain headers/ Column Labels here is the fixed guide.

Starting with your 3 columns with headers I wqould create a named range on the column youd like the counts for to do this use built in Name Manager and click on new:

Name Manager New

Then from this Set the Name to CountColumn and in the Formula use the following:

=OFFSET($A$2,0,0,COUNTA($A$2:$A$1000000),1)

Name Manager Formula

Now using a modified version of my original answer type the following in cell D2:

=IF(A3<>A2,COUNTIF(CountColumn,A2),"")

Formula

AS shown above this is IDENTICAL to what your original question Asked for in Desired Output.

Now to further this to get the highlights as your VBA Code looks to do I would use the following.

Go Back to the Name Manager, as we did for the CountColumn, and Create another new Named Range called Sums And then change all the A references to D like follows:

=OFFSET($D$2,0,0,COUNTA($D$2:$D$1000000),1)

SumsNamed

And you Name Manager Should look like the following:

Name Manager2

Now in the Name Box (top left box next to formula bar) type in the word Sums to select the entire sum area so we can format it:

Sums

Then ****while sums area is highlighted*** go to Conditional Formatting ~~> New Rule:

enter image description here And use the built in No Blanks Function:

NO Blanks

Then for the format Use Fill and the color you want, Based on your posted formula I used the Green Color:

Fill Green

Now you should be done and your Data should look as the picture below does:

Finish

user2140261
  • 7,855
  • 7
  • 32
  • 45
  • The first formula seems to do the trick, but it puts the final count of the occurrrences in the next row, not in the last occurrence row. – Ken Ingram May 23 '13 at 20:03
  • I started in row 2, under the labels. It more or less did what I needed. Not perfect, but close enough to be useful for the moment. – Ken Ingram May 29 '13 at 08:00
  • @KenIngram You did not mention You had labels in your sample data I have updated my answer to reflect. – user2140261 Jun 03 '13 at 14:01
  • Nice guide. I ultimately got what I needed. "VBA should ALWAYS be avoided if possible." Why? – Ken Ingram Jun 06 '13 at 08:16
  • @KenIngram Plain and simple? Speed and reliability. This of coarse goes much deeper, but the main notes are that code working from the excel data will always be slower then code (excel functions) that is run at the data level. i'd put money on it that if you ran your vba against my solution on a large range of data, you would completely understand. You function will look at a single cell,make a decision then work, then move on to the next for each and every cell. Excel is built around checking an entire range. There is much more to this, try looking at past answer here on SO. – user2140261 Jun 06 '13 at 11:29
  • @KenIngram Just to put this into numbers for you I ran testing on the 2 methods. Setting up the worksheet the way I did it and Duplicating your sample data 5000 Rows long. When using my method and measuring the worksheet caluclation 10 times the avergae calculation time was 0.044281667 seconds. When Re-Calculating the worksheet the average was 0.022606667 seconds. When running your macro the average was 0.145542 seconds. That's 650% Slower then my recalculate time. Obviously in this situation that is pretty insignificant but in other application 650% performance decrease is hard to swallow. – user2140261 Jun 06 '13 at 13:10
0

The following assumes that your data is all in one column (eg: "315 1 344" is one cell)

It will look at sheet1 starting from A1, generate a list of unique cell values and count any duplicates. Once all the records have been checked, it outputs the results to sheet2.

Sub Main()
' this requires you to add a reference to "Microsoft Scripting Runtime" (usefull if you do not know the methods of scripting.dictionary)
'Dim Results As New Scripting.Dictionary
' the line does not require you to add any extra references (there is no code-completion, you must know the methods and their arguments)
Dim Results As Object: Set Results = CreateObject("Scripting.Dictionary")
Dim Data As Variant
Dim Key As Variant
Dim Source As Worksheet: Set Source = ThisWorkbook.Worksheets("Sheet1") ' the sheet where your data is
Dim Destination As Worksheet: Set Destination = ThisWorkbook.Worksheets("Sheet2") ' where the results will be put
Dim Row As Long: Row = 1 ' the row number to start from
Dim Item As String
Data = Source.UsedRange.Value2
' iterate over the data
Do
    Item = Data(Row, 1)
    If Results.Exists(Item) = True Then
        Results(Item) = Results(Item) + 1
    Else
        Results(Item) = 1
    End If
    Row = Row + 1
Loop While Not Data(Row, 1) = ""
' display the output
Destination.Cells.Clear ' reset the worksheet
For Each Key In Results.Keys ' loop through the results
    Destination.Range("A1:B1").Insert xlShiftDown ' move the previous results down
    Destination.Cells(1, 1) = Key
    Destination.Cells(1, 2) = Results(Key)
Next Key

End Sub
NickSlash
  • 4,758
  • 3
  • 21
  • 38