Ok, that was very fun. Code first, talk later:
Public Sub MergeAndFormat()
Const originalColumn As Long = 1
Const formattedColumn As Long = 3
Dim lastRow As Long
lastRow = Sheet1.Range("A" & Sheet1.Rows.Count).End(xlUp).Row
Dim currentEntry As Long
Dim currentRow As Long
For currentRow = 1 To lastRow
Dim currentCell As Range
Set currentCell = Sheet1.Cells(currentRow, originalColumn)
Dim currentText As String
currentText = currentCell.Value
' ensure we have a space at the end of the line
If Right$(currentText, 1) <> " " Then currentText = currentText & " "
Dim isNewEntry As Boolean 'new entry if first char is bold
isNewEntry = currentCell.Characters(1, 1).Font.Bold
Dim currentCharOffset As Long
Dim currentEntryText As String
If isNewEntry Then
currentEntry = currentEntry + 1
currentEntryText = currentText
currentCharOffset = 1
Else
currentCharOffset = Len(currentEntryText) + 1
currentEntryText = currentEntryText & currentText
End If
Dim entryCell As Range
Set entryCell = Sheet1.Cells(currentEntry, formattedColumn)
If isNewEntry Then entryCell.Value = vbNullString
'append the source characters, without losing formatting in the entryCell
entryCell.Characters(currentCharOffset + 1).Insert currentText
Dim currentIndex As Long
For currentIndex = 1 To currentCell.Characters.Count
entryCell.Characters(currentCharOffset + currentIndex - 1, 1).Font.Bold = currentCell.Characters(currentIndex, 1).Font.Bold
entryCell.Characters(currentCharOffset + currentIndex - 1, 1).Font.Italic = currentCell.Characters(currentIndex, 1).Font.Italic
entryCell.Characters(currentCharOffset + currentIndex - 1, 1).Font.Underline = currentCell.Characters(currentIndex, 1).Font.Underline
entryCell.Characters(currentCharOffset + currentIndex - 1, 1).Font.Strikethrough = currentCell.Characters(currentIndex, 1).Font.Strikethrough
Next
Next
End Sub
The entire loop logic was obscured by single-letter variable names, data types involved were obscured with type hint characters, and the intent of the variables was obscured because the meaning of a variable changed depending on what line of code you were looking at (e.g. uneditedColumn
with a value of 1
coincidentally making sense as a Length
argument for the Range.Characters
property.
So I burned everything to the ground, and rewrote the whole logic.
We know where the "original" text begins, and where it ends - we don't need a near-infinite Do While
loop: instead we figure out what the lastRow
is, and we use a For...Next
loop that starts at the top and finishes at whatever the lastRow
is, using currentRow
as out counter.
Since we use currentRow
for counting where we're at in the original column, we'll use currentCell
for the Range
object representing that particular "current cell", and currentText
will hold the string value of that cell's text.
Then we need to know if we're looking at a "new entry", or if we're continuing the previous one - isNewEntry
is True
if the first character of the currentCell
is bold.
When isNewEntry
is True
, we increment the currentEntry
counter (which is 0 until we first assign it with the first "new entry") so we know what row we're going to be writing to; the currentEntryText
will then match the currentText
, and the character-formatting offset will be at position 1.
When isNewEntry
is False
, we don't increment the currentEntry
counter (we'll be appending to that cell's text instead), and we compute the new character-formatting offset by adding 1 to the length of the entire text for the current entry - then we update the currentEntryText
to append the currentText
- not because we need the text itself, but because we'll need it next iteration to compute the new character offset.
At this point we know what to write, and where to write it - only, if we work at the Range
level, we're going to overwrite everything we did in the previous iteration, and lose the formatting... and we don't want that, so that's why we track these offsets...
We Insert
the currentText
at the end of the entryCell
's current content, and then we begin iterating the characters in the currentCell
, and literally copy the formatting - offsetting the characters by what we've tracked.
The above code preserves Bold
, Italic
, Underline
, and Strikethrough
formatting; changing it to also support Subscript
and Superscript
formats should be trivial.