I am trying to write a helper script for a colleague that will automatically open up all .doc(x) files in a directory, find any and all chinese characters, set their Font, save and close.
I already have a working version of this script. The file opening/saving/closing part is handled in Python/win32com and works fine. My big point of contention is still the VBA macro.
I know there is a regex (\p{Han}) that should be able to catch all Chinese characters, but this does not seem to work in VBA. Similarly, I have tried using Unicode Ranges and Chr(W). Nothing so far produced any output, let alone correct output. Out of frustration, I made one last ditch attempt and simply inverted the search paramters. This is how it is now:
Sub FindReplace_zh(Rng As Range)
With Rng.Find
Do While .Execute(FindText:="[!A-ZÄÖÜa-zäöü0-9><_ ^11^13§$²³%#&/\+-]", MatchWildcards:=True)
If Rng.Font.Bold = True And Rng.Font.Name Like "Arial*" Then
Rng.Font.Name = "SimHei"
ElseIf Rng.Font.Bold = False And Rng.Font.Name Like "Arial*" Then
Rng.Font.Name = "SimSun"
End If
Rng.Collapse 0
Loop
End With
End Sub
AT LEAST THIS WORKS, but its far from elegant and still produces some undesired output.
I have yet to understand how I can substitute "[!A-ZÄÖÜa-zäöü0-9><_ ^11^13§$²³%#&/+-]" with a variable, or most anything else. Many characters are not covered by this regex, such as "(", ")" etc., but adding them (even escaped with ) will result in runtime errors in VBA. I found a lot of tutorials and questions dealing with removing or inserting text, but my specific case of finding text and then changing the font, while leaving everything else untouched, seems rather specific.
Fun fact: I had to add ^11 and ^13 to the regex list, as not including them would lead to the Macro inserting new linebreaks in random positions of the .doc
EDIT: New try with comment:
Dim searchPattern As String
searchPattern = "[" & ChrW(&H2E80) & "-" & ChrW(&HFFED) & "]{1,}"
With Rng.Find
Do While .Execute(FindText:=searchPattern, MatchWildcards:=True)
Invalid operation on final line! I also would not have concatinated a string like this. I am not sure how VBA parses this, but apprently not the way we hoped.
EDIT2: FIX
Removing "{1,}" from searchPattern did it. Now it works exactly as I expected it to :)
searchPattern = "[" & ChrW(&H2E80) & "-" & ChrW(&HFFED) & "]"