-1

I have some text I need remove from a string, but I cannot use the normal Replace() because it is a timestamp that will always be changing.

Text to remove <09:35:40> (could be any time, but always the same format <HH:MM:SS>).

These time stamps could occur in multiple locations throughout a string, all need to be removed (replaced with "").

I've seen regular expressions used for similar applications on other posts, but I don't really understand them, so cannot validate which one to use for my use case here.

Edit: The < and > also need to be removed.

If feedback could be provided as to the -1, that would be great. Help me improve.

ApolloENG
  • 1
  • 2
  • Are there the starting character (<) and the ending one (>) as you show in your question, or tried emphasizing somehow the string to be eliminated? If so, are there some other strings between the two mentioned characters? I mean, not being needed to eliminate them... – FaneDuru Oct 03 '22 at 11:27
  • Hi FaneDuru, I dont quite understand your question, sorry. – ApolloENG Oct 03 '22 at 11:34
  • I mean, are there other strings between the mentioned characters, which must not be rplaced? As "", "1234>" etc. Should "<09:35:40>" be replaced, or only "09:35:40"? – FaneDuru Oct 03 '22 at 11:38
  • Thanks for explaining. "" Should be replaced with "" – ApolloENG Oct 03 '22 at 11:40
  • And no other strings between the mentioned characters, as the examples shown above (", "1234>" etc. )? – FaneDuru Oct 03 '22 at 11:46
  • The stuff between the timestamp is data from equipment: "Data from equipment" "Data from equipment" – ApolloENG Oct 03 '22 at 11:47

5 Answers5

1

You don't need regular expressions I don't think. What about:

Range("A:A").Replace "<??:??:??>", "", xlPart
  • Use Application.Trim() to deal with double spaces after replacement;
  • Range("A:A") is just my placeholder for whatever is your range-object.
JvdV
  • 70,606
  • 8
  • 39
  • 70
  • When you use Range, you are stuck with Data that is already on the spreadsheet/in the workbook. Using the regular expression helps you amend the data before it hits the spreadsheet. – ApolloENG Oct 03 '22 at 12:14
0

You could use Split:

Text = "Text to remove <09:35:40> (could be any time, but always the same format)"
NewText = Split(Text, "<")(0) + Split(Text, "> ")(1)

? NewText
Text to remove (could be any time, but always the same format)
Gustav
  • 53,498
  • 7
  • 29
  • 55
  • The problem with that, is it only does the split, if that exact time is met. I would have to make a lot of the split functions to 'catch' all possible times. – ApolloENG Oct 03 '22 at 11:42
  • No, that's exactly not the (or a) problem - as the output shows. Why don't you try it? – Gustav Oct 03 '22 at 11:57
  • @ApolloENG Are there multiple occurrences in the text to be processed? – FaneDuru Oct 03 '22 at 12:08
  • I understand how this Split works now, and I think FaneDuru is trying to nudge me to say this. Your Split will work for the first instance, but will not iterate for every instance throughout the length of the string, I do not believe. Whereas, a regular expression neatly goes through the whole string so long as global is set to true. – ApolloENG Oct 03 '22 at 13:17
0

Use this regular expression to select all the sub strings in HH:MM:SS format. Then just replace it with empty string ("")

\d\d:\d\d:\d\d

And use this one to remove select it including these characters <>

\<\d\d:\d\d:\d\d\>
0

In excel, you can find-replace between two characters using the normal Ctrl+F replace, and searching for <*> (in my use case). However, special characters such as * cannot be used in the Replace() function within VBA code. If you want to perform the same operation, replacing anything between characters, I believe a regular expression is a good way of achieving this. This following code works for me in Excel VBA. Note, I am working on a string before it hits the spreadsheet (E.g. I am formatting the string before I print it to any cells).

Dim regExp As ObjectSet 
regExp = CreateObject("vbscript.regexp")  'This way, you do not have enable VBScript Regular Expressions 5.5 in the references.
        
With regExp
     .Global = True                          'Get all matches.
     .Pattern = "\<\d\d:\d\d:\d\d\>"         'Search for any string that contains the pattern entered in quotes. As per the guide Jayadul Shuvo links.
     newString = .Replace(prevString, "")    'Replace instances of strings that contain the pattern, with a nothing ""
End With
ApolloENG
  • 1
  • 2
  • 1
    Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney Oct 04 '22 at 00:06
0

I had another requirement where "\start " and "\stop " were before and after the timestamps. I was not aware this could have been the case to start with, new information came to light. E.g. "\start <HH:MM:SS> \stop ". This could also be spread across newlines, so I had to consider the removal of the newline as well.

This essentially meant I had to remove a string between two substrings (and be able to remove the newline) and I have used the following pattern:


"\\\bstart\b\s((.|\n)*?)\\\bstop\b\s"

'\ removes the special operation of \ and matches the \
'\b followed by \b matches the whole word between the \b and \b
'\s matches the space
'(.|\n) matches any single character and newlines
'*? matches zero or more occurrences, but as fewer as possible

I would recommend using a regular expression tester such as: https://regexr.com/3hmb6when creating these patterns, it is so helpful! Use the tabs on the bottom right to see what is replaced and to get an explanation of what is going on.

Picture Snippet of the explanation tab for ((.|\n)*?) on the tester website

ApolloENG
  • 1
  • 2