Regular expression to find a lowercase letter followed by an uppercase

Question

I have difficulty using Regular Expression (Grep) in TextWrangler to find occurrences of lowercase letter followed by uppercase. For example:

This announcement meansStudents are welcome.

In fact, I want to split the occurrence by adding a colon so that it becomes means: Students

I have tried:

[a-z][A-Z]

But this expression does not work in TextWrangler.

*EDIT: here are the exact contexts in which the occurrences appear (I mean only with these font colors).*

<font color =#48B700>  - Stột jlăm wẻ baOne hundred and three<br></font>

<font color =#C0C0C0>     »» Qzống pguộc lyời ba yghìm fảy dyổiTo live a life full of vicissitudes, to live a life marked by ups and downs<br></font>

"baOne" and "dyổiTo" must be "ba: One" and "dyổi: To"

Could anyone help? Many thanks.

score 3 · Accepted Answer · answered Jan 06 '12 at 10:36

3

I do believe (don't have TextWrangler at hand though) that you need to search for ([a-z])([A-Z]) and replace it with: \1: \2

Hope this helps.

answered Jan 06 '12 at 10:36

Igor Korkhov

8,283
1
26
31

Nope! It just finds any adjacent letters. – Niamh Doyle Jan 06 '12 at 10:39
1

Any adjacent letters, even two lowercase ones? Then maybe you need to tick 'Case sensitive' box then? – Igor Korkhov Jan 06 '12 at 10:44
That's exactly the problem. Thank you so much! But it now turns to another problem: it finds and replaces all the values, even the unwanted one FileMaker into File: Maker. – Niamh Doyle Jan 06 '12 at 10:52
Unfortunately, you haven't described the nature of your text. Of course, the expression I suggested looks for any lowercase letter following any uppercase one, regardless of any context. Maybe if you give us an example of your text we will be able to provide a better solution. – Igor Korkhov Jan 06 '12 at 11:02
Still not clear what must be separated by the colon, and what should be left unchanged. – Igor Korkhov Jan 06 '12 at 12:18
Well Igor, between each of these two font-color tags, there is one occurrence of lowercase letter followed by uppercase that need separating by the colon. All other occurrences outside these two font-color tags are left unchanged. – Niamh Doyle Jan 06 '12 at 12:34

score 2 · Answer 2 · answered May 24 '17 at 18:34

This question is ages old, but I stumbled upon it, so someone else might, as well. The OP's comment to Igor's response clarified how the task was meant to be described (& could have be added to the description).

To match only those font-specific lines of the HTML replace

(?<=<font color =#(?:48B700|C0C0C0)>)(.*?[a-z])([A-Z])

with \1: \2

Explanation:

(?<=[fixed-length regex]) is a positive lookbehind and means "if my match has this just before it"
(?:48B700|C0C0C0) is an unnamed group to match only 2 colours. Since they are of the same length, they work in a lookbehind (that needs to be of fixed length)
(.*?[a-z])([A-Z]) will match everything after the > of those begin font tags up to your Capital letters.
The \1: \2 replacement is the same as in Igor's response, only that \1 will match the entire first string that needs separating.

Addition:

Your input strings contain special characters and the part you want to split may very well end in one. In this case they won't be caught by [a-z] alone. You will need to add a character ranger that captures all the letters you care about, something like

(?<=<font color =#(?:48B700|C0C0C0)>)(.*?[a-zḁ-ῼ])([A-Z])

Amarghosh · Answer 3 · 2012-01-06T10:38:13.430

2

Replace ([a-z])([A-Z]) with \1:\2 - I don't have TextWrangler, but it works on Notepad++

The parenthesis are for capturing the data, which is referred to using \1 syntax in the replacement string

edited Jan 06 '12 at 10:38

answered Jan 06 '12 at 10:30

Amarghosh

58,710
11
92
121

Thanks, Amarghosh. But it still does not work. Anyway, my document contains HTML tags and the expression seems to include everything between the font tags. – Niamh Doyle Jan 06 '12 at 10:36
Thanks, but still no luck in TextWrangler. I don't have Notepad++ for Mac :( to try. – Niamh Doyle Jan 06 '12 at 10:44

score 0 · Answer 4 · answered Nov 26 '15 at 07:59

0

That is the correct pattern for identifying lower case and upper case letters, however, you will need to check matching to be Case Sensitive within the Find/Replace dialogue.

answered Nov 26 '15 at 07:59

Joshua Cook

12,495
2
35
31

Regular expression to find a lowercase letter followed by an uppercase

4 Answers4