Remove lowercase letter if it is followed by an uppercase letter

Question

The goal is to get from string $a="NewYork" new string without lowercase that stands before uppercase.

In this example, we should get output "NeYork"

I tried to do this through positions of small and big letters in ASCII table, but it doesn't work. I'm not sure is it possible to to do this in similar way, through positions in ASCII table.

function delete_char($a)
{
 global $b;
    $a = 'NewYork';
   
    for($i =0; $i<strlen($a); $i++)
    {
         if( ord($a[$i])< ord($a[$i+1])){//this solves only part of a problem 
           chop($a,'$a[$i]');
         }
         else{
            $b.=$a[$i];
         }
    }
    return $b;
}

Opposite duplicate https://stackoverflow.com/q/3455232/2943403 — mickmackusa, Apr 02 '21 at 13:53
@mickmackusa Future researchers who are in pursuit of a readable solution will assume that this task is more complicated than it actually seems, which is not the case. Regex one liners are great, but you could rather add that as a follow-up solution than as a only solution and explain to OP what he is missing in his initial comparison in the loop. — nice_dev, Apr 02 '21 at 14:12
I discounted the OP's snippet (though I always appreciate when an attempt is included) because I find it to be unnecessarily hard to follow compared to the basic regex required. I would never use a non-regex solution for this task in professional code. — mickmackusa, Apr 02 '21 at 14:16
It is more of an academic exercise to review the OP's code (which is riddled with flaws) because I would never recommend using such a verbose and convoluted technique in my own project. Anyhow, I bothered to provide a non-regex snippet. I am assuming that's what you were seeking. I think your first comment should say "All regex answers have shown how simply this task can be performed". @nice — mickmackusa, Apr 02 '21 at 22:15
@mickmackusa +1 for adding the explanation. The other solution is verbose but ain't convoluted. The OP would be able to comprehend your answer much better now and so would future researchers. Regex is of course a professional way of solving such a problem but, unfortunately, not all devs are on the same level. Hence, including both answers is a good idea in any case. — nice_dev, Apr 03 '21 at 05:01
I guess we will have to agree to disagree on a few things then. — mickmackusa, Apr 03 '21 at 05:21

score 2 · Answer 1 · answered Apr 02 '21 at 13:34

This is something a regular expression handles with ease

<?php

$a ="NewYorkNewYork";
$reg="/[a-z]([A-Z])/";
echo preg_replace($reg, "$1", $a); // NeYorNeYork

The regular expression searches for a lower case letter followed by an upper case letter, and captures the upper case one. preg_replace() then replace that combination with just the captured letter ($1).

See https://3v4l.org/o43bO

mickmackusa · Answer 2 · 2021-04-03T06:27:51.103

You don't need to capture the uppercase letter and use a backreference in the replacement string.

More simply, match the lowercase letter then use a lookahead for an uppercase letter -- this way you only replace the lowercase character with an empty string. (Demo)

echo preg_replace('~[a-z](?=[A-Z])~', '', 'NewYork');
// NeYork

As for a review of your code, there are multiple issues.

global $b doesn't make sense to me. You need the variable to be instantiated as an empty string within the scope of the custom function only. It more simply should be $b = '';.
The variable and function naming is unhelpful. A function's name should specifically describe the function's action. A variable should intuitively describe the data that it contains. Generally speaking, don't sacrifice clarity for brevity.
As a matter of best practice, you should not repeatedly call a function when you know that the value has not changed. Calling strlen() on each iteration of the loop is not beneficial. Declare $length = strlen($input) and use $length over and over.
$a[$i+1] is going to generate an undefined offset warning on the last iteration of the loop because there cannot possibly be a character at that offset when you already know the length of the string has been fully processed. In other words, the last character of a string will have an offset of "length - 1". There is more than one way to address this, but I'll use the null coalescing operator to set a fallback character that will not qualify the previous letter for removal.
Most importantly, you cannot just check that the current ord value is less than the next ord value. See here that lowercase letters have an ordinal range of 97 through 122 and uppercase letters have an ordinal range of 65 through 90. You will need to check that both letters meet the qualifying criteria for the current letter to be included in the result string.

Rewrite: (Demo)

function removeLowerCharBeforeUpperChar(string $input): string
{
    $output = '';
    $length = strlen($input);
    for ($offset = 0; $offset < $length; ++$offset) {
        $currentOrd = ord($input[$offset]);
        $nextOrd = ord($input[$offset + 1] ?? '_');

        if ($currentOrd < 97
            || $currentOrd > 122
            || $nextOrd < 65
            || $nextOrd > 90
        ){
            $output .= $input[$offset];
        }
    }
    return $output;
}

echo removeLowerCharBeforeUpperChar('MickMacKusa');
// MicMaKusa

Or with ctype_ functions: (Demo)

function removeLowerCharBeforeUpperChar(string $input): string
{
    $output = '';
    $length = strlen($input);
    for ($offset = 0; $offset < $length; ++$offset) {
        $nextLetter = $input[$offset + 1] ?? '';
        if (ctype_lower($input[$offset]) && ctype_upper($nextLetter)) {
            $output .= $nextLetter; // omit current letter, save next
            ++$offset; // double iterate
        } else {
            $output .= $input[$offset]; // save current letter
        }
    }
    return $output;
}

To clarify, I would not use the above custom function in a professional script and both snippets are not built to process strings containing multibyte characters.

XMehdi01 · Answer 3 · 2021-04-03T09:16:35.307

1

Simply, I create new variable $s used for store new string to be returned and a make loop iterate over $a string, I used ctype_upper to check if next character not uppercase append it to $s. at the end i return $s concatenate with last char of string.

function delete_char(string $a): string
{
  if(!strlen($a))
  {
     return '';
  }

  $s='';
  for($i = 0; $i < strlen($a)-1; $i++)
  {
      if(!ctype_upper($a[$i+1])){
        $s.=$a[$i];
      }
  }
  return $s.$a[-1];
}
echo delete_char("NewYork");//NeYork

edited Apr 03 '21 at 09:16

answered Apr 02 '21 at 23:36

XMehdi01

5,538
2
10
34

1

Using php7.1's negative offset as well as `ctype_upper()` are good ideas to reduce the complexity of a non-regex snippet, but the question is ambiguous about the requirement of removing only lowercase letters. The sample data only offers a lowercase letter before an uppercase letter, but the coding attempt seems to care about the `ord()` value of the "current letter" as well as the "next letter". – mickmackusa Apr 03 '21 at 05:37
1

The only fringe case scenario of concern is when an empty string is passed in. `Warning: Uninitialized string offset -1` – mickmackusa Apr 03 '21 at 05:45
thank you! for your notes. i don't pay attention to edge case of empty string, i fixed it. – XMehdi01 Apr 03 '21 at 08:31
`empty()` is not a good fix -- for a couple reasons. `empty()` checks if a variable is "not set" or "falsey", but the incoming variable is guaranteed to "be set" AND "0" is a falsey string that does have length. I recommend `!strlen()` as the early return condition. You should also return a string-type value (empty string) instead of void for consistency in the return type. – mickmackusa Apr 03 '21 at 08:39
Also, https://www.php-fig.org/psr/psr-12/#:~:text=The%20body%20of%20each%20structure%20MUST%20be%20enclosed%20by%20braces – mickmackusa Apr 03 '21 at 09:13
Thank you @mickmackusa again! i fixed it by using `strlen()` , i also return empty string and use a type of return to be string! – XMehdi01 Apr 03 '21 at 09:15
also i added curly braces! but if there is one statement no need for {braces}. but i think for readability the code isn't it ? – XMehdi01 Apr 03 '21 at 09:19
To obey the professional coding standard, always use braces. Yes, it is for claity's sake. We must always endeavor to teach best practices and professional standards. – mickmackusa Apr 03 '21 at 09:21
1

Yes you're right. the code must be always clean and easy to read. i learn so many things from you and thank you for sharing the source php-fig.org it gives tips in coding at php professionally. – XMehdi01 Apr 03 '21 at 09:26

score -1 · Answer 4 · answered Apr 02 '21 at 13:32

-1

Something like this maybe?

<?php
    $word = 'NewYork';
    preg_match('/.[A-Z].*/', $word, $match);
    if($match){
        $rlen = strlen($match[0]); //length from character before capital letter
        $start = strlen($word)-$rlen; //first lower case before the capital
        $edited_word = substr_replace($word, '', $start, 1); //removes character
        echo $edited_word; //prints NeYork
    }
?>

answered Apr 02 '21 at 13:32

Onimusha

3,348
2
26
32

This would only replace the first instance. `NewYorkNewYork` comes out as `NeYorkNewYork` – Tangentially Perpendicular Apr 02 '21 at 13:41
Why was my comment removed? My comment clarified that the OP did not make it clear that it's for recurring cases. If it's recurring then mickmackusa's answer is the way to go. If you delete a response to a comment then delete the comment that the response is for as well. – Onimusha Apr 16 '21 at 15:12

Remove lowercase letter if it is followed by an uppercase letter

4 Answers4