-3

I have few dummy function names and I want to transform them as follows:

sample case 1

input : getDataType
output: Data_Type

sample case 2

input: getDatasetID
output: Dataset_ID

My code is as below:

def apiNames(funcName):
 name = funcName.split("get")[1]

 print(''.join('_' + char if char.isupper() else char
          for char in name).lstrip('_'))

apiNames("getDataType")
apiNames("getDatasetID")

it works for case 1 but not for case 2

case 1 Output:

Data_Type

case 2 Output:

Dataset_I_D
  • 1
    Good for you, although I'd recommend following [PEP 8 -- The Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/) for naming. But what is your actual question? – MattDMo Apr 05 '23 at 14:47
  • Are you looking for a way to break up a string like `getDataType` at the capitals, and insert an underscore, or do you want to actually call existing functions by a different name? – alexis Apr 05 '23 at 14:47
  • Is this a known list? You could use a dict that maps old to new. This is easy. Or are you generically trying to convert things that match the `getSomething` pattern? A bit harder. – tdelaney Apr 05 '23 at 14:48
  • @alexis, I want a insert underscore when it finds a capital letter but not when consecutive capital letter as shown in sample case – Sushant Shelke Apr 05 '23 at 14:52
  • What if you have a variable `getDatasetIDOrder` that returns the order of IDs of the Dataset? You will not always have the consecutive capital letters at the end, so you have to consider what happens then – Sembei Norimaki Apr 05 '23 at 15:08

2 Answers2

0

I think regex is a better route to solving this.

Consider:

def apiNames(funcName):
    name = funcName.split("get")[1]

    print('_'.join([found for found in re.findall(r'[A-Z]*[a-z]*', name)][:-1]))
JNevill
  • 46,980
  • 4
  • 38
  • 63
  • this works fine for my requirements but yes there are many corner cases so I have to deal with them separately... thanks for the answer. cheers – Sushant Shelke Apr 05 '23 at 15:11
  • 1
    Makes sense. Often time folks pop up asking how to split address strings into their components. This sounds like a similar requirement. Death from a thousand edge cases. I believe going the regex route will make writing patterns for your different edge cases easier though instead of relying on more rigid functions like `split()`. – JNevill Apr 05 '23 at 15:44
0

Just for the fun of it, here's a more robust camelCASEParser that takes care of all the "corner cases". A little bit of lookahead goes a long way.

>>> pattern = '^[a-z]+|[A-Z][a-z]+|[A-Z]+(?![a-z])'
>>> "_".join(re.findall(pattern, "firstWordsUPPERCASECornerCasesEtc"))
'first_Words_UPPERCASE_Corner_Cases_Etc'

And yes, it also works if the pattern ends in an uppercase span. Negative lookahead succeeds at end of line.

>>> "_".join(re.findall(pattern, "andAShortPS"))
'and_A_Short_PS'

I'm pretty sure it works for any sequence matching ^[a-zA-Z]+. Try it.

alexis
  • 48,685
  • 16
  • 101
  • 161