1

I am currently creating a Java program to rewrite some outdated Java classes in our software. Part of the conversion includes changing variable names from containing underscores to using camelCase instead. The problem is, I cannot simply replace all underscores in the code. We have some classes with constants and for those, the underscore should remain.
How can I replace instances like string_label with stringLabel, but DO NOT replace underscores that occur after the prefix "Parameters."?

I am currently using the following which obviously does not handle excluding certain prefixes:

public String stripUnderscores(String line) { 
  Pattern p = Pattern.compile("_(.)");
  Matcher m = p.matcher(line);         
  StringBuffer sb = new StringBuffer(); 
  while(m.find()) { 
    m.appendReplacement(sb, m.group(1).toUpperCase()); 
  } 
  m.appendTail(sb); 
  return sb.toString(); 
}
4b0
  • 21,981
  • 30
  • 95
  • 142
Tommo
  • 977
  • 14
  • 35
  • 1
    Is there a question here? This sounds more like a status report than a request for help; I don't see a question mark or a statement of what actual problem you're having, nor what you've tried. – dcsohl Apr 28 '15 at 16:07
  • What is the actual rule that's followed? Any identifier qualified with `Parameters`? What about inside the `Parameters` class where an identifier might be referenced with a simple name? – Radiodef Apr 28 '15 at 16:08
  • I thought it was pretty clear that I am able to replace something like "_a" with "A", but I am unsure how to skip instances where the underscores occur directly after a specific prefix. – Tommo Apr 28 '15 at 16:12
  • @Radiodef Here are some examples: Convert `field_name` to `fieldName` Convert `field_name.setText(Parameters.is_module_installed ? "Label1" : "Label2");` to `fieldName.setText(Parameters.is_module_installed ? "Label1" : "Label2");` – Tommo Apr 28 '15 at 16:16

2 Answers2

2

You could possibly try something like:

Pattern.compile("(?<!(class\\s+Parameters.+|Parameters\\.[\\w_]+))_(.)")

which uses a negative lookbehind.

You would probably be better served using some kind of refactoring tool that understood scoping semantics.

If all you check for is a qualified name like Parameters.is_module_installed then you will replace

class Parameters {
    static boolean is_module_installed;
}

by mistake. And there are more corner cases like this. (import static Parameters.*;, etc., etc.)

Using regular expressions alone seems troublesome to me. One way you can make the routine smarter is to use regex just to capture an expression of identifiers and then you can examine it separately:

static List<String> exclude = Arrays.asList("Parameters");

static String getReplacement(String in) {
    for(String ex : exclude) {
        if(in.startsWith(ex + "."))
            return in;
    }

    StringBuffer b = new StringBuffer();
    Matcher m = Pattern.compile("_(.)").matcher(in);
    while(m.find()) {
        m.appendReplacement(b, m.group(1).toUpperCase());
    }

    m.appendTail(b);
    return b.toString();
}

static String stripUnderscores(String line) { 
    Pattern p = Pattern.compile("([_$\\w][_$\\w\\d]+\\.?)+");
    Matcher m = p.matcher(line);         
    StringBuffer sb = new StringBuffer(); 
    while(m.find()) { 
        m.appendReplacement(sb, getReplacement(m.group())); 
    } 
    m.appendTail(sb); 
    return sb.toString(); 
}

But that will still fail for e.g. class Parameters { is_module_installed; }.

It could be made more robust by further breaking down each expression:

static String getReplacement(String in) {
    if(in.contains(".")) {
        StringBuilder result = new StringBuilder();

        String[] parts = in.split("\\.");

        for(int i = 0; i < parts.length; ++i) {
            if(i > 0) {
                result.append(".");
            }

            String part = parts[i];

            if(i == 0 || !exclude.contains(parts[i - 1])) {
                part = getReplacement(part);
            }

            result.append(part);
        }

        return result.toString();
    }

    StringBuffer b = new StringBuffer();
    Matcher m = Pattern.compile("_(.)").matcher(in);
    while(m.find()) {
        m.appendReplacement(b, m.group(1).toUpperCase());
    }

    m.appendTail(b);
    return b.toString();
}

That would handle a situation like

Parameters.a_b.Parameters.a_b.c_d

and output

Parameters.a_b.Parameters.a_b.cD

That's impossible Java syntax but I hope you see what I mean. Doing a little parsing yourself goes a long way.

Radiodef
  • 37,180
  • 14
  • 90
  • 125
  • I understand that. The goal is to really replace the field names that were defined using _ instead of camel case without changing fields/constants that exist from another class. – Tommo Apr 28 '15 at 16:27
  • This does exactly what I need! Thank you. After reviewing all instances I need to be replaced, I almost think anytime the match begins with a period, regardless of what the prefix is, then I don't want to replace it since that likely means it is a field/method that belongs to another class. – Tommo Apr 28 '15 at 17:04
  • I was hoping you would be able to assist me further. I think instead of checking for particular prefixes, it is safe for me to assume a string should be replaced if it is not directly preceded by a ".". So something like: `class.getPrimary_Id(primary_idField);` Should become: `class.getPrimary_Id(primaryIdField);` Skipping where the underscore containing value is directly preceded by a "." would prevent changing any methods/fields that are being referenced in the code from another class where the underscores should not be replaced. – Tommo Apr 29 '15 at 20:39
  • Possibly something like `"(?<!\\.)[\\w_]*_(.)"`. Also `"[^.][\\w_]*_(.)"`. I recommend reading the lookbehind link. – Radiodef Apr 29 '15 at 20:45
0

Maybe you can have another Pattern:

Pattern p = Pattern.compile("^Parameters.*"); //^ means the beginning of a line

If this matches , don't replace anything.

Nick Allen
  • 1,647
  • 14
  • 20