Adding synonym words different code numbers - java

Question

I have this program part, which puts specific numbers in all sections. What I would like to have is to make the iterator keep counting the words, even if the word are same. This recent code sets the same number when same word comes again. For example it should approximately look like this:

Before -------------------->>>> After

AAAA -------------------->>>> 001 AAAA

BBB --------------------->>>> 002 BBB

CCCC ----------------->>>>>> 003 CCCC

BBB ---------------------->>>>>> 004 BBB

Can I somehow solve it with if? for example in the example above:

int i=0; if (val[1].equals("BBB"){ i++; if(i==2){sections.put("004","BBB");  //only the second B must be changed at all times, that's why I thought when i=2 then section.put(....) stuff can be used.

`

// List of all Sections, [0]=Code, [1]=Expression

I hope I explained clearly. Any suggestions?

static final ArrayList<String[]> sections= new ArrayList<String[]>();

static final ArrayList sections= new ArrayList();

private static String convert (final String s) { 
    String temp = s;
    //replaces the defined section keywords with a unique section code identifier(3 Bits)
    final Iterator<String[]> sic = sections.iterator();
    while (sic.hasNext()) {
        final String[] val = sic.next();
        temp = temp.replaceAll("\n(" + val[1] + ")\r?\n--*\r?\n",  val[0].length()>0 ? "\n\u25b4\u25ba"+val[0]+"\n$1\n": "\n\n\n"); //$NON-NLS-1$ //$NON-NLS-2$ //$NON-NLS-3$ //$NON-NLS-4$ //$NON-NLS-5$
    }
    return (temp + "\u25b4").replaceAll("\u25ba212(\n" + ratioRam +"\u25b4)","\u25ba222$1"); //$NON-NLS-1$ //$NON-NLS-2$ //$NON-NLS-3$ //$NON-NLS-4$
}

// in another part, the file sections are read. : readList(sections) ;

I really don't understand what you're trying to do. Can you describe what your code does in more detail? Please also explain why your current code doesn't work correctly. — Duncan Jones, Apr 30 '14 at 10:22
Is `CCCC ----------------->>>>>> 003 BBB` correct, or a typo? Should it not be `003 CCCC`? — Teetrinker, Apr 30 '14 at 10:46
I think that if you kept the frequencies of the strings in some data structure and not as part of the string this would be a lot easier and more efficient. — Simon, Apr 30 '14 at 11:26
If I understand this correctly, you want add sections numbers to a document (markdown?) or something like that, where there might be repeating section names. Maybe use `replace` instead of `replaceAll`? — tobias_k, Apr 30 '14 at 11:39

tobias_k · Answer 1 · 2014-04-30T12:20:45.027

0

If I understand this correctly, you want to add section numbers to a document, where there might be repeating section names. You sections list seems to hold arrays describing the sections, with the first element being the section number and the second element the section name.

You do so by constructing a regular expression using the section title and the expected surrounding markup, and then replace that with the title preceded by the respective number. Now the problem seems to be that there can be repeated section titles, and using replaceAll you find all those sections and put the same number in from of them.

Instead of using replaceAll, you could use replaceFirst to replace just the first occurrence of that section title. Of course, this requires the sections list to be sorted.

Here's a minimal example (slightly simplified, but using the same regex as in your question).

String text = "bla bla \nsection\n------\nmore text\nanother section"
        + "\n----\ntext again\nsection\n---\nfinal bunch of text";
Map<String, String> sections = new TreeMap<>();
sections.put("001", "section");
sections.put("002", "another section");
sections.put("003", "section");
for (String num : sections.keySet()) {
    String title = sections.get(num);
    text= text.replaceFirst("\n(" + title + ")\r?\n--*\r?\n", "\n\u25b4\u25ba"+ num+"\n$1\n");
}
System.out.println(text);

In the output, all the sections are correctly numbered.

edited Apr 30 '14 at 12:20

answered Apr 30 '14 at 11:51

tobias_k

81,265
12
120
179

my code replaces the defined section keywords with a unique section code identifier(3 Bits). I changed them as replaceFirst but this time the second word does not have any code in front of it. :/ – tolgazinho Apr 30 '14 at 12:02
Please add a minimal reproduceable example, e.g. a string with three section headers, one unique and two duplicates, and the respective `sections` list. – tobias_k Apr 30 '14 at 12:11
Would you please give an example ? I am not sure how I can do it – tolgazinho Apr 30 '14 at 12:14
@tolgazinho I can not give you an example of how your text and your sections look, you have to do that. But I did a minimal example for my solution, and here everything seems to work just fine. – tobias_k Apr 30 '14 at 12:21
nope Tobias, unfortunately not. There are plenty of sections with plenty of code numbers. I believe, doing it like this should cause lots of effort. But what it can be done if it's implementable is added in the question – tolgazinho May 02 '14 at 05:49
@tolgazinho I do not understand. Please specify the problem. Does it cause an error, or does it not behave properly, or it it too slow, or what? How large is your body of text, and how many sections are there? Again, please provide some sort of reproducable example. – tobias_k May 02 '14 at 11:04

Adding synonym words different code numbers - java

1 Answers1

Linked