I am reading a wikipedia XML file, in which i have to delete anything between curly braces. E.g. For the following string:
String text = "{{Use dmy dates|date=November 2012}} {{Infobox musical artist <!-- See Wikipedia:WikiProject_Musicians --> | name
= Russ Conway | image = | caption = Russ Conway, pictured on the front of his 1959 [[Extended play|EP]] ''More Party Pops''. | image_size = | background = non_vocal_instrumentalist | birth_name = Trevor Herbert Stanford | alias = | birth_date = {{birth date|1925|09|2|df=y}} | birth_place = [[Bristol]], [[England]], UK | death_date = {{death date and age|2000|11|16|1925|09|02|df=y}} | death_place = [[Eastbourne]], [[Sussex]], England, UK | origin = | instrument = [[Piano]] | genre = | occupation = [[Musician]] | years_active = | label = EMI (Columbia), Pye, MusicMedia, Churchill | associated_acts = | website = | notable_instruments = }}";
It should be replaced with an empty string. Notice, that the example has multiple new lines and nested {{...}}
I am using the following code:
Pattern p1 = Pattern.compile(".*\\({\\{.+\\}\\}).*", Pattern.DOTALL);
Matcher m1 = p1.matcher(text);
while(m1.find()){
String text1 = text.replaceAll(m1.group(1), "");
}
I am new to regex, can you please tell what i am doing wrong?