I am learning Nutch. I have set up nutch and started crawling sites. But one thing I am unable to figure out is how to restrict url containing # as several duplication is going on due to this #. I have checked the regex-urlfilter.txt
# skip URLs containing certain characters as probable queries, etc.
-[*!@]
If I add # to this line conceptually this should work but after adding # It's not working. Is it due to # used to comment lines? If so how to fix it.