0

I am trying to extract only the City names from a text so I am using geograpy library with python but in the output, some other names have been extracted. Here is my code:

from geograpy.extraction import Extractor
text6 = u"""Some text..."""
e6 = Extractor(text=text6)
e6.find_entities()
print(e6.places)

INPUT TEXT:-

Opposition Leader Mahinda Rajapaksa says that the whole public administration has collapsed due to the constitution council’s arbitrary actions. The Opposition Leader said so in response to a query a journalised raised after a meeting held...

OUTPUT

['Opposition', 'Leader Mahinda Rajapaksa', 'Opposition Leader']

There are no any city names in this text therefore the output shold be empty

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186

1 Answers1

1

As a committer of geograpy3 to reproduce your issue i added a test to the most recent geograpy3 https://github.com/somnathrakshit/geograpy3/blob/master/tests/test_extractor.py: and added the issue:

https://github.com/somnathrakshit/geograpy3/issues/3 which was fixed with: this commit

so that the result is now:

[]

as asked for

 def testStackoverflow54712198(self):
        '''
        see https://stackoverflow.com/questions/54712198/not-only-extracting-places-from-a-text-but-also-other-names-in-geograpypython
        '''
        text='''Opposition Leader Mahinda Rajapaksa says that the whole public administration has collapsed due to the constitution council’s arbitrary actions. The Opposition Leader said so in response to a query a journalised raised after a meeting held...'''
        e=Extractor(text)
        places=e.find_geoEntities()
        if self.debug:
            print(places)
        self.assertEqual([],places)
Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186