8

I got gensim to work in Google Collab by following this process:

!pip install gensim
from gensim.summarization import summarize

Then I was able to call summarize(some_text)

Now I'm trying to run the same thing in VS code:

I've installed gensim: pip3 install gensim

but when I run

from gensim.summarization import summarize

I get the error

Import "gensim.summarization" could not be resolvedPylancereportMissingImports

I've also tried from gensim.summarization.summarizer import summarize with same error. Regardless I haven't been able to call the function summarize(some_text) outside of Google Collab.

halfer
  • 19,824
  • 17
  • 99
  • 186
Katie Melosto
  • 1,047
  • 2
  • 14
  • 35
  • Note: when using `inline formatting`, single backticks are fine. Triple backticks work, but they are more effort to write, and more fiddly to edit. – halfer Sep 06 '21 at 20:35

2 Answers2

10

The summarization code was removed from Gensim 4.0. See:

https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4#12-removed-gensimsummarization

12. Removed gensim.summarization

Despite its general-sounding name, the module will not satisfy the majority of use cases in production and is likely to waste people's time. See this Github ticket for more motivation behind this.

If you need it, you could try:

  • installing an older gensim version (such as 3.8.3, the last official release in which it remained); or…
  • copy the source code out to your own local module

However, I expect you'd likely be disappointed by its inflexibility and how little it can do.

It was only extractive summarization - choosing a few key sentences from those that already exist. That only gives impressive results when the source text was already well-written in an expository style mixing high-level overview sentences with separate detail sentences. And, its method of analyzing/ranking words was very crude & hard-to-customize – totally unconnected to the more generic/configurable/swappable approaches used elsewhere in Gensim or in other text libraries.

gojomo
  • 52,260
  • 14
  • 86
  • 115
3

So I had to download specifically

pip3 install gensim==3.6.0

I was using gensim==4.1.0 and this function no longer seems to work in this later version

Katie Melosto
  • 1,047
  • 2
  • 14
  • 35
  • 2
    If choosing to roll-back to an older Gensim, you'd probably prefer to get `gensim=3.8.3`, the latest version that still had the `summarization` module - rather than the even-older `3.6.0`. – gojomo Sep 05 '21 at 20:16
  • Thanks so much. Do you know why they removed it? It seemed to work well. Oh never mind. I see you posted the link! – Katie Melosto Sep 05 '21 at 20:17
  • My answer & the linked project issue talk about some of the reasons it was removed. If it's working well for you, that's great. I suspect some variant of the algorithm could someday return to Gensim, if the code were improved to integrate better with the rest of the project, and had an engaged contributor/maintainer. – gojomo Sep 05 '21 at 20:26
  • @gojomo do you have any recommendations on libraries in Python that do a better job at summarization than the older GenSim models? I'm curious about what else is out there that might be better – Katie Melosto Sep 06 '21 at 11:46
  • There's been a lot of progress in deeper-neural-network language models, but I don't know the space well enough to recommend anything. Via an HN post about a Chrome extension (https://news.ycombinator.com/item?id=28322429), I'd heard recently about a free summarization model based on the BERT language model which appears usable in Python - see https://huggingface.co/sshleifer/distilbart-cnn-12-6, which also offers a test text box. That may be worth looking into, directly, or as a jumping-off point to other related work. – gojomo Sep 07 '21 at 00:14
  • Ok. Thanks so much - will take a look at that – Katie Melosto Sep 07 '21 at 00:20