Chunkize warning while installing gensim

Question

I have installed gensim (through pip) in Python. After the installation is over I get the following warning:

C:\Python27\lib\site-packages\gensim\utils.py:855: UserWarning: detected Windows; aliasing chunkize to chunkize_serial warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")

How can I rectify this?

I am unable to import word2vec from gensim.models due to this warning.

I have the following configurations: Python 2.7, gensim-0.13.4.1, numpy-1.11.3, scipy-0.18.1, pattern-2.6.

score 34 · Accepted Answer · answered Jan 31 '17 at 06:49

34

You can suppress the message with this code before importing gensim:

import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')

import gensim

answered Jan 31 '17 at 06:49

Roland Pihlakas

4,246
2
43
64

@user7420652 Hey, thanks for your reply and happy to know! Stack Overflow works like that: instead of commenting (unless you want to add more info), you can upvote answers that are helpful and if the problem is solved then choose one of the answers as "the solution" by clicking the check mark on left of that answer. – Roland Pihlakas Feb 16 '17 at 13:14
2

Anyone knows what's the point of the warning though? it currently pops up also upon the first time the gensim is imported in code after installation. – matanster Jun 12 '18 at 06:20

score 16 · Answer 2 · edited Oct 19 '18 at 18:01

I think is not a big problem. Gensim just lets you know that it will alias chunkize to different function because you use a specific os.

Check out this code from gensim.utils

if os.name == 'nt':
    logger.info("detected Windows; aliasing chunkize to chunkize_serial")

    def chunkize(corpus, chunksize, maxsize=0, as_numpy=False):
        for chunk in chunkize_serial(corpus, chunksize, as_numpy=as_numpy):
            yield chunk
else:
    def chunkize(corpus, chunksize, maxsize=0, as_numpy=False):
    """
    Split a stream of values into smaller chunks.
    Each chunk is of length `chunksize`, except the last one which may be smaller.
    A once-only input stream (`corpus` from a generator) is ok, chunking is done
    efficiently via itertools.

    If `maxsize > 1`, don't wait idly in between successive chunk `yields`, but
    rather keep filling a short queue (of size at most `maxsize`) with forthcoming
    chunks in advance. This is realized by starting a separate process, and is
    meant to reduce I/O delays, which can be significant when `corpus` comes
    from a slow medium (like harddisk).

    If `maxsize==0`, don't fool around with parallelism and simply yield the chunksize
    via `chunkize_serial()` (no I/O optimizations).

    >>> for chunk in chunkize(range(10), 4): print(chunk)
    [0, 1, 2, 3]
    [4, 5, 6, 7]
    [8, 9]

    """
    assert chunksize > 0

    if maxsize > 0:
        q = multiprocessing.Queue(maxsize=maxsize)
        worker = InputQueue(q, corpus, chunksize, maxsize=maxsize, as_numpy=as_numpy)
        worker.daemon = True
        worker.start()
        while True:
            chunk = [q.get(block=True)]
            if chunk[0] is None:
                break
            yield chunk.pop()
    else:
        for chunk in chunkize_serial(corpus, chunksize, as_numpy=as_numpy):
            yield chunk

I have edited my question. when I am writing: from gensim.models import word2vec, I am getting the chunkize warning. — user7420652, Jan 15 '17 at 07:08
Hi! @Dongmin Pete Shin, things are working fine. thnx for the help. — user7420652, Jan 15 '17 at 07:56

Chunkize warning while installing gensim

2 Answers2

Linked