Only last line of output being retrieved from stdout (contextlib, sys.stdout, and capture magic)

Question

I'm trying to capture terminal output to store it elsewhere. Specifically, I'm trying to retrieve the training loss from sklearn.linear_model.LogisticRegression. This is unfortunately not stored anywhere, but is printed when verbose=1.

I am trying to use contextlib.redirect_stdout to do this. However, it only captures the last line of output.

Full terminal output without redirection:

[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
Epoch 1, change: 1.00000000
Epoch 2, change: 0.22057155
Epoch 3, change: 0.12789722
Epoch 4, change: 0.06779011
Epoch 5, change: 0.04539726
Epoch 6, change: 0.03062415
Epoch 7, change: 0.02026218
Epoch 8, change: 0.01470303
Epoch 9, change: 0.00691171
Epoch 10, change: 0.00415176
Epoch 11, change: 0.00230851
Epoch 12, change: 0.00145497
Epoch 13, change: 0.00110029
convergence after 14 epochs took 1 seconds
[Parallel(n_jobs=8)]: Done   1 out of   1 | elapsed:    0.5s finished

redirect_stdout outputs only the following line:

convergence after 14 epochs took 1 seconds

Code sample:

from sklearn.linear_model import LogisticRegression
from contextlib import redirect_stdout
import io            

model = LogisticRegression(
                    random_state=42,
                    C=.001,
                    penalty='l1',
                    max_iter=1000,
                    solver='saga',
                    verbose=1))

with redirect_stdout(io.StringIO()) as f:
   model.fit(X_train, y_train)
s = f.getvalue()
print(s)

I've had the same results when using sys.stdout directly or in a Jupyter notebook using %%capture.

Does anyone know what is causing this or how to extract the full output? Thanks!

The "Epoch" messages seem likely to be printed by native code, for which `redirect_stdout` has no control over... — AKX, Jun 01 '23 at 10:55
This could be it -- I thought it might just be getting overwritten, but I tried outputting directly to a file and it successfully saves text before and after the actual epochs. — neverreally, Jun 01 '23 at 12:48
My guess would be that this is due to the use of multiprocessing. See [here](https://docs.python.org/3.11/library/contextlib.html#contextlib.redirect_stdout): _"... is not suitable for use in library code and most **threaded applications**. It also has **no effect on the output of subprocesses**."_ — Timus, Jun 01 '23 at 14:33
I suspect it is what Timus says. If it were what AKX says, you'd use [wurlitzer](https://github.com/minrk/wurlitzer). I don't think it will help here in this case, but for anyone encountering a case like AKX seems to be referencing, i.e., actual C-level out, see especially under "Forward C-level stdout/stderr to Python sys.stdout/stderr, which may already be forwarded somewhere by the environment, e.g. IPython" on [the wurlitzer documentation page under 'Usage'](https://github.com/minrk/wurlitzer#usage). — Wayne, Jun 01 '23 at 16:17
@Timus I'm about 97.6% sure this doesn't spawn subprocesses. — AKX, Jun 01 '23 at 21:26
@AKX That's a rather specific confidence level :)) It was just a guess, since I ran into a similiar problem a while ago when using multiple processes. — Timus, Jun 02 '23 at 15:43

Only last line of output being retrieved from stdout (contextlib, sys.stdout, and capture magic)

0 Answers0