0

I'm trying to capture terminal output to store it elsewhere. Specifically, I'm trying to retrieve the training loss from sklearn.linear_model.LogisticRegression. This is unfortunately not stored anywhere, but is printed when verbose=1.

I am trying to use contextlib.redirect_stdout to do this. However, it only captures the last line of output.

Full terminal output without redirection:

[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
Epoch 1, change: 1.00000000
Epoch 2, change: 0.22057155
Epoch 3, change: 0.12789722
Epoch 4, change: 0.06779011
Epoch 5, change: 0.04539726
Epoch 6, change: 0.03062415
Epoch 7, change: 0.02026218
Epoch 8, change: 0.01470303
Epoch 9, change: 0.00691171
Epoch 10, change: 0.00415176
Epoch 11, change: 0.00230851
Epoch 12, change: 0.00145497
Epoch 13, change: 0.00110029
convergence after 14 epochs took 1 seconds
[Parallel(n_jobs=8)]: Done   1 out of   1 | elapsed:    0.5s finished

redirect_stdout outputs only the following line:

convergence after 14 epochs took 1 seconds

Code sample:

from sklearn.linear_model import LogisticRegression
from contextlib import redirect_stdout
import io            

model = LogisticRegression(
                    random_state=42,
                    C=.001,
                    penalty='l1',
                    max_iter=1000,
                    solver='saga',
                    verbose=1))

with redirect_stdout(io.StringIO()) as f:
   model.fit(X_train, y_train)
s = f.getvalue()
print(s)

I've had the same results when using sys.stdout directly or in a Jupyter notebook using %%capture.

Does anyone know what is causing this or how to extract the full output? Thanks!

  • The "Epoch" messages seem likely to be printed by native code, for which `redirect_stdout` has no control over... – AKX Jun 01 '23 at 10:55
  • This could be it -- I thought it might just be getting overwritten, but I tried outputting directly to a file and it successfully saves text before and after the actual epochs. – neverreally Jun 01 '23 at 12:48
  • My guess would be that this is due to the use of multiprocessing. See [here](https://docs.python.org/3.11/library/contextlib.html#contextlib.redirect_stdout): _"... is not suitable for use in library code and most **threaded applications**. It also has **no effect on the output of subprocesses**."_ – Timus Jun 01 '23 at 14:33
  • I suspect it is what Timus says. If it were what AKX says, you'd use [wurlitzer](https://github.com/minrk/wurlitzer). I don't think it will help here in this case, but for anyone encountering a case like AKX seems to be referencing, i.e., actual C-level out, see especially under "Forward C-level stdout/stderr to Python sys.stdout/stderr, which may already be forwarded somewhere by the environment, e.g. IPython" on [the wurlitzer documentation page under 'Usage'](https://github.com/minrk/wurlitzer#usage). – Wayne Jun 01 '23 at 16:17
  • @Timus I'm about 97.6% sure this doesn't spawn subprocesses. – AKX Jun 01 '23 at 21:26
  • @AKX That's a rather specific confidence level :)) It was just a guess, since I ran into a similiar problem a while ago when using multiple processes. – Timus Jun 02 '23 at 15:43

0 Answers0