0

Here is a minimum example of how to catch the output of the Twitter API Filtered Stream and to send it to another API, in this case a FastAPI project, using httpie according to an example use case in the docs:

http --stream get \
    "https://api.twitter.com/2/tweets/search/stream" \
    "Authorization: Bearer $BEARER_TOKEN" |
    while read -r line; do
        echo "$line" | http post localhost:8888/posts
    done

This works fine, but now I am trying to alter the output of the Twitter stream using the command line JSON processor jq:

http --stream get \
    "https://api.twitter.com/2/tweets/search/stream" \
    "Authorization: Bearer $BEARER_TOKEN" |
    jq --compact-output --monochrome-output '. | {id: .data.id, rules: .matching_rules}' |
    while read -r line; do
        echo "$line" | http post localhost:8888/posts
    done

This does not work, unfortunately. It does work while doing http … | jq … but it stops working after adding | while … so even when omitting the following | http … this does not output anything.

So I assume that jq somehow breaks that streaming behavior, even when using --compact-output to have the entire JSON string in one line and --monochrome-output to prevent any color issues with the shell. Why is that? How can I fix this?

How could I filter the Twitter stream otherwise?

mcnesium
  • 1,423
  • 2
  • 16
  • 21
  • I think that `jq` doesn’t operate on each line as it arrives but rather buffers the input. So you should probably move the `jq` call inside the loop and have it operate on `$line`. – Jakub Roztocil Sep 28 '21 at 16:56
  • You may want to consider jq's `--stream` option. According to the manual, it `parse[s] the input in streaming fashion, outputting arrays of path and leaf values (scalars and empty arrays or empty objects). For example, "a" becomes [[],"a"], and [[],"a",["b"]] becomes [[0],[]], [[1],"a"], and [[1,0],"b"]` – pmf Sep 28 '21 at 18:02
  • Alternatively, if the output from the first stream is an array, try changing your jq filter to `.[] | {id: .data.id, rules: .matching_rules}` in order to iterate over the elements – pmf Sep 28 '21 at 18:05
  • 1
    Please follow the [mcve] guidelines as much as possible. In particular, a (brief but informative) glimpse into the input as seen by jq would be helpful to those of us who don't already know much about the Twitter stream. – peak Sep 28 '21 at 19:53
  • 1
    Some things you're missing that might help to diagnose this: what does the input to jq look like; what does it actually do (e.g. does it exit with no output, does it sit there with no output but still running, is there an error message, is it using CPU) and what did you expect to see; which version of jq are you using; what shell are you using? Broadly speaking it _seems_ like this should work. The API is documented to return a stream of CRLF separated JSON objects and jq happily processes that form of input without buffering the whole thing. – Weeble Sep 29 '21 at 13:01
  • @mcnesium: what happens if you add an `echo` or `printf` in your loop? Is it printed? Have you tried the `--unbuffered` mode of jq? – knittl Sep 21 '22 at 11:11

0 Answers0