-2

A Node.js example for a Google API contains this (for the server code). This is from Google's examples on GitHub, simplified for the purpose here:

const audio = {
    content: fs.readFileSync(filename).toString('base64'),
};

const request = {
    audio: audio,
};

return client
    .longRunningRecognize(request)
    .then(
       ...
    })

This surprises me (since I thought async was the norm for Node.js server code). And I wonder if I can replace it with fs.readFile (this is the async version).

EDIT: The above part seems to be answered at least partly by a simple "yes", thanks. But I do not understand how to fix the base64 part (and avoid having big files in memory).

However what I really want to do is to stream the file with sockets to this API. Can I do that (if I struggle with how to do it long enough...)?

EDIT: I would appreciate some example of using sockets here. How do I stream to something like the structure above?

EDIT 2: There is a library socked-io.file, but I have no idea how to fit this into the API interface above. Can it be done?

EDIT 3: In the code for the API that I have the file is already on the server, but I want to upload the file. (Sorry for the confusion.)

Leo
  • 4,136
  • 6
  • 48
  • 72
  • 1
    Yes. Yes you can. – gforce301 Apr 11 '18 at 19:52
  • @gforce301 That gives me some hope. (But was that about the `readFile` part?) Ah, 2 "yes". More hope! – Leo Apr 11 '18 at 19:53
  • 1
    Possible duplicate of [Difference between readFile and readFileSync](https://stackoverflow.com/questions/17604866/difference-between-readfile-and-readfilesync) – Taki Apr 11 '18 at 19:54
  • You aren't going to be able to call it with a chained function like this, however. – Mark Apr 11 '18 at 19:54
  • 1
    It was about the entire post. What you are asking about can be done with nodejs. – gforce301 Apr 11 '18 at 19:54
  • @Mark_M Do you mean the `base64` conversion? – Leo Apr 11 '18 at 19:55
  • 1
    Yes. Because `readFile` doesn't return anything. Also worth looking at https://nodejs.org/api/fs.html#fs_class_fs_readstream if you will be turning around and streaming the contents. – Mark Apr 11 '18 at 19:57
  • @Taki Thanks, I saw that, but I do not understand if `readFile` and `readFileSync` always are interchangeable. – Leo Apr 11 '18 at 19:57
  • @Mark_M Is there anything that can help me base64-convert the stream (so I do not have to go through all the mistakes myself)? – Leo Apr 11 '18 at 20:01
  • @Mark_M Ah, yes, there is something here: https://www.npmjs.com/package/base64-stream – Leo Apr 11 '18 at 20:11
  • 1
    `fs.readFile()` does not "avoid having big files in memory". Maybe you were referring to `fs.createReadStream()`? – Patrick Roberts Apr 11 '18 at 20:34
  • @PatrickRoberts Hm, yes probably. I am very unsure about these things yet. It does not seem easy to avoid having the file in memory with `fs.createReadStream()` either: https://stackoverflow.com/questions/43256505/replacing-fs-readfile-with-fs-createreadstream-in-node-js – Leo Apr 11 '18 at 20:40
  • 1
    @Leo any situation where you need to send the whole file in one chunk will have that issue. Of course, that's a contrived example, and in most cases, simply `.pipe()`ing the stream will minimize memory usage and eventually emit the whole file to the destination writable stream in chunks. – Patrick Roberts Apr 11 '18 at 20:46
  • @PatrickRoberts Yes, and it is very different from my case where I want to upload (not download). My confusion. :-( But I don't understand how to connect all this to the API in my question. – Leo Apr 11 '18 at 20:55
  • 1
    There is no API in your question. It's just an object literal with a value loaded from file content encoded into a base64 string. There's no indication of when this object needs to be made available, or if it's necessary to have the entire string at once. Your question is very open-ended. – Patrick Roberts Apr 11 '18 at 20:57
  • @PatrickRoberts The object in my question is used in calling the API. It is just a parameter in the call example I have. The only parameter, actually. Is there anything unclear there? – Leo Apr 11 '18 at 21:00
  • Yes, every point I just made. Where is the code using this object? Where's the documentation, or at least a linked reference to the particular API call you're making? Is the object converted to JSON before it's sent? If so, does the API call allow it to be streamed, or does it have to be buffered and sent in one chunk? You haven't answered any of those questions. – Patrick Roberts Apr 11 '18 at 21:04
  • @PatrickRoberts Thanks. I have added what I think could be the relevant parts from Google's example now. – Leo Apr 11 '18 at 21:11
  • Without a link like I requested, are we simply supposed to guess the implementation for `client.longRunningRecognize()`? – Patrick Roberts Apr 11 '18 at 21:12
  • @PatrickRoberts That is the API from Google. Is the implementation of this public? Then it is on GitHub. It must be here then: https://github.com/googleapis/nodejs-speech – Leo Apr 11 '18 at 21:14
  • 1
    [Here's the link you should have provided in the original draft of your question](https://cloud.google.com/nodejs/docs/reference/speech/1.3.x/v1.SpeechClient#longRunningRecognize). – Patrick Roberts Apr 11 '18 at 21:15
  • @PatrickRoberts Oh, I see, but that documentation does not fit with the code from Google. All this is in beta. The documentation says that only a gcs URI could be used, but the example is using fs.fileReadSync. ....... Eh, noo. I was wrong. Now I see you either provide `content` or `uri`. I admit you were very right. – Leo Apr 11 '18 at 21:18
  • @PatrickRoberts So, back to my original (second) question: Can I somehow stream to this API? (Or do I have to use gcs?) – Leo Apr 11 '18 at 21:22

1 Answers1

2

If you had looked at the documentation for client.longRunningRecognize(), you would have seen that request.audio could instead contain a URI to a google cloud store audio file.

Based on this approach, you could choose to stream fs.createReadStream(filename) as an upload to that location before calling the client.longRunningRecognize() with a request object using the RecognitionAudio#uri to the destination of the streamed upload instead of RecognitionAudio#content as a base64 string of the file.

That would prevent blocking the event loop like fs.readFileSync() would, and it would avoid buffering the whole file in memory like both fs.readFileSync() and fs.readFile() would.

It also appears that passing a protobuffer to RecognitionAudio#content would allow you to stream the audio data directly to the API request, though looking into that approach would require more knowledge about how this works.

Patrick Roberts
  • 49,224
  • 10
  • 102
  • 153
  • Thanks. Yes, I was aware of that option, but wanted to avoid storing those files on gcs. But after all the comments my conclusion is I can't do that. So I have to use gcs. – Leo Apr 11 '18 at 21:25
  • This long discussion actually helped me a lot. Thanks for the patience. – Leo Apr 11 '18 at 21:30