8

I'm trying to read a csv text file from S3 and then send each of its lines to a distributed queue to get them processed.

When trying to read it, I'm getting "java.net.SocketException: Socket is closed" Exception at different points of the file being read (in different executions). This is the code:

      AmazonS3 s3 = new AmazonS3Client(new PropertiesCredentials(MyClass.class.getResourceAsStream("myCredentials.properties")));

        String bucketName = "myBucket";
        String key = "myFile";  

        S3Object object = s3.getObject(new GetObjectRequest(bucketName, key));

        InputStream in = object.getObjectContent();

        BufferedReader readerS3 = new BufferedReader(new InputStreamReader(in, Charset.forName(fileInfo.getEncoding())));

        try {
            String line = null;
            while ((line = readerS3.readLine()) != null) {
                // Sending the line to a distributed queue
            }
        } catch (IOException e) {
            e.printStackTrace();
        }finally {
            try {
                in.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

Any idea on how to solve this issue?

UPDATE:

This exception occurs from the second time I run the method, if I stop the whole program and run it again, then the first time I run the method it works ok.

Fgblanch
  • 5,195
  • 8
  • 37
  • 51
  • No, but it's inside a Structs Action class – Fgblanch Jul 11 '12 at 20:27
  • 2
    Try running this while retaining explicit references to all these objects (if being used and this hasn't already been tried): `S3Client`, `S3Object` and `AmazonS3Client`. There may be a problem with GC picking up objects and closing connections. – jn1kk Jul 11 '12 at 20:32
  • All those objects are in the same scope while running so it's references should be mantained right? What about separating its execution to a different thread? – Fgblanch Jul 11 '12 at 20:36
  • I mean also make sure you do not do something like this: new `Conn c = AmazonS3ClientInstance.getConn()`. In this case, as you may know, `AmazonS3ClientInstance` may be picked up by GC (where `AmazonS3ClientInstance` is of type `AmazonS3Client`). Would be helpful to post complete code, if possible. – jn1kk Jul 11 '12 at 20:38
  • No AmazonS3, S3Object are not modified. I update the code – Fgblanch Jul 11 '12 at 20:42
  • 2
    How big is the file? Perhaps, you may need to change the socket timeout (perhaps 0? -> infinite)? `http://docs.amazonwebservices.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html#getSocketTimeout()` – jn1kk Jul 11 '12 at 20:53
  • 2
    Bad URL (can't edit comment), good URL -> http://tinyurl.com/d75xffa – jn1kk Jul 11 '12 at 20:58
  • I'll give it a try , although the file it's only 10M and the socket being closed sometimes happens when only 20 lines are read – Fgblanch Jul 11 '12 at 21:00
  • @jsn It has nothing to do with (a) multithreading, (b) garbage-collection, (c) file size, or (d) luck. The only relevant phrase here is 'in different executions'. – user207421 Jul 12 '12 at 00:09
  • @jsn The only variable here that doesn't appear to be local is 'in', the socket input stream. Local variables are subject to neither GC nor multithreading, and the local streams wrapped around the socket input stream will preserve the latter and its socket from GC as well. Network problems manifest themselves as exceptions or infinite blocks. File size doesn't affect this code, as he isn't reading it all into memory. There's not much left except a premature close by this very code or some other code. – user207421 Jul 12 '12 at 13:24
  • @jsn augmenting the socket timeout does not solve the problem. It's really strange because after the compilation of the project works, then the next ones don't – Fgblanch Jul 12 '12 at 21:01
  • 1
    Local *variables* aren't subject to GC. The objects they point to are. You are closing the socket prematurely somewhere. This is a bug in your code, not a server or network problem. – user207421 Sep 23 '13 at 10:00
  • @jsn Changing the read timeout won't fix this problem. You seem to be just guessing. – user207421 Sep 23 '13 at 21:24

7 Answers7

6

As suggested by "jsn" in comments to question, the problem is that you need to configure AmazonS3 with ClientConfiguration:

ClientConfiguration config = new ClientConfiguration();
config.setSocketTimeout(0);
AmazonS3 s3 = new AmazonS3Client(/* credentials */, config);
om-nom-nom
  • 62,329
  • 13
  • 183
  • 228
yegor256
  • 102,010
  • 123
  • 446
  • 597
2

Thanks, @jsn, your suggestion was my issue.

I have a method that returns just the InputStream so the AmazonS3 object gets garbage collected and that causes it to close the InputStream.

I've made it keep a reference to the AmazonS3 object and that fixed my issue.

Sarel Botha
  • 12,419
  • 7
  • 54
  • 59
1

Closing the input stream or the output stream of a socket, or any stream/reader/writer wrapper around them, closes the socket (and therefore the output or input stream respectively).

user207421
  • 305,947
  • 44
  • 307
  • 483
  • But no stream/reader/writer is closed during the loop execution. – Fgblanch Jul 12 '12 at 06:44
  • @Fgblanch It is closed *after* the loop execution, *before* the next call to the method. And/or you are closing the socket, or one of its streams, somewhere else. – user207421 Jul 12 '12 at 06:49
1

Maybe you should be closing readerS3 in your finally instead of 'in'. I.e. close the outermost object, which can close it's wrapped children.

If you close 'in' first, then the InputStreamReader and BufferedReader are still open and if they try to do anything with the object that they wrap it will already be closed.

Marc
  • 824
  • 10
  • 18
0

no need to keep reinitializing s3.

on your onCreate make the call to initialize s3Object and s3Client.

then in your asynctask just use the call

by doing this way your s3Client will keep the same data and never close the socket connection with s3 when doing the while read. be smart and learn.

S3Client s3Client;
S3Object s3Object;

onCreate() {
 s3Client = new AmazonS3Client(new BasicSessionCredentials(Constants.ACCESS_KEY_ID, Constants.SECRET_KEY, Constants.TOKEN));

 object = new S3Object();
}

doinbackground() {
      object = s3Client.getObject(new GetObjectRequest(Constants.getBucket(), id +".png"));
 }
0

I had the same problem and this subject helped me to resolve the issue : S3 Java client fails a lot with "Premature end of Content-Length delimited message body" or "java.net.SocketException Socket closed"

Basically I was creating a new S3Client object for every files but at one point this object was garbage collected. So instead of doing that I transformed my class to use a Singleton :

private static AmazonS3 s3Client;
  static {
    s3Client = new AmazonS3Client(new BasicAWSCredentials(AWSKey, AWSSecretKey));
  }

  public AmazonS3 getService() {
    return s3Client;
  }
Community
  • 1
  • 1
lildesert
  • 21
  • 5
0

In my case, I got this error due to high JVM memory usage.

Reducing memory usage of the application or increasing the memory available to the JVM solved the issue.

Ermiya Eskandary
  • 15,323
  • 3
  • 31
  • 44