UPDATE: 04 Jan 2015
I still have these issues. Users of our app have increased and I see all kind of network errors. Our app sends out emails everytime there is a network related error on app.
Our app does a financial transactions - so re-submits are not really idempotent - so very scared of enabling HttpClient's retry feature. we have done some kind of response caching on server to handle re-submits done explicitly by user. However, still no solution that works without bad user experience.
Original Question
I have an android app which posts data as part of user operation. The data includes few images & I package them as Protobuf message (byte array, in effect) and post it to server over HTTPS connection.
Though the app works fine for most part, but we are seeing connection errors occasionally. The issue has become more pronounced now that we have some users in relatively slow network areas (2G connections). However, the issue is not limited to slow connections areas, issue is seen with customers using WiFi and 3G connections.
Here are few exceptions we notice in our App logs
Below one happens after 5 minutes, as I had set Socket timeout to 5 minutes. The app was trying to post 145kb of data in this case
Stack trace java.net.SocketTimeoutException: Read timed out at org.apache.harmony.xnet.provider.jsse.NativeCrypto.SSL_read(Native Method) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl$SSLInputStream.read(OpenSSLSocketImpl.java:662) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:103) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:191)
Below one happened 2.5 minutes ( socket timeout was set to 5 minutes), client was sending 144kb of data
javax.net.ssl.SSLException: Write error: ssl=0x5e4f4640: I/O error during system call, Broken pipe at org.apache.harmony.xnet.provider.jsse.NativeCrypto.SSL_write(Native Method) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl$SSLOutputStream.write(OpenSSLSocketImpl.java:704) at org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:109) at org.apache.http.impl.io.ContentLengthOutputStream.write(ContentLengthOutputStream.java:113)
Below one happened after 1 minute.
Stack trace javax.net.ssl.SSLException: Connection closed by peer at org.apache.harmony.xnet.provider.jsse.NativeCrypto.SSL_do_handshake(Native Method) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:378) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl$SSLInputStream.(OpenSSLSocketImpl.java:634) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.getInputStream(OpenSSLSocketImpl.java:605)
Below one happened after 77 seconds
Stack trace javax.net.ssl.SSLException: SSL handshake aborted: ssl=0x5e2baf00: I/O error during system call, Connection reset by peer at org.apache.harmony.xnet.provider.jsse.NativeCrypto.SSL_do_handshake(Native Method) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:378) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl$SSLInputStream.(OpenSSLSocketImpl.java:634) at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.getInputStream(OpenSSLSocketImpl.java:605) at org.apache.http.impl.io.SocketInputBuffer.(SocketInputBuffer.java:70)
Below one happened after 15 seconds (Connect timeout is set to 15 seconds)
Time Taken : 15081 Stack trace org.apache.http.conn.ConnectTimeoutException: Connect to /103.xx.xx.xx:443 timed out at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:121) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:144) at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:164) at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:119) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:365)
Here is the source code snippets that I use for posting the reqeust
HttpParams params = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(params, 15000); //15 seconds
HttpConnectionParams.setSoTimeout(params, 300000); // 5 minutes
HttpClient client = getHttpClient(params);
HttpPost post = new HttpPost(uri);
post.setEntity(new ByteArrayEntity(requestByteArray));
HttpResponse httpResponse = client.execute(post);
....
public static HttpClient getHttpClient(HttpParams params) {
try {
KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType());
trustStore.load(null, null);
SSLSocketFactory sf = new TrustAllCertsSSLSocketFactory(trustStore);
sf.setHostnameVerifier(SSLSocketFactory.STRICT_HOSTNAME_VERIFIER);
HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
HttpProtocolParams.setContentCharset(params, HTTP.UTF_8);
SchemeRegistry registry = new SchemeRegistry();
registry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
registry.register(new Scheme("https", sf, 443));
ClientConnectionManager ccm = new ThreadSafeClientConnManager(params, registry);
DefaultHttpClient client = new DefaultHttpClient(ccm, params);
// below line of code will disable the retrying of HTTP request when connection is timed
// out.
client.setHttpRequestRetryHandler(new DefaultHttpRequestRetryHandler(0, false));
return client;
} catch (Exception e) {
return new DefaultHttpClient();
}
}
I have read some forums indicating that we should use HttpUrlConnection class. I did make code changes to use https://code.google.com/p/basic-http-client/ as a hot fix. Though it worked on my Samsung phone, it seemed to have some issue in phone customer was using, it was not even able to connect to our site. I had to roll it back, though I can relook at it if the root cause can be pinned to DefaultHttpClient.
OUr web server is nginx, and our web service runs on Apache Tomcat. Customers are mostly using Android 4.1+ phones. The customer from whose phone I have retrieved above stack traces is using Micromax A110Q phone with Android 4.2.1
Any inputs on this will be highly appreciated. Thanks a lot!
Update:
- I had noticed that we were not shutting down the Connection Manager. So added below code in finally block of the code where I use the http client.
if (client != null) { client.getConnectionManager().shutdown(); }
- Updated nginx configuration to accept data upto size of 5M as its default is 1Mb and some clients were submitting more than 1MB and server was severing connection with 413 error.
client_max_body_size 5M;
- Also increased the nginx proxy read timeout so that it waits longer for getting data from client.
proxy_read_timeout 300;
With the above changes, the errors have reduced a bit. In last one week, I see following two types of erros:
org.apache.http.conn.ConnectTimeoutException: Connect to /103.xx.xx.xxx:443 timed out
- This happens in 15 seconds which is my connect timeout. I am assuming that this happens as client is unable to reach to server due to network slowness or as @JaySoyer pointed out, may be due to network switching.java.net.SocketTimeoutException: SSL handshake timed out at org.apache.harmony.xnet.provider.jsse.NativeCrypto.SSL_do_handshake(Native Method)
. This is happening at the expiry of socket timeout. I am now using 1 minute as socket timeout for small requests, and 3 and 6 minutes for packets upto 75 KB and higher respectively.
However, these errors have reduced considerably, and I am seeing 1 failure in 100 requests, compared with earlier version of my code where it was 1 in 10 requests.