SVN + Apache HTTPD - 500 Internal Server Error after several checkouts using Jenkins

Question

Backstory:
We decided to migrate the SVN from On-Prem to Cloud.
Both servers are CentOS 7 and the SVN version On-Prem is 1.8.15 while on Cloud it's 1.8.19;
The access protocol changed from SVN (port 3690) to HTTPS (443), so the httpd setup is a novelty.

For the migration of the repository, I've tried doing a plain old 'rsync' between the servers to move the whole repository, and it worked since the functionality & all the revisions were there, however I still got the same error.
I thought it may be some kind of DB issue, so I then used the SVN-native 'svnadmin dump' and 'svnadmin load' commands to import the repository. The issue still persists.

I am using SVN accessed using HTTPS through Apache HTTPD. Everything seems to work fine and all the functionality is there, but after several checkouts I start getting a 500 Internal Server Error.

Currently, the issue is caused by a Jenkins pipeline which checkouts from SVN, here is the outputted error:

ERROR: Failed to check out https://svn-repo/path/to/files
org.tmatesoft.svn.core.SVNException: svn: E175002: OPTIONS of '/path/to/files': 500 Internal Server Error (https://svn-repo)
svn: E175002: OPTIONS request failed on '/path/to/files'

The reason why I don't think it's a problem from the client (Jenkins) side at the moment is because the same error happened to me when checking out from my PC SVN client.

Here are the logs from HTTPD:

10.10.10.16 - - [17/Aug/2020:12:45:21 +0300] "OPTIONS /path/to/files HTTP/1.
1" 401 381
10.10.10.16 - user [17/Aug/2020:12:45:21 +0300] "OPTIONS /path/to/files HTTP/1.1" 500 541

As you can see, I receive a 401 before getting the 500, but as I said the checkouts occur one after the other so it couldn't have checked out something previously if the authorization was invalid (the permissions for the whole repo are identical, not path-based).

Side-note: The 401 occurs due to the definition of the WEBSAV protocol: it allows unauthenticated access so it will always try it first. If it gets back a 401 then it will send the credentials.

---- Progress Report ----
It's been brought to my attention that 'SVNAllowBulkUpdates On' could be the cause of this issue. I tried running the pipeline both with 'Prefer' and 'Off', however that did not fix the issue.

Possibly related issue: Large SVN checkout fails sporadically

I upgraded the SVN to version 1.10 successfully. After upgrading and running the pipeline once more, I saw the following error in the SVN error log:

[Thu Oct 01 17:25:55.268333 2020] [dav:error] [pid 9465] [client 11.11.11.11:39580] Provider encountered an error while streaming a REPORT response. [500, #0]
[Thu Oct 01 17:25:55.268355 2020] [dav:error] [pid 9465] [client 11.11.11.11:39580] A failure occurred while driving the update report editor [500, #104]
[Thu Oct 01 17:25:55.268360 2020] [dav:error] [pid 9465] [client 11.11.11.11:39580] Connection reset by peer [500, #104]

Since the log points to a client-side issue, I started searching for configuration changes related to the client. Added the following in "~/.subversion/servers":

http-timeout = 259200

Source: https://confluence.atlassian.com/fishkb/svn-operations-taking-longer-than-an-hour-time-out-229180362.html

Unfortunately, this still did not help.

Later, I performed a 'tcpdump' on port 443 (tcpdump -nnS -i ens5 port 443) to see the headers of the incoming / outgoing packets. I ran the commands both on the Jenkins Slave and the SVN simultaneously, and found that at a certain point they stopped exchanging information for precisely one minute, after which the SVN sent a session termination packet to the Jenkins Slave which tried to later send information and abort the connection:

17:14:56.976631 IP SVN > Jenkins-Slave: Flags [.], ack 4264260017, win 235, options [nop,nop,TS val 1054806523 ecr 1461582542], length 0
17:14:56.976961 IP SVN > Jenkins-Slave: Flags [P.], seq 394455454:394456190, ack 4264260017, win 235, options [nop,nop,TS val 1054806523 ecr 1461582542], length 736
17:14:56.983612 IP Jenkins-Slave > SVN: Flags [P.], seq 4264260017:4264260557, ack 394456190, win 279, options [nop,nop,TS val 1461582631 ecr 1054806523], length 540
17:14:56.983688 IP Jenkins-Slave > SVN: Flags [P.], seq 4264260557:4264260693, ack 394456190, win 279, options [nop,nop,TS val 1461582631 ecr 1054806523], length 136
17:14:57.065351 IP SVN > Jenkins-Slave: Flags [.], ack 4264260693, win 252, options [nop,nop,TS val 1054806611 ecr 1461582631], length 0
17:15:57.124806 IP SVN > Jenkins-Slave: Flags [P.], seq 394456190:394457011, ack 4264260693, win 252, options [nop,nop,TS val 1054866672 ecr 1461582631], length 821
17:15:57.124832 IP SVN > Jenkins-Slave: Flags [F.], seq 394457011, ack 4264260693, win 252, options [nop,nop,TS val 1054866672 ecr 1461582631], length 0
17:15:57.125768 IP Jenkins-Slave > SVN: Flags [P.], seq 4264260693:4264260724, ack 394457012, win 300, options [nop,nop,TS val 1461642773 ecr 1054866672], length 31
17:15:57.125804 IP Jenkins-Slave > SVN: Flags [R.], seq 4264260724, ack 394457012, win 300, options [nop,nop,TS val 1461642774 ecr 1054866672], length 0

I obfuscated the IPs for obvious reasons.

These are logs from the client's perspective, here Jenkins. What do the logs on httpd say? Verify that your httpd is still running. If it crashes by itself, debug that, and if required post here. But as is, it could be some many things, impossible to really tell. — Nic3500, Aug 18 '20 at 02:17
You're right, forgot to add the HTTPD logs here because they don't give out too much info as well. I added them to the OP now, thanks for your input. — just-another-dude, Aug 18 '20 at 08:50
ah ok, you will have to dig further and look at the Tomcat logs. — Nic3500, Aug 19 '20 at 22:57
Unfortunately, that is the SVN Access Log, there is no info in the error log or other access logs at all. I will also add the backstory to the post which should give some necessary context. — just-another-dude, Aug 20 '20 at 07:37
Your Jenkins runs under a servlet container, like Tomcat. It could be running under an application server as well (like JBoss). Jenkins in itself cannot run alone. So the logs of the server must be checked, not the logs of the httpd server in front of it. Since it stops working after a couple successful transfers, it might be running out of memory in the JVM, or some other resource. — Nic3500, Aug 21 '20 at 01:29
The reason I don't think it's relevant to the issue is because usually it fails from checkouts on the slave which doesn't seem to have any RAM / CPU issues as well as he fact that it still works normally with the On-Prem server. In addition, the same error happens to me occasionally when checking out from my PC SVN client. I suppose that the most likely scenario is that there is some kind of bug / issue with the SVN due to the move and/or versions. Nonetheless, I will take a look at the Jenkins web logs as well and post them here ASAP. Thanks you. — just-another-dude, Aug 23 '20 at 08:40
By the way, I see you wrote "So the logs of the server must be checked, not the logs of the httpd server in front of it". The logs I sent were from the httpd server in front of the SVN, not the Jenkins, and in fact there are no other SVN log options available at the moment. I thought of compiling it with the debug mode enabled to get some more info. — just-another-dude, Aug 23 '20 at 10:53

SVN + Apache HTTPD - 500 Internal Server Error after several checkouts using Jenkins

0 Answers0