2

Say that one wishes to convert all absolute svn:externals URLS to relative URLS throughout their repository.

Alternatively, if heeding the tip in the svn:externals docs ("You should seriously consider using explicit revision numbers..."), one might find themselves needing to periodically pull new revisions for externals in many places throughout the repository.

What's the best way to programmatically update a large number of svn:externals properties?

My solution is posted below.

kostmo
  • 6,222
  • 4
  • 40
  • 51

2 Answers2

4

Here's my class to extract parts from a single line of an svn:externals property:

from urlparse import urlparse
import re
class SvnExternalsLine:
    '''Consult https://subversion.apache.org/docs/release-notes/1.5.html#externals for parsing algorithm.
    The old svn:externals format consists of:
        <local directory> [revision] <absolute remote URL>

    The NEW svn:externals format consists of:
        [revision] <absolute or relative remote URL> <local directory>

    Therefore, "relative" remote paths always come *after* the local path.
    One complication is the possibility of local paths with spaces.
    We just assume that the remote path cannot have spaces, and treat all other
    tokens (except the revision specifier) as part of the local path.
    '''

    REVISION_ARGUMENT_REGEXP = re.compile("-r(\d+)")

    def __init__(self, original_line):
        self.original_line = original_line

        self.pinned_revision_number = None
        self.repo_url = None
        self.local_pathname_components = []

        for token in self.original_line.split():

            revision_match = self.REVISION_ARGUMENT_REGEXP.match(token)
            if revision_match:
                self.pinned_revision_number = int(revision_match.group(1))
            elif urlparse(token).scheme or any(map(lambda p: token.startswith(p), ["^", "//", "/", "../"])):
                self.repo_url = token
            else:
                self.local_pathname_components.append(token)

    # ---------------------------------------------------------------------
    def constructLine(self):
        '''Reconstruct the externals line in the Subversion 1.5+ format'''

        tokens = []

        # Update the revision specifier if one existed
        if self.pinned_revision_number is not None:
            tokens.append( "-r%d" % (self.pinned_revision_number) )

        tokens.append( self.repo_url )
        tokens.extend( self.local_pathname_components )

        if self.repo_url is None:
            raise Exception("Found a bad externals property: %s; Original definition: %s" % (str(tokens), repr(self.original_line)))

        return " ".join(tokens)

I use the pysvn library to iterate recursively through all of the directories possessing the svn:externals property, then split that property value by newlines, and act upon each line according to the parsed SvnExternalsLine.

The process must be performed on a local checkout of the repository. Here's how pysvn (propget) can be used to retrieve the externals:

client.propget( "svn:externals", base_checkout_path, recurse=True)

Iterate through the return value of this function, and and after modifying the property on each directory,

client.propset("svn:externals", new_externals_property, path)
kostmo
  • 6,222
  • 4
  • 40
  • 51
  • The script assumes that there is no space between `-r` and number in revision definition in externals, but space is allowed and `-r 1234 http://example.com/repos/zag foo/bar1` is a [valid external defition](https://subversion.apache.org/docs/release-notes/1.5.html#externals). The link with such examples is actually at the top comment in the script itself. – Alex Che Apr 16 '23 at 21:33
  • Also, the words `Therefore, "relative" remote paths always come *after* the local path.` in the script comment are not valid. It should be the other way around: `Therefore, "relative" remote paths always come *before* the local path.` Similarly to what the doc linked above says: `When Subversion sees an svn:externals without an absolute URL, it takes the first argument as a relative URL and the second as the target directory.` – Alex Che Apr 16 '23 at 21:43
0

The kostmo's implementation (provided more than 10 years ago) does not support pinned revision number definitions with space. E.g. in the valid external definitions below:

   foo/bar -r 1234 http://example.com/repos/zag
   -r 1234 http://example.com/repos/zag foo/bar1

the "-r 1234" part will be parsed as a local path, instead of a revision specification.

My implementation below, which fixes this issue:

#!/usr/bin/python3

import re
import sys
import traceback
from urllib.parse import urlparse


class SvnExternalsLine:
   '''Consult https://subversion.apache.org/docs/release-notes/1.5.html#externals for parsing algorithm.
   The old svn:externals format consists of:
     <local directory> [revision] <absolute remote URL>

   The NEW svn:externals format consists of:
     [revision] <absolute or relative remote URL> <local directory>

   Therefore, "relative" remote paths always come *before* the local path.
   When Subversion sees an svn:externals without an absolute URL,
   it takes the first argument as a relative URL and the second as the target directory.
   One complication is the possibility of local paths with spaces.
   We just assume that the remote path cannot have spaces, and treat all other
   tokens (except the revision specifier) as part of the local path.
   '''

   OLD_FORMAT_REGEXP = re.compile(r'^\s*(?P<loc>.*?)(\s*-r\s*(?P<rev>\d+))?\s+(?P<url>\S+)\s*$')
   NEW_FORMAT_REGEXP = re.compile(r'^\s*(-r\s*(?P<rev>\d+)\s*)?(?P<url>\S+)\s+(?P<loc>.*?)\s*$')

   def __init__(self, original_line):
      self.original_line = original_line

      self.pinned_revision_number = None
      self.repo_url = None
      self.local_pathname = None

      is_abs_url = lambda s: urlparse(s).scheme
      is_old = is_abs_url(original_line.split()[-1])
      regexp = SvnExternalsLine.OLD_FORMAT_REGEXP if is_old else SvnExternalsLine.NEW_FORMAT_REGEXP

      m = regexp.fullmatch(original_line)
      self.repo_url = m.group('url')
      self.local_pathname = m.group('loc')
      self.pinned_revision_number = m.group('rev')


   def constructLine(self):
      '''Reconstruct the externals line in the Subversion 1.5+ format'''
      line = f'{self.repo_url} {self.local_pathname}'
      if self.pinned_revision_number is not None:
          line = f'-r{self.pinned_revision_number} {line}'

      return line
Alex Che
  • 6,659
  • 4
  • 44
  • 53