8

I have the follow urls.

https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY

Foreach url, I need to extract the sheet id: 1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY into a java String.

I am thinking of using split but it can't work with all test cases:

String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("/");
String res = parts[parts.length-2];
Log.d("hello res",res );

How can I that be possible?

TSR
  • 17,242
  • 27
  • 93
  • 197
  • Does the ID always follow /spreadsheets/d/? If so, then you can write a regex that looks for /spreadsheets/d/ and then captures the component following that. You don't need to use `split`. You could still use `split` and search for array elements that equal "spreadsheets" and "d". If there are other cases where the ID doesn't follow this, you'll have to figure out what the possibilities are. – ajb Aug 07 '17 at 04:43

4 Answers4

6

You can use regex \/d\/(.*?)(\/|$) (regex demo) to solve your problem, if you look closer you can see that the ID exist between d/ and / or end of line for that you can get every thing between this, check this code demo :

String[] urls = new String[]{
    "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258",
    "https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258",
    "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY"
};

String regex = "\\/d\\/(.*?)(\\/|$)";
Pattern pattern = Pattern.compile(regex);

for (String url : urls) {
    Matcher matcher = pattern.matcher(url);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
    }
}

Outputs

1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
Martin Zeitler
  • 1
  • 19
  • 155
  • 216
Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
1

it looks like the id you are looking for always follow "/spreadsheets/d/" if it is the case you can update your code to that

        String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
        String[] parts = string.split("spreadsheets/d/");
        String result;
        if(parts[1].contains("/")){
            String[] parts2 = parts[1].split("/");
            result = parts2[0];
        }
        else{
            result=parts[1];
        }
        System.out.println("hello "+ result);
Abdessamad139
  • 325
  • 4
  • 16
0

Using regex

Pattern pattern = Pattern.compile("(?<=\\/d\\/)[^\\/]*");
Matcher matcher = pattern.matcher(url);
System.out.println(matcher.group(1));

Using Java

String result = url.substring(url.indexOf("/d/") + 3);
int slash = result.indexOf("/");
result =  slash == -1 ? result
                      : result.substring(0, slash);
System.out.println(result);
Vanna
  • 746
  • 1
  • 7
  • 16
0

Google use fixed lenght characters for its IDs, in your case they are 44 characters and these are the characters google use: alphanumeric, -, and _ so you can use this regex:

regex = "([\w-]){44}"
match = re.search(regex,url)