-1

I would like to extract a substring from an s3 URL using Regex rather than with string manipulation functions.

My requirement is to retrieve dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19 out of a URL s3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19

However, I have not been able to arrange the regex expression to give me what I want.

I would like the regex to parse in this form but I know that I am missing something in the regex line.

const url = 's3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19';
const patternMatches = url.match(new RegExp(s3://${s3bucket}/${dynamodbtablename}/([a-f\d-]+)));
const migrationDataFileS3Key = patternMatches[indexOfResultingArrayWithDesiredSubstring]

I was able to come up with the expression below to retrieve the UUID/GUID and have had to concatenate it with ${s3bucket} to form the S3 bucket key. However, I am not happy with this solution. I require the above.

const url = 's3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19';
const patternMatches = url.match(/([a-f\d-]+)/g);
const migrationDataFileS3Key = massiveTableItem + '/' + patternMatches[patternMatches.length - 1];

Thank you very much for your help.

sage
  • 587
  • 1
  • 8
  • 28
  • Try ``let patternMatches = url.match(new RegExp(String.raw`s3://${s3bucket}/${dynamodbtablename}/([a-fA-F\d-]+)`));``, then, `patternMatches[1]` will hold `05abd315-2e0b-4717-919d-1cc6576ebe19` – Wiktor Stribiżew Mar 19 '20 at 10:03
  • @WiktorStribiżew, I believe that this question is different because of the format that I wanted the answer to be in. The answer to the question is also simpler and easier for most people to follow than what is in the post that you shared. My question should lalo not be downvoted too. – sage Mar 20 '20 at 13:37
  • I also specified my question clearly before your attempt at answering it. If you had read it through at first instance, you could have proposed the duplicate at that time and not after I proposed an answer that still gave you and another responder credit. – sage Mar 20 '20 at 13:39

3 Answers3

1

You may not need a regular expression: split the URL on / and take the element you need from that. Like:

{
  console.log(`s3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19`
    .split(`/`)  // split on forward slash
    .slice(-2)   // take the last 2 elements from the resulting array
    .join(`/`)   // extract it
    );     
  // alternatively
  console.log(`s3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19`
    .match(/([\w\-])+/g)
    .slice(-2)
    .join(`/`)
    );
  // or (use capture groups)
  const {groups: {root, hashpath}} =
    /(?<root>s3:\/\/s3bucket\/)(?<hashpath>[\w\-\/]+)/
       .exec(`s3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19`);
  console.log(hashpath);
  // or (just substring from known index)
  const url = `s3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19`;
  console.log(url.substr(url.indexOf(`/`, 5) + 1))
}
KooiInc
  • 119,216
  • 31
  • 141
  • 177
  • Hi, thank you for your speedy response. However, I have to do it with a regex expression. I noted that in my question. The second thing is that I want to match `dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19` and not only the UUID alone. – sage Mar 19 '20 at 08:19
  • Ok, provided a few regex alternatives – KooiInc Mar 19 '20 at 08:57
  • Thank you very much. I appreciate what you have provided but I do not want a solution that does any string or array manipulation (i.e. slice) except to index in and read the result. – sage Mar 19 '20 at 09:16
  • This seems to work with template strings but I will test it a little bit more before approving this as answer. – sage Mar 19 '20 at 09:37
  • Well, both `RegExp.match` and `RegExp.exec` deliver an array – KooiInc Mar 19 '20 at 13:04
0

you can use capture groups, like

var str = "s3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19"; 
var myRegexp = /s3:\/\/s3bucket\/(.*)/;
var match = myRegexp.exec(str);
console.log(match[1]);
// returns 'dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19'
Sudhir Bastakoti
  • 99,167
  • 15
  • 158
  • 162
  • Thank you very much. This works. I should checkout capture groups in greater detail as they appear very convenient. However, if you were to do a match that will verify the uuid/guid at the end like I indicated, how would you go about it. Do you have a solution for that? I have already marked this as a solution but will appreciate your answer regarding what I just asked. Thank you very much. – sage Mar 19 '20 at 09:14
  • Wait a minute, I haven't yet checked if using template string works in here. I will check this and get back. – sage Mar 19 '20 at 09:29
  • This solution does not work with template string variables. I have unlisted it as an answer. – sage Mar 19 '20 at 09:31
-2

I was able to eventually arrive at a solution that was closest to the format that I wanted as required in my question. I was able to do it by combining the solution of @sudhir-bastakoti and @wiktor-stribiżew as each individual answer did not address my question completely.

I am grateful to everyone that answered my question including @kooiinc. I checked out his last answer options and it worked. However, I wanted the answer in a certain format.

const s3bucket = 's3bucket';
const url = 's3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19';
const migrationDataFileS3Key = url.match(new RegExp(String.raw`s3://${s3bucket}/(.*)`))[1];

sage
  • 587
  • 1
  • 8
  • 28