3

I am trying to filter a set of data and I have to deal with multiple entries of ~5000 characters.

What I need is 100 characters before and after some keyword.

I have looked into regex code for search and replace but only found functions to get one keyword, not the surrounding characters.

Example Input:

abc123cde345fgh678ijk910keywordbc123cde345fgh678ijk910

Desired output with +-5 characters:

jk910keywordbc123
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320

2 Answers2

3

Match 100 characters, followed by the keyword, followed by 100 more characters:

const str = 'abc123cde345fgh678ijk910keywordbc123cde345fgh678ijk910';
const match = str.match(/.{5}keyword.{5}/);
console.log(match[0]);

If you need to construct the pattern dynamically, then:

const str = 'abc123cde345fgh678ijk910keywordbc123cde345fgh678ijk910';
const keyword = 'keyword';
const pattern = new RegExp(`.{5}${keyword}.{5}`);
const match = str.match(pattern);
console.log(match[0]);

If the pattern may contain characters with a special meaning in a regular expression, like $, then make sure to escape them first before passing to new RegExp:

// https://stackoverflow.com/questions/3561493/is-there-a-regexp-escape-function-in-javascript
const escape = s => s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');

const str = 'abc123cde345fgh678ijk910keyw$ordbc123cde345fgh678ijk910';
const keyword = 'keyw$ord';
const pattern = new RegExp(`.{5}${escape(keyword)}.{5}`);
const match = str.match(pattern);
console.log(match[0]);
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
0

One solution for this can be implemented using String.indexOf() to search the index of the keyword inside the input string, and then use String.slice() to get the chars between a particular radius.

const str = 'abc123cde345fgh678ijk910keywordbc123cde345fgh678ijk910';

const getKeyword = (str, keyword, radius) =>
{
    let idx = str.indexOf(keyword);
    return str.slice(idx - radius, idx + keyword.length + radius);
}

console.log(getKeyword(str, "keyword", 5));
console.log(getKeyword(str, "keyword", 15));
console.log(getKeyword(str, "keyword", 1000));
.as-console {background-color:black !important; color:lime;}
.as-console-wrapper {max-height:100% !important; top:0;}

Note this will also work when the radius is greater than the maximum possible, returning the entire string in this case.

Shidersz
  • 16,846
  • 2
  • 23
  • 48