0

guys i am stuck while parsing following text into object. I have created two separate regex but i want to make only one. Below i am posting sample text as well as my following regex pattern.

PAYER:\r\n\r\n   MCNA \r\n\r\nPROVIDER:\r\n\r\n   MY KHAN \r\n   Provider ID: 115446397114\r\n   Tax ID: 27222193992\r\n\r\nINSURED:\r\n\r\n   VICTORY OKOYO\r\n   Member ID: 60451158048\r\n   Birth Date: 05/04/2008\r\n   Gender: Male\r\n\r\nCOVERAGE TYPE:\r\n\r\n   Dental Care

REGEX:

 re = new RegExp('(.*?):\r\n\r\n(.*?)(?:\r\n|$)', 'g');
re2 = new RegExp('(.*?):(.*?)(?:\r\n|$)', 'g');

Expected result:

{
  payer: 'MCNA',
  provider: 'MY KHAN'
}

1 Answers1

3

This turns your input into an object that contains all key/value pairs:

const input = 'PAYER:\r\n\r\n   MCNA \r\n\r\nPROVIDER:\r\n\r\n   MY KHAN \r\n   Provider ID: 115446397114\r\n   Tax ID: 27222193992\r\n\r\nINSURED:\r\n\r\n   VICTORY OKO\r\n   Member ID: 60451158048\r\n   Birth Date: 05/04/2009\r\n   Gender: Male\r\n\r\nCOVERAGE TYPE:\r\n\r\n   Dental Care';

let result = Object.fromEntries(input
  .replace(/([^:]+):\s+([^\n\r]+)\s*/g, (m, c1, c2) => c1.toLowerCase() + '\r' + c2 + '\n')
  .split('\n')
  .filter(Boolean)
  .map(item => item.trim().split('\r'))
);
console.log(result);

Output:

{
  "payer": "MCNA",
  "provider": "MY KHAN",
  "provider id": "115446397114",
  "tax id": "27222193992",
  "insured": "VICTORY OKO",
  "member id": "60451158048",
  "birth date": "05/04/2009",
  "gender": "Male",
  "coverage type": "Dental Care"
}

Explanation:

  • Object.fromEntries() -- convert a 2D array to object, ex: [ ['a', 1], ['b', 2] ] => {a: 1, b: 2}
  • .replace() regex /([^:]+):\s+([^\n\r]+)\s*/g -- two capture groups, one for key, one for value
  • replace action c1.toLowerCase() + '\r' + c2 + '\n' -- convert key to lowercase, separate key/value pairs with newline
  • .split('\n') -- split by newline
  • .filter(Boolean): -- remove empty items
  • .map(item => item.trim().split('\r')) -- change array item to [key, value], e.g. change flat array to 2D array

You could add one more filter after the .map() to keep only keys of interest.

Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20
  • Hello peter thanks for helping out but i need one more thing. Following i am posting little bit of change structure – Muhammad Faisal Dec 22 '22 at 08:27
  • @MuhammadFaisal: I believe I answered your original question, let me know if not. The example input in your new question is more complex, it's unlikely that it can be done in one regex. The first one with address that spans multiple lines possibly can, but `COVERAGE DATES: Plan Date: 05/01/2022 - 02/28/2223` has two consecutive keys, thus is non-deterministic and needs clarification. – Peter Thoeny Dec 22 '22 at 19:16
  • you answered my original question without a doubt. – Muhammad Faisal Dec 23 '22 at 03:31