0

Is it possible to write a Windows batch file that can delete all text between 2 characters, including the characters themselves?

I am dynamically generating text files that includes a piece of text in HTML format. I want to extract only the non-HTML part of the text, meaning, I want to remove all HTML tags from it.

So, I want a Windows batch file that takes a text file as input, removes all characters between < and > (including) and creates an output file. Can you please help me with this?

Guhan Murugesan
  • 89
  • 1
  • 1
  • 8
  • You would be better off using python – Monacraft Dec 03 '14 at 21:22
  • ... or JScript. JScript can even parse the HTML as a [collection of DOM elements](http://msdn.microsoft.com/en-us/library/ms755628%28v=vs.85%29.aspx) if you wish. Or it can do a `string.replace(/<[^>]+>/g,'')` to strip all tags if you prefer. – rojo Dec 03 '14 at 21:29
  • Thanks for your response. I need this batch script to be given to the user. If it is a simple batch file, the user can just run that file and be done with it. If it is in Python or JScript, it will be more complex – Guhan Murugesan Dec 03 '14 at 22:34
  • [Incorrect](https://gist.github.com/DavidRuhmann/5199433). (I typically use the Hybrid2.bat style in my hybrid batch / JScript scripts.) – rojo Dec 04 '14 at 03:01

0 Answers0