Is it possible to segment a bs4.element.Tag
into several bs4.element.Tag
?
You can think of an application as the following:
1- The original bs4.element.Tag
contains a paragraph.
2- We want to segment the paragraph in the original bs4.element.Tag
into sentences and get a bs4.element.Tag
corresponding to each sentence.
Example:
paragraphs = soup.find_all('p')
gives all the paragraphs in an HTML file.
Suppose a paragraph (which is also a bs4.element.Tag
instance) is the following:
<p><i><a href="/wiki/Le_Bassin_Aux_Nymph%C3%A9as" title="Le Bassin Aux Nymphéas">Le Bassin Aux Nymphéas</a></i>, 1919. Monet's late series of water lily paintings are among his best-known works.
I would like to turn this bs4.element.Tag
instance (which is also a paragraph) into 2 bs4.element.Tag
instances as the following (one for each sentence):
First bs4.element.Tag
should correspond to the first sentence:
<i><a href="/wiki/Le_Bassin_Aux_Nymph%C3%A9as" title="Le Bassin Aux Nymphéas">Le Bassin Aux Nymphéas</a></i>, 1919.
Second bs4.element.Tag
should correspond to the second sentence:
Monet's late series of water lily paintings are among his best-known works.