Okay I am trying to select text data from the html below using python beautiful soup but I am having trouble. Basically there is a title within the <b>
, but I want the data outside of that. For instance the first is assessment type, but I only want the capacity curve. Here is what I have so far:
modelinginfo = soup.find( "div", {"id":"genInfo"} ) # this is my raw data
rows=modelinginfo.findChildren(['p']) # this is the data displayed below
for row in rows:
print(row)
print('/n')
cells = row.findChildren('p')
for cell in cells:
value = cell.string
print("The value in this cell is %s" % value)
[<p><b>Assessment Type: </b>Capacity curve</p>,
<p><b>Name: </b>Borzi et al (2008) - Capacity-Xdir 4Storeys InfilledFrame NonSismicallyDesigned</p>,
<p><b>Category: </b>Structure specific - Building</p>,
<p><b>Taxonomy: </b>CR/LFINF+DNO/HEX:4 (GEM)</p>,
<p><b>Reference: </b>The influence of infill panels on vulnerability curves for RC buildings (Borzi B., Crowley H., Pinho R., 2008) - Proceedings of the 14th World Conference on Earthquake Engineering, Beijing, China</p>,
<p><b>Web Link: </b><a href="http://www.iitk.ac.in/nicee/wcee/article/14_09-01-0111.PDF" style="color:blue" target="_blank"> http://www.iitk.ac.in/nicee/wcee/article/14_09-01-0111.PDF</a></p>,
<p><b>Methodology: </b>Analytical</p>,
<p><b>General Comments: </b>Sample Data: A 4-storey building designed according to the 1992 Italian design code (DM, 1992), considering gravity loads only, and the Decreto Ministeriale 1996 (DM, 1996) when considering seismic action (the seismically designed building has been designed assuming a lateral force equal to 10% of the seismic weight, c=10%, and with a triangular distribution shape).
The Y axis in the capacity curve represent the collapse multiplier: Base shear resistance over seismic weight.</p>,
<p><b>Geographical Applicability: </b> Italy</p>]