Suppose I have a paragraph:
Str_wrds ="Power curve, supplied by turbine manufacturers, are extensively used in condition monitoring, energy estimation, and improving operational efficiency. However, there is substantial uncertainty linked to power curve measurements as they usually take place only at hub height. Data-driven model accuracy is significantly affected by uncertainty. Therefore, an accurate estimation of uncertainty gives the confidence to wind farm operators for improving performance/condition monitoring and energy forecasting activities that are based on data-driven methods. The support vector machine (SVM) is a data-driven, machine learning approach, widely used in solving problems related to classification and regression. The uncertainty associated with models is quantified using confidence intervals (CIs), which are themselves estimated. This study proposes two approaches, namely, pointwise CIs and simultaneous CIs, to measure the uncertainty associated with an SVM-based power curve model. A radial basis function is taken as the kernel function to improve the accuracy of the SVM models. The proposed techniques are then verified by extensive 10 min average supervisory control and data acquisition (SCADA) data, obtained from pitch-controlled wind turbines. The results suggest that both proposed techniques are effective in measuring SVM power curve uncertainty, out of which, pointwise CIs are found to be the most accurate because they produce relatively smaller CIs."
And have the following test_wrds
,
Test_wrds = ['Power curve', 'data-driven','wind turbines']
I would like to select before and after 1 sentence whenever Test_wrds
found it in a paragraph and list them as a separate string. For example, Test_wrds
Power curve
appeared first in 1st sentence hence but when we select 2nd sentence there are another Power curve
words thus the output would be something like this
Power curve, supplied by turbine manufacturers, are extensively used in condition monitoring, energy estimation, and improving operational efficiency. However, there is substantial uncertainty linked to power curve measurements as they usually take place only at hub height. Therefore, an accurate estimation of uncertainty gives the confidence to wind farm operators for improving performance/condition monitoring and energy forecasting activities that are based on data-driven methods.
And likewise, I would like to slice sentences for data-driven
and wind turbines
and saved them in separate strings.
How can I implement this using Python in a simple way?
So far I found code which basically removes the entire sentence whenever any Text_wrds
is in.
def remove_sentence(Str_wrds , Test_wrds):
return ".".join((sentence for sentence in input.split(".")
if Test_wrds not in sentence))
But I don't understand how to use this for my problem.
update on the problem: Basically, whenever there is test_wrds
present in the paragraph, I would like to slice that sentence as well as before and after one sentence and saved it on a single string. So for example for three text_wrds
I am expected to get three strings which basically covers sentences with text_wrds
individually. I attached pdf, for example, the output, I am looking for