I am parsing text weather data : http://www.nws.noaa.gov/view/prodsByState.php?state=OH&prodtype=hourly and want to only grab data for my county/area. The trick is that each text report has previous reports from earlier in the day and I'm only interested in the latest which appears towards the beginning of the file. I attempted to use the "print section of file between two regular expressions (inclusive)" from the sed one liners. I couldn't figure out how to get it to stop after one occurrence.
sed -n '/OHZ061/,/OHZ062/p' /tmp/weather.html
I found this: Sed print between patterns the first match result which works with the following
sed -n '/OHZ061/,$p;/OHZ062/q' /tmp/weather.html
but I feel like it isn't the most robust of solutions. I don't have anything to back up the statement of robustness but I have a gut feeling that there might be a more robust solution.
So are there any better solutions out there? Also is it possible to get my first attempted solution to work? And if you post a solution please give an explanation of all the switches/backreference/magic as I'm still trying to discover all the power of sed and command line tools.
And to help start you off:
wget -q "http://www.nws.noaa.gov/view/prodsByState.php?state=OH&prodtype=hourly" -O /tmp/weather.html
ps: I looked at this post:http://www.unix.com/shell-programming-scripting/167069-solved-sed-awk-print-between-patterns-first-occurrence.html but the sed was completely greek to me and I couldn't muddle through it to get it to work for my problem.