12

Here is what I'm specifically trying to do:

I have written a HttpModule to do some site specific tracking. Some old .aspx pages on our site are hard coded with no real controls, but they are .aspx files so my module still runs when they are requested.

My module's handler is attached to the PostRequestHandlerExecute, so I believe what will be sent back to the requester should have already been determined.

I need to be able to extract whatever string is in the title tag.

So if

<title>Chunky Bacon</title>

is sent to the requester in the final rendered HTML. Then I want "Chunky Bacon".

Ideas?

spilliton
  • 3,811
  • 5
  • 35
  • 35
  • what do you mean "extract whatever string is in the tag"? Are you trying to manipulate the response being sent back to the requester? It isn't clear what you are trying to do. – NerdFury Sep 04 '09 at 17:32
  • sorry, I forgot that my HTML tag wouldn't show up unless I spaced it over into a code block. I don't need to manipulate the response, just extract the string inside the title tag. – spilliton Sep 04 '09 at 17:53
  • To clarify, are you trying to get the content from the response or trying to parse the tag from the content? – Scott Lance Sep 04 '09 at 18:01
  • I'm trying to get the text content of the title html tag. If this html is sent to the requester's browser: Chunky Bacon Then I want "Chunky Bacon" – spilliton Sep 04 '09 at 18:08

2 Answers2

29

Fun little challenge.

Here's the code:

StreamWatcher.cs

    public class StreamWatcher : Stream
    {
        private Stream _base;
        private MemoryStream _memoryStream = new MemoryStream();

        public StreamWatcher(Stream stream)
        {
            _base = stream;
        }

        public override void Flush()
        {
            _base.Flush();
        }

        public override int Read(byte[] buffer, int offset, int count)
        {
            return _base.Read(buffer, offset, count);
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            _memoryStream.Write(buffer, offset, count);
            _base.Write(buffer, offset, count);
        }

        public override string ToString()
        {
            return Encoding.UTF8.GetString(_memoryStream.ToArray());
        }

        #region Rest of the overrides
        public override bool CanRead
        {
            get { throw new NotImplementedException(); }
        }

        public override bool CanSeek
        {
            get { throw new NotImplementedException(); }
        }

        public override bool CanWrite
        {
            get { throw new NotImplementedException(); }
        }

        public override long Seek(long offset, SeekOrigin origin)
        {
            throw new NotImplementedException();
        }

        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }

        public override long Length
        {
            get { throw new NotImplementedException(); }
        }

        public override long Position
        {
            get
            {
                throw new NotImplementedException();
            }
            set
            {
                throw new NotImplementedException();
            }
        }
        #endregion
    }

TitleModule.cs

public class TitleModule : IHttpModule
{
    public void Dispose()
    {
    }

    private static Regex regex = new Regex(@"(?<=<title>)[\w\s\r\n]*?(?=</title)", RegexOptions.Compiled | RegexOptions.IgnoreCase);
    private StreamWatcher _watcher;
    public void Init(HttpApplication context)
    {
        context.BeginRequest += (o, e) => 
        {
            _watcher = new StreamWatcher(context.Response.Filter);
            context.Response.Filter = _watcher;
        };


        context.EndRequest += (o, e) =>
        {
            string value = _watcher.ToString();
            Trace.WriteLine(regex.Match(value).Value.Trim());
        };
    }
}
Richard Nienaber
  • 10,324
  • 6
  • 55
  • 66
  • That does it, thanks broseph! I'm still super surprised this takes so many lines of code to perform... – spilliton Sep 04 '09 at 21:35
  • With 4.6.1 AFAIK you must implement the methods that are raising the NotImplementedException. The easiest way to do this is of course to defer all of the calls to _memoryStream – João Antunes Sep 16 '16 at 00:37
  • 2
    `TitleModule` is a singleton in the scope of Application. So storing `_watcher` as a field of `TitleModule` is a bad idea since it can be shared between different requests. You don't need to store `StreamWatcher` since you are assigning it `context.Response.Filter` and you can take it from there later. – AlbertK Jun 29 '20 at 09:59
3

There is an article on 4GuysFromRolla that talks about creating HttpResponse filters which are basically streams that process the response before passing it through to the final output stream (an interceptor).

https://web.archive.org/web/20210510022033/https://aspnet.4guysfromrolla.com/articles/120308-1.aspx

NerdFury
  • 18,876
  • 5
  • 38
  • 41
  • Cool, I read a little about these on google when looking for a solution, it seems the main purpose of writing one of these is to manipulate the HTML that is sent before it is sent. Since I'm not manipulating and just need access to the HTML, I figured this would be overkill, but if it's the only way... – spilliton Sep 04 '09 at 18:24