3

I've been trying to implement server-side XSLT transformations as an IIS HttpModule. My basic approach is to install a new filter at BeginRequest that diverts writes into a MemoryStream, and then at PreSendRequestContent to transform the document using XSLT and write it to the original output stream. However, even without performing the transformation I'm clearly doing something wrong as the HttpModule appears to work for the first page load and then I get no response from the server at all until I restart the application pool. With the transformation in place I get an empty page the first time and then no response. I'm clearly doing something stupid but this is the first C# code I'd written in years (and my first attempt at an HttpModule) and I have no idea what the problem might be. What mistakes am I making? (I've commented out the XSLT part in the code below and uncommented a line that writes the contents of the cache to the response.)

using System;
using System.IO;
using System.Text;
using System.Web;
using System.Xml;
using System.Xml.Xsl;

namespace Onyx {

    public class OnyxModule : IHttpModule {

        public String ModuleName {
            get { return "OnyxModule"; }
        }

        public void Dispose() {
        }

        public void Init(HttpApplication application) {

            application.BeginRequest += (sender, e) => {
                HttpResponse response = HttpContext.Current.Response;
                response.Filter = new CacheFilter(response.Filter);
                response.Buffer = true;
            };

            application.PreSendRequestContent += (sender, e) => {

                HttpResponse response = HttpContext.Current.Response;
                CacheFilter cache = (CacheFilter)response.Filter;

                response.Filter = cache.originalStream;
                response.Clear();

 /*               XmlReader xml = XmlReader.Create(new StreamReader(cache), new XmlReaderSettings() {
                    ProhibitDtd = false,
                    ConformanceLevel = ConformanceLevel.Auto
                });

                XmlWriter html = XmlWriter.Create(response.OutputStream, new XmlWriterSettings() {
                    ConformanceLevel = ConformanceLevel.Auto
                });

                XslCompiledTransform xslt = new XslCompiledTransform();
                xslt.Load("http://localhost/transformations/test_college.xsl", new XsltSettings() {
                    EnableDocumentFunction = true
                }, new XmlUrlResolver());
                xslt.Transform(xml, html); */

                response.Write(cache.ToString());

                response.Flush();

            };

        }


    }

    public class CacheFilter : MemoryStream {

        public Stream originalStream;
        private MemoryStream cacheStream;

        public CacheFilter(Stream stream) {
            originalStream = stream;
            cacheStream = new MemoryStream();
        }

        public override int Read(byte[] buffer, int offset, int count) {
            return cacheStream.Read(buffer, offset, count);
        }

        public override void Write(byte[] buffer, int offset, int count) {
            cacheStream.Write(buffer, offset, count);
        }

        public override bool CanRead {
            get { return cacheStream.CanRead; }
        }

        public override string ToString() {
            return Encoding.UTF8.GetString(cacheStream.ToArray());
        }

    }

}
Rich
  • 3,095
  • 17
  • 17
  • 1
    No offense, but your sample is miles away from that fancy "Clean Code" thing. – Filburt Feb 17 '10 at 16:48
  • @Filburt - please be more specific, or you're just spamming. – Jeff Sternal Feb 17 '10 at 16:59
  • @Jeff: Okay, just two obvious points: 1) Naming - (HttpApplication context) is misleading. 2) cramming all the stuff into Init() which according to its name should Init the Module. – Filburt Feb 17 '10 at 17:12
  • No offence taken. I've corrected the naming of the argument, but I think I actually find the version with the lambda expressions easier to read and more maintainable (although that might be because I'm not a C# developer). At the moment I wouldn't even call it a prototype but I just want to see whether it's possible to do what I want to do with adequate performance so I've been going through many iterations today trying to just get some output rather than worrying about which classes and methods have which responsibilities. I spent a while attaching the handlers to different events... – Rich Feb 17 '10 at 19:35
  • ...and it was quite tedious doing so when there was a lot of application.SomeEvent += (new EventHandler(this.Application_SomeEvent))... private void Application_SomeEvent(...) all over the place. Having said that, I'm sure that the production version won't have more than two or three lines in each lambda. – Rich Feb 17 '10 at 19:38
  • @Rich: Glad my comment didn't come out too snotty for you. Very much like Uncle Bob i consider it important how to name and where to put things. Of course hacking down all these event handlers isn't really sexy but "syntactical sugar" like lambda expressions tend to go bad. – Filburt Feb 17 '10 at 21:05
  • @Filburt: I think that in this case, though, having an event handler called Application_BeginRequest isn't really naming something that needs to be named. I tend to think that there are three reasons for defining a named function: (1) to re-use a piece of functionality without duplicating it; (2) to improve readability or simplify complex logic; (3) to abstract out parts of an algorithm to improve future maintainability. In the case of the BeginRequest handler, I don't think any of these three apply, but of course my idea of simplicity or readability is more important to me than Uncle Bob's... – Rich Feb 17 '10 at 21:22
  • ...but I have little idea what idiomatic C# code looks like so I'm not really in a position to judge what other developers would find easiest to read or maintain in the future. I agree that the other even handler in my sample is a monstrosity though. I tend to think that when I make it work in the simple case I'll move all the XSLT stuff into another object whose sole responsibility is transforming the document. I need to do quite a lot more than is shown: for example I have to determine which stylesheet to use either from the request or processing directives in the document. – Rich Feb 17 '10 at 21:25
  • @Rich: Did you already try if this invalid character at position 1 is caused by a BOM? (see my comment on Mikaels answer). – Filburt Feb 18 '10 at 18:33
  • @Filburt: I didn't have much time for development today as our operations team had our VMware cluster down for maintenance so I haven't tried that. I did try a variant that performed the transformation in the CacheStream's "Flush" method and directly read the stream using an XmlReader, which had the same "work on first load but not thereafter" problem. XmlReaders are supposed to handle BOMs correctly, aren't they? – Rich Feb 18 '10 at 20:13

3 Answers3

3

When you are done reading the data into your MemoryStream the position is at the end of the stream. Before sending the stream to the StreamReader/XmlReader you need to reset the position to 0.

stream.Position = 0;
/* or */
stream.Seek(0, SeekOrigin.Begin);
Mikael Svenson
  • 39,181
  • 7
  • 73
  • 79
  • The second version has the arguments in the wrong order. I've now modified the code to do that and I get the XSLT output once. The second time I load the page I get "'', hexadecimal value 0x1F, is an invalid character. Line 1, position 1" which is certainly a step forward. – Rich Feb 17 '10 at 16:06
  • Corrected my code.. that's what I get without intellisense ;) For further debugging you should dump your incoming stream to disk and examine that it is in fact valid xml. Could it be that it is gzipped? – Mikael Svenson Feb 17 '10 at 17:19
  • 0x1F seems to be the BOM for your utf-8 encoded response. You can instruct UTF8Encoding to omit the BOM: http://msdn.microsoft.com/en-us/library/s064f8w2.aspx – Filburt Feb 17 '10 at 22:41
  • 1
    The problem was static content compression. When I turned that off it worked fine (although I moved the XSLT processing into the filter's "Write" method.) – Rich Feb 19 '10 at 14:51
2

I'm a little surprised this works at all (even after resetting the stream's position). I poked around the HttpApplication code a bit, and though I don't fully understand it, it looks like you may be modifying the output stream too late in the request handling process.

If you still haven't figured this out, try attaching your second handler function to one of the events after PostReleaseRequestState - either UpdateRequestCache or PostUpdateRequestCache. Neither sounds especially suitable, but read on!

For some reason, the MSDN documentation for HttpApplication doesn't include PreSendRequestContent in its list of events, but Reflector shows that its handlers don't get called until HttpResponse.Flush.

If I'm reading the code right, Response.Flush calculates the content length before the handlers are called, so it thinks the response is empty when it gets to this code:

if (contentLength > 0L) {
    byte[] bytes = Encoding.ASCII.GetBytes(Convert.ToString(contentLength, 0x10) + "\r\n");
    this._wr.SendResponseFromMemory(bytes, bytes.Length);
    this._httpWriter.Send(this._wr);
    this._wr.SendResponseFromMemory(s_chunkSuffix, s_chunkSuffix.Length);
}

There are some alternate paths that may get called depending on your entry point and initial conditions, and that might explain why this works some of the time but not others. But at the end of the day you probably shouldn't be modifying the response stream once you're in Flush.

You're doing something a little unusual - you're not really filtering the response stream in the traditional sense (where you pass some bytes along to another stream), so you may have to do something a bit hackish to make your current design work.

An alternative would be to implement this using an IHttpHandler instead of a module - there's a good example here. It deals with transforming output from a database query, but should be easy to adapt to a file system data source.

Jeff Sternal
  • 47,787
  • 8
  • 93
  • 120
  • I've tried attaching the handlers to other events but to no avail. I think I might be somehow doing something wrong with streams though. In this case an HttpHandler implementation isn't the best option as I want to be able to transform files, WCF data services, the output of other applications and so on transparently. At the moment we're doing this using the URL rewriter to map external URLS to a PHP script that forwards requests to another endpoint and then transforms the responses using libxslt but the performance could be better and it's very far from transparent. – Rich Feb 18 '10 at 18:16
  • Blast - ah well, I'll leave this here to ward away anyone else that might be similarly misguided. – Jeff Sternal Feb 18 '10 at 18:57
1

Even if you don't stick to the msdn example you should implement HttpApplication.EndRequest:

context.EndRequest += (sender, e) => {
    HttpResponse response = HttpContext.Current.Response;
    response.Flush();
};

cleaner

// ...

public void Init(HttpApplication application)
{
    // ...
    application.EndRequest += (new EventHandler(this.Application_EndRequest));
}

private void Application_EndRequest(object sender, EventArgs e)
{
    HttpApplication application = (HttpApplication)source;
    HttpContext context = application.Context;
    context.Current.Response.Flush();
}
Filburt
  • 17,626
  • 12
  • 64
  • 115
  • Ironically, this is how I started out but then changed to the lambda version because I thought it was cleaner. – Rich Feb 17 '10 at 22:58