16

I'm developing a Web application using F#. Thinking of protecting user input strings from SQL, XSS, and other vulnerabilities.

In two words, I need some compile-time constraints that would allow me discriminate plain strings from those representing SQL, URL, XSS, XHTML, etc.

Many languages have it, e.g. Ruby’s native string-interpolation feature #{...}.
With F#, it seems that Units of Measure do very well, but they are only available for numeric types.
There are several solutions employing runtime UoM (link), however I think it's an overhead for my goal.

I've looked into FSharpPowerPack, and it seems quite possible to come up with something similar for strings:

[<MeasureAnnotatedAbbreviation>] type string<[<Measure>] 'u> = string
// Similarly to Core.LanguagePrimitives.IntrinsicFunctions.retype
[<NoDynamicInvocation>]
let inline retype (x:'T) : 'U = (# "" x : 'U #)
let StringWithMeasure (s: string) : string<'u> = retype s

[<Measure>] type plain
let fromPlain (s: string<plain>) : string =
    // of course, this one should be implemented properly
    // by invalidating special characters and then assigning a proper UoM
    retype s

// Supposedly populated from user input
let userName:string<plain> = StringWithMeasure "John'); DROP TABLE Users; --"
// the following line does not compile
let sql1 = sprintf "SELECT * FROM Users WHERE name='%s';" userName
// the following line compiles fine
let sql2 = sprintf "SELECT * FROM Users WHERE name='%s';" (fromPlain userName)

Note: It's just a sample; don't suggest using SqlParameter. :-)

My questions are: Is there a decent library that does it? Is there any possibility to add syntax sugar?
Thanks.

Update 1: I need compile-time constraints, thanks Daniel.

Update 2: I'm trying to avoid any runtime overhead (tuples, structures, discriminated unions, etc).

Be Brave Be Like Ukraine
  • 7,596
  • 3
  • 42
  • 66
  • 2
    See http://blog.moertel.com/articles/2006/10/18/a-type-based-solution-to-the-strings-problem for an interesting Haskell take on the problem. – kvb Feb 23 '12 at 16:53
  • 1
    Well, if you do try to implement this, I'd be interested to see it! – Benjol Feb 24 '12 at 06:40
  • 2
    @kvb, your link seems to become dead... let me put a working link just for myself :) http://blog.moertel.com/posts/2006-10-18-a-type-based-solution-to-the-strings-problem.html – moudrick May 03 '16 at 08:00

5 Answers5

7

A bit late (I'm sure there's a time format where there is only one bit different between February 23rd and November 30th), I believe these one-liners are compatible for your goal:

type string<[<Measure>] 'm> = string * int<'m>

type string<[<Measure>] 'm> = { Value : string }

type string<[<Measure>] 'm>(Value : string) = struct end
Ramon Snir
  • 7,520
  • 3
  • 43
  • 61
  • Thank you for the answer, but the first one forces constructing a `Tuple<_,_>`, and the latter ones are actually `struct`'s. I'm trying to avoid any runtime overhead. – Be Brave Be Like Ukraine Dec 15 '12 at 19:52
3

In theory it's possible to use 'units' to provide various kinds of compile-time checks on strings (is this string 'tainted' user input, or sanitized? is this filename relative or absolute? ...)

In practice, I've personally not found it to be too practical, as there are so many existing APIs that just use 'string' that you have to exercise a ton of care and manual conversions plumbing data from here to there.

I do think that 'strings' are a huge source of errors, and that type systems that deal with taintedness/canonicalization/etc on strings will be one of the next leaps in static typing for reducing errors, but I think that's like a 15-year horizon. I'd be interested in people trying an approach with F# UoM to see if they get any benefit, though!

Brian
  • 117,631
  • 17
  • 236
  • 300
  • Maybe I don't fully understand it, but what would a UoM approach to this provide beyond wrapper classes with validation? The level of type safety seems the same but the latter is (currently) much easier to implement. – Daniel Feb 23 '12 at 19:25
  • Daniel: The reasons are very similar to those for numeric types. UoM prevent accidental mixing variables having different _purposes_ yet stored as same _runtime type_. There's a perfect explanation of "code smell" problem, written by Joel http://www.joelonsoftware.com/articles/Wrong.html – Be Brave Be Like Ukraine Feb 23 '12 at 21:04
  • I understand not wanting to mix different string types, but you could achieve that various ways (wrapper classes, etc). I'm just not clear on the advantage of the UoM approach to this. – Daniel Feb 23 '12 at 21:37
  • 4
    UoM can be made generic in ways wrappers cannot. And they are erased, so no runtime cost. Do you see an advantage to UoM for numeric code? The same advantages apply; you could write struct wrappers around floats for kilograms and seconds, but UoM are superior. I think? Right? – Brian Feb 23 '12 at 23:08
3

The simplest solution to not being able to do

"hello"<unsafe_user_input>

would be to write a type which had some numeric type to wrap the string like

type mystring<'t>(s:string) =
    let dummyint = 1<'t>

Then you have a compile time check on your strings

John Palmer
  • 25,356
  • 3
  • 48
  • 67
  • 1
    I would suggest using a struct instead of a class, to avoid the (admittedly) small overhead of the wrapper. – Joh May 04 '12 at 20:00
2

You can use discriminated unions:

type ValidatedString = ValidatedString of string
type SmellyString = SmellyString of string

let validate (SmellyString s) =
  if (* ... *) then Some(ValidatedString s) else None

You get a compile-time check, and adding two validated strings won't generate a validated string (which units of measure would allow).

If the added overhead of the reference types is too big, you can use structs instead.

Joh
  • 2,380
  • 20
  • 31
  • 2
    Thank you for the suggestion. The problem is that DU is a class, you cannot avoid constructing it. `int` is actually an `int` at runtime, which makes no overhead at all. – Be Brave Be Like Ukraine May 04 '12 at 20:27
2

It's hard to tell what you're trying to do. You said you "need some runtime constraints" but you're hoping to solve this with units of measure, which are strictly compile-time. I think the easy solution is to create SafeXXXString classes (where XXX is Sql, Xml, etc.) that validate their input.

type SafeSqlString(sql) =
  do
    //check `sql` for injection, etc.
    //raise exception if validation fails
  member __.Sql = sql

It gives you run-time, not compile-time, safety. But it's simple, self-documenting, and doesn't require reading the F# compiler source to make it work.

But, to answer your question, I don't see any way to do this with units of measure. As far as syntactic sugar goes, you might be able to encapsulate it in a monad, but I think it will make it more clunky, not less.

Daniel
  • 47,404
  • 11
  • 101
  • 179
  • But your question mentions needing run-time constraints--thus the confusion. However, I attempted to address both possibilities. – Daniel Feb 23 '12 at 17:51