2

I need a regular expression that makes sure a string does not start with or end with a space. I don't care if it has a space in the "middle" just not at the beginning or the end.

I have a regular expression that almost works:

^\S.*\S$

Here are some example results:

"HELLO" (Match)
"HEL LO" (Match)
" HELLO" (No Match)
"HELLO " (No Match)
"H" (No Match)

As you can see, the issue I am having is that when the string is only 1 character long ("H" in the example above) it doesn't return a match.

How do I modify my regular expression to handle the case where the string length is 1?

Thank you

NOTE - I am saving this data to an Xml file so I need a pattern to match the same thing in Xml schema. I am not sure if it's the same as whatever Regex in C# uses or not.

If anyone could provide me with the pattern to use in my schema that would be greatly appreciated!

Jan Tacci
  • 3,131
  • 16
  • 63
  • 83
  • That's a tricky task. According to [this link](http://www.regular-expressions.info/xml.html), the XML schema flavor of Regex doesn't support anchors or lookaheads/lookbehinds. – pcnThird Jul 21 '13 at 02:04
  • Ah. Thanks I will have to investigate this further then. – Jan Tacci Jul 21 '13 at 02:06

2 Answers2

2

You could do this:

^\S(.*\S)?$

It will match either a single non space character, followed by an optional zero-or-more characters followed by a single non space character.


Update

Given that you said this was for XML schema validation I tested it with this schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="xml">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="test" minOccurs="0" maxOccurs="unbounded">
          <xs:complexType>
            <xs:attribute name="value">
              <xs:simpleType>
                <xs:restriction base="xs:string">
                  <xs:pattern value="\S(.*\S)?"/>
                </xs:restriction>
              </xs:simpleType>
            </xs:attribute>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Against this sample document

<xml>
  <test value="HELLO"/>    <!-- MATCH -->
  <test value="HEL LO"/>   <!-- MATCH -->
  <test value="HELLO "/>   <!-- ERROR -->
  <test value=" HELLO"/>   <!-- ERROR -->
  <test value="H"/>        <!-- MATCH -->
</xml>

So it appears that if you simply remove the start / end brackets. It works.

p.s.w.g
  • 146,324
  • 30
  • 291
  • 331
  • Thank you. I am getting really sick of this regular expression stuff. My head is about to explode. :) – Jan Tacci Jul 21 '13 at 01:47
  • I am saving this data to an Xml file so I need a schema that validates the data the same exact way. I tried using the pattern that you gave me in my Xml schema but it did not like it. I guess Xml uses different regular expression formats than C# Regex(?). – Jan Tacci Jul 21 '13 at 01:51
  • @JanTacci According to [this table](http://www.regular-expressions.info/refflavors.html), XML schema's regular expression flavor doesn't understand non-capturing groups (`(?:...)`). Try using a regular group instead. See my updated answer. – p.s.w.g Jul 21 '13 at 02:01
  • I tried your updated pattern in my Xml schema and it still did not work. :( (But it does work in my C# Regex expression!) – Jan Tacci Jul 21 '13 at 02:04
  • I'm not sure. From what I can tell, this *should* work. Perhaps you can solve this by adding more ``'s to your schema? – p.s.w.g Jul 21 '13 at 02:09
  • I must move on in my coding so I am going to simplify my schema to just require the string length be greater than zero. I will periodically check back here or try some more Google searching later. Thanks! – Jan Tacci Jul 21 '13 at 02:11
  • @Jan Tacci, what happens when you remove `^` and `$` from the expression? – pcnThird Jul 21 '13 at 02:11
  • 1
    @JanTacci / pcnThird I tested it out by removing `^`/`$`, and it seems to work. See my updated answer. – p.s.w.g Jul 21 '13 at 02:25
1

You use lookaround assertions, because they're zero-width:

^(?=\S).*(?<=\S)$

It might be better to use negative assertions and positive character classes, though:

^(?!\s).*(?<=\s)$
Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145