There are a few ways to achieve this:
Inheriting from pydantic.ConstrainedStr
Instead of using constr
to specify the regex constraint (which uses pydantic.ConstrainedStr
internally), you can inherit from pydantic.ConstrainedStr
directly:
import re
import pydantic
from pydantic import Field
from typing import List
class Regex(pydantic.ConstrainedStr):
regex = re.compile("^[0-9a-z_]*$")
class Data(pydantic.BaseModel):
regex: List[Regex]
data = Data(**{"regex": ["abc", "123", "asdf"]})
print(data)
# regex=['abc', '123', 'asdf']
print(data.json())
# {"regex": ["abc", "123", "asdf"]}
Mypy accepts this happily and pydantic does correct validation. The type of data.regex[i]
is Regex
, but as pydantic.ConstrainedStr
itself inherits from str
, it can be used as a string in most places.
Using pydantic.Field
The regex consraint can also be specified as an argument to Field
:
import pydantic
from pydantic import Field
from typing import List
class Regex(pydantic.BaseModel):
__root__: str = Field(regex="^[0-9a-z_]*$")
class Data(pydantic.BaseModel):
regex: List[Regex]
data = Data(**{"regex": ["abc", "123", "asdf"]})
print(data)
# regex=[Regex(__root__='abc'), Regex(__root__='123'), Regex(__root__='asdf')]
print(data.json())
# {"regex": ["abc", "123", "asdf"]}
Because Regex
is not directly used as a field in a pydantic model (but as an entry in a list in your example), we need to introduce a model by force. __root__
makes the Regex
model act as its single field when validating and serializing (more details here).
But it has a drawback: the type of data.regex[i]
is again Regex
, but this time not inheriting from str
. This results in e.g. foo: str = data.regex[0]
not typechecking. foo: str = data.regex[0].__root__
has to be used instead.
I'm still mentioning this here because it might be the simplest solution when the constraint is applied directly to a field and not to a list entry (and typing.Annotated
is not avaible, see below). For example like so:
class DataNotList(pydantic.BaseModel):
regex: str = Field(regex="^[0-9a-z_]*$")
Using typing.Annotated
with pydantic.Field
Instead of using constr
to specify the regex constraint, you can specify it as an argument to Field
and then use it in combination with typing.Annotated
:
import pydantic
from pydantic import Field
from typing import Annotated
Regex = Annotated[str, Field(regex="^[0-9a-z_]*$")]
class DataNotList(pydantic.BaseModel):
regex: Regex
data = DataNotList(**{"regex": "abc"})
print(data)
# regex='abc'
print(data.json())
# {"regex": "abc"}
Mypy treats Annotated[str, Field(regex="^[0-9a-z_]*$")]
as a type alias of str
. But it also tells pydantic to do validation.
This is described in the pydantic docs here.
Unfortunately it does not curretly work with the following:
class Data(pydantic.BaseModel):
regex: List[Regex]
The validation simply does not get run. This is an open bug (github issue). Once the bug is fixed this might overall be the best solution.
Note that typing.Annotated
is only available since Python 3.9. For older Python versions typing_extensions.Annotated
can be used.
As a side note: I've used ^[0-9a-z_]*$
instead of [0-9a-z_]*
for the regex, as the latter would accept any string as valid, as pydantic uses re.match
for validation.