A good solution is to parse into an AST containing dependency information, then resolve the dependencies separately, outside the parser. For example, suppose you have a format that may be an #include
line or a content line:
data WithIncludes = WithIncludes [ContentOrInclude]
data ContentOrInclude
= Content String
| Include FilePath
And a parser parse :: String -> WithIncludes
so that these files:
Parse to these representations:
file1 = WithIncludes
[ Content "before"
, Include "file2"
, Content "after"
]
file2 = WithIncludes
[ Content "between"
]
You can add another type representing a flattened file with the imports resolved:
data WithoutIncludes = WithoutIncludes [String]
And separately from parsing, load and recursively flatten includes:
flatten :: WithIncludes -> IO WithoutIncludes
flatten (WithIncludes ls) = WithoutIncludes . concat <$> traverse flatten' ls
where
flatten' :: ContentOrInclude -> IO [String]
flatten' (Content content) = pure [content]
flatten' (Include path) = do
contents <- readFile path
let parsed = parse contents
flatten parsed
Then the result is:
flatten file1 == WithoutIncludes
[ "before"
, "between"
, "after"
]
Parsing remains pure, and you just have an IO
wrapper around it driving which files to load. You can even reuse the logic here for loading a single file:
load :: FilePath -> IO WithoutIncludes
load path = flatten $ WithIncludes [Include path]
It’s also a good idea to add logic here to check for import cycles, for example by adding an accumulator to flatten
containing a Set
of canonicalised FilePath
s, and checking at each Include
that you haven’t seen the same FilePath
already.
For a more complex AST, you may want to share most of the structure between the unresolved and resolved types. In that case, you can parameterise the type by whether it’s resolved, and have the unresolved & resolved types be aliases for the underlying AST type with different arguments, for instance:
data File i = File [ContentOrInclude i]
data ContentOrInclude i
= Content String
| Include i
type WithIncludes = File FilePath
type WithoutIncludes = File [String]