GHC Contributors’ Workshop
Sam Derbyshire, Well-Typed
June 7th, 2023
{-# COMPLETE U #-}
pattern U :: ()
pattern U = ()
foo :: Identity Char
foo = do
U <- return ()
return 'c'
error:
* No instance for `MonadFail Identity'
arising from a do statement
with the failable pattern `U'
* In a stmt of a 'do' block: U <- return ()
⭲
This has been reported many times: #15681 #16470 #22004 #23458 (related: #16618).
Fundamental problem:
do
-notation,However, in this situation, we don’t need types to be able to see that the pattern match succeeds (this is different from situation which involve GADT pattern matches).
⭲
From the error message, we can see that we emit a
MonadFail Identity
Wanted constraint when typechecking
foo
.
Where? We can search for MonadFail
to find out!
⭲
GHC.Rename.Expr
:
rnStmt ctxt rnBody (L loc (BindStmt _ pat (L lb body))) thing_inside
= do { -- ...
; (fail_op, fvs2) <- monadFailOp pat ctxt
; -- ...
}
monadFailOp :: LPat GhcPs
-> HsStmtContext GhcRn
-> RnM (FailOperator GhcRn, FreeVars)
monadFailOp pat ctxt = do
dflags <- getDynFlags
if | isIrrefutableHsPat dflags pat -> return (Nothing, emptyFVs)
| not (isMonadStmtContext ctxt) -> return (Nothing, emptyFVs)
| otherwise -> getMonadFailOp ctxt
⭲
GHC.Tc.Gen.Match
(Gen
= constraint
generator):
tcDoStmt ctxt (BindStmt xbsrn pat rhs) res_ty thing_inside
= do { -- ...
; fail_op' <- fmap join . forM (xbsrn_failOp xbsrn) $ \fail ->
tcMonadFailOp (DoPatOrigin pat) pat' fail new_res_ty
; -- ...
; return (BindStmt xbstc pat' rhs', thing) }
tcMonadFailOp orig pat fail_op res_ty = do
dflags <- getDynFlags
if isIrrefutableHsPat dflags pat
then return Nothing
else Just . snd <$> (tcSyntaxOp orig fail_op [synKnownType stringTy]
(mkCheckExpType res_ty) $ \_ _ -> return ())
tcSyntaxOp
is going to instantiate
fail :: MonadFail m => String -> m a
, which will emit
a MonadFail Identity
Wanted constraint.
⭲
isIrrefutableHsPat :: Bool -- ^ Is @-XStrict@ enabled?
-> Pat (GhcPass p) -> Bool
isIrrefutableHsPat is_strict = \case
WildPat {} -> True
VarPat {} -> True
-- ...
ConPat
{ pat_con = con
, pat_args = details } ->
case ghcPass @p of
GhcPs -> False -- Conservative
GhcRn -> False -- Conservative
GhcTc -> case con of
L _ (PatSynCon _pat) -> False -- Conservative
L _ (RealDataCon con) ->
isJust (tyConSingleDataCon_maybe (dataConTyCon con))
&& all goL (hsConPatArgs details)
⭲
Let’s fix isIrrefutableHsPat
for ConPat
at
GhcRn
and GhcTc
stages.
We need two things:
DataCon
, whether it is irrefutable
(tyConSingleDataCon_maybe
);PatSyn
, whether it is the unique member of a
COMPLETE set.For PatSyn
s, we have the choice of storing whether the
PatSyn
is irrefutable in the PatSyn
itself, or
of threading through the COMPLETE
pragmas to
isIrrefutableHsPat
.
Idea: let’s store whether the constructor pattern is a
DataCon
or a PatSyn
, and if it’s a
DataCon
whether it’s irrefutable, in
XConPat GhcRn
.
(Alternatively we could change ConLikeP GhcRn
to
ConLikeName
, but that wouldn’t handle the
tyConSingleDataCon
situation.)
COMPLETE
pragmas to isIrrefutableHsPat
.
⭲
We want to pass COMPLETE
sets to
isIrrefutableHsPat
.
So let’s start off by figuring out how these are represented in the
compiler.
Now let’s look into passing COMPLETE
sets to
isIrrefutableHsPat
.
Let’s audit its calls; the HLS call hierarchy functionality is very
useful for that.
stmtTreeToStmts .. (StmtTreeApplicative ..)
monadFailOp
in rnStmt .. (BindStmt ..)
and
rn_rec_stmt .. (BindStmt ..)
PatSyn
in tcPatSynMatcher
(from tcPatSynDecl
).
tcMonadFailOp
in tcApplicativeStmts
,
tcDoStmt
, tcMcStmt
.
COMPLETE
pattern
information in the renamer, and before we have actually typechecked
PatSyn
s in the typechecker.
⭲
Let’s add a field to the typechecker environment (used for both the typechecker and the renamer):
data TcGblEnv
= TcGblEnv { -- ...
, tcg_complete_matches :: !CompleteMatches
{- NEW -} , tcg_complete_matches_rn :: !CompleteMatchesRn
, -- ...
}
-- NEW
type CompleteMatchesRn = [CompleteMatchRn]
data CompleteMatchRn = CompleteMatchRn
{ cmConLikesRn :: NameSet -- ^ The set of `ConLike` values
}
⭲
Let’s see then how to thread through the correct information; we need
to ensure that we first rename pattern synonym COMPLETE signatures and
add that information to the TcGblEnv
environment before we
call isIrrefutableHsPat
.
⭲
monadFailOp
We want to find what to change to pass on the correct
COMPLETE
information.
Let’s use HLS to find the call hierarchy of
GHC.Rename.Expr.monadFailOp
.
monadFailOp
rnStmt
rnStmtsWithFreeVars
rnExpr
rnGRHS'
rnGHRS
rnGHRSs
rnBind
rnLBind
rnValBindsRHS
⭲
rnSrcDecls
rnValBindsRHS
rnValBindsRHS
is interesting, as this is where we also
rename signatures:
rnValBindsRHS :: HsSigCtxt
-> HsValBindsLR GhcRn GhcPs
-> RnM (HsValBinds GhcRn, DefUses)
rnValBindsRHS ctxt (ValBinds _ mbinds sigs)
= do { (sigs', sig_fvs) <- renameSigs ctxt sigs
; binds_w_dus <- mapBagM (rnLBind (mkScopedTvFn sigs')) mbinds
-- ...
}
Here we clearly see that we rename Sig
s first (which
includes COMPLETE
pragmas), and then rename value bindings.
So we can simply extend the TcGblEnv
, e.g.:
Now, this does the trick in the renamer. However, we also need to
make sure to persist the information in
tcg_complete_matches
. So we should return the
CompleteMatchesRn
at the end of rnValBindsRHS
,
and make sure to add them to the final typechecker environment at the
end of rnSrcDecls
.
⭲
tcg_complete_matches
We then need to look this information up in
isIrrefutableHsPat
.
isIrrefutableHsPat :: IsPass p => LPat (GhcPass p) -> TcM Bool
isIrrefutableHsPat pat
= do { strict <- xoptM LangExt.Strict
; comps <- get_complete_matches_rn
; return $ is_irrefutable_pat comps strict pat }
However, we need to make sure we include all complete matches, not just those from the local module.
Let’s find how GHC already does that, for
tcg_complete_matches
.
⭲
Inspired by mkDsEnvsFromTcGbl
and
initDsWithModGuts
in GHC.HsToCore.Monad
:
get_complete_matches_rn :: TcM CompleteMatchesRn
get_complete_matches_rn
= do { hsc_env <- getTopEnv
; tcg_env <- getGblEnv
; eps <- liftIO $ hscEPS hsc_env
; return $
completeMatchesRn
( hptCompleteSigs hsc_env -- from the home package
++ eps_complete_matches eps ) -- from imports
++ tcg_complete_matches_rn tcg_env -- from the current module
}
completeMatchesRn
throws away the COMPLETE
pragmas with a result TyCon
, as in the renamer we can’t
know whether they apply or not.
⭲
isIrrefutableHsPat
In the worker function is_irrefutable_pat
we can then
make use of this information, in the ConPat
case for a
PatSyn
. We want to check whether the PatSyn
Name
is the single member of one of the
COMPLETE
sets.
is_irrefutable_hs_pat complete_matches strict_enabled = \case
-- ...
ConPat
{ pat_con_ext = ext
, pat_con = con
, pat_args = details } ->
con_irref && all go (hsConPatArgs details)
where
con_irref = case ghcPass @p of
GhcPs -> False -- Conservative
GhcRn -> case ext of
ConIsData { conIsSingleDataCon = irref } -> irref
ConIsPatSyn -> any (single_match con) complete_matches
GhcTc -> -- ...
⭲
Now that everything is in place, we try some tests programs.
Debug this by working backwards: search for “missing fail op”.
⭲
dsHandleMonadicFailure
dsHandleMonadicFailure ctx pat match m_fail_op =
case shareFailureHandler match of
MR_Infallible body -> body
MR_Fallible body ->
case m_fail_op of
-- Note that (non-monadic) list comprehension, pattern guards, etc could
-- have fallible bindings without an explicit failure op, but this is
-- handled elsewhere. See Note [Failing pattern matches in Stmts] the
-- breakdown of regular and special binds.
Nothing -> pprPanic "missing fail op" $
text "Pattern match:" <+> ppr pat <+>
text "is failable, and fail_expr was left unset"
Just fail_op -> -- ...
MR_Fallible
?
⭲
mkCoSynCaseMatchResult
mkCoSynCaseMatchResult
called in
matchPatSyn
looks relevant!
We should accept the previous program, but crash at runtime, using a non-monadic failure operator which raises a pattern match error:
dsHandleMonadicFailure ctx pat match m_fail_op body_ty =
case shareFailureHandler match of
MR_Infallible body -> body
MR_Fallible body -> do
case m_fail_op of
Nothing ->
do error_expr <- mkErrorAppDs pAT_ERROR_ID body_ty (ppr pat)
body error_expr
Just fail_op -> -- ...
do
expression to know the return type of the error; so we modify
dsHandleMonadicFailure
.
⭲
Let’s make sure we properly handle the problematic program:
pattern Bogus :: Int
pattern Bogus = 3
{-# COMPLETE Bogus #-}
bogus :: Identity Bool
bogus = do
Bogus <- return 4
return False
The program compiles successfully, and running it gives:
pAT_ERROR_ID
.
⭲
⭲