GHC Contributors’ Workshop
Sam Derbyshire, Well-Typed
June 7th, 2023
{-# COMPLETE U #-}
pattern U :: ()
pattern U = ()
foo :: Identity Char
foo = do
U <- return ()
return 'c'error:
* No instance for `MonadFail Identity'
arising from a do statement
with the failable pattern `U'
* In a stmt of a 'do' block: U <- return ()⭲
This has been reported many times: #15681 #16470 #22004 #23458 (related: #16618).
Fundamental problem:
do-notation,However, in this situation, we don’t need types to be able to see that the pattern match succeeds (this is different from situation which involve GADT pattern matches).
⭲
From the error message, we can see that we emit a
MonadFail Identity Wanted constraint when typechecking
foo.
Where? We can search for MonadFail to find out!
⭲
GHC.Rename.Expr:
rnStmt ctxt rnBody (L loc (BindStmt _ pat (L lb body))) thing_inside
= do { -- ...
; (fail_op, fvs2) <- monadFailOp pat ctxt
; -- ...
}monadFailOp :: LPat GhcPs
-> HsStmtContext GhcRn
-> RnM (FailOperator GhcRn, FreeVars)
monadFailOp pat ctxt = do
dflags <- getDynFlags
if | isIrrefutableHsPat dflags pat -> return (Nothing, emptyFVs)
| not (isMonadStmtContext ctxt) -> return (Nothing, emptyFVs)
| otherwise -> getMonadFailOp ctxt⭲
GHC.Tc.Gen.Match (Gen = constraint
generator):
tcDoStmt ctxt (BindStmt xbsrn pat rhs) res_ty thing_inside
= do { -- ...
; fail_op' <- fmap join . forM (xbsrn_failOp xbsrn) $ \fail ->
tcMonadFailOp (DoPatOrigin pat) pat' fail new_res_ty
; -- ...
; return (BindStmt xbstc pat' rhs', thing) }tcMonadFailOp orig pat fail_op res_ty = do
dflags <- getDynFlags
if isIrrefutableHsPat dflags pat
then return Nothing
else Just . snd <$> (tcSyntaxOp orig fail_op [synKnownType stringTy]
(mkCheckExpType res_ty) $ \_ _ -> return ())tcSyntaxOp is going to instantiate
fail :: MonadFail m => String -> m a, which will emit
a MonadFail Identity Wanted constraint.
⭲
isIrrefutableHsPat :: Bool -- ^ Is @-XStrict@ enabled?
-> Pat (GhcPass p) -> Bool
isIrrefutableHsPat is_strict = \case
WildPat {} -> True
VarPat {} -> True
-- ...
ConPat
{ pat_con = con
, pat_args = details } ->
case ghcPass @p of
GhcPs -> False -- Conservative
GhcRn -> False -- Conservative
GhcTc -> case con of
L _ (PatSynCon _pat) -> False -- Conservative
L _ (RealDataCon con) ->
isJust (tyConSingleDataCon_maybe (dataConTyCon con))
&& all goL (hsConPatArgs details)⭲
Let’s fix isIrrefutableHsPat for ConPat at
GhcRn and GhcTc stages.
We need two things:
DataCon, whether it is irrefutable
(tyConSingleDataCon_maybe);PatSyn, whether it is the unique member of a
COMPLETE set.For PatSyns, we have the choice of storing whether the
PatSyn is irrefutable in the PatSyn itself, or
of threading through the COMPLETE pragmas to
isIrrefutableHsPat.
Idea: let’s store whether the constructor pattern is a
DataCon or a PatSyn, and if it’s a
DataCon whether it’s irrefutable, in
XConPat GhcRn.
(Alternatively we could change ConLikeP GhcRn to
ConLikeName, but that wouldn’t handle the
tyConSingleDataCon situation.)
COMPLETE
pragmas to isIrrefutableHsPat.
⭲
We want to pass COMPLETE sets to
isIrrefutableHsPat.
So let’s start off by figuring out how these are represented in the
compiler.
Now let’s look into passing COMPLETE sets to
isIrrefutableHsPat.
Let’s audit its calls; the HLS call hierarchy functionality is very
useful for that.
stmtTreeToStmts .. (StmtTreeApplicative ..)
monadFailOp in rnStmt .. (BindStmt ..) and
rn_rec_stmt .. (BindStmt ..)
PatSyn in tcPatSynMatcher
(from tcPatSynDecl).
tcMonadFailOp in tcApplicativeStmts,
tcDoStmt, tcMcStmt.
COMPLETE pattern
information in the renamer, and before we have actually typechecked
PatSyns in the typechecker.
⭲
Let’s add a field to the typechecker environment (used for both the typechecker and the renamer):
data TcGblEnv
= TcGblEnv { -- ...
, tcg_complete_matches :: !CompleteMatches
{- NEW -} , tcg_complete_matches_rn :: !CompleteMatchesRn
, -- ...
}
-- NEW
type CompleteMatchesRn = [CompleteMatchRn]
data CompleteMatchRn = CompleteMatchRn
{ cmConLikesRn :: NameSet -- ^ The set of `ConLike` values
}⭲
Let’s see then how to thread through the correct information; we need
to ensure that we first rename pattern synonym COMPLETE signatures and
add that information to the TcGblEnv environment before we
call isIrrefutableHsPat.
⭲
monadFailOpWe want to find what to change to pass on the correct
COMPLETE information.
Let’s use HLS to find the call hierarchy of
GHC.Rename.Expr.monadFailOp.
monadFailOprnStmtrnStmtsWithFreeVarsrnExprrnGRHS'rnGHRSrnGHRSsrnBindrnLBindrnValBindsRHS
⭲
rnSrcDeclsrnValBindsRHSrnValBindsRHS is interesting, as this is where we also
rename signatures:
rnValBindsRHS :: HsSigCtxt
-> HsValBindsLR GhcRn GhcPs
-> RnM (HsValBinds GhcRn, DefUses)
rnValBindsRHS ctxt (ValBinds _ mbinds sigs)
= do { (sigs', sig_fvs) <- renameSigs ctxt sigs
; binds_w_dus <- mapBagM (rnLBind (mkScopedTvFn sigs')) mbinds
-- ...
}Here we clearly see that we rename Sigs first (which
includes COMPLETE pragmas), and then rename value bindings.
So we can simply extend the TcGblEnv, e.g.:
Now, this does the trick in the renamer. However, we also need to
make sure to persist the information in
tcg_complete_matches. So we should return the
CompleteMatchesRn at the end of rnValBindsRHS,
and make sure to add them to the final typechecker environment at the
end of rnSrcDecls.
⭲
tcg_complete_matchesWe then need to look this information up in
isIrrefutableHsPat.
isIrrefutableHsPat :: IsPass p => LPat (GhcPass p) -> TcM Bool
isIrrefutableHsPat pat
= do { strict <- xoptM LangExt.Strict
; comps <- get_complete_matches_rn
; return $ is_irrefutable_pat comps strict pat }However, we need to make sure we include all complete matches, not just those from the local module.
Let’s find how GHC already does that, for
tcg_complete_matches.
⭲
Inspired by mkDsEnvsFromTcGbl and
initDsWithModGuts in GHC.HsToCore.Monad:
get_complete_matches_rn :: TcM CompleteMatchesRn
get_complete_matches_rn
= do { hsc_env <- getTopEnv
; tcg_env <- getGblEnv
; eps <- liftIO $ hscEPS hsc_env
; return $
completeMatchesRn
( hptCompleteSigs hsc_env -- from the home package
++ eps_complete_matches eps ) -- from imports
++ tcg_complete_matches_rn tcg_env -- from the current module
}completeMatchesRn throws away the COMPLETE
pragmas with a result TyCon, as in the renamer we can’t
know whether they apply or not.
⭲
isIrrefutableHsPatIn the worker function is_irrefutable_pat we can then
make use of this information, in the ConPat case for a
PatSyn. We want to check whether the PatSyn
Name is the single member of one of the
COMPLETE sets.
is_irrefutable_hs_pat complete_matches strict_enabled = \case
-- ...
ConPat
{ pat_con_ext = ext
, pat_con = con
, pat_args = details } ->
con_irref && all go (hsConPatArgs details)
where
con_irref = case ghcPass @p of
GhcPs -> False -- Conservative
GhcRn -> case ext of
ConIsData { conIsSingleDataCon = irref } -> irref
ConIsPatSyn -> any (single_match con) complete_matches
GhcTc -> -- ...⭲
Now that everything is in place, we try some tests programs.
Debug this by working backwards: search for “missing fail op”.
⭲
dsHandleMonadicFailuredsHandleMonadicFailure ctx pat match m_fail_op =
case shareFailureHandler match of
MR_Infallible body -> body
MR_Fallible body ->
case m_fail_op of
-- Note that (non-monadic) list comprehension, pattern guards, etc could
-- have fallible bindings without an explicit failure op, but this is
-- handled elsewhere. See Note [Failing pattern matches in Stmts] the
-- breakdown of regular and special binds.
Nothing -> pprPanic "missing fail op" $
text "Pattern match:" <+> ppr pat <+>
text "is failable, and fail_expr was left unset"
Just fail_op -> -- ...MR_Fallible?
⭲
mkCoSynCaseMatchResultmkCoSynCaseMatchResult called in
matchPatSyn looks relevant!
We should accept the previous program, but crash at runtime, using a non-monadic failure operator which raises a pattern match error:
dsHandleMonadicFailure ctx pat match m_fail_op body_ty =
case shareFailureHandler match of
MR_Infallible body -> body
MR_Fallible body -> do
case m_fail_op of
Nothing ->
do error_expr <- mkErrorAppDs pAT_ERROR_ID body_ty (ppr pat)
body error_expr
Just fail_op -> -- ...do
expression to know the return type of the error; so we modify
dsHandleMonadicFailure.
⭲
Let’s make sure we properly handle the problematic program:
pattern Bogus :: Int
pattern Bogus = 3
{-# COMPLETE Bogus #-}
bogus :: Identity Bool
bogus = do
Bogus <- return 4
return FalseThe program compiles successfully, and running it gives:
pAT_ERROR_ID.
⭲
⭲