Type inference is a quite broad topic, and here we focus only on the type inference for calls, i.e., on the algorithm trying to guess which type arguments would fit properly for a given call if they're not specified explicitly.
For example, in the case of a call listOf("")
, it would be quite nice to understand that the user expects it to work the same way as listOf<String>("")
.
All other cases of type inference, e.g., property type inference like val x = ""
is beside the scope of the document.
TODO: For the sake of simplicity, we don't cover callable references here, shall be done later.
Let‘s assume that we’ve got some independent call on a statement level inside the body block:
fun main() { listOf("") }
In general, the situation might be a bit more complicated.
The independent call may have dependent call-arguments (see Call tree):
fun <K> materialize(): K = TODO() fun main() { listOf("", materialize()) }
Or it might have lambdas and/or expected type
fun main() { val x: List<Any> = listOf("", run { 1 }) }
TODO: Cover how we infer the function kind of lambda (regular/suspend/compose).
fun main() { val x: List<Any> = listOf<Any>("", run<Int>(fun(): Int = 1)) }
Or, if there are some type contradictions, the error should be reported.
The idea of this part is to give some basic idea of how the algorithm works:
Tv <: String
in listOf("")
).In general, the call resolution algorithm works in a very natural recursive manner:
Calls resolution modes might be separated into two kinds:
fun main() { listOf("") // Independent val x: List<Any> = listOf("") // Independent with expected type listOf( // Independent listOf("") // Dependent ) listOf( // Independent run { // Dependent listOf("") // Independent listOf("") // Dependent } ) listOf( // Independent "" ).get(0) // Independent }
NB: The calls that are used as receivers are being completed as Independent
and it‘s crucial because our inference algorithm doesn’t allow type-constraints to flow back from the call to its receiver (like mutableMapOf().put("", "")
is red code).
See org.jetbrains.kotlin.fir.resolve.ResolutionMode
.
One of the most important use cases for the resolution mode is defining whether a call should be fully completed.
At the first step of any candidate call resolution, we create a constraint system (CS), which has:
Tv <: SomeType
(UPPER), SomeType <: Tv
(LOWER) or Tv := SomeType
(EQUALITY).Tv <: Closeable
for fun <T : Closeable> T.use(...)
.NB: in the case of complex call-trees (parentCall(argumentCall())
) the parent calls accumulate CS of all the arguments and add their own type variables.
NB: CS is being created for each candidate, even for ones that later might be declared as inapplicable. Even the candidates with no type parameters should have their CS (the empty one).
See org.jetbrains.kotlin.fir.resolve.calls.stages.CreateFreshTypeVariableSubstitutorStage
.
The substitutor from type parameter types to fresh type variables is stored at Candidate::substitutor
.
For each candidate we perform a bunch of resolution stages (see org.jetbrains.kotlin.fir.resolve.calls.candidate.CallKind
), each of them either works with its CS by introducing new initial constraints, or in some other ways checks if the candidate is applicable.
The most illustrative example of the resolution stage is CheckArguments
which effectively adds argument-related initial constraints in the form of ValueArgumentType <: ValueParameterType
.
NB: During candidate resolution it's not allowed to go into lambda analysis because the signature of the anonymous function might depend on the candidate and there may be more than one candidate. And resolution of the same lambda body more than once would make the whole call resolution algorithm exponential.
After that, if the Constraint System contains contradiction, we discard the candidate and go to the next one (or report an error for the last one).
Otherwise, if we've got a single perfect candidate, we choose it and go to the Completion Stage.
Initial constraint is an arbitrary type restriction obtained from code semantics.
The most basic example of such a constraint is that a value argument type must be a subtype of a parameter type. Or it might be an expected type in case like val x: List<String> = listOf()
(initial constraint would be List<Tv> <: List<String>
).
Without losing generality, we might say that all the initial constraints have the form of ArbitraryType1 <: ArbitraryType2
.
And their main purpose is deriving type variable constraints (Tv <: SomeType
, SomeType <: Tv
, or Tv := SomeType
).
The process of deriving constraints works through the regular subtyping algorithm with a special callback catching attempt to check if some type is a subtype of a type variable or vice versa.
fun <T> foo(x: Collection<T>) {} fun bar(x: MutableList<String>) { foo(x) }
For example, in the case above:
MutableList<String> <: Collection<Tv>
Collection<String>
Collection<String> <: Collection<Tv>
.String <: Tv
and got a new lower constraint on Tv
.For more details, see org.jetbrains.kotlin.types.AbstractTypeChecker.isSubtypeOf
and org.jetbrains.kotlin.resolve.calls.inference.components.TypeCheckerStateForConstraintSystem.addSubtypeConstraint
as an example of the overridden callback for catching a type variable comparison case.
One of the potential results of the subtyping algorithm might be a failure if some intermediate constraint ArbitraryType1 <: ArbitraryType2
cannot be satisfied.
For example, MutableMap<String, Int> <: Map<Tv, String>
would be derived to Int <: String
constraint which cannot be satisfied.
In this situation, we say that CS has a contradiction.
Once there's a new type variable constraint, it might make sense to derive new ones from it using constraints derived in previous steps and relying on subtyping transitivity.
If a new incorporated variable constraint has been added, we repeat the process recursively.
There are two kinds of incorporation.
If the new constraint has a form of Tv <: SomeType
, for each existing lower (opposite) constraint OtherType <: Tv
, we may add a new initial constraint OtherType <: SomeType
.
fun <I> id(x: I): I = x fun <T> listOf(): List<T> = TODO() fun main() { val x: List<String> = id(listOf()) }
In the above example we would have:
List<Tv> <: Iv
(which immediately adds a lower constraint on Iv
)Iv <: List<String>
from expected type (which immediately adds an upper constraint on Iv
)List<Tv> <: Iv
and Iv <: List<String>
we might deduce a new initial constraint List<Tv> <: List<String>
Tv <: String
If the new constraint has a form of Tv <: SomeType
, for each other type variable Fv
which contains Tv
inside its constraints, for each such constraint Fv <: OtherType<..Tv..>
, we make a tricky substitution of Tv
inside OtherType<..>
with some approximation of SomeType
which would give a new correct initial constraint.
Fv <: Tv
, we would add Fv <: SomeType
.Fv <: Inv<Tv>
, it would be Fv <: Inv<out SomeType>
.For more details, see org.jetbrains.kotlin.resolve.calls.inference.components.ConstraintIncorporator.insideOtherConstraint
At this stage, we've got:
Independent
calls, there might be an expected type (which initially should be contributed into a relevant initial constraint)Completion is a process that for given information tries to infer the resulting type for each type variable and to analyze all the given lambdas.
Completion mode (ConstraintSystemCompletionMode
) defines how actively/forcefully the given call is being completed
PARTIAL
— used for calls that are other calls argumentsval x: Int = 1 fun main() { x.plus(run { x }) }
FULL
— used for top statement-level calls and receiversUNTIL_FIRST_LAMBDA
— used for OverloadByLambdaReturnTypePCLA_POSTPONED_CALL
— see pcla.md.Roughly, the framework algorithm might be represented in the form of the following pseudocode, some parts of which will be described in the following sections
while (true) { if (analyzeSomeLambdaWithProperInputTypes()) continue if (findAndFixReadyTypeVariable()) continue if (completionMode == FULL) if (analyzeSomeLambdaWithPCLA()) continue if (completionMode == FULL) reportNotInferredTypeVariable() }
After the loop is completed in FULL
mode, the completion results are being applied to the FIR call representations via FirCallCompletionResultsWriterTransformer
.
For more details see FirCallCompleter
and org.jetbrains.kotlin.fir.resolve.inference.ConstraintSystemCompleter.runCompletion
for the actual representation of the Main While Loop.
NB: From the definition of the loop, it shouldn't be very hard to show that it converges in polynomial time (if all the operations are polynomial).
If for some lambda all its input types are proper, it's possible to start its regular analysis.
All its statements beside return ones (including the last expression) might be resolved as usual/independently.
If the return type of the lambda is proper, all return expressions (the subjects of return statements) should be analyzed and completed independently (with return type as an ExpectedType).
Otherwise, they are being analyzed in Dependent
mode and for each of them:
return@lambda listOf()
, the type argument for the call should be inferred via the root completion phaseReturnedExpressionType <: LambdaReturnType
.With given CS data input and the given framework, there might be more than one “consistent” result.
For example, for the top-level independent call listOf("")
might-be inferred as listOf<Any>("")
in a sense that this result doesn't lead to any type contradiction, but that might surprise a user who has a property val myList = listOf("")
which type would be implicitly inferred to List<Any>
.
The Completion Framework defines a number of degrees of freedom which might be tuned to give different results.
And the following sections describe them and the state chosen for them in the attempt to have the most reasonable results.
NB: Those degrees of freedom do not just define the resulting type arguments but potentially affect if the inference process converges successfully. For example, too early type variable fixation (e.g., before analysis of a lambda with a return type containing the variable) might lead to contradictions which wouldn't be there if we analyzed the lambda beforehand.
During the completion some of the type variables become fixed, i.e., the final type for them is chosen, and all occurrences of the variable inside other constraints are replaced with that type. Also, this type is used as a type argument when completion ends.
Thus, such a fixed type variable stops being an active part of the Constraint System.
Some of the variables that are not yet fixed might be forbidden to fix at this point.
The most obvious reason is the lack of any proper constraint, thus there's not just enough information yet to choose the type.
Another reason might be relevant for dependent calls: a type variable is forbidden to fix if it is related to the return type of the call. The explanation for this rule is that this particular call is an argument of another call for which we don‘t know yet what candidate would be chosen. Thus don’t know a proper expected type that might in fact contradict the chosen result type.
fun foo(x: MutableList<Any>, y: Int) {} fun foo(x: MutableList<String>, y: String) {} fun main() { // Resolved to the first overload foo( // String <: Tv // Potentially, we might've fixed Tv to String when resolving `mutableListOf("")`. // But that would lead to `MutableList<String> <: MutableList<Int>` contradiction mutableListOf(""), 1 ) }
All other variables (that might be fixed) are being sorted according to a set of heuristics, and then the first one is chosen to be fixed.
NB: if two variables are equal from heuristics' point-of-view, the one that has a shorter path in a call-tree is chosen (see org.jetbrains.kotlin.fir.resolve.inference.ConstraintSystemCompleter.getOrderedAllTypeVariables
for details).
When it comes to heuristics, we consider the following factors:
PostponedArgumentInputTypesResolver.fixNextReadyVariableForParameterTypeIfNeeded
)But for the exact ordering, it's better to look at the freshest version of VariableFixationFinder.getTypeVariableReadiness
as it made in a quite descriptive form.
Once the variable is chosen, we should decide to which result type it should be fixed.
Input:
At first, we look if there are some EQUAL
constraints and choose first of them which doesn't contain integer literal types (NB: anyway all of them should not contradict each other, so in some broad sense all such constraints are equal).
Otherwise, we choose a representative subtype and supertype if applicable.
For subtype, we select all the lower constraints and compute a common supertype of them. A common supertype is expected to be the lowest upper bound on the type lattice, i.e., it should be a supertype of all the lower constraints, but it should be as “precise” as possible (see NewCommonSuperTypeCalculator
).
In case of non-proper lower constraints, we replace type variables in them with special stub types which behave like they're equal to anything.
For example, here:
fun <F> select(f1: F, f2: F): F = TODO() fun main() { select( // MutableList<String> <: Fv mutableListOf<String>(), // List<Tv> <: Fv emptyList() ) }
We don't yet have simple constraints for fixing Tv
, so going to fix Fv
first:
List<Tv>
(List<StubType>
after stub-type substitution) and MutableList<String>
.List<String>
(with the help of specially defined equality on stub-types).Fv := List<String>
and then Tv := String
.We don‘t cover that part extensively here, but it’s guaranteed that the resulting type would not contain nor a type variable or a stub type.
For more details, look into ResultTypeResolver.findSubType
For supertypes, it's just an intersection of all proper types.
After that, Captured Type Approximation is applied to both of the candidate types (if the approximated version does not lead to contradictions).
And in most cases if both candidate types exist, we choose the subtype (if it's not Nothing?
), or return the single candidate type (see ResultTypeResolver.resultType
for details).
For FULL
completion mode, we've got the situation when no more type information might be brought from somewhere outside, so we have to use only what we already got.
And if there are still some not-yet-fixed type variables and some lambdas not-yet-analyzed (because they contain type variables among their input types), the last resort would be to try analyzing them in some limited form.
The problem with that is that usually we don't allow using type variables inside the independent statements, while here we had to do that.
The most common example when it becomes necessary is builder functions with signature like fun <E> buildList(builderAction: MutableList<E>.() -> Unit): List<E>
, so in the case of independent call like buildList { add("") }
the only possible way to infer the type argument is to analyze lambda, but the lambda input type contains the single type variable.
For more information, see pcla.md.
Type variable is a special kind of type/type constructor being created for each type parameter in the call-tree.
To avoid confusion with type parameter types, we mostly use Tv
for referencing type variable based on a type parameter named T
.
Note that in the same call-tree, the same type parameter might be met twice (listOf(listOf(""))
), but we would have two different type variables (for sake of simplicity, let's call them Tv1
and Tv2
) for each instance of the call.
Another kind of type variables is synthetic ones created for lambdas unknown parameter/return types.
Xi <: SomeType
or SomeType <: Xi
or Xi = SomeType
org.jetbrains.kotlin.resolve.calls.inference.model.NewConstraintSystemImpl
See TypeVariableDependencyInformationProvider
for details.
Two variables Xv
and Yv
are shallowly related if there is some chain of a constraints Xv ~ a_1 ~ .. ~ a_n ~ Yv
where (~
is either <:
or >:
)
Two variables Xv
and Yv
are deeply related if there is some chain of a constraints a_1 ~ .. ~ a_n
where (~
is either <:
or >:
) and a_1
contains Xv
while a_n
contains Yv
.
A tree of calls, in which constraint systems are joined and solved(completed) together
A type/constraint that doesn't reference any (not-yet-fixed) type variables
A set of callbacks related to inference that being called during function body transformations. See FirInferenceSession
.