| # Type inference for calls |
| |
| ## Intro |
| |
| Type inference is a quite broad topic, and here we focus only on the type inference for calls, i.e., on the algorithm trying to guess which |
| type arguments would fit properly for a given call if they're not specified explicitly. |
| |
| For example, in the case of a call `listOf("")`, it would be quite nice to understand that the user expects it to work the same way as |
| `listOf<String>("")`. |
| |
| All other cases of type inference, e.g., property type inference like `val x = ""` is beside the scope of the document. |
| |
| **TODO**: For the sake of simplicity, we don't cover callable references here, shall be done later. |
| |
| ### Given input |
| |
| Let's assume that we've got some independent call on a statement level inside the body block: |
| |
| ```kotlin |
| fun main() { |
| listOf("") |
| } |
| ``` |
| |
| - And some candidate with the appropriate value arguments number has already been chosen. |
| - But the candidate function has its type parameters, for which type arguments aren't specified. |
| |
| In general, the situation might be a bit more complicated. |
| |
| The independent call may have dependent call-arguments (see [Call tree](#call-tree)): |
| |
| ```kotlin |
| fun <K> materialize(): K = TODO() |
| |
| fun main() { |
| listOf("", materialize()) |
| } |
| ``` |
| |
| Or it might have lambdas and/or expected type |
| |
| ```kotlin |
| fun main() { |
| val x: List<Any> = listOf("", run { 1 }) |
| } |
| ``` |
| |
| ### Expected output |
| |
| - For all calls in the call-tree, for each type parameter, there should be a type argument chosen. |
| - For all lambdas, all receiver/parameter/return types should be computed. |
| |
| **TODO:** Cover how we infer the function kind of lambda (regular/suspend/compose). |
| |
| ```kotlin |
| fun main() { |
| val x: List<Any> = listOf<Any>("", run<Int>(fun(): Int = 1)) |
| } |
| ``` |
| |
| Or, if there are some type contradictions, the error should be reported. |
| |
| ### Naive algorithm description |
| |
| The idea of this part is to give some basic idea of how the algorithm works: |
| |
| - At first step, we create a constraint system ([CS](#cs--constraint-system)) with type variables associated with all type parameters in the |
| call-tree. |
| - Gather type variables constraints from the call tree (e.g. `Tv <: String` in `listOf("")`). |
| - Analyze lambdas' bodies after choosing types of their parameters (once it's possible). |
| - For each type variable, from the gathered constraints, choose the most reasonable resulting type. |
| |
| ## Detailed description |
| |
| In general, the call resolution algorithm works in a very natural recursive manner: |
| |
| - During body resolution, when we recursively traverse the AST and meet a call, before doing something, we would recursively resolve |
| its receiver and arguments. |
| - But the arguments are being resolved in a special way, so the algorithm is aware that it's ok if some information (from an expected type) |
| is missing, so it leaves them partially resolved until the resolution of the top-level call begins. |
| - So, when we start the actual Resolution & Inference process, the argument-calls have their own chosen candidates and Constraint Systems. |
| |
| ### Context dependency |
| |
| Calls resolution modes might be separated into two kinds: |
| |
| - Independent (potentially with an expected type): regular top-level statements, receivers, assignments RHS, etc. |
| - Dependent: arguments of other calls or return expressions of lambdas |
| |
| ```kotlin |
| |
| fun main() { |
| listOf("") // Independent |
| val x: List<Any> = listOf("") // Independent with expected type |
| |
| listOf( // Independent |
| listOf("") // Dependent |
| ) |
| |
| listOf( // Independent |
| run { // Dependent |
| listOf("") // Independent |
| listOf("") // Dependent |
| } |
| ) |
| |
| listOf( // Independent |
| "" |
| ).get(0) // Independent |
| } |
| ``` |
| |
| **NB:** The calls that are used as receivers are being completed as `Independent` and it's crucial because our inference algorithm |
| doesn't allow type-constraints to flow back from the call to its receiver (like `mutableMapOf().put("", "")` is red code). |
| |
| See `org.jetbrains.kotlin.fir.resolve.ResolutionMode`. |
| |
| One of the most important use cases for the resolution mode is defining whether a call should be **fully** [completed](#completion-mode). |
| |
| ### Constraint System Creation |
| |
| At the first step of any candidate call resolution, we create a constraint system ([CS](#cs--constraint-system)), which has: |
| |
| - A specific dedicated type variable ([TV](#tv--type-variable-fresh-type-variable)) for each type parameter in the call-tree. |
| - A set of constraints for each type variable in a form of `Tv <: SomeType` (UPPER), `SomeType <: Tv` (LOWER) |
| or `Tv := SomeType` (EQUALITY). |
| - Basic constraints, obtained from the relevant type parameters upper bounds, e.g. `Tv <: Closeable` for |
| `fun <T : Closeable> T.use(...)`. |
| |
| **NB**: in the case of complex call-trees (`parentCall(argumentCall())`) the parent calls accumulate CS of all the arguments and add |
| their own type variables. |
| |
| **NB**: CS is being created for each candidate, even for ones that later might be declared as inapplicable. |
| Even the candidates with no type parameters should have their CS (the empty one). |
| |
| See `org.jetbrains.kotlin.fir.resolve.calls.stages.CreateFreshTypeVariableSubstitutorStage`. |
| |
| The substitutor from type parameter types to fresh type variables is stored at `Candidate::substitutor`. |
| |
| ### Candidate Resolution Phase |
| |
| For each candidate we perform a bunch of resolution stages (see `org.jetbrains.kotlin.fir.resolve.calls.candidate.CallKind`), each of them |
| either works with its CS by introducing new [initial constraints](#initial-constraints), |
| or in some other ways checks if the candidate is applicable. |
| |
| The most illustrative example of the resolution stage is `CheckArguments` which effectively adds argument-related initial constraints |
| in the form of `ValueArgumentType <: ValueParameterType`. |
| |
| **NB:** During candidate resolution it's not allowed to go into lambda analysis because the signature of the anonymous function might |
| depend on the candidate and there may be more than one candidate. |
| And resolution of the same lambda body more than once would make the whole call resolution algorithm exponential. |
| |
| After that, if the Constraint System contains contradiction, we discard the candidate and go to the next one |
| (or report an error for the last one). |
| |
| Otherwise, if we've got a single perfect candidate, we choose it and go to the *Completion Stage*. |
| |
| ### Initial Constraints |
| |
| *Initial constraint* is an arbitrary type restriction obtained from code semantics. |
| |
| The most basic example of such a constraint is that a value argument type must be a subtype of a parameter type. |
| Or it might be an expected type in case like `val x: List<String> = listOf()` (initial constraint would be `List<Tv> <: List<String>`). |
| |
| Without losing generality, we might say that all the initial constraints have the form of `ArbitraryType1 <: ArbitraryType2`. |
| |
| And their main purpose is deriving *type variable constraints* (`Tv <: SomeType`, `SomeType <: Tv`, or `Tv := SomeType`). |
| |
| ### Deriving Constraints |
| |
| The process of deriving constraints works through the regular subtyping algorithm with a special callback catching attempt to check |
| if some type is a subtype of a type variable or vice versa. |
| |
| ```kotlin |
| |
| fun <T> foo(x: Collection<T>) {} |
| |
| fun bar(x: MutableList<String>) { |
| foo(x) |
| } |
| ``` |
| |
| For example, in the case above: |
| |
| - We add an initial constraint `MutableList<String> <: Collection<Tv>` |
| - When checking, the subtyping algorithm looks for a matching type with a matching type constructor for subtyping, i.e.,`Collection<String>` |
| - Thus, the subtyping check is reduced to `Collection<String> <: Collection<Tv>`. |
| - For such cases, we check each type argument and as we've got the single covariant parameter |
| - Thus, we deduce that `String <: Tv` and got a new lower constraint on `Tv`. |
| |
| For more details, see `org.jetbrains.kotlin.types.AbstractTypeChecker.isSubtypeOf` and |
| `org.jetbrains.kotlin.resolve.calls.inference.components.TypeCheckerStateForConstraintSystem.addSubtypeConstraint` as an example of the |
| overridden callback for catching a type variable comparison case. |
| |
| One of the potential results of the subtyping algorithm might be a failure |
| if some intermediate constraint `ArbitraryType1 <: ArbitraryType2` cannot be satisfied. |
| |
| For example, `MutableMap<String, Int> <: Map<Tv, String>` would be derived to `Int <: String` constraint |
| which cannot be satisfied. |
| |
| In this situation, we say that CS has a *contradiction*. |
| |
| ### Constraint Incorporation |
| |
| Once there's a new *type variable constraint*, it might make sense to derive new ones from it using constraints derived in previous steps |
| and relying on subtyping transitivity. |
| |
| If a new *incorporated variable constraint* has been added, we repeat the process recursively. |
| |
| There are two kinds of incorporation. |
| |
| #### Direct With The Same Variable |
| |
| If the new constraint has a form of `Tv <: SomeType`, for each existing lower (opposite) constraint `OtherType <: Tv`, we may add |
| a new *initial constraint* `OtherType <: SomeType`. |
| |
| ```kotlin |
| fun <I> id(x: I): I = x |
| |
| fun <T> listOf(): List<T> = TODO() |
| |
| fun main() { |
| val x: List<String> = id(listOf()) |
| } |
| ``` |
| |
| In the above example we would have: |
| |
| - Initial constraint `List<Tv> <: Iv` (which immediately adds a lower constraint on `Iv`) |
| - `Iv <: List<String>` from expected type (which immediately adds an upper constraint on `Iv`) |
| - Thus, from the transitive closure of `List<Tv> <: Iv` and `Iv <: List<String>` we might deduce |
| a new initial constraint `List<Tv> <: List<String>` |
| - From which we finally may derive an upper constraint `Tv <: String` |
| |
| #### Inside Other Constraints |
| |
| If the new constraint has a form of `Tv <: SomeType`, for each other type variable `Fv` which contains `Tv` inside its constraints, |
| for each such constraint `Fv <: OtherType<..Tv..>`, we make a tricky substitution of `Tv` inside `OtherType<..>` with some approximation of |
| `SomeType` which would give a new **correct** initial constraint. |
| |
| - In a simple case of `Fv <: Tv`, we would add `Fv <: SomeType`. |
| - In more complex `Fv <: Inv<Tv>`, it would be `Fv <: Inv<out SomeType>`. |
| |
| For more details, see `org.jetbrains.kotlin.resolve.calls.inference.components.ConstraintIncorporator.insideOtherConstraint` |
| |
| ### Completion Stage |
| |
| At this stage, we've got: |
| |
| - Chosen candidates for the call and for all its call-sub-arguments |
| - Single Constraint System which contains type variables from the whole call-tree with their constraints and no-yet found contradictions |
| - No lambda has been analyzed yet |
| - For `Independent` calls, there might be an expected type (which initially should be contributed into a relevant initial constraint) |
| |
| **Completion** is a process that for given information tries to infer the resulting type for each type variable and to analyze all the given |
| lambdas. |
| |
| #### Completion Mode |
| |
| Completion mode (`ConstraintSystemCompletionMode`) defines how actively/forcefully the given call is being completed |
| |
| * `PARTIAL` — used for calls that are other calls arguments |
| * It prevents from fixing TV that are related somehow to return types because their actual result depends on the chosen outer call |
| * Otherwise, it tries to infer as much as possible, but doesn’t lead to reporting not-inferred-type-parameter or running PCLA lambdas |
| * On one hand, we might’ve not run that kind of completion at all, and just accumulate nested argument calls into an outer as is, |
| * but inferring something allows disambiguating outer candidates |
| * ``` |
| val x: Int = 1 |
| fun main() { |
| x.plus(run { x }) |
| } |
| ``` |
| * `FULL` — used for top statement-level calls and receivers |
| * It tries to fix/analyze everything we have |
| * And start PCLA analysis if necessary |
| * Reports errors if for some TV there’s no enough information to infer them |
| * `UNTIL_FIRST_LAMBDA` — used for OverloadByLambdaReturnType |
| * Similar to FULL, but stops after first lambda analyzed |
| * `PCLA_POSTPONED_CALL` — see [pcla.md](pcla.md). |
| |
| #### Completion Framework |
| |
| Roughly, the framework algorithm might be represented in the form of the following pseudocode, some parts of which will be described |
| in the following sections |
| |
| ```kotlin |
| while (true) { |
| if (analyzeSomeLambdaWithProperInputTypes()) continue |
| |
| if (findAndFixReadyTypeVariable()) continue |
| |
| if (completionMode == FULL) if (analyzeSomeLambdaWithPCLA()) continue |
| |
| if (completionMode == FULL) reportNotInferredTypeVariable() |
| } |
| ``` |
| |
| After the loop is completed in `FULL` mode, the completion results are being applied to the FIR call representations via |
| `FirCallCompletionResultsWriterTransformer`. |
| |
| For more details see `FirCallCompleter` and `org.jetbrains.kotlin.fir.resolve.inference.ConstraintSystemCompleter.runCompletion` for the |
| actual representation of the Main While Loop. |
| |
| **NB:** From the definition of the loop, it shouldn't be very hard to show that it converges in polynomial time |
| (if all the operations are polynomial). |
| |
| ### Basic Lambda analysis |
| |
| If for some lambda all its [input types](#input-types-of-a-lambda) are proper, it's possible to start its regular analysis. |
| |
| All its statements beside return ones (including the last expression) might be resolved as usual/independently. |
| |
| If the return type of the lambda is [proper](#proper-typeconstraint), all return expressions (the subjects of return statements) should |
| be analyzed and completed independently (with return type as an ExpectedType). |
| |
| Otherwise, they are being analyzed in `Dependent` mode and for each of them: |
| |
| - It becomes a part of a root call-tree. |
| For example, if we've got `return@lambda listOf()`, the type argument for the call should be inferred via the root completion phase |
| - We add another initial constraint `ReturnedExpressionType <: LambdaReturnType`. |
| |
| ### Inference Heuristics |
| |
| With given CS data input and the given framework, there might be more than one "consistent" result. |
| |
| For example, for the top-level [independent](#context-dependency) call `listOf("")` might-be inferred as `listOf<Any>("")` in a sense |
| that this result doesn't lead to any type contradiction, but that might surprise a user who |
| has a property `val myList = listOf("")` which type would be implicitly inferred to `List<Any>`. |
| |
| The [Completion Framework](#completion-framework) defines a number of degrees of freedom which might be tuned to give different results. |
| |
| And the following sections describe them and the state chosen for them in the attempt to have the most reasonable results. |
| |
| **NB:** Those degrees of freedom do not just define the resulting type arguments but potentially affect if the inference process converges |
| successfully. |
| For example, too early type variable fixation (e.g., before analysis of a lambda with a return type containing the variable) might lead |
| to contradictions which wouldn't be there if we analyzed the lambda beforehand. |
| |
| ### Variable Selection |
| |
| During the completion some of the type variables become *fixed*, i.e., the final type for |
| them is chosen, and all occurrences of the variable inside other constraints are replaced with that type. |
| Also, this type is used as a type argument when completion ends. |
| |
| Thus, such a fixed type variable stops being an active part of the Constraint System. |
| |
| #### Variables that cannot be fixed |
| |
| Some of the variables that are not yet fixed might be forbidden to fix at this point. |
| |
| The most obvious reason is the lack of any [proper constraint](#proper-constraint), thus there's not just enough information yet to choose |
| the type. |
| |
| Another reason might be relevant for [dependent](#context-dependency) calls: a type variable is forbidden to fix if it is related to |
| the return type of the call. |
| The explanation for this rule is that this particular call is an argument of another call for which we don't know yet what candidate would |
| be chosen. |
| Thus don't know a proper expected type that might in fact contradict the chosen result type. |
| |
| ```kotlin |
| fun foo(x: MutableList<Any>, y: Int) {} |
| fun foo(x: MutableList<String>, y: String) {} |
| |
| fun main() { |
| // Resolved to the first overload |
| foo( |
| // String <: Tv |
| // Potentially, we might've fixed Tv to String when resolving `mutableListOf("")`. |
| // But that would lead to `MutableList<String> <: MutableList<Int>` contradiction |
| mutableListOf(""), |
| 1 |
| ) |
| } |
| ``` |
| |
| #### Variables order |
| |
| All other variables (that might be fixed) are being sorted according to a set of heuristics, and then the first one is chosen to be fixed. |
| |
| **NB:** if two variables are equal from heuristics' point-of-view, the one that has a shorter path in a call-tree is chosen |
| (see `org.jetbrains.kotlin.fir.resolve.inference.ConstraintSystemCompleter.getOrderedAllTypeVariables` for details). |
| |
| When it comes to heuristics, we consider the following factors: |
| |
| - if a variable is contained in an input type of lambda (see |
| `PostponedArgumentInputTypesResolver.fixNextReadyVariableForParameterTypeIfNeeded`) |
| - If a variable has dependency (constraints) related to another not-fixed variable |
| - If a variable is used in any output types (return types of lambdas) |
| - Or even if it's made upon a reified type parameter |
| |
| But for the exact ordering, it's better to look at the freshest version of `VariableFixationFinder.getTypeVariableReadiness` as it made |
| in a quite descriptive form. |
| |
| ### Result Type For Fixation |
| |
| Once the variable is chosen, we should decide to which result type it should be fixed. |
| |
| **Input:** |
| |
| - Non-contradictory state of Constraint System |
| - The list of constraints for the type variable (at least one of them must be proper) |
| - Constraints of other type variables that potentially might be related |
| |
| At first, we look if there are some `EQUAL` constraints and choose first of them which doesn't contain integer literal types |
| (**NB:** anyway all of them should not contradict each other, so in some broad sense all such constraints are equal). |
| |
| Otherwise, we choose a representative **subtype** and **supertype** if applicable. |
| |
| For **subtype**, we select all the lower constraints and compute a *common supertype of them*. |
| A common supertype is expected to be the lowest upper bound on the type lattice, i.e., it should be a supertype |
| of all the lower constraints, but it should be as "precise" as possible (see `NewCommonSuperTypeCalculator`). |
| |
| In case of non-proper lower constraints, we replace type variables in them with special *stub types* which behave like they're equal to |
| anything. |
| |
| For example, here: |
| |
| ```kotlin |
| fun <F> select(f1: F, f2: F): F = TODO() |
| |
| fun main() { |
| select( |
| // MutableList<String> <: Fv |
| mutableListOf<String>(), |
| // List<Tv> <: Fv |
| emptyList() |
| ) |
| } |
| ``` |
| |
| We don't yet have simple constraints for fixing `Tv`, so going to fix `Fv` first: |
| |
| - It has two lower constraints: `List<Tv>` (`List<StubType>` after stub-type substitution) and `MutableList<String>`. |
| - Their common supertype is computed to `List<String>` (with the help of specially defined equality on stub-types). |
| - Thus, we fix `Fv := List<String>` and then `Tv := String`. |
| |
| We don't cover that part extensively here, but it's guaranteed that the resulting type would not contain nor a type variable or a stub type. |
| |
| For more details, look into `ResultTypeResolver.findSubType` |
| |
| For **supertypes**, it's just an intersection of all *proper* types. |
| |
| After that, *Captured Type Approximation* is applied to both of the candidate types |
| (if the approximated version does not lead to contradictions). |
| |
| And in most cases if both candidate types exist, we choose the subtype (if it's not `Nothing?`), or return the single candidate type |
| (see `ResultTypeResolver.resultType` for details). |
| |
| ### Partially Constrained Lambda Analysis |
| |
| For `FULL` completion mode, we've got the situation when no more type information might be brought from somewhere outside, so we have |
| to use only what we already got. |
| |
| And if there are still some not-yet-fixed type variables _and_ some lambdas not-yet-analyzed |
| (because they contain type variables among their input types), the last resort would be to try analyzing them in some limited form. |
| |
| The problem with that is that usually we don't allow using type variables inside the independent statements, while here we had to do that. |
| |
| The most common example when it becomes necessary is builder functions with signature like |
| `fun <E> buildList(builderAction: MutableList<E>.() -> Unit): List<E>`, so in the case of independent call |
| like `buildList { add("") }` the only possible way to infer the type argument is to analyze lambda, but the lambda input type contains |
| the single type variable. |
| |
| For more information, see [pcla.md](pcla.md). |
| |
| ## Glossary |
| |
| ### TV = type variable (fresh type variable) |
| |
| Type variable is a special kind of type/type constructor being created for each type parameter in the call-tree. |
| |
| To avoid confusion with type parameter types, we mostly use `Tv` for referencing type variable based on a type parameter named `T`. |
| |
| Note that in the same call-tree, the same type parameter might be met twice (`listOf(listOf(""))`), but we would have two different |
| type variables (for sake of simplicity, let's call them `Tv1` and `Tv2`) for each instance of the call. |
| |
| Another kind of type variables is synthetic ones created for lambdas unknown parameter/return types. |
| |
| ### CS = Constraint system |
| |
| - Mostly, it’s just a collection of TVs and the constraints for them in a form of `Xi <: SomeType` or `SomeType <: Xi` or `Xi = SomeType` |
| - Represented as an instance of `org.jetbrains.kotlin.resolve.calls.inference.model.NewConstraintSystemImpl` |
| |
| ### Related TVs |
| |
| See `TypeVariableDependencyInformationProvider` for details. |
| |
| #### Shallow relation |
| |
| Two variables `Xv` and `Yv` are shallowly related if there is some chain of a constraints |
| `Xv ~ a_1 ~ .. ~ a_n ~ Yv` where (`~` is either `<:` or `>:`) |
| |
| #### Deep relation |
| |
| Two variables `Xv` and `Yv` are deeply related if there is some chain of a constraints |
| `a_1 ~ .. ~ a_n` where (`~` is either `<:` or `>:`) **and** `a_1` contains `Xv` while `a_n` contains `Yv`. |
| |
| ### Call-tree |
| |
| A tree of calls, in which constraint systems are joined and solved(completed) together |
| |
| ### Proper type/constraint |
| |
| A type/constraint that doesn't reference any (not-yet-fixed) type variables |
| |
| ### Input types of a lambda |
| |
| - Receiver type |
| - Value parameter types |
| |
| ### Inference session |
| |
| A set of callbacks related to inference that being called during function body transformations. |
| See `FirInferenceSession`. |