Infer type predicates from function bodies using control flow analysis
DRANK

Fixes #16069 Fixes #38390 Fixes #10734 Fixes #50734 Fixes #12798 This PR uses the TypeScript's existing control flow analysis to infer type predicates for boolean-returning functions where appropriate. For example: function isString(x: string | number) { return typeof x === 'string'; } This currently has an inferred return type of boolean, but with this PR it becomes a type predicate: I filed #16069 seven years ago (!) and thought it would be interesting to try and fix it. It turned out to be cleaner and simpler than I thought: only ~65 LOC in one new function. I think it's a nice win! How this works A function is a candidate for an inferred type guard if: It does not have an explicit return type or type predicate. Its inferred return type is boolean. It has a single return statement and no implicit returns (this could potentially be relaxed later). It does not mutate its parameter. If so, then the function looks something like this: function f(p: T, p2: T2, ...) { // ... return expr; } For each parameter, this PR determine what its flow type would be in each branch if the function looked like this instead: function f(p: T, p2: T2, ...) { // ... if (expr) { p1; // trueType } } if trueType != T then we have a candidate for a type predicate. We still need to check what a false return value means because of the semantics of type predicates. If we have: declare function isString(x: string | number): x is string; then x is a string if this function returns true. But if it returns false then x must be a number. In other words, in order to be a type predicate, a function should return true if and only if the predicate holds. We can test this directly by plugging in trueType to the synthesized if statement and seeing what's left in the else branch: function f(p: trueType, p2: T2, ...) { // ... if (expr) { p1; // trueType } else { p1; // never? } If it's never then we've demonstrated the "only if" condition. Note that the previous versions of this PR did slightly different checks. The original version had problems with subtypes and my initial fix made more calls to getFlowTypeOfReference than was necessary. This version directly tests the condition that we want. Wins TypeScript is now able to infer type guards in many places where it's convenient, e.g. calls to filter: const nums = [12, "foo", 23, "bar"].filter(x => typeof x === 'number'); // ^? const nums: number[] Since this piggybacks off of the existing flow type code, all forms of narrowing that TypeScript understands will work. const foos = [new Foo(), new Bar(), new Foo(), new Bar()].filter(x => x instanceof Foo); // ^? const foos: Foo[] There are a few other non-obvious wins: Type guards now flow const isString = (o: string | undefined): o is string => !!o; // - (o: string | undefined) => boolean // + (o: string | undefined) => o is string const secondIsString = (o: string | undefined) => myGuard(o); They also compose in new ways: // const isFooBar: (x: unknown) => x is Foo | Bar const isFooBar = (x: unknown) => isFoo(x) || isBar(x); This is a gentle nudge away from truthiness footguns We don't infer a type guard here if you check for truthiness, only if you check for non-nullishness: const numsTruthy = [0, 1, 2, null, 3].filter(x => !!x); // ^? const numsTruthy: (number | null)[] const numsNonNull = [0, 1, 2, null, 3].filter(x => x !== null); // ^? const numsNonNull: number[] This is because of the false case: if the truthiness test returns false, then x could be 0. Until TypeScript can represent "numbers other than 0" or it has a way to return distinct type predicates for the true and false cases, there's nothing that can be inferred from the truthiness test here. If you're working with object types, on the other hand, there is no footgun and a truthiness test will infer a predicate: const datesTruthy = [new Date(), null, new Date(), null].filter(d => !!d); // ^? const datesTruthy: Date[] This provides a tangible incentive to do non-null checks instead of truthiness checks in the cases where you should be doing that anyway, so I call this a win. Notably the example in the original issue tests for truthiness rather than non-null. Type guards are more discoverable Type predicates are an incredibly useful feature, but you'd never learn about them without reading the documentation or seeing one in a declaration file. Now you can discover them by inspecting symbols in your own code: const isString = (x: unknown) => typeof x === 'string'; // ^? const isString: (x: unknown) => x is string This makes them feel like they're more a part of the language. Inferred type guards in interfaces are checked While this PR defers to explicit type predicates, it will check an inferred predicate in this case: interface NumberInferrer { isNumber(x: number | string): x is number; } class Inferrer implements NumberInferrer { isNumber(x: number | string) { // this is checked!!! return typeof x === 'number'; } } Interesting cases The identity function on booleans is, in theory, a type guard: // boolId: (b: boolean): b is true? const boolId = (b: boolean) => b; This seems correct but not very useful: why not just test the boolean? I've specifically prohibited inferring type predicates on boolean parameters in this PR. If nothing else this significantly reduces the number of diffs in the baselines. Here's another interesting case: function flakyIsString(x: string | number) { return typeof x === 'string' && Math.random() > 0.5; } If this returns true then x is a string. But if it returns false then x could still be a string. So it would not be valid to infer a type predicate in this case. This is why we can't just check the trueType. In general, combining conditions like this will prevent inference of a type predicate. It would be nice if there were a way for a function to return distinct type predicates for the true and false cases (https://github.com/microsoft/TypeScript/issues/15048). This would make inference much more powerful. But that would be a bigger change. I remember RyanC saying once that a function's return type shouldn't be a restatement of its implementation in the type system. In some cases this can feel like a move in that direction: // function hasAB(x: unknown): x is object & Record<"a", unknown> & Record<"b", unknown> function hasAB(x: unknown) { return x !== null && typeof x === 'object' && 'a' in x && 'b' in x; } Like any function, a type guard can be called with any subtype of its declared parameter types. We need to consider this when inferring a type guard: function isShortString(x: unknown) { return typeof x === 'string' && x.length < 10; } The issue here is that x narrows to string and unknown if you inline this check in an if / else, and Exclude<unknown, string> = unknown. But we can't infer a type predicate here because isShortString could be called with a string. This broke the test originally used in this PR, see this comment. A function could potentially narrow the parameter type before it gets to the return, say via an assertion: function assertAndPredicate(x: string | number | Date) { if (x instanceof Date) { throw new Error(); } return typeof x === 'string'; } It's debatable what we should do in this case. Inferring x is string isn't wrong but will produce imprecise types in the else branch (which will include Date). This PR plays it safe by not inferring a type predicate in this case, at the cost of an extra call to getFlowTypeOfReference. Breaks to Existing Code Most new errors in existing code are more or less elaborate variations on https://github.com/microsoft/TypeScript/issues/38390#issuecomment-626019466 const a = [1, "foo", 2, "bar"].filter(x => typeof x === "string"); a.push(10); // ok now, error with my PR In other words, this PR allows TS to infer a narrower type for an array or other variable, and then you do something that required the broader type: pushing, reassigning, calling indexOf. See this comment for a full run-down of the changes that @typescript-bot and I have found. Performance We're doing additional work to infer type predicates, so performance is certainly a concern. @typescript-bot found the most significant slowdown with Compiler-Unions, a +1.25% increase in Check time. Some fraction of the slowdown is from additional work being done in getTypePredicateFromBody (my addition), but some is also because TypeScript is inferring more precise types in more places thanks to the newly-detected type guards. If there are performance concerns with the current implementation, there are a few options for reducing its scope: Only run this on arrow functions Only run this on contextually-typed arrow functions Only run this on contextually-typed arrow functions where the context would use the type predicate (e.g. Array.prototype.filter). Possible extensions There are a few possible extensions of this that I've chosen to keep out of scope for this PR: Check explicit type predicates. This PR defers to explicit type predicates, but you could imagine checking them instead. This would bring some type safety to user-defined type guards, which are currently no safer than type assertions. Infer assertion functions. If you throw instead of returning a boolean, it should be possible to perform an analogous form of inference. If #15048 were implemented, we could infer type predicates in many, many more situations. Infer type predicates on this. It's possible for a boolean-returning method to be a this predicate. This should be a straightforward extension of this PR. Infer type predicates on functions that return something other than boolean. If dates is (Date|null)[], then dates.filter(d => d) is a fine way to filter the nulls. This PR makes you write dates.filter(d => !!d). I believe this is a limitation of type predicates in general, not of this PR. Handle multiple returns. It should be possible to infer a type guard for this function, for example, but it would require more bookkeeping: function isString(x: string | number) { if (typeof x === 'string') { return true; } return false; }

github.com
Related Topics: TypeScript
1 comments
  • In practice, the exact approach will vary based on the language and tools used. Type inference in dynamically typed languages Drift Hunters like JavaScript tends to be more challenging but can still be approached through static analysis techniques or runtime type tracking.