248 lines
6.8 KiB
Markdown
248 lines
6.8 KiB
Markdown
---
|
|
title: Intro to Types
|
|
subtitle: What they are and what they're for
|
|
author: Levi Pearson
|
|
...
|
|
|
|
# Overview
|
|
|
|
## Goals
|
|
|
|
+ Enough terminology to understand discussions
|
|
+ Understand what types provide, and at what cost
|
|
+ How common language features interact with type systems
|
|
|
|
## Not-Goals
|
|
|
|
+ Debate about whether typed languages are "good" or "bad"
|
|
+ Deep understanding of Type Theory
|
|
|
|
# Definitions
|
|
|
|
## Caveats
|
|
|
|
+ "Types" and "Type Systems" are overloaded terms
|
|
|
|
+ Many usages are correct--in the right context!
|
|
|
|
+ Be prepared to find common ground *before* arguing
|
|
|
|
## Type System
|
|
|
|
> A type system is a tractable syntactic method for proving
|
|
> the absence of certain program behaviors by classifying
|
|
> phrases according to the kinds of values they compute.
|
|
>
|
|
> -- Benjamin Pierce, "Types and Programming Languages"
|
|
|
|
## Types About Programs
|
|
|
|
+ Types and type systems were invented before computers
|
|
+ Used in logic, math, and philosophy
|
|
+ Also used in CS for things other than practical programs
|
|
+ They are related, but we won't speak further about them!
|
|
|
|
## Syntactic Method
|
|
|
|
+ "Syntactic" means "based on the form of the text", or "static"
|
|
|
|
+ This excludes run-time or "dynamic" information from the proof
|
|
|
|
## Classifying
|
|
|
|
+ Types *classify* textual phrases ("terms")
|
|
|
|
+ Types *describe* the values the terms of that type can evaluate to
|
|
|
|
+ They provide an "upper bound" on the range of values
|
|
|
|
+ E.g. the expression `3 + 2` is a term that may have the type `Int`
|
|
|
|
## Absence of Behaviors
|
|
|
|
+ Proofs are *about* run-time behavior, not syntactic issues
|
|
|
|
+ Example behaviors to exclude:
|
|
- Nonsensical operations, e.g. `3 + false`
|
|
- Calling missing methods
|
|
- Violation of abstraction boundaries
|
|
|
|
+ Each system chooses what behaviors it excludes
|
|
|
|
## Tractable
|
|
|
|
+ An algorithm (the "type checker") exists
|
|
+ It can prove that the types assigned to terms are consistent *or*
|
|
+ It can show where any inconsistencies are in the program
|
|
+ The algorithm must complete in a reasonable amount of time
|
|
+ It is desirable to know it will *always* complete
|
|
|
|
# Errors and Safety
|
|
|
|
## Going Wrong
|
|
|
|
> ... well-typed programs cannot "go wrong" ...
|
|
>
|
|
> -- Robin Milner, "A Theory of Type Polymorphism in Programming"
|
|
|
|
+ This paper introduced the type checker for ML
|
|
|
|
+ What does he mean by "go wrong"?
|
|
|
|
## Formal Semantics
|
|
|
|
+ To formally prove, you must first formally specify
|
|
|
|
+ A formal semantics gives unambiguous meaning to all programs
|
|
|
|
+ This is as difficult as it sounds; rarely done for full languages
|
|
|
|
+ Milner gave the meaning "wrong" to certain program fragments
|
|
|
|
## Well-Typed
|
|
|
|
+ To be "well-typed" means types can be found for all terms
|
|
|
|
+ Types are the "upper bound" on possible evaluations
|
|
|
|
+ The meaning/evaluation "wrong" has no type
|
|
|
|
+ If a program is well-typed, it can't go (i.e. evaluate to) "wrong"
|
|
|
|
## What is "wrong"?
|
|
|
|
+ Not all errors can be easily ruled out this way
|
|
|
|
+ Types are an approximation of run-time values
|
|
|
|
+ Detecting some errors requires very fine approximation
|
|
|
|
+ "Fancy" types can approximate better, but add complexity
|
|
|
|
## Safety
|
|
|
|
+ What errors to prioritize? Abstraction-breaking ones.
|
|
- Out-of bounds manipulation of memory
|
|
- Viewing an object as a value outside its type
|
|
- Viewing uninitialized memory
|
|
- Executing code at an illegal address
|
|
- Corruption of run-time or system data
|
|
|
|
+ If a language rules out these behaviors in its programs, it is Safe.
|
|
|
|
+ Safe langauges may still have plenty of errors in their programs
|
|
|
|
+ Errors in safe languages are properly attributed to their causes!
|
|
|
|
## Safety and Types
|
|
|
|
+ Types can rule out many, but usually not all, safety errors.
|
|
- Array bounds for dynamically-sized arrays
|
|
- Use-after-free errors
|
|
|
|
+ Run-time checks can catch them all!
|
|
|
|
+ So why use types?
|
|
|
|
## Advantages to Type Safety
|
|
|
|
+ Errors are ruled out *before* execution and for all runs
|
|
|
|
+ Expensive run-time checks can be elided
|
|
|
|
+ Sometimes more efficient data representation can be used
|
|
|
|
+ Types provide useful documentation, especially when precise
|
|
|
|
## Costs to Type Safety
|
|
|
|
+ Some programs are ruled out that would run correctly if dyanmically checked
|
|
|
|
+ Extra time and formality is needed in the language design
|
|
|
|
+ Type checking algorithms may take non-trivial compile time
|
|
|
|
+ Some run-time checks and other infrastructure still needed
|
|
|
|
## Type Systems Are Hard
|
|
|
|
> It turns out that a fair amount of careful analysis is required to
|
|
> avoid false and embarrasing claims of type soundness for programming
|
|
> languages. As a consequence, the classification, description, and
|
|
> study of type systems has emerged as a formal discipline.
|
|
>
|
|
> -- Luca Cardelli
|
|
|
|
# Classifying Type Systems
|
|
|
|
## Surface Dimensions
|
|
|
|
+ Safe vs. Unsafe
|
|
+ Typed vs. Untyped
|
|
+ Explicit vs. Implicit
|
|
+ Simple vs... not?
|
|
|
|
## When is a language typed?
|
|
|
|
+ If it has a type checker, it is typed
|
|
+ It may *also* store information for run-time checking
|
|
+ The type checker might not be of much help sometimes
|
|
+ Things get fuzzy sometimes
|
|
|
|
## Implicit types and inference
|
|
|
|
+ Many useful ML programs can be written without naming a type in their text
|
|
+ "Algorithm W", a.k.a. "Hindley-Milner type inference" figures them all out
|
|
+ Tractability of inference algorithms is very sensitive to type system features!
|
|
+ Interacts particularly poorly with subtyping
|
|
+ "inference" vs. "deduction"
|
|
|
|
## Simple Types
|
|
|
|
+ These types classify values and functions
|
|
+ Can include higher-order functions
|
|
+ Base types, plus various compound types
|
|
+ tuples, records, variants, enumerations, arrays, etc.
|
|
+ Languages with simple type systems begin with Fortran in 1950s.
|
|
|
|
## Polymorphic Types
|
|
|
|
+ A polymorphic type is parameterized by one or more type variables
|
|
+ For each choice of type for the type variable, a new concrete type is selected
|
|
+ A polymorphic type is like a function from types to types
|
|
+ This is called "parametric polymorphism"
|
|
+ A simple type system is a first-order system; with polymorphism, higher-order.
|
|
|
|
## Polymorphic Lists
|
|
|
|
+ `List<T>` is a type that maps a type `T` to the type of lists of `T`
|
|
+ It works for *all* types `T`
|
|
+ `List` itself is sometimes called a "type constructor"
|
|
+ Code written in terms of `List<T>` can not depend on what `T` is chosen
|
|
|
|
## Subtyping
|
|
|
|
+ A type 'A' is a subtype of type 'B' if a term of type 'A'
|
|
can be used anywhere a 'B' is expected
|
|
|
|
+ Sometimes known as "subtype polymorphism"
|
|
|
|
## Existential Types
|
|
|
|
> Type structure is a syntactic discipline for enforcing levels of abstraction.
|
|
>
|
|
> -- John Reynolds
|
|
|
|
## Forall vs. Exists
|
|
|
|
+ Parametric type variable is *universally quantified*
|
|
+ `List<T>` means "For all types T, we can construct List of T"
|
|
+ From logic, we also have *existential quantification*
|
|
+ "There exists a type T such that..."
|
|
|
|
## How does that work?
|
|
|
|
+ We can bundle an existential type with operations that are aware of its real nature
|
|
+ `exists T. { T.to_string() -> String; T.combine(other: T) -> T }`
|
|
+ These can be used in contexts where the exact type is known; i.e. it is abstract
|