--- title: Intro to Types subtitle: What they are and what they're for author: Levi Pearson ... # Overview ## Goals + Enough terminology to understand discussions + Understand what types provide, and at what cost + How common language features interact with type systems ## Not-Goals + Debate about whether typed languages are "good" or "bad" + Deep understanding of Type Theory # Definitions ## Caveats + "Types" and "Type Systems" are overloaded terms + Many usages are correct--in the right context! + Be prepared to find common ground *before* arguing ## Type System > A type system is a tractable syntactic method for proving > the absence of certain program behaviors by classifying > phrases according to the kinds of values they compute. > > -- Benjamin Pierce, "Types and Programming Languages" ## Types About Programs + Types and type systems were invented before computers + Used in logic, math, and philosophy + Also used in CS for things other than practical programs + They are related, but we won't speak further about them! ## Syntactic Method + "Syntactic" means "based on the form of the text", or "static" + This excludes run-time or "dynamic" information from the proof ## Classifying + Types *classify* textual phrases ("terms") + Types *describe* the values the terms of that type can evaluate to + They provide an "upper bound" on the range of values + E.g. the expression `3 + 2` is a term that may have the type `Int` ## Absence of Behaviors + Proofs are *about* run-time behavior, not syntactic issues + Example behaviors to exclude: - Nonsensical operations, e.g. `3 + false` - Calling missing methods - Violation of abstraction boundaries + Each system chooses what behaviors it excludes ## Tractable + An algorithm (the "type checker") exists + It can prove that the types assigned to terms are consistent *or* + It can show where any inconsistencies are in the program + The algorithm must complete in a reasonable amount of time + It is desirable to know it will *always* complete # Errors and Safety ## Going Wrong > ... well-typed programs cannot "go wrong" ... > > -- Robin Milner, "A Theory of Type Polymorphism in Programming" + This paper introduced the type checker for ML + What does he mean by "go wrong"? ## Formal Semantics + To formally prove, you must first formally specify + A formal semantics gives unambiguous meaning to all programs + This is as difficult as it sounds; rarely done for full languages + Milner gave the meaning "wrong" to certain program fragments ## Well-Typed + To be "well-typed" means types can be found for all terms + Types are the "upper bound" on possible evaluations + The meaning/evaluation "wrong" has no type + If a program is well-typed, it can't go (i.e. evaluate to) "wrong" ## What is "wrong"? + Not all errors can be easily ruled out this way + Types are an approximation of run-time values + Detecting some errors requires very fine approximation + "Fancy" types can approximate better, but add complexity ## Safety + What errors to prioritize? Abstraction-breaking ones. - Out-of bounds manipulation of memory - Viewing an object as a value outside its type - Viewing uninitialized memory - Executing code at an illegal address - Corruption of run-time or system data + If a language rules out these behaviors in its programs, it is Safe. + Safe langauges may still have plenty of errors in their programs + Errors in safe languages are properly attributed to their causes! ## Safety and Types + Types can rule out many, but usually not all, safety errors. - Array bounds for dynamically-sized arrays - Use-after-free errors + Run-time checks can catch them all! + So why use types? ## Advantages to Type Safety + Errors are ruled out *before* execution and for all runs + Expensive run-time checks can be elided + Sometimes more efficient data representation can be used + Types provide useful documentation, especially when precise ## Costs to Type Safety + Some programs are ruled out that would run correctly if dyanmically checked + Extra time and formality is needed in the language design + Type checking algorithms may take non-trivial compile time + Some run-time checks and other infrastructure still needed ## Type Systems Are Hard > It turns out that a fair amount of careful analysis is required to > avoid false and embarrasing claims of type soundness for programming > languages. As a consequence, the classification, description, and > study of type systems has emerged as a formal discipline. > > -- Luca Cardelli # Classifying Type Systems ## Surface Dimensions + Safe vs. Unsafe + Typed vs. Untyped + Explicit vs. Implicit + Simple vs... not? ## When is a language typed? + If it has a type checker, it is typed + It may *also* store information for run-time checking + The type checker might not be of much help sometimes + Things get fuzzy sometimes ## Implicit types and inference + Many useful ML programs can be written without naming a type in their text + "Algorithm W", a.k.a. "Hindley-Milner type inference" figures them all out + Tractability of inference algorithms is very sensitive to type system features! + Interacts particularly poorly with subtyping + "inference" vs. "deduction" ## Simple Types + These types classify values and functions + Can include higher-order functions + Base types, plus various compound types + tuples, records, variants, enumerations, arrays, etc. + Languages with simple type systems begin with Fortran in 1950s. ## Polymorphic Types + A polymorphic type is parameterized by one or more type variables + For each choice of type for the type variable, a new concrete type is selected + A polymorphic type is like a function from types to types + This is called "parametric polymorphism" + A simple type system is a first-order system; with polymorphism, higher-order. ## Polymorphic Lists + `List` is a type that maps a type `T` to the type of lists of `T` + It works for *all* types `T` + `List` itself is sometimes called a "type constructor" + Code written in terms of `List` can not depend on what `T` is chosen ## Subtyping + A type 'A' is a subtype of type 'B' if a term of type 'A' can be used anywhere a 'B' is expected + Sometimes known as "subtype polymorphism" ## Existential Types > Type structure is a syntactic discipline for enforcing levels of abstraction. > > -- John Reynolds ## Forall vs. Exists + Parametric type variable is *universally quantified* + `List` means "For all types T, we can construct List of T" + From logic, we also have *existential quantification* + "There exists a type T such that..." ## How does that work? + We can bundle an existential type with operations that are aware of its real nature + `exists T. { T.to_string() -> String; T.combine(other: T) -> T }` + These can be used in contexts where the exact type is known; i.e. it is abstract