Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularity #39

Open
shelby3 opened this issue Feb 10, 2018 · 300 comments
Open

Modularity #39

shelby3 opened this issue Feb 10, 2018 · 300 comments

Comments

@shelby3
Copy link

shelby3 commented Feb 10, 2018

Prior discussion:

#14 (comment)

#33 (comment)

#8 (comment)

Please kindly move modularity discussion here and not pollute issue threads which are specialized on their thread title (speaking to myself also, not to any specific person), so that we can have modularity discussion organized in a thread of its own.

This was referenced Feb 10, 2018
@sighoya
Copy link

sighoya commented Feb 12, 2018

@shelby3
Associated types are fine and they constrain types on type level not on runtime level, right? (example: type family in typeclasses, pokemon example)

@keean
I will try to recapture your problem with a code example (though, I'm not really familiar with ocaml), you can correct me, if I'm wrong:

signature EQ = sig
type t
val eq : t * t -> bool
end

structure EqInt = struct
type t = int
val eq = Int.eq
end

fun f (X : EQ) = struct
type t = X.t
val eq = X.eq
end

Function f should have a signature of the form EQ where type t -> EQ where type t
which can be erased even if the type t from X is not known explicitly.
Why?
Every time you construct a signature instance EQ with certain type t, the compiler will/could create a monomorphized function for it. When a module of type EQ is applied to f, the compiler should decide to pick the right function at compile time, which should be possible because of type inference.
Even if you change X.t from int to float (don't know if possible), you state the type for the type change of X.t explicitly in your code, which should be recognized by the compiler.
The only problem that arises is if you eval a type out of a string which is created from "outside":

string<-readline...
X.t=eval_to_type(string)

Because of this, most statically typed languages disallow such things.

@keean
Copy link
Owner

keean commented Feb 12, 2018

@sighoya it would help if you referenced the problem you are questioning, I am not sure what you are trying to show or disprove?

@sighoya
Copy link

sighoya commented Feb 12, 2018

I want to prove that dependent types are not needed for modules.

Edit:
reference

@keean
Copy link
Owner

keean commented Feb 12, 2018

You example has no runtime polymorphism. You need to introduce an existential type, so that you want to run Eq over a list of existential values.

@sighoya
Copy link

sighoya commented Feb 12, 2018

You mean something like this:

signature EQ = forall x sig
type t=[x]
val eq : t * t -> bool
end

or

signature EQ = sig
type t=[forall x]
val eq : t * t -> bool
end

I don't know if the latter is a valid syntax, but I mean a heterogenous list, i.e. contain values of different types.
In the case of the latter example, fun f with:

fun f (X : EQ) = struct
type t = X.t
val eq = X.eq
end

should have signature of EQ where t=[forall x] ->EQ where t=[forall x],

meaning that t is a list where the elements are of type: Variant Type(runtime Union Type) for which no type erasure and no dependent type is needed.

@keean
Copy link
Owner

keean commented Feb 12, 2018

@sighoya

meaning that t is a list where the elements are of type: Variant Type(runtime Union Type) for which no type erasure and no dependent type is needed.

That is different because with a union type you are limited in which types you can put in. Given a "String|Int" you can only put a string or integer into the heterogeneous collection.

With an existential type we can specify the interface, say forall a . Eq a => a which allows any type that implements Eq to be put in the collection.

I can actually demonstrate the unsoundness without using an existential type, just think about polymorphic recursion. You cannot monomorphise a recursive polymorphic function.

@keean
Copy link
Owner

keean commented Feb 12, 2018

@sighoya

should have signature of EQ where t=[forall x] ->EQ where t=[forall x],

This type is impossible because the type escapes the existential.

@sighoya
Copy link

sighoya commented Feb 12, 2018

@keean wrote

That is different because with a union type you are limited in which types you can put in. Given a "String|Int" you can only put a string or integer into the heterogeneous collection.

Thats right, therefore Variant Type != Union Type
@keean wrote

With an existential type we can specify the interface, say forall a . Eq a => a which allows any type that implements Eq to be put in the collection.

Aha, here an existential takes the role of a trait/typeclass object.
@keean wrote

I can actually demonstrate the unsoundness without using an existential type, just think about polymorphic recursion. You cannot monomorphise a recursive polymorphic function.

Corner cases/Undecidable issues
Do you think the following is sound?:
(+) :: Int->Int->Int
Int.MAX+Int.MAX=Error?

@keean wrote

@sighoya

should have signature of EQ where t=[forall x] ->EQ where t=[forall x],

This type is impossible because the type escapes the existential.

Yes it breaks the rules of existentials.
Maybe: f::EQ where t=Variant -> EQ where t=Variant is a better fit
or left out the associated type with: f::EQ->EQ
Why is it so important to include the associated type in the type signature?

Further, I don't see the need for dependent types for module purposes but maybe you mean runtime polymorphism is inevitable, then yes.
Runtime polymorphism is indeed very useful.

@shelby3
Copy link
Author

shelby3 commented Feb 13, 2018

@keean

If a module has an associated type, and a module is first class (can be passed to a function as a value) then the associated type must depend on the value passed to the function. This is clearly the definition of a dependent type.

But if the instance of the module’s type is statically known at the time when the value is applied at the call-site then it’s no different than applying typeclasses at the call-site. So are typeclass bounds dependent typing?

I guess it’s because we erase the instance’s type from the static knowledge when we assign it to a module signature (but have to reify it with runtime dispatch), thus we don’t track it statically? Isn’t that the analogous to assigning an object to an existential type thus erasing it from static tracking?

The key difference is when the type of the value can’t be known statically, then we need to track the type of a value at runtime with some form of runtime dispatch. Which is analogous to existential types with typeclasses.

What am I missing? Trying to understand why you see a problem with this form of dependent typing that you don’t see with typeclasses?

I can actually demonstrate the unsoundness without using an existential type, just think about polymorphic recursion. You cannot monomorphise a recursive polymorphic function.

Ditto cannot monomorphise typeclass bounds with polymorphic recursion. Have to use runtime dispatch. We had a discussion about that in the past on these issue threads.


@sighoya wrote:

Further, I don't see the need for dependent types for module purposes but maybe you mean runtime polymorphism is inevitable, then yes.

Well I think he may be defining dependent type correctly, yet I agree that seems like it’s just runtime polymorphism. And I am also wondering why he dislikes it given it seems to occur because of the inevitable need for runtime polymorphism.

I thought dependent typing was when we can check the values of a value statically?? With runtime polymorphism we’re not enforcing the type of the value statically. There’s no monomorphisation going on.

@keean
Copy link
Owner

keean commented Feb 13, 2018

Ditto typeclass bounds with polymorphic recursion.

No typeclasses only depend on the type of the thing, not the value of the thing.

Now you would have the same problem if you had type-classes with associated types, but in Haskell they get around it by saying the associated types must be globally coherent with the type parameters of the type class instances. This makes the type-class type-parameters, and the associated types of that type-class form a type-family.

@keean
Copy link
Owner

keean commented Feb 13, 2018

@sighoya a variant type is also limited in the types you can put into it (at least it is in OCaml). Without existentials (runtime polymorphism) polymorphic recursion can still cause problems with modules that requires dependent types to support.

Now you can have modules without dependent types, if you limit the language to having everything monomorphisable. This means no runtime polymorphism, no polymorphic recursion etc.

@sighoya
Copy link

sighoya commented Feb 14, 2018

@keean wrote

@sighoya a variant type is also limited in the types you can put into it (at least it is in OCaml).

My thoughts were onto dlang's Variant Type.
You can also further restrict dlangs Variant Type to allow assignments of values of specific types (they call them Algebraic Type, in the same url I presented).
A variant type is a any type without the sh?t of inheritance.

Edit:
A typeclass/trait object can also be seen as a Variant Type which is bounded to the types implementing the trait/typeclass. In some situations, a trait object can also be a Union Type (compile time Variant Type), in which case monomorphization can applies.

@shelby3

I thought dependent typing was when we can check the values of a value statically?? With runtime polymorphism we’re not enforcing the type of the value statically. There’s no monomorphisation going on.

I thought depending typing was when we cannot check the type of a variable at compile time (statically), because the variable's type depends on the variable's value and maybe of other variables's values.
However we can pattern match over it and can follow different branches in which the compiler knows the type for each branch (when done rightly), though it can become tedious.

Dependent types are indeed very useful. Imagine a type called DeterminantNotZeroMatrix, which is defined at follows:

DeterminantNotZeroMatrix<Type eltype,Int n,Int m>={Matrix<eltype,m,n> mat |  det(mat)!=0}

Edit: ambigious use of m, changed it to mat.

Once constructed this kind of dependent type, we can invert this Matrix without to check every time if the matrix is invertible.

But @keean, I think you need a runtime polymorphic time rather than dependent typing

@sighoya
Copy link

sighoya commented Feb 14, 2018

There is a proposal to add type classes to C#. The funny thing is that the proposed implementation of typeclasses looks like a lot of a light form of Modular Type Classes

@keean
Copy link
Owner

keean commented Feb 14, 2018

@sighoya

My thoughts were onto dlang's Variant Type.
You can also further restrict dlangs Variant Type to allow assignments of values of specific types (they call them Algebraic Type, in the same url I presented). A variant type is a any type without the sh?t of inheritance.

This kind of variant is not modular, as there is no way to represent a universal type tag. This kind of variant requires Run Time Type Information, which is a big problem for me, as you lose type erasure (and theorems for free) as well. I can give more details why this is bad (or perhaps why I don't like it if I am being generous).

But @keean, I think you need a runtime polymorphic time rather than dependent typing

Runtime type information is worse (see above) with dependent typing we can still resolve issues at compile time. For example if we try and print a value of an associated type, we can add the "Print" constraint to that module. This can be propagated up to the interface specification for the current module such that we say the argument for this function must be an iterator where the ValueType of the iterator is printable. These are dependent types (because the type passed to print depends on the value of the module passed to the function) but we solve by placing constraints on the types which we propagate at compile time. The latest we would need to check these constraints is when linking a dynamically loaded module, we still retain type erasure and don't have to find a way to store types in memory at runtime.

I think its bad to store types in memory at runtime due to the Russel paradox. If every memory location has a type, and a memory location contains a type then the type of a type is a type and we have created the conditions for a paradox. Everything in memory at runtime is a value, now we can create a mapping from values to types, but we cannot do this globally, because with separate compilation we don't know what values other compilers assign to which types. So we have to locally map values to types which is what an OCaml variant does (a disjoint sum type).

@sighoya
Copy link

sighoya commented Feb 14, 2018

@keean wrote

Runtime type information is worse (see above) with dependent typing we can still resolve issues at compile time. For example if we try and print a value of an associated type, we can add the "Print" constraint to that module.

Looks like for me you need a trait/typeclas object

@keean wrote

I think its bad to store types in memory at runtime due to the Russel paradox. If every memory location has a type, and a memory location contains a type then the type of a type is a type and we have created the conditions for a paradox.

The type Type is a meta type which is a non well founded set including itself

@keean
Copy link
Owner

keean commented Feb 14, 2018

@sighoya it seems you agree with my reasoning, but trait objects are Rusts name for existential types, and you are back to needing dependent types if you use modules.

It all comes down to the difference between a type class and a module. In a type class the implementation is chosen only using the type information of the type class parameters. In the lambda cube this makes an associated type a type that depends on another type. Because a module is first-class the associated type depends on the actual value of the module passed, not just its type, hence an associated type is a type that depends on a value (otherwise known as a dependent type).

@sighoya
Copy link

sighoya commented Feb 14, 2018

@keean
Let me try to reflect.
You describe a case where a functor is taking a module and returning a module where the associated type of the returned module depends on the taking module.
You don't want to solve it with Variant Types because you don't like them, but you want to solve it with Existentials which have a constraint which depends, however, on the value of the taking module.
Your solution is to constrain the taking module by a dependent type, i.e. dependent signature in which only the desired modules are accepted by the functor.

In my eyes, it is one solution. But you can also force the functor to return a Maybe structure in which its contained/wrapped structure depends on the value condition.

@keean
Copy link
Owner

keean commented Feb 14, 2018

@sighoya not quite. With first class modules (and Functor) you can pass modules/functors as arguments to functions. Because associated types with modules depend on the value of a module and not just its type, you need dependent types even without variants, existentials etc. You just need first-class modules and associated types to have the problem.

@keean
Copy link
Owner

keean commented Feb 14, 2018

@sighoya Let me have another go, as I don't think I have done a very good job of explaining, and may have got a bit side-tracked by other issues. The type system only tracks the type of modules or functors. It is possible for two modules which conform to the same signature (type) to have different associated types. The associated types are not part of the signature.

signature EQ = sig
type t
val eq : t * t -> bool
end

structure EqInt = struct
type t = int
val eq = Int.eq
end

structure EqFloat = struct
type t = float
val eq = Float.eq
end

fun f (X : EQ) = struct
type t = X.t
val eq = X.eq
end

f(EqInt).eq(1, 2)
f(EqFloat).eq(1.2, 1.1)

Because EqInt and EqFloat are values of the type EQ the type system does not track the difference between them. All the type system sees is type EQ in order to see what value is passed to f we have to use partial-evaluation or abstract-interpretation. However nether of these methods can solve general computational problems, and you very quickly end up running the whole program. This is why compile-time evaluation is restricted to constant-expressions in C++ and other languages. In other words we could resolve this for limited cases, but most compilers don't even try.

@sighoya
Copy link

sighoya commented Feb 14, 2018

@keean wrote:

The type system only tracks the type of modules or functors. It is possible for two modules which conform to the same signature (type) to have different associated types. The associated types are not part of the signature.

Right, This is the intention of associated types, to not include these in the type signature, otherwise, they are parameters.

@keean wrote:

Because EqInt and EqFloat are values of the type EQ the type system does not track the difference between them. All the type system sees is type EQ in order to see what value is passed to f we have to use partial-evaluation or abstract-interpretation.

Ah, I see your problem. The problem is that the signature of eq changes in dependence to the input module. And this is unmodular, right?
You can pattern match over it (=partial evaluation?), but cannot grasp all possible values with it (at least not in all cases)
Edit:
May the type system does not know the difference between EqInt and EqFloat, but the compiler knows it because you hard coded the type int to EqInt and the type float to EqFloat and the compiler can therefore typecheck the class of eq at compile time (For hidden/already compiled modules, there must be a way for the compiler to access the associated type, at least I would implemented it to do so)

I don't think the problem is because that modules are values. You even have the problem if the modules are defined statically but the definition of the modules are hidden from the user.

For example, you specialize a function f with some unknown static module EqA and call f with values depending on EqA.

f<EqA>(1,2) or f<EqA>("1","2") ?

I think it is not a problem that modules are first class values, the problem is that associated types are unmodular.

@sighoya
Copy link

sighoya commented Feb 14, 2018

Ah.... I must correct, your are right:

a=io.read().asInt()
module m::EQ
if a=2 m =EqInt else m = EqFloat
f(m)(1,2) #allowable?

@sighoya
Copy link

sighoya commented Feb 14, 2018

I could imagine to disallow:

a=io.read().asInt()
module m::EQ
if a=2 m =EqInt else m = EqFloat
f(m)(1,2) #allowable?

by the compiler and to pattern match over all possible outcomes, but the user cannot reason why the above code is not valid (because it is not inferable from the type signature), damnit!

@keean
Copy link
Owner

keean commented Feb 15, 2018

@sighoya We can only reason about code using the types, otherwise we are executing the code. If we execute the code we have an interpreter not a compiler. The whole point of a compiler is to translate code from one language (the source language) to another (the target language). This gets fuzzy when we can partially evaluate code where the inputs are static, but as soon as the input depends on IO we cannot even do that. So when you write short code examples it's easy to say "well the compiler knows this value", but in real code it es exceptionally rare, except for constant values like PI. Even with constant values you have the catch22 of not wanting to execute code until it has been type checked to make sure it does not crash the compiler, but your type checking depends on the values which require you to run the code. So in reality partial evaluation does not work for this kind of problem. That leaves abstract interpretation as the only method, and this is not a well known technique even amongst compiler writers. Dependent types are by far the best solution, or avoid the problem by forcing global coherence on module instances by making the associated types a type-family over the module type parameters ( what Haskell does).

@sighoya
Copy link

sighoya commented Feb 15, 2018

Some Questions:

1.) How does Ocaml handle these things? Does it crash?

2.) What is if we force the user to pattern match over the associated type in a module of type EQ:

module m::EQ
m= EqInt if a=2 else EqFloat# now, the compiler knows that m.t could be different
f(m).eq(2,2) # Here the compiler knows eq must be of type Int->Int->Bool, so the user have to check if m.t is Int

The compiler will throw an error and forces the user to rewrite the code above to:

module m::EQ
m= EqInt if a=2 else EqFloat
match m.t=
Int -> f(m).eq(2,2)
_ -> Error/Nothing

What I'm missing?

3.) What is your plan to this problem?

@shelby3
Copy link
Author

shelby3 commented Feb 15, 2018

@keean,

Ditto typeclass bounds with polymorphic recursion.

No typeclasses only depend on the type of the thing, not the value of the thing.

We discussed this already in the past in our discussion of rankN HRT. When we store callbacks that have a typeclass bound as values, we lose the definition of the function body. Thus there’s no way to monomorphise the body because we don’t know the body of the function that is the callback because more than one function definition can match the same set of typeclass bounds or alternatively the caller of the callback may be supplying existential types instead of statically known instances. We have to resort to runtime dispatch. So indeed typeclasses do depend on the value of the thing when we have to throw away static knowledge of which values correspond to which types in our imperative soup (also for example existential types aka trait objects). If we can eliminate the need for imperative soup…

Now you can have modules without dependent types, if you limit the language to having everything monomorphisable. This means no runtime polymorphism, no polymorphic recursion etc.

…if our modules are applicative/pure and all imperativity occurs top-down outside any modules, then in theory we can have polymorphic recursion without requiring runtime polymorphism in the modules, although at the junction between the imperative layer and the modular layer, must be runtime dispatch to monomorphised branches of the modular code.

Btw, the Pokemon example employing type family fails to solve the Expression Problem. New Pokemon types can’t be added independently. The code could be reorganized to not use associated types if runtime polymorphism is allowed and then it would also be more modular and extensible. So again I ask @keean what the objection to dependent typing is, if all it means is to enable runtime polymorphism? Runtime polymorphism is necessary for modularity, solving the Expression Problem, and because static typing is limited in expression?


@sighoya wrote:

Associated types are fine and they constrain types on type level not on runtime level, right? (example: type family in typeclasses, pokemon example)

Well even when simulated with modular abstract types, I don’t see why they couldn’t be monomorphized in the cases where there’s no runtime polymorphism?

I thought dependent typing was when we can check the values of a value statically?? With runtime polymorphism we’re not enforcing the type of the value statically. There’s no monomorphisation going on.

I thought depending typing was when we cannot check the type of a variable at compile time (statically), because the variable's type depends on the variable's value and maybe of other variables's values.

So we agree that dependent typing is the problem where we need to know the type of values. I’m saying that I thought dependent typing was the ability to type those values statically with total order knowledge about all possible states of the program. You’re saying that that actually it’s the ability to declare some invariants independent of the values, but isn’t that what a type is? An int is an invariant independent of the values it takes.

Dependent types are indeed very useful. Imagine a type called DeterminantNotZeroMatrix, which is defined at follows:

So what you’re saying is the user can create new types and enforce semantics on those types. But isn’t that what typing is for?

It seems to me that runtime polymorphism is somehow being conflated with dependent typing in our discussion?


@keean wrote:

This kind of variant requires Run Time Type Information, which is a big problem for me, as you lose type erasure (and theorems for free) as well. I can give more details why this is bad (or perhaps why I don't like it if I am being generous).

RTTI is bad because the programmer can access it and use it as a hack for polymorphism (c.f. the mess that is the Go libraries). Whereas what we really need is static/dynamic polymorphism and runtime only where it must be.

But are you conflating runtime polymorphism with RTTI? Existential types (aka Rust’s trait objects) don’t erase the runtime information about types, because it’s required for runtime dispatch. Yet the programmer has no access to the RTTI.

But @keean, I think you need a runtime polymorphic time[type] rather than dependent typing

Runtime type information is worse (see above) with dependent typing we can still resolve issues at compile time. For example if we try and print a value of an associated type, we can add the "Print" constraint to that module. This can be propagated up to the interface specification for the current module such that we say the argument for this function must be an iterator where the ValueType of the iterator is printable. These are dependent types (because the type passed to print depends on the value of the module passed to the function) but we solve by placing constraints on the types which we propagate at compile time. The latest we would need to check these constraints is when linking a dynamically loaded module, we still retain type erasure and don't have to find a way to store types in memory at runtime.

That presumes that modules aren’t first-class values:

You just need first-class modules and associated types to have the problem.

If they are values then you can’t know until runtime what the types of the abstract types are. So then you must use runtime dispatch.

I think its bad to store types in memory at runtime due to the Russel paradox. If every memory location has a type, and a memory location contains a type then the type of a type is a type and we have created the conditions for a paradox. Everything in memory at runtime is a value, now we can create a mapping from values to types, but we cannot do this globally, because with separate compilation we don't know what values other compilers assign to which types. So we have to locally map values to types which is what an OCaml variant does (a disjoint sum type).

We discussed this before in the past. Use a cryptographic hash to have decentralized creation of a universal tag.

Again I think you’re conflating access to RTTI as evil with type erasure as good. There’s a middle ground, which is RTTI for runtime dispatch is not evil (when the programmer can’t access that RTTI).

As for Russell’s paradox, it’s omnipresent in many different facets of programming and there’s absolutely nothing you can do to solve that. If you could solve it, then our universe would be provably bounded. Life is necessarily imperfect else we couldn’t exist.


@sighoya wrote:

The type Type is a meta type which is a non well founded set including itself

This is related to why we can’t have unbounded/open ADTs and use them to select implementation of typeclass instances, because then we can’t prove anything is ever disjoint. Monomorphism and the Expression Problem are in tension with each other, abstractly because of Russell’s paradox.


@keean wrote:

It is possible for two modules which conform to the same signature (type) to have different associated[abstract] types. The associated[instances of the abstract] types are not part of the signature.

ftfy. Associated types are a typeclass concept. Abstract types are a module concept. Afaik, the paper Modular Type Classes employs abstract types to emulate the associated types of typeclasses.

Because EqInt and EqFloat are values of the type EQ the type system does not track the difference between them. All the type system sees is type EQ in order to see what value is passed to f we have to use partial-evaluation or abstract-interpretation.

It seems you argued above that the type system can track the differences and monomorphise (in some cases) and I agreed except I pointed out that when the modules are first-class values that get passed around, then the instances of the signature can’t be known statically.

However nether of these methods can solve general computational problems, and you very quickly end up running the whole program.

What does that sentence mean? Do you mean that not everything can be monomorphised?

This is why compile-time evaluation is restricted to constant-expressions in C++ and other languages. In other words we could resolve this for limited cases, but most compilers don't even try.

Typeclass bounds at call sites also throw away the information about the types at the call site, but the difference is the static information is thrown away at the call site and not at the construction of module instance with a functor. In the case of typeclass bounds of callback functions, the body of the function is statically unknowable.

Since we know we can emulate typeclasses with modules per Modular Type Classes, we know that not all use of modules can’t be monomorphised. The key is really referential transparency. Because when we/compiler knows the code isn’t overwriting (generic references to) values with different types (which causes the requirement for the value restriction!), then the compiler can statically trace the type of the instance of a module to it’s use in a function and thus monomorphise. That is why C++ and other non-pure languages can’t do it.

All of us discussing here know that PL design has different axes that are highly interrelated. We must analyse holistically over all axes simultaneously in order to explore the design space.

We can only reason about code using the types, otherwise we are executing the code. If we execute the code we have an interpreter not a compiler. The whole point of a compiler is to translate code from one language (the source language) to another (the target language). This gets fuzzy when we can partially evaluate code where the inputs are static, but as soon as the input depends on IO we cannot even do that. So when you write short code examples it's easy to say "well the compiler knows this value", but in real code it es exceptionally rare, except for constant values like PI.

The key point of your comment is that when the code is no longer pure, then the compiler (nor the human) can reason about it.

Dependent types are by far the best solution, or avoid the problem by forcing global coherence on module instances by making the associated types a type-family over the module type parameters ( what Haskell does).

I wish we’d write abstract types instead of dependent types. Dependent types seems to be a more broad term. Yeah modules need abstract types as opposed to type parameters (because per prior discussion they otherwise lose encapsulation and can’t be non-explicitly/boilerplate composed in partial orders, i.e. must always know all the instance types when composing), and if you don’t have referential transparency, then probably cannot monomorphise them.

But referential transparency really is about top-down dependency injection. So in some cases we can top-down inject the types and monomorphise but in other cases the top-down imperative layer will lose static tracking of the types and thus we must punt to runtime polymorphism.

Per the link above, what I’m not clear on now is whether we really need both modules/typeclasses and algebraic effects? Seems like maybe all modular genericity can be modeled as bind points on the free monad?

I would appreciate you guys checking my logic on this.


@sighoya wrote:

The problem is that the signature of eq changes in dependence to the input module. And this is unmodular, right?

No that is modularity. Multiple implementations of the same signature. (Also multiple signatures for the same implementation).

@sighoya
Copy link

sighoya commented Feb 16, 2018

@shelby3 wrote

You’re saying that that actually it’s the ability to declare some invariants independent of the values, but isn’t that what a type is? An int is an invariant independent of the values it takes.

@shelby3 wrote

So what you’re saying is the user can create new types and enforce semantics on those types. But isn’t that what typing is for?

It seems to me that runtime polymorphism is somehow being conflated with dependent typing in our discussion?

Valid point and I'am think you are right. What I presented was a refinement type which correlates with dependent types but are not the same.

After further investigations, Languages which support Variant Types already partially implements dependent types because
dependent types are function types where the Return Type is dependent on the values of the function parameters, i.e. the Return Type is a Variant Type in case of nondeterminism and a Union Type in case of determinism.
The described dependent type above is a dependent product type, i.e. the free parameter value tuple determines the Return Type.
However, you can also "chain" the dependent type generation meaning the parameter value tuple of a function is not free in itself, i.e. some values depend on other values in the parameter value tuple. Such a function is called dependent sum type, e.g. function ():a->a+1->(a,a+1).

The question is, why not simply use the Variant Type/Union Type or a Sum Type (Maybe, Either) to represent Varieties?
With dependent types you can encode value semantics in the type signature, e.g.:

concat :: Vec a n -> Vec a m -> Vec a (m+n)

.
This is so much expressive that such constructs are permitted:

fun::Vec a (x**2-y**4+z**4-z)-> (Vec a x, Vec a y, Vec a z)

Because the length of a Vector is integral, you can construct diophantine problems which are undecidable (though, for a computer nothing is undecidable because he can only see a finite subset of infinities). Edit: (Corrected)

@keean Again, I see not clearly why you need dependent types instead of simple runtime polymorphism (Sum Type, Variant Type)

@shelby3 wrote

Btw, the Pokemon example employing type family fails to solve the Expression Problem. New Pokemon types can’t be added independently.

The problem with the pokemon example is that Haskell is missing Union Type with the mechanism to typecheck the value and to monomorphize over it. They do this manually in this example which is bad practice, especially when the associated type Winner is of type * meaning I could instantiated the *-Type not only with the pokemon or foe type, but also with any other type.

@shelby3 wrote

So again I ask @keean what the objection to dependent typing is, if all it means is to enable runtime polymorphism? Runtime polymorphism is necessary for modularity and solving the Expression Problem?

Totally Agree.

@sighoya wrote:

The problem is that the signature of eq changes in dependence to the input module. And this is unmodular, right?

@shelby3 wrote

No that is modularity. Multiple implementations of the same signature. (Also multiple signatures for the same implementation).

Hmm... Yes. I've formulated it badly. What I meant is that the user cannot see how the signature changes at design time and compile time (at least not always). Therefore I presented a possible solution in my previous post which should overcome this problem, at least in my mind.

@keean
Copy link
Owner

keean commented Feb 16, 2018

So again I ask @keean what the objection to dependent typing is, if all it means is to enable runtime polymorphism? Runtime polymorphism is necessary for modularity and solving the Expression Problem?

Dependent typing breaks some desirable type system properties like decidability. If the type system becomes a fully usable turing-complete programming language, then why have two different turing complete programming languages. However it might be less problematic than the alternatives.

@keean
Copy link
Owner

keean commented Feb 16, 2018

@sighoya

Again, I see not clearly why you need dependent types instead of simple runtime polymorphism (Sum Type, Variant Type)

This is a complex question to answer, but it is along the lines of "I cannot clearly see why everyone cannot program in Lisp, as it is turing complete and you can express any program in it."

@keean
Copy link
Owner

keean commented Jul 28, 2018

@sighoya The barber paradox was used by Russel as a concrete example of his paradox. If we now take these to be sets, the set of people who do not shave themselves is paradoxical because of the barber. If types are names for sets of values then we can see how this applies. Further we can have kinds who's values are types. In type theory we would call these "Universes" and Universe zero would be types that contain values. The universe of a type is the highest universe of any member of the set plus one. So kinds have a universe number of one. The ordering of universes prevents the Russel paradox because no type can be determined in terms of a type in the same universe.

@sighoya
Copy link

sighoya commented Jul 28, 2018

@keean wrote:

The ordering of universes prevents the Russel paradox because no type can be determined in terms of a type in the same universe.

But is it not the same case as before, why not simply abort the construction of a to-be-shaved type set with a exception.
Try to think that you ask the SMT solver of your type system to construct a set "to-be-shaved" with your problem in predicate logic, then the SMT solver puts the barber into your constructed set because it needs to do so but checks that this decision is also wrong.
Because there is no way around, the SMT solver throws an Exception, a result of type bottom which is sound because any request which returns something can include bottom because of subtyping.

This is the advantage of decision procedures in a constructive world where we don't have to state it is or not and instead favor a third solution.

What is wrong with constructors?

I don't like constructors defining types as a type should be defined by itself not by a helper, explicitism vs implicitism.

I see similar examples very often:

class A
{
    int i;
    A(int i)
    {
        if(i<0)
            throw new Exception()
        this.i=i
    }
}

What is the type of A as a set?
Especially does it include a record instance {i:int} where i<0, from the record perspective yes and you don't see the constructor implementation from the signature, but not from the constructor perspective.

What here happens is that the constructor mutates the record type A to include the constructor input as well:
A={(0, {i:Int=0}),(1, {i:Int=1}),..}
so constructor which are values determine types.

If I implement f(Int,Int):D does D= x | g(Int) needs to be changed?

@shelby3
Copy link
Author

shelby3 commented Jul 29, 2018

I have slightly different way of visualizing it and explaining it that may help some others get a different perspective on the same issue. AFAICT, @keean and I are in agreement.

@keean wrote:

@shelby3 even containing yourself is not a paradox. Unboundedness is not paradoxical. You need the specific combination of recursion (self-reference) and negation to cause a paradox.

Note I also mentioned negation:

I don’t conceptualize Russell’s paradox as being a paradox. It simply means that sets which contain themselves aren’t bounded and thus can’t be restricted by a rule which says they don’t contain themselves.

Unboundness by itself isn’t ever paradoxical although it could be uncountable if the domain is not finite.

The problem occurs only when we have both, and the problem is instability, that is the barber is in the set if he does not shave himself, which means he is not in the set, which means he is in the set.

Right.

I still don’t think it is a paradox. Negation (as Socrates pointed out) assumes a total order (or a finite domain). Unboundedness presumes a total order doesn’t exist1. Thus Russell’s paradox is really just an inconsistent set of requirements. It’s asking for something and its antithesis simultaneously.

I like this explanation of Russell’s paradox:

Russell's paradox is a counterexample to naive set theory, which defines a set as any definable collection. The paradox defines the set R of all sets that are not members of themselves, and notes that

  • if R contains itself, then R must be a set that is not a member of itself by the definition of R, which is contradictory;
  • if R does not contain itself, then R is one of the sets that is not a member of itself, and is thus contained in by definition--also a contradiction.

An example of a set which contains itself is the set of all things which include themselves in any way. For example a recursive function (i.e. it includes a call to itself). So the set of things which include themselves in any way, also includes itself.

Another example is the set of those which are or contain those that shave others. The set would then contain itself. We can’t argue that the set itself doesn’t shave others because we have changed the inclusion rule from @keean’s “those which shave others” to “which are or contain those that shave others”. If we then add that it can only include sets that are not members of themselves, we have the first contradiction (as quoted above) because the set contains itself by the first rule, but not allowed by the second rule. So we have specified conflicting rules. But that seems sort of obvious that if we say can contain and can’t contain in the same criteria then we have a conflict and not a paradox.

The paradox is that if we write the first rule to not contain such as the @keean’s “those which shave others” so we can reason that the set itself doesn’t shave others and thus it is not a set which contains itself. But by the second rule (and as quoted above a contradiction) it does contain itself because it doesn’t contain itself. IOW, the second rule is always a paradox because those sets which are not members of themselves are thus members of the rule and members of themselves. The second rule is contradictory with itself.

So what this demonstrates is that universal negation is a total order and can’t exist. Total orders do not exist.

A related discussion about Godel, total orders not existing, and the limitations of math is worth injecting here. Make sure you click the “Load more…” seen on the thread to display the entire discussion.

The outcome of this is that we can’t conclude anything about the totality of the universe because it is asymptotic. If we could have total omniscience then we wouldn’t exist because all uncertainty would be lost. And thus entropy couldn’t progress inexorably to maximize uncertainty. Time would be reversible and indistinguishable. Everything would be static. So the perception of a dynamic universe requires the lack of perception of omniscience. Does that mean we have free will? Well yes in a partially ordered perception we have free will, but universally our free will is just noise and not ordered.


Time is not a factor because the set oscillates even without time (infinitely fast if you will). There is no moment in time we can pause that the set would actually have an enumerable value.

Of course I mean spacetime. Unboundedness is an informational dimension our universe. And again the point is that negation presumes a total order. How do you prove there will never be a neon colored swan?

So the problem of the conflicting rules is that sets that include themselves are open to unbounded inclusion (they never terminate unless we only consider a finite subset of our existence as the domain), but negation presumes bounded exclusion.

Russell’s “paradox” is resolved without restricting to finite domain in the analogous way that FCP solves the unsoundness of HRT by creating a hierarchy of types wrapping the unboundedness in an explicitly enumerated domain of nominal constructors which makes it impossible for inclusion to recurse unbounded over the same domain as itself.

The ordering of [kinds] universes prevents the Russel paradox because no type can be determined in terms of a type in the same universe.

Do you mean to convey that negation of a kind that can’t contain itself (by definition since if it does contain itself then it is its kind + 1, i.e. we add a nominal wrapper when we recurse) can never conflict because this is analogous to the resolution of Russell’s “paradox” mentioned above.

1 This is why for example due to the FLP theorem that blockchains or Byzantine fault tolerant consensus doesn’t work (i.e. don’t have a total ordering for their domain) without bounded network synchrony. That said total ordering is partial ordering in the unbounded network synchrony context, which is why BFT is not unassailable. Analogously as we already discussed, this is why type systems aren’t unassailable.


@sighoya wrote:

What is wrong with constructors?

I don't like constructors defining types as a type should be defined by itself not by a helper, explicitism vs implicitism.

I see similar examples very often:

class A
{
    int i;
    A(int i)
    {
        if(i<0)
            throw new Exception()
        this.i=i
    }
}

What is the type of A as a set?

Especially does it include a record instance {i:int} where i<0, from the record perspective yes and you don't see the constructor implementation from the signature, but not from the constructor perspective.

As I wrote in my prior post, I don’t like FCP-like nominal constructors as means to avoid unsound recursion, because it causes a lot of additional boilerplate. But sometimes that’s the only way to get a sound type system.

But you’re pointing out something different about constructors. You don’t like when the types of the program don’t accurately represent the dependent types (i.e. the allowed values that populate the declared types). IOW, you don’t like runtime values that escape from the invariants that the type checker enforces.

Unfortunately the only way to get what you want is fully dependent typing with something like Coq or Epigram. And those are not suitable for general purpose programming. Very tedious and inflexible.

Type systems really can never be complete. Gödel’s incompleteness theorem tells us that a set of computable axioms can never be both consistent and total. And it’s really conceptually just the same analogous concept I explained above in this post.

@sighoya
Copy link

sighoya commented Jul 31, 2018

@keean
Yet I know why you are against runtime reflection (RR), I'm not the opinion that every type needs a runtime tag, but only those which are worth for (Unions, random structures)

Instead, I would usually prefer more compile time reflection (CR) which D offers limited in Traits which can be used for instance to read out additional record fields for a record polymorphic type.

I also don't like to add or remove types at evaluation time (runtime, compile time), this should be prohibited in any case.

@keean
Copy link
Owner

keean commented Jul 31, 2018

@sighoya seems we are in agreement about this :-)

I am currently thinking we can have a special type-constructors for boxing. Not sure what a good name for it is though. Box[T] requires Addable[T].
Go can get away with Addable[] because it only allows single-parameter interfaces. Box[T] requires Collection[T, U], Printable[U] would not work in that format, and I think having two syntaxes is more complex than just having one even if that one is more complex than the simplest syntax.

@shelby3
Copy link
Author

shelby3 commented Aug 14, 2018

I wrote:

I wrote:

Reminder about my up-thread post on strong modularity (c.f. also). So as I proposed there are modules with type parameters and no translucent sealing of abstract data types. And then unify that post with my recent proposal to forsake global canonicity of typeclass instances.

[…] (discussion of proposed syntax for the proposal to forsake global canonicity)

So strong modules are modules parametrised on interface signatures. This means we can write code that operates on specific data types, functions, and typeclasses contained in a signature without needing to know the implementations of those signatures. Then some other code can import that signature parametrised module and select a specific implementation for that signature thus creating an concrete instance of the module.

This has some overlap with (Named Instances for Haskell Type Classes, pg. 4) the functionality of typeclasses. Typeclasses enable signature parametrised functions which operate on the typeclasses in the signature. Typeclasses are parametrised by data types and contain functions. Essentially they can both accomplish signature parametrised modularity if we remove the global canonicity of typeclasses, but typeclasses are more granular at the level of a function of a module of functions so each function definition implicit transitive interface bounds don’t have to be all explicitly lifted up to module parameters (thus more degrees-of-freedom but more noisy function declarations) which is claimed to at least in Haskell sometimes make type inference more problematic.

http://blog.ezyang.com/2014/08/whats-a-module-system-good-for-anyway/
http://blog.ezyang.com/2017/08/backpack-for-deep-learning/

In an up-thread post, I had also linked to a Rust forum discussion of claimed advantages for parametrised modules:

I do think it would be nice in addition to typeclasses to also have parametrised module scopes for sharing the parametrisation of a group of functions, so they can be modularized as a group scope instead of at every call site. And I think I would like this to be (or at least option to be) applicative so that order of imports doesn’t matter. But I’m thinking it would be sufficient to parametrise them from the top-down with input parameters, instead of abstract types.

That Rust OP admits the parametrised modularity can be achieved with typeclasses:

This is a fine way to solve the problem, but there are some quirks. First, we always need a value to call methods on; there is no way to say once that some functions are in scope and then use them thereafter because there is state that is carried around. But that state is constant, so we lose the benefit of referring to globals implicity compared to having “free” constants not tied to a structure. Furthermore, we have changed the shape of our code considerably. When we were programming with a module, we wrote free functions that lived in some space and imported them as needed. Now, we are writing functions that live in an inherent impl, we have to construct values to hold our configuration, and we must refer to a “self.0” throughout the methods anywhere we try to use the configuration. This is a loss of tersity.

My opinion is we should move forward first with only typeclasses and my variant of the proposal to forsake global canonicity. And see if we really encounter use cases that require signature parametrised mixin modules. If we can avoid adding an overlapping paradigm for modularity, that might be best.

Either way, we’ll still need non-parametrised (aka “weak”) modules for encapsulation and reuse.

@jdegoes argued against fine grained modularity.

EDIT: I have reverted my typeclass modularity proposal back to the “subtypeclasses” concept but with module name prefix on instance selection.

@shelby3 shelby3 mentioned this issue Aug 14, 2018
@sighoya
Copy link

sighoya commented Aug 15, 2018

@shelby3 wrote:

So strong modules are modules parametrised on interface signatures.

This has some overlap with the functionality of typeclasses.

As you don't like multiple ways to solve a problem, why you need signatures on top of typeclasses. Typeclasses should also be able to constrain module parameters so why not us them?

@keean
Copy link
Owner

keean commented Aug 15, 2018

@sighoya Type-classes provide specialisation (as distinct from generalisation which is provided by parametric types). Modules generally provide separate compilation, data-hiding and specialisation. For this language it would seem bet to limit modules to separate-compilation and data-hiding. I guess the issue here is that Haskell has these kind of modules, and there are calls for it to implement something more full featured like ML modules. The question is whether these calls are justified, or whether they are from people that don't understand how to implement what they want with typeclasses?

@sighoya
Copy link

sighoya commented Aug 15, 2018

The question is whether these calls are justified, or whether they are from people that don't understand how to implement what they want with typeclasses?

This is a good question, but it was not my intention to challenge if a module system is worthless for you. The point is that @shelby3 wrote that module parametrization should be constrained by signatures.
Why couldn't this be accomplished by typeclasses?

@NodixBlockchain
Copy link

NodixBlockchain commented Aug 15, 2018

One thing i don't understand with typeclass is in the context of highly interactive applications where the function to be executed on an object depend on global state that are exterior to the object.

For example in photoshop the effect of a click depend on which tool and option is selected.

With OO paradigm, it's easy to have a global class and instanciate the good class/module when a global state change, and runtime polymorphism can achieve this.

With typeclass how it's made in haskell i have more hard time to see how to achieve this same thing.

I can see how typeclass can be useful for program that are designed like linear algebra, always having the same function applied on an input, and rather designed as monolithic thing.

Or typeclass would need to be sort of 'first class', and having a way to select the good set of function to use depending on runtime state. Which i don't see how it would be possible if typeclass function are monomorphized at compile time. With the dictionary thing it would seem more possible to load and use specific typeclass function depending on runtime state.

@shelby3
Copy link
Author

shelby3 commented Aug 15, 2018

@sighoya wrote:

The point is that @shelby3 wrote that module parametrization should be constrained by signatures.
Why couldn't this be accomplished by typeclasses?

@keean he means typeclass bounds instead of signature bounds on modules. We would still retain typeclass bounds on functions also.

That’s a reasonable question, because my conjecture is that parametrised modules provide less fine-grained grouping of parametric generalization which provides some benefits such as less noisy repetition of typeclass bounds on function declarations. With monomorphisation then typeclass bounds on modules is equivalent to signature instantiation of modules.

I think you are correct. Application of typeclass bounds to instances doesn’t require function application.

Unless @keean objects, I will incorporate this into the proposed Zer0 grammar I’m close to completing.

Note @keean had mentioned using typeclasses as modules before.

Why didn’t Backpack think of this?

Ah probably because of Haskell’s canonicity which I have proposed to forsake. Canonicity is why I think they were forced to separate the module system into a different concept of signatures. Remember again that I recently identified “strong modularity” combined with existential quantification as the source of anti-modularity unsoundness, so my proposal to forsake canonicity has to avoid that combination. Haskell doesn’t have the non-disjoint, structural unions which I have proposed and unions may suffice and be superior for most of the use cases we would otherwise use existential quantification (and the unions are bound to typeclass bounds at application to a function call site, not in the union object itself as is the case for existential quantification so it side-steps the unsoundness with strong modularity).

@shelby3
Copy link
Author

shelby3 commented Aug 15, 2018

@NodixBlockchain typeclasses don’t force immutability. If you want to change the pointer to the function to callback, AFAICT there’s nothing in typeclasses preventing that.

Once again you are raising an issue in the wrong thread. This is the Modularity thread. You should have asked that question in for example the Why Typeclasses issues thread.

Please try not to thread jack and take threads off topic. I want to be able to find relevant discussion. I would be more than happy to discuss this topic with you if you raise it in a thread that is more related to the topic you want to discuss.

@shelby3
Copy link
Author

shelby3 commented Aug 16, 2018

I have reverted my typeclass modularity proposal back to the “subtypeclasses” concept but with module name prefix on instance selection.

That linked post explains our modularity ideas for typeclasses have an issue with existential quantification— i.e. runtime polymorphism other than bounded union dynamic polymorphism.

@shelby3
Copy link
Author

shelby3 commented Aug 29, 2018

The reasons why associated (aka abstract) types of typeclasses can’t be replaced by HKT and multi-parameter typeclasses:

  1. Associated types are output types of instances, thus putting instances in control instead of the environment of the function/procedure application/call site. HKT and multi-parameter typeclasses are inputs set by the environment. Thus associated types are generally necessary to invert the control over the types. Someone wrote:

    Note: I am aware that one can express currently express type synonyms in much the same way that one does in Java (or C#, or Eiffel; see Garcia et al paper): by making them a type-parameter of the trait itself. Rust's type-inference system does make that workaround more palatable, but it seems natural to enable the developer to express what they mean more directly: These types are obligations of the implementation, much like the functions declared in the trait, and thus it makes sense to express them the same way.

  2. A motivating example (c.f. also) of multidimensional slices.

  3. The data type which may be hard-coded to a specific type which is parametric in the typeclass— e.g. the element type of a typeclass which the data type with the hard-coded element type can implement as for example a fold or sort interface.

    For example to implement the map interface on a String requires knowing that the callback function takes a type Char but String has no type parameter Char. Compare how clumsy this is to implement in an OOP paradigm without typeclasses (and thus without associated types).

  4. The original research paper for associated data types noted in §2.1 Self-optimising finite maps on pg. 2 that they provide a way for the instance to dictate an optimized data structure:

    Such maps change their representation in dependence on the structure of the key type k used to index the map. We express this idea by defining a type class MapKey, parameterised by the key type, with an associated type Map as one of its components:

    class MapKey k where
       data Map k v
       empty :: Map k v
       lookup :: k  Map k v  v
    
    instance MapKey Int where
       data Map Int v = MapInt (Patricia:Dict v)
       empty = MapInt Patricia:emptyDict
       lookup k (MapInt d) = Patricia:lookupDict k d
  5. And a research paper noted on pg. 54:

    Explicitly specifying type arguments that represent associated types also leads to the introduction of unnecessary implementation dependencies. As explained above, four of the type arguments in the call to the breadth-first search algorithm in Figure 12 are associated types of the first type argument, the graph type. These types may represent internal implementation details of the graph type. At each call to the algorithm, however, these types must be explicitly specified. Consequently, changes that should be just implementation details in a data structure require changes to user code at the call sites of generic algorithms that pass that data structure as an argument.

  6. Associated types are necessary when hidden (helper) parameters are needed.

@shelby3
Copy link
Author

shelby3 commented Aug 30, 2018

Important post about default typeclass instances, overlapping problems, and alternative solutions that apply to modules.

@shelby3
Copy link
Author

shelby3 commented Jan 13, 2019

  1. The data type which may be hard-coded to a specific type which is parametric in the typeclass— e.g. the element type of a typeclass which the data type with the hard-coded element type can implement as for example a fold or sort interface.

For example to implement the map interface on a String requires knowing that the callback function takes a type Char but String has no type parameter Char. Compare how clumsy this is to implement in an OOP paradigm without typeclasses (and thus without associated types).

On further thought, that clumsiness is due to the incorrect design of map in that it returns the same type as the input collection.

Scala’s (arguably Martin Odersky’s) corner-case laden “solution” was affectionately referred to as The Longest Suicide Note Ever Written.

A better solution seems to be having map always output a sequence or an iterator for a sequence. And offer a fromSequence type-class for types that can be mapped. For performance reasons, I don’t agree with making map input a sequence. Instead make it a typeclass and provide an implementation for each input collection type.

So then we really don’t necessarily require associated types for handling this case. We could define strings to be class StringT<T extends Char> and type String = StringT<Char>. That is more fugly though than employing an associated type. Probably want to define Char to be wide enough to encode all Unicode glyphs.

@shelby3
Copy link
Author

shelby3 commented Jan 13, 2019

On further thought (and realize Microsoft broke my ability to edit my prior comment post, or @keean removed my privileged to do so?), we don’t have to incur the inefficiency of forcing map to output a sequence type when we really intend to invoke fromSequence anyway. Simple declare the map type-class as a two parameter type-class that also inputs the type of its output. For unsupported output type combinations, the type-class instance will not be provided and thus the compiler will generate an error.

@keean
Copy link
Owner

keean commented Jan 13, 2019

@shelby3 no changes in permissions from me, but maybe the defaults have changed?

@shelby3
Copy link
Author

shelby3 commented Jan 13, 2019

I will try accessing from the laptop with newer versions of all browsers including IE, and report if I have the same problem.

@keean
Copy link
Owner

keean commented Jan 13, 2019

@shelby3 IE is not a good browser, even Microsoft realised this and tried to fix in Edge. Edge is better if you have windows 10. I used to really like Chrome, but these days I tend to use Firefox. Why not install Firefox?

@shelby3
Copy link
Author

shelby3 commented Jan 13, 2019

I meant Edge (not IE) on my 2017 model ASUS laptop running Windows 10. :shudder:

I hate Microsoft so much I do not even bother to remember the names of any of their new sh8t.

@shelby3
Copy link
Author

shelby3 commented Jun 10, 2020

Late Binding of Interface vs. Modularity?

I wrote:

I have reverted my typeclass modularity proposal back to the “subtypeclasses” concept but with module name prefix on instance selection.

That linked post explains our modularity ideas for typeclasses have an issue with existential quantification— i.e. runtime polymorphism other than bounded union dynamic polymorphism.

I wrote:

Important post about default typeclass instances, overlapping problems, and alternative solutions that apply to modules.

I’ll re-summarize the issues in one place to make this easier to digest and come back to.

  1. Type classes: confluence, coherence and global uniqueness

    Global uniqueness of type class instances is required to insure a globally consistent semantics aka an abstract algebra. Non-eager (i.e. not wild goose chase exhaustiveness checking) intra-module confluence (aka consistent type checking) and intra-module coherence (consistent intra-module semantics) don’t guarantee global (i.e. total order, nor just inter-module partial orders) semantic coherence.

  2. Open type families are not modular

    Open type families or associated types allow instances to create types, i.e. output types. Thus incoherent inter-module instances with open type families or associated type can segfault. Some heuristics are proposed to prevent inter-module incoherence (and enforce the requisite uniqueness of instances required for the necessary inter-module partial orders coherence) without a global total order eager guarantee of global uniqueness.

    The point is also made that global uniqueness and coherence sacrifices modularity. The suggestion is made that perhaps the culture of Haskell needs to change and forsake a globally consistent abstract algebra and by implication instead view type classes only as some spaghetti code and not for enforcing some invariants of an API.

  3. http://blog.ezyang.com/2014/09/open-type-families-are-not-modular/#comment-20942

    Discussion of how the ML programming language differs (c.f. also the Type Inference Issues section below), which afaics is essentially delaying static binding of interface until construction of an object instance as contrasted with OOP which statically binds interface to the data type where (in the case of nominal typing) the data type is declared and its supertype is not replaceable after declaration (or in the case of structural typing where the interface type is declared, c.f. also). I had also shared my thoughts about what I believe to be the undesirable tradeoffs of the 1ML research.

    So while the ML approach has the advantage of being able to operate on data in more than one way (e.g. different Ord sorting partial orders) by multiplexing partial orders defined by the interfaces attached to objects at construction time analogous to the partial orders of OOP but later, static, interface binding — and note each consistent set of interfaces are abstract algebras defining a said partial order if they extend a common ML signature — ML has the disadvantage that an object can only be operated on in one way after the binding of interface upon construction of said object instance. Whereas (without some dubious extensions) Haskell can only ever operate on objects in one consistent way with a presumed (but not intractably compiler enforced) globally consistent coherence aka global abstract algebra, Haskell has the arguably pyrrhic “advantage” of delaying static, binding of an object instance’s interface until the function call site. We can construct objects and then elsewhere in our program add them to collections or unions and bind the interface invariant to the (collection’s or union’s) generic type parameter at function call sites. Whereas afaics ML could only plausibly (hypothetically) allow adding more interfaces to an already constructed object by changing its type which require an exclusive reference (a feature ML doesn’t have) in order to consume the prior type?

    Similar to OOP, Haskell also offers early, static binding of interface known as existential quantification aka Rust’s trait objects. And like OOP these existential type class bound interface invariants are set statically when the existentially quantified data type is declared. But unlike OOP and more like ML, the implementations of these interfaces are delayed until the instance of the existentially quantified object is constructed — so when combined with generic type parameters the analogous delay as ML is achieved except that the specificity of the type is erased by the generic invariant. ML is much more explicit about types which some people find to be a disadvantage and others perhaps argue the benefits which result are an advantage.

  4. I have attempted to devise proposals for merging late, static binding within partial orders of type class abstract algebras, to match ML’s capability while not binding to object instances at their construction as ML does. But any such proposal will not be fully interoperable with unbounded existentially quantified types, c.f. also. Unbounded existentially quantified types are also incompatible with multiparameter type classes (except where only 1 of the parameters is unbounded generic), so maybe I prefer to forsake them.

    Given that type class invariants declared on function definitions are bound at function call sites (as contrasted with ML which binds interface to object types upon object construction), then partial orderings for type classes should be based on partial orders of bindings at function call sites.

    To enforce partial orders of abstract algebras then we need inheritance of signature analogous to ML in the type class context, so I had proposed “subtypeclasses”. IOW, to form a consistent abstract algebra for ReverseOrd (synonymous with Backwards(OrdInt) in the aforelinked ML example) then we need to name it as a partial order of Ord so that each data type can only be implemented once and thus coherently for each “subtypeclass” such as ReverseOrd — i.e. no overlapping instances for and enforce global uniqueness of instances of each partial order “subtypeclass”.

    Function definitions would require an Ord type class interface invariant, and function call sites could inject the additional invariant of ReverseOrd, so that modularity of specialization is achieved along with the latest binding of interface to objects at function call sites. This is not equivalent to placing the additional “subtypeclass” invariant on the type class interface requirements of the function definition that contains the said function call site(s), because the body of (i.e. implementatin of a) function might call a plurality of said function call sites, each with different specializations, e.g. ReverseOrd, ForwardOrd, AdditiveMonoid, MultiplicativeMonoid, etc..

    The challenge is the balance between being overly explicit versus losing modularity due to being implicit with “subtypeclasses”.

    1. Type classes with global canonicity (i.e. that have no “subtypeclasses”) don’t require lifting to explicit module parameters for noiseless, highly granular modularity and refactoring.

    2. Whereas, “subtypeclasses” should be explicit module parameters for maximum composability. Or some implicit or default combined with explicit override model which trades-off some modularity specificity for less explicit explosion in number of module parameters. For example import a module or function from a module, and declare that all Ord at function call sites in that module or specific function are to be ReverseOrd where not otherwise specialized on call sites. Obviously either way, minimizing “subtypeclasses” use will minimize explosion in explicit specificity.

Note AFAICT Scala 3 is ostensibly not enabling said partial orders correctly.

Type Inference Issues

Yet another blog also written by the co-developer of the Haskell modularity project Backpack What’s a module system good for anyway? discusses the issues with abstracting over for example the numerous string types in Haskell (e.g. String, Text and ByteString) and claims that ML signatures with functors is superior to type classes:

Using type classes (3) to regain modularity may seem like an attractive approach. But this approach has both practical and theoretical problems. First and foremost, how do you choose which methods go into the type class? Ideally, you'd pick a minimal set, from which all other operations could be derived. However, many operations are most efficient when directly implemented, which leads to a bloated type class, and a rough time for other people who have their own string types and need to write their own instances. Second, type classes make your type signatures more ugly String -> String to StringLike s => s -> s and can make type inference more difficult (for example, by introducing ambiguity). Finally, the type class StringLike has a very different character from the type class Monad, which has a minimal set of operations and laws governing their operation. It is difficult (or impossible) to characterize what the laws of an interface like this should be. All-in-all, it's much less pleasant to program against type classes than concrete implementations.

Wouldn't it be nice if I could import String, giving me the type String and operations on it, but then later decide which concrete implementation I want to instantiate it with? This is something a [ML-like] module system [e.g. Backpack] can do for you!

[…]

If designing an effective and principled type class to capture what String, ByteString, and Text “have in common” is difficult, then why would designing a module signature to do the same thing be any easier? I think these are the same problem […]

Great question. I think if you wanted to design “one” module signature to capture string representations, you would certainly still have the same problem. However, in an Backpack or ML-like module system, it is a lot less expensive to have multiple module signatures than it is to have multiple type classes: to implement a new type class, you have to explicitly write an instance; but to implement a module signature, all you need is for the module to be set up in the right way.

In my mind they aren’t really the same problem, for a rather peculiar reason. Even though many of the functions in e.g. Data.Text can be easily written in terms of smaller components, nobody really thinks that users should have to do so themselves. Rather, users should be able to import Data.Text and have the full API available to them. However, nobody wants to use large monolithic typeclasses (like ListLike), and really nobody is interested in writing new instances of monolithic typeclasses. So while people are willing to write all the functions for the module, they aren’t willing to list them all again in a class interface. So hopefully there will be less resistance to using a module system like Backpack […]

[…]

In the end, I see ListLike and TextualMonoid as both essentially implementation details of the same idea, presenting slightly different interfaces. Both are typeclass-based, though, with all the baggage that brings. If Haskell had ML-style modules, most likely neither would have come into being.

https://www.reddit.com/r/haskell/comments/28v6c9/backpack_an_mllike_module_system_for_haskell/cierxc1

Among other things, it's very useful for library code where you don't want to decide on a concrete implementation for the user. For example, the functional graph library does this by having a Graph typeclass. This is important to support different usecases of the library.

Unfortunately, the typeclass-based solution to this problem is very awkward in practice, and messes up type inference consistently. Using it felt like I was fighting the system more than taking advantage of it. For things like this, a real module system would be a significant improvement.

What issues did you have with it? What made you feel like you were fighting it?

Mostly dealing with type inference, especially if I wanted my code to be generic over the graph implementation too. Ultimately, I just ended up using a concrete type in my code, which was unfortunate. I think there were some other minor annoyances, but it was a while ago so I don't remember exactly.

I discussed this with @keean and he seemed to concur with my point that Haskell’s type classes don’t have to be monolithic. Employ class inheritance to extend only those finely grained type classes required by each separate use case and automatically inherit the extant instance implementations, e.g.:

class Concat a b c where
  concat :: a -> b -> c

class  (Concat a a a, Ord a) => ConcatOrd a
instance  (Concat a a a, Ord a) => ConcatOrd a

Thus it appears that the claimed advantage for ML-like signatures and functors distills to some perceived advantage w.r.t. type inference.

Type inference ambiguity is the result of underconstrained type parametrisation. Which is why in my Haskell example code above I set the Concat a b c to (Concat a a a… so that compiler isn’t underconstrained as to which type to choose for c when the input arguments a and b are set to the same type parameter at the call site of concat a a. Yet if the programmer attempts to write code that is unconstrained by invoking/applying (aka “calling”) with concat a b then my definition of ConcatOrd a tells the compiler that a and b must be the same. Or in some cases the return type will be constrained at the call site to prevent ambiguities.

So afaics (which was @keean’s point) what this distills to is that if a programmer wants to (or unwittingly does) write underconstrained polymorphism then the capability exists in these PLs to do so. I really don’t understand how those I quoted above can claim that a ML-like module system won’t also suffer type inference ambiguities if the programmer attempts to write underconstrained polymorphic code? I understand that ML functors enable selective choice of interface(s) from signatures, but afaics as I demonstrated by example code above, so does type class inheritance. Those ML signatures and resultant ML interfaces can still be underconstrained polymorphic if the programmer so chooses. I understand (as I explained in the previous section of this comment post) that ML binds those interfaces to the object at time of construction unlike type classes which delay binding interface to objects until the call site of a function which requires such interfaces.

Thus I conclude that the cited blog is a red herring. Note that experience often trumps theory in the PL realm, so perhaps I am missing something which hasn’t been clarified sufficiently for me (and @keean) in the cited blog — which is why I discussed it with @keean given he has much more experience with Haskell and ML-like languages than my nearly nil experience with those PLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@shelby3 @keean @sighoya @NodixBlockchain and others