Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast track transpiling to TypeScript? #13

Open
shelby3 opened this issue Sep 24, 2016 · 142 comments
Open

Fast track transpiling to TypeScript? #13

shelby3 opened this issue Sep 24, 2016 · 142 comments

Comments

@shelby3
Copy link

shelby3 commented Sep 24, 2016

Building off the reasoning and justification in my self-rejected issue Fast track transpiling to PureScript?, I have a new idea of how we might be able to attain some of the main features proposed for ZenScript, without builiding a type checker by transpiling to TypeScript.

If we can for the time being while using this hack, presume that no modules will employ differing implementations of a specific data type for any specific typeclass, i.e. that all implementations of each data type are the same globally for each typeclass implemented (which we can check at run-time and throw an exception otherwise), then the module can at load/import insure that all implementations it employs are set on the prototype chain of all the respective classes' construction functions. In other words, my original point was that JavaScript has global interface injection (a form of monkey patching) via the prototype chain of the construction function, and @svieira pointed out the potential for global naming (implementation) conflicts.

So the rest of the hack I have in mind, is that in the emitted TypeScript we declare typeclasses as interfaces and in each module we declare the implemented data types as classes with all the implemented interfaces in the hierarchy. So these classes then have the proper type where ever they are stated nominally in the module. We compile the modules separately in TypeScript, thus each module can have differing declarations of the same class (because there is no type checking linker), so that every module will type check independently and the global prototype chain is assured to contain the interfaces that the TypeScript type system checks.

So each function argument that has a typeclass bound in our syntax, will have the corresponding interface type in the emitted TypeScript code. Ditto typeclass objects will simply be an interface type.

This appears to be a clever way of hacking through the type system to get the type checking we want along with the ability to have modules add implementations to existing data types with our typeclass syntax. And this hack requires no type checking in ZenScript. We need only a simple transformation for emitting from the AST.

As for the first-class inferred structural unions, TypeScript already has them, so no type checking we need to do.

It can't support my complex solution to the Expression Problem, but that is okay for starting point hack.

I think this is way we can have a working language in a matter of weeks, if we can agree?

That will give us valuable experimentation feedback, while we can work on our own type checker and fully functional compiler.

TypeScript's bivariance unsoundness should be avoided, since ZenScript semantics is to not allow implicit subsumption of typeclass bounds, but this won't be checked so it is possible that bivariance unsoundness could creep in if we allow typeclasses to extend other typeclasses. Seems those bivariance cases don't impact my hack negatively though:

When comparing the types of function parameters, assignment succeeds if either the source parameter is assignable to the target parameter, or vice versa.

Not a problem because of course an interface argument type can never to be assigned to any subclass.

When comparing functions for compatibility, optional and required parameters are interchangeable. Extra optional parameters of the source type are not an error, and optional parameters of the target type without corresponding parameters in the target type are not an error.

When a function has a rest parameter, it is treated as if it were an infinite series of optional parameters.

We should be able to design around this.

Also a compile flag can turn off TypeScript's unsound treatment of the any type.

TypeScript is structural but there is a hack to force it to emulate nominal typing. We could consider instead transpiling to N4JS which has nominal typing and soundness, but it is not as mature in other areas.

@keean
Copy link
Owner

keean commented Sep 24, 2016

I had actually planned to start with no optimisation or type checking :-) Simply parse the source, discard the typing information and output the JavaScript. You can then already start experimenting with the syntax and writing simple programs, and you will simply get runtime errors (or bugs) in the JavaScript.

Type-classes are not that simple however, because they don't obviously associate with any given argument to the function. Consider a Cast type class that converts a value of type A to type B, and a function that uses it:

fn f<A, B>(x : A, y : B) where Cast<A, B> =>
    let x_as_B : B = cast(x)
    ...

(aside: we need to decide on where the type-class requirements list is going to go in function definitons).

Cast is does not provide methods on A or B but provides independent functions (all type classes do).

What we actually want to do is turn the type-class into a Record data type, turn the implementation into a value of the type of that Record, and then add an extra parameter to the function. The JavaScript would be something like:

var cast_A_B = {
    cast : function(x) { return x; } // whatever casting function
 }

function f(x, y, cast_class) {
    var x_as_B = cast_class.cast(x)
    ...
}

Note there is no representation of the type-class itself in the JavaScript, it just has the implementations.

In the future we can monomorphise and inline the typeclass function and remove the extra argument where appropriate, but that can be viewed as an optimisation.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean wrote:

Simply parse the source, discard the typing information and output the JavaScript.

If that was feasible, I would have created ZenScript a long time ago. Your plan is not feasible.

The compiler has to select the dictionary for the implementation of the data type which is passed in the source code, but there is no way for ZenScript to do that without also doing full type checking.

That is why I devised this hack described in the OP.

My hack transforms typeclassing into subclassing within the module, so that a subclassing type checker can check everything. And it requires no typechecking in ZenScript. And the only downside I see, is that it requires that modules don't provide conflicting implementations (which is damn easy for us to do now at this experimentive stage).

So again I ask you do you want to accept my proposal that we first prioritize transpiling to TypeScript? I am probably going to do if you won't. Afaics, there is no other fast track way.

Type-classes are not that simple however, because they don't obviously associate with any given argument to the function.

fn f<A, B>(x : A, y : B) where Cast<A, B> =>

Obviously we can't do that in my hack, but we can do higher-kinds if our transpile target supports them:

fn f<A, B>(x : A<B>) where Cast<A<B>> =>

P.S. would you please stop writing this:

fn f<A, B>(x : A, y : B)

Since (for the 5th time I will state that) I presume we've decided (since I've seen no objections) on the following.

let f(x : A, y : B)

Or:

let f = (x : A, y : B)

We unified around let rather than have an unnecessary fn along with let. And we made type parameters ALLCAPS.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@shelby wrote:

  1. Sum, Recursive, Record, and Product parametrized data type with optional Record member names and optional Record (e.g. Cons) and Product (e.g. MetaData) names:

    data List<T> = MetaData(Nil | Cons{head: T, tail: List<T>}, {meta: Meta<T>})

Thinking about how to emit that in a language with classes and tuples. We also need to get the declaration of covariant or contravariant for type parameters correct. Also were is mutability declared above?

sealed interface NilOrCons<T>   // or `abstract class`, note sealed may not be available but that is ok
object Nil implements NilOrCons<Bottom> // or `class` and instantiate a singleton instance
sealed class Cons<T> implements NilOrCons<T> {
  var head, tail
  constructor(head: T, tail: NilOrCons<T>) {
    this.head = head
    this.tail = tail
  }
}
sealed interface List<T>
sealed interface Meta<T>
sealed interface Recordmeta<T> {
  var meta
  constructor(meta: Meta<T>) {
    this.meta = meta
  }
}
sealed class MetaData<T> extends Tuple2<NilOrCons<T>, Recordmeta<T>> implements List<T>

Makes one realize the unified data syntax is much more succinct! 😎

@keean
Copy link
Owner

keean commented Sep 24, 2016

Nil should have the same type as Cons. The reason is we do not know how long the list is at runtime, so how do we type check:

let f<A>(list : List<A>) requires Show<A> =>
    while list /= Nil :
        show(list.head)
        list = list.tail

@keean
Copy link
Owner

keean commented Sep 24, 2016

Please no all caps, it's like shouting in my text editor :-(

ML prefixes type variables with a character (happens to be ' but could be anything)

Also I note you want to get rid of the type parameter list, you do realise sometimes you need to explicitly give the parameters if they cannot be inferred like:

let x = f<Int>(some_list)

This also makes me realise another ambiguity in our syntax, it is hard to tell the difference between assigning the result of a function and assigning the function itself as a value. In the above you might have to pass to the end of the line to see if there is a '=>' at the end. This requires unlimited backtracking which you said you wanted to avoid.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean's prior comment has a response. Please where reasonable to do so, let's try to keep the syntax discussions in their Issue thread. I started the tangent in this case. 😯

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean wrote:

Nil should have the same type as Cons. The reason is we do not know how long the list is at runtime, so how do we type check:

let f(list : List<A>) requires Show<A> =>
    while (list != Nil)
        show(list.head)
        list = list.tail

Disagree. And nearly certain you are incorrect.

Assuming:

data List<T> = Nil | Cons{head: T, tail: List<T>}

The list != Nil is a guard that insures list is a Cons in the guarded code block, because otherwise list.head and list.tail must be a compiler error on a polymorphic type List<A>. This is yet another example where you are misapplying your Haskell mindset to a subtyping language. TypeScript already has these guards in their type checker.

But the guard doesn't change the type of list (can't change the type of a reference after it is constructed), it only specializes it where it is necessary. Thus the type of list remains List<A>.

I don't think you should attempt the type checker as our first step. We should gain experience with TypeScript's type checker first, before attempting our own. After that, everything will be clearer to both of us, so we don't commit major blunders on the type checker and lose precious time.

@keean
Copy link
Owner

keean commented Sep 24, 2016

So what you are saying is that List<Bottom> <: List<A> where A is any type other than bottom. However Nil cannot have a different type without dependent types as we do not know where the end of the list is until runtime.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean wrote:

So what you are saying is that List<Bottom> <: List<A>

Yes if by <: you mean subtypeof.

where A is any type other than bottom.

Disagree. List<Bottom> is a subtypeof of itself. Bottom being a subtypeof itself doesn't mean nor necessitate that it can be instantiated (quantified).

However Nil cannot have a different type without dependent types as we do not know where the end of the list is until runtime.

Different type from what? And what is the problem you see? I don't see any problem.

The list != Nil is a run-time test. The types match because Nil is subsumable to a List<A> and the != is defined for List<A> (actually List<_> unless we want deep comparison of all elements) not List<Bottom>. You forgot the trait bound for the != operation, although maybe we can make that implicit requires?

 let f(list : List<A>) requires Show<A>, Eq<List<_>> =>
     while (list != Nil)
         show(list.head)
         list = list.tail

Where _ means not used.

Generally though, I am thinking we will need higher-kinds.

@keean
Copy link
Owner

keean commented Sep 24, 2016

In the case of the above datatype Eq<List<A>> would be the correct trait bound.

In the above Nil would not be typed List<Bottom> as there is enough typing information to determine that Nil has the type List<A>, we can tell that because the arguments of /= must have the same type.

In other words we assign Nil the type List<B = Bottom> where B is a fresh unused type variable, then we know list has the type List<A> where A is the type in the function type. We then know the types of /= must be the same so we have to promote Nil to type List<B = A>

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

What the heck is /= in this context? I thought it was a typo and changed it to !=.

@keean
Copy link
Owner

keean commented Sep 24, 2016

Do you prefer !=, I have programmed in lots of languages so I know a few, !=, /=, =/=, <> etc...

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

For me and everybody I know who uses a mainstream programming language, /= means divide-equals.

We are targeting JavaScript. The / means divide.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean wrote:

In the case of the above datatype Eq<List> would be the correct trait bound.

In the above Nil would not be typed List<Bottom> as there is enough typing information to determine that Nil has the type List<A>, we can tell that because the arguments of /= must have the same type.

In other words we assign Nil the type List<B = Bottom> where B is a fresh unused type variable, then we know list has the type List<A> where A is the type in the function type. We then know the types of /= must be the same so we have to promote Nil to type List<B = A>

This is hopeless. You can stop thinking in terms of Haskell's lack of subtyping and subsumption.

Mythical Man Month effect is taking over.

Nil remains List<Bottom>. Its type can't change. It is subsumed as necessary to type check separately in each expression where it appears.

@keean
Copy link
Owner

keean commented Sep 24, 2016

Ada uses /= :-) It comes from the Pascal family of languages,

@keean
Copy link
Owner

keean commented Sep 24, 2016

@shelby3 wrote:

This is hopeless. You can stop thinking in terms of Haskell's lack of subtyping and subsumption.

I think we should avoid subtyping and subsumption, except specifically for union types where we have set-based subsumption like Int | String :> Int

Edit: Remember we agree on how bottom and top should behave, so its probably just a communication problem :-)

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean I need to go jogging and buy prepaid load for my mobile phone. I suggest we focus on the syntax for now and that you experiment with output to TypeScript (or Ceylon) to get a feel for how these type systems work with subtyping and as we do this transform of our simplest typeclasses.

It is too much verbiage to discuss the typing now.

We can't target Scala as it doesn't have the unions.

@keean
Copy link
Owner

keean commented Sep 24, 2016

There's a reason I dont like typescript or Cylon :-)

Anyway above I was thinking about bidirectional type inference, which was a mistake, as I actually want things to be compositional, which would align better with what you want.

My point about not losing track of type variables, and needing more than one bottom in something like Pair<A, B> still stands though.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

Finished jog. Very fast. Headed to buy load.

Please let us put typing strategies aside for now and focus on the syntax and transpiling to TypeScript. This is what I am going to do. I need a working language yesterday. I don't have time to wait for the perfect language. Sorry.

Let's focus on what we can do quickly.

The perfect language will come incrementally.

@keean
Copy link
Owner

keean commented Sep 24, 2016

Then we are stuck with typescripts unsound type system?

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

@keean wrote:

Then we are stuck with typescripts unsound type system?

I documented upthread why I thought we could avoid the aspects that are unsound. If not, we could try N4JS which the authors claim is sound.

The transformation to subclassing will lose some of the degrees-of-freedom that we get with for example multi-type parameter typeclasses, but we can start with single-parameter so at least I can start experimenting with the syntax and typeclass concept.

You may not agree with this direction because you may think it is not correct to model with a subtyping type checker. I think we will end up with a subtyping type checker eventually any way and that you will eventually realize this. If I am wrong, then I am wrong. You think you know, but I doubt either us can know for sure yet.

But what are your options? You can't transpile to PureScript because it doesn't allow impurity. So your only option is to build the entire type checker. But that will take months to get to a working stable, usable state. And there is a non-zero risk that you would realize along the way that your design was incorrect and full of corner cases issues. Then you'd start over again.

So what alternative do you propose?

We both agree that we need an imperative, typeclass language. It seems we mostly agree on the syntax. Our major conflict if any is on implementation and which strategy to prioritize.

@shelby3
Copy link
Author

shelby3 commented Sep 24, 2016

Then we are stuck with typescripts unsound type system?

http://www.brandonbloom.name/blog/2014/01/08/unsound-and-incomplete/

@shelby3
Copy link
Author

shelby3 commented Sep 25, 2016

Thinking about if we really need higher-kinds, because TypeScript apparently doesn't have the feature and Ceylon's feature may or may not be adequate. Both of them offer first-class unions, but TypeScript's guards may require less boilerplate (Edit: Ceylon has flow-typing aka typeof guards), we access prototype in native code, and presumably with high certainty TypeScript has more bijective JS emission since it is a minimal superset. Scala offers higher-kinds (and simulation of traits with implicits), also compilation to Scala.JS, but no first-class unions.

A Monoid is an example of an interface that needs both the type of the datatype implemented and the element type of the said datatype.

interface Monoid<A, SELF extends Monoid<A, SELF>> {
    identity(): SELF
    append(x: A): SELF
}

abstract class List<A> implements Monoid<A, List<A>> {
    identity(): List<A> { return Nil }
    append(x:A): List<A> { return new Cons(x, this) }
}
class _Nil extends List<never> {}
const Nil = new _Nil
class Cons<A> extends List<A> {
    head: A
    tail: List<A> 
    constructor(head: A, tail: List<A>) {
        super()
        this.head = head
        this.tail = tail
    } 
}

I thought the above code is higher-kinded, but I guess not because no where did I write SELF<A> and thus compiler isn't enforcing that A is a type parameter of SELF's type constructor. And it compiles and assumes that an implementation of Monoid for Nil and Cons<A> will always be same as the implementation for List<A>, which is why I added run-time exceptions to assert the invariants the compiler can't check. It even compiles if elide the types we couldn't know in order to transpile without any type checking.

Whereas, the following attempt didn't because it doesn't make that assumption.

interface Monoid<A> {
    identity(): this
    append(x: A): this
}

abstract class List<A> implements Monoid<A> {
    identity(): this { return <this>Nil }
    append(x:A): this { return <this>(new Cons(x, this)) } // Error: Type 'Cons<A>' cannot be converted to type 'this'.\nthis: this
}
class _Nil extends List<never> {}
const Nil = new _Nil
class Cons<A> extends List<A> {
    head: A
    tail: List<A> 
    constructor(head: A, tail: List<A>) {
        super()
        this.head = head
        this.tail = tail
    } 
}

And the following attempt didn't compile also because it didn't make that assumption.

interface Monoid<A> {
    identity(): this
    append(x: A): this
}

abstract class List<A> implements Monoid<A> { // Error: Class 'List<A>' incorrectly implements interface 'Monoid<A>'.\n  Types of property 'identity' are incompatible.\n    Type '() => List<A>' is not assignable to type '() => this'.\n      Type 'List<A>' is not assignable to type 'this'.
class List<A>
    identity(): List<A> { return Nil }
    append(x:A): List<A> { return new Cons(x, this) }
}
class _Nil extends List<never> {}
const Nil = new _Nil
class Cons<A> extends List<A> {
    head: A
    tail: List<A> 
    constructor(head: A, tail: List<A>) {
        super()
        this.head = head
        this.tail = tail
    } 
}

@shelby3 shelby3 mentioned this issue Sep 25, 2016
@keean
Copy link
Owner

keean commented Sep 25, 2016

Well both of those are having to bend a monoid to fit their objects only systems (see my post on objects or no objects).

A monoid is simply this:

pluggable Monoid<A> :
    mempty : A
    mappend(A, A) : A

Note in a non object system it has no self reference and no need for higher kinds, they are both free functions like string literals and string concatenation. So using that example:

String implements Monoid :
    mempty = ""
    mappend = +

And we can use it like this:

let x = mappend(mempty, mempty)

Note mappend and mempty are just functions not methods, just like "" and + are not methods. Infact my proposal for "no objects" would mean no methods at all, you simply create a data-type and implement the type-classes you want on it. Namespacing is handled by modules.

Regarding why your implementation did not work, try:

interface Monoid {
    identity(): this
    append(x: this): this

@shelby3
Copy link
Author

shelby3 commented Sep 25, 2016

Regarding why your implementation did not work, try:

interface Monoid {
    identity(): this
    append(x: this): this

I tried but it won't type check and Monoid has incompatible semantics with a list.

Readers see also the related discussion in the Higher-kinds thread.

@keean
Copy link
Owner

keean commented May 24, 2017

That is my solution to the Expression Problem which I explained for the first time on the Rust thread and then clarified more in your Issues threads. So thus my idea had prior art.

Indeed, I met Jeremy at a workshop on "Datatype Generic Programming" at Oxford University back around 2004 around when the HList paper was published. The HList paper was all about type-indexed-types, and I could see how this enabled some really elegant programming, however trying to implement this in Haskell was fighting an uphill battle, because it was not the focus of the language nor its design committee.

@shelby3
Copy link
Author

shelby3 commented May 24, 2017

@keean I want to work with you. But I am afraid we are going to repeat the same problem in our language to each other. We always do. It is difficult for you or I to change our personality or mannerisms. Maybe I could try a new tactic? When ever you write something that offends me, I could quote it and write "could you please rephrase that for me?". But if you reply with some obstinate insistence that "wrong is wrong" then we would not have made any progress on resolving it. If I instead try to just ignore it, it does not work because it accumulates. I really think a person's personality is take it or leave it. You are very knowledgeable. It is a shame if our personalities clash.

I do not have enough precision in my work on this to possibly satisfy your precision. I gain precision over time. When you talk with those other experts, they have specialized in this field with PhDs, and they speak your language with high precision. We are perhaps mismatched.

@shelby3
Copy link
Author

shelby3 commented May 24, 2017

I could see how this enabled some really elegant programming, however trying to implement this in Haskell was fighting an uphill battle, because it was not the focus of the language nor its design committee.

If I do complete a simplified language, it will not hit every aspect of what you were thinking of doing, at least not at the outset. But overtime it is likely to trend towards something closer to what you want. In any case, it will at least it will have some things you want (e.g. typeclasses) so it would be a first step towards experimentation and see what works well and does not work well.

As you know my priority right now is on compatibility with transpiling to TypeScript and simplifying as much as possible.

I want to start the LL(k) grammar tomorrow. If I can't make this happen quickly, then I really need to abandon and put it on the back burner. So my available time for talking about design decisions has expired. If there are any more points, they need to be made immediately.

@shelby3
Copy link
Author

shelby3 commented May 24, 2017

It is the disjointedness that prevents flattening.

The nesting of Either in Left and Right can not be flattened to the union of the wrapped types because of the wrapping.

I am do not yet clearly see how to reconcile that with your statement, but we may just have different conceptualization of terminology, so I am not sure if what I am thinking about is the same as what you are thinking about. Afaics, the disjointedness seems unrelated to flattening, i.e. (A & B) | (B & C) does not prevent any logical flattening. Whereas Either[Either[A,B], Either[B,C]] can not be flattened (from a 2 options with 2 options for each parent option) to the 3 options choice A | B | C.

@shelby3 wrote:

Now I get another hint as to why first-class intersection types can do strange things to a type system.

I do not see how to implement typeclasses for intersection types without having to implement every permutation manually (which is ridiculous and unacceptable). Thus I think I am pretty much decided to not allow intersection types (the programmer can use a nominal product type instead). For unions we can dispatch on the tagged option dynamically, i.e. all possibilities are covered by implementation of typeclasses for each possible option. Optionally the programmer can choose to implement a set of options, e.g. for Branch | Leaf, and the compiler will of course choose the best fit implementation.

For Sum types with options as values, those values can not be implemented for typeclasses individually because they are not types. So instead we implement the entire Sum type for the typeclass, then we must always use type-case logic, versus the aforementioned optional design pattern for tags-as-types. Thus the options as types (tags-as-types aka type-indexed-types) seems to interopt better with typeclasses.

The type-indexed-types seem to have more flexibility than the tags-as-values (aka Sum types) form of co-products. We lose the origin on the Lambda Cube but we gain the ability to not double-tag JavaScript types and the other advantages enumerated.

I am still wondering if there are any tradeoffs other than the loss of global and local inferencing already mentioned.

@keean
Copy link
Owner

keean commented May 25, 2017

Have we come to agreement on nomenclature? It seems you are looking at 'tagged unions' as opposed to 'disjoint tagged unions' (which are also known as sum types). This suggests 'non-disjoint sums' as a name that is consistent with the body of computer science publishing.

Regarding precision and working, I think it is important to focus on what is wrong. Your whole idea of 'non-disjoint sums' is not wrong, it was the naming. I suggest you focus on what specific part of what you said is wrong, Likewise I will try and be more precise about what is wrong. However I only have a small brain, and if I am half focused on not saying "you're wrong", I won't have enough left to solve the problems, but I will do my best to avoid it.

I would appreciate if you could cut down on the ad-hominem attacks in return, which don't help further the discussion, and probably undermine your credibility with other readers.

@keean
Copy link
Owner

keean commented May 25, 2017

For Sum types with options as values, those values can not be implemented for typeclasses individually because they are not types. So instead we implement the entire Sum type for the typeclass, then we must use type-case logic. Thus the options as types seems to interopt better with typeclasses.

We don't need to use type-classes because the 'case' statement, and normal pattern matching work.

In the cases were we do, Haskell cannot do this, but Datatype generic programming allows type-classes to be declared using the structure of the types.

Using type-indexed-coproducts might be a better solution if it was built into the language, this is what we were investigating in the HList paper. What I don't yet know is if this mechanism is sufficient to write elegant boiler-plate free programs on its own, or if the traditional sum types are needed as well. Really the only way to determine this is to create an experimental language with just the experimental feature (as design is more about what you leave out) and see what it is like to program with without the other features.

@keean
Copy link
Owner

keean commented May 25, 2017

I do not see how to implement typeclasses for intersection types without having to implement every permutation manually (which is ridiculous and unacceptable). For unions we can dispatch on the tagged option dynamically, i.e. all possibilities are covered by implementation of typeclasses for each possible option.

An intersection type represents a function is a combination of other functions like Int -> Int /\ Float -> Float as such it is an alternative representation of a type class, or a module, but as a first-class value. So given:

f(x) => (x(3), x('a'))

we can easily infer the type: (Int -> A /\ String -> B) -> (A \/ B)

What if we passed the function as a dictionary (record):

f(dict) => (dict.x(3), dict.x('a'))

Now we can make dict an implicit argument:

f(implicit dict) => (dict.x(3), dict.x('a'))

So a type-class is kind of an implicit module, and a module is a nominally typed intersection type.

What does it mean to have a type-class of type classes? It seems like nonsense, but in that case if our type system allows type-classes of types, and permits intersection types, does that mean it admits nonsense?

One argument to avoid intersection types is that if a 'typeclass' is something other than a 'type' then you cannot create nonsense, because typeclasses only range over types, and not over themselves.

This seems vaguely similar to set theory, and the Russel paradox. If we use intersection types, we are collapsing everything into one level, like set theory, and that probably means there are problems (infact we know it is incomplete and some unifications never terminate). Type classes have a stratification that prevents this, and also limits us to one meta-level (that is programs that create programs, not an infinite stack of programs creating programs).

@keean
Copy link
Owner

keean commented May 25, 2017

I think to work out what we want regarding intersection type, you have to consider the bottom level, what is the data and what is the memory layout.

Simple values like 'Int' and 'Float' are easy. Structs make sense.

Union types do not make sense, because we cannot interpret a word from memory if we do not know whether it is a Float or an Int. We need some 'program' to interpret the data based on some 'tag' that is stored elsewhere in memory. In other words 'unions' are not a primitive type to the CPU.

Intersection types do not make sense either. You do not have machine code functions that can cope with Ints or Floats. You can have generic machine code, for example 'memcpy' can copy an object of any type as long as we know its length. In other words pointers are better modelled as universally quantified types than intersection types.

Structs (objects with properties) each of which can be typed make sense.

Type-classes make sense, where we know the type of something we can select the correct method to use.

All this assumes an 'unboxed' world like that of the CPU.

Another way to think about this is that the computer memory is filled with bits like

11001010101011001100101101000011

To interpret these bits we need to know how they are encoded. Static types represent the implicit knowledge of how they are encoded based on their location (the static refers to lack of motion, hence fixed location).

To decode dynamic types, we need to know how to interpret the encoding of the type, so we run into the question, what is the type of a type. To avoid the Russel paradox we cannot answer "the type of a type is type", so we need something else. My answer to this is that 'tags' encode runtime type information, and the 'sum' datatype is the type of the tag + its data.

@keean
Copy link
Owner

keean commented May 25, 2017

To make union types make sense we would need a standard encoding for all types. This means we must have at least partial structural typing.

The primitive types we need to start with would be:

  • unsigned int 8, 16, 32, 64
  • signed int 8, 16, 32, 64
  • float 32, 64

Maybe some bigger ones for future-proofing, and then some vector types for SIMD:

  • unsigned 2x64, 4x32, 8x16, 16x8 (possibly others)
  • signed 2x64, 4x32, 8x16, 16x8 (possibly others)
  • float 2x64, 4x32

That would do for basic types. Unlike static typing where we can represent these in the compiler, we need these to actually be written to memory for use are runtime.

We then need an encoding for objects (which are tagged product types) and unions.

For your solution to work you will need to work all these codings out and include them in the language runtime.

My solution above, only allows static types based on location, and the implementation of unions would be layered on top of this, implemented as DSL library in the new language.

@shelby3
Copy link
Author

shelby3 commented May 25, 2017

@keean wrote:

This suggests 'non-disjoint sums' as a name that is consistent with the body of computer science publishing.

Well that suggests my original understanding that in math a ‘sum’ is just a co-product (as I cited Filinski in 1989 wrote about sums versus products in the field of computer science). That is why I just assumed that the popular term Sum type, applied as a general name for any co-product type. And I was thinking that Haskell’s algebraic types are disjoint tagged unions as one possible variant of a sum type. The mainstream sources do not seem to clarify this unambiguously. You cited some literature which was also somewhat vague on the historic delineation of the use of the terminology.

That is why I have stated to you in private that I think “this is wrong” is not very productive. It is more productive to try to go develop a very well thought out statement which recognizes what your colleague has correct and builds upon it, in way that is not just some quip but rather imparts information for your colleague and readers. In that way, your colleague will not view it as unproductive ego contests.

I would like to cite my recent post as an example of such a statement which does not bother with bluntly saying “you are wrong”:

@shelby3 wrote:

It is the disjointedness that prevents flattening.

The nesting of Either in Left and Right can not be flattened to the union of the wrapped types because of the wrapping.

I am do not yet clearly see how to reconcile that with your statement, but we may just have different conceptualization of terminology, so I am not sure if what I am thinking about is the same as what you are thinking about. Afaics, the disjointedness seems unrelated to flattening, i.e. (A & B) | (B & C) does not prevent any logical flattening. Whereas Either[Either[A,B], Either[B,C]] can not be flattened (from a 2 options with 2 options for each parent option) to the 3 options choice A | B | C.


Regarding precision and working, I think it is important to focus on what is wrong.

Disagree. As I have stated in private, I think it is important to focus on stating what is correct. It is more than just a subtle difference. I am not saying to ignore correcting what is wrong. I am saying that it is very lazy to quip “that is wrong”. To well articulate what is correct and also recognize what your colleague has stated correctly, requires production and effort. I prefer to see production and than cutting each other down. I get depressed when I see negative activity and I end up lashing out in return. Because I sort of adjust to the way the people are who are around me. If people want to be negative, then I after giving them every chance, and they insist then I let them have a boatload of negativity. But I have decided that in the future, I will just put such people on Ignore. I realize that is a flaw in my personality to argue with people who are negative.

I am a Cancer zodiac sign. This means the mood and ambiance of the workplace is very important for me to be productive. I like uplifting, positive people (but not in lack of substance way, i.e. I am not an air zodiac sign). I am an earth sign, meaning I need warmth of relations. I do not co-exist well with cold people. People do have different personalities and we just have to accept that not all personalities can mesh.

but I will do my best to avoid it.

I also have committed negative communication at times. Most recently I have been trying to apply the effort to be more careful about what I write in public. I will quote what I wrote to you in private:

@shelby3 wrote in private:

IMO, whereas if we explain something very well and very fairly explicitly recognizing the areas where the other is correct, and that other individual insists with their wrong claims and refusing to recognize or clearly refute with a better explanation, then that is the time we can put that troll on Ignore.


@keean wrote:

I would appreciate if you could cut down on the ad-hominem attacks in return, which don't help further the discussion, and probably undermine your credibility with other readers.

Yeah I agree not to let you or anyone else bait me into calling out and then lashing out when my pleas fail. As I explain below, I will learn to just walk away from situations that do not mesh.

I do not know if I can ignore if you continue to think that blunting pointing our wrongs without well developing statements of what is correct with fairness, and thus we would slide back into the same flame wars.

As I wrote above as quoted from private, I decided recently that I just have to learn to Ignore such people entirely, meaning ending all communication with them. It does not mean that they are incapable of having a conversation with others. Different people have different levels of tolerance for communication styles.

Having said that, if I felt someone had a treasure map (i.e. offer me extremely valuable information), I would probably bite my tongue and be exceedingly nice while they said anything they want to say about me or my ideas.

tl;dr: I do not think we solved the problem. So I will drastically reduce my interaction and try to be very judicious about the topics I discuss going forward with you in public (private communication is okay you can say whatever you want there, lol). But I want to make it clear that in no way is this a statement of judgement about you. I am not accusing you of being wrong for your style of communication. Diversity of personalities makes the world a more fertile soil.

@keean
Copy link
Owner

keean commented May 25, 2017

I do not know if I can ignore if you continue to think that blunting pointing our wrongs without well developing statements of what is correct with fairness, and thus we would slide back into the same flame wars.

If I say an idea you post is wrong, there is no other useful response apart from to try and convince me you are right. I have not insulted you, am am merely stating an opinion about the correctness of an idea. You have no right to object to my opinion about your idea, or to be 'offended' by it. It is after all just my opinion. In fact I should be flattered you are offended by it, because that means you hold me in such high esteem my opinion seems like a fact to you.

If you then escalate into ad-hominem attacks it looks like you have no convincing arguments that the idea is infact correct, and are trying to derail the discussion.

@shelby3
Copy link
Author

shelby3 commented May 25, 2017

@keean your reply is seems to indicate to me (IMO) why the incongruence in our personalities and communication styles will never be solved. I do not think you understand (or you disagree with) my point about how to be positive versus negative. But that is okay. We can move on. I will be very, very judicious about the topics I participate in from here forward.

Thank you for taking the time to respond on all issues including the meta ones.

@shelby3
Copy link
Author

shelby3 commented May 25, 2017

Really the only way to determine this is to create an experimental language with just the experimental feature (as design is more about what you leave out) and see what it is like to program with without the other features.

Good so if I proceed with that feature, then it will help you also by being the test bed. That is encouraging and inspiring because I know if you were inspired to write libraries, you have a lot of knowledge about Alexander Stepanov’s Elements of Programming models.

@shelby3
Copy link
Author

shelby3 commented May 25, 2017

My solution above, only allows static types based on location, and the implementation of unions would be layered on top of this, implemented as DSL library in the new language.

I visualize some potential tradeoffs of your suggested design choice:

  1. The DSL will add boilerplate. Perhaps you can hide most of it in libraries, I dunno.
  2. Unless your language exposes low-level details, your DSL will not be able to make some optimizations such as employing the free space of tagged pointers for the tags. Analogously, for targeting JavaScript, unless your language expose instanceof and typeof, then your DSL can’t avail of the built-in tags.
  3. To make certain optimizations, your DSL may be output target specific thus not portable. This indicates that the DSL logic might be in the wrong abstraction layer.

There might be some advantages as well, i.e. the analysis above may not be complete.

For your solution to work you will need to work all these codings out and include them in the language runtime.

For targeting JavaScript, the reified type tags are always there any way with instanceof and typeof, unless one is creating anonymous (i.e. non-nominal, structurally typed) objects.

Or even as I had proposed before upthread for JavaScript, since typeclasses should only be implemented one way for each type, these typeclass methods could be placed on the Object.prototype. When modules load, they have to add their typeclass implementations on the relevant prototypes.

@keean
Copy link
Owner

keean commented May 29, 2017

The DSL will add boilerplate. Perhaps you can hide most of it in libraries, I dunno.

If it does add excessive boilerplate, I will have failed. The compiler should do the work, although there might be a keyword involved to control when this happens.

The way I see this happening is that accessing the contents of a container should be a type-class based operation, that allows user defined containers to override the normal behaviour, so a user defined container is just as short to use as a built in type like an array.

Unless your language exposes low-level details, your DSL will not be able to make some optimizations such as employing the free space of tagged pointers for the tags. Analogously, for targeting JavaScript, unless your language expose instanceof and typeof, then your DSL can’t avail of the built-in tags.

Again I see type-classes facilitating this. Dereferencing would be overridable, so you can write a container where the type is encoded in the pointer if that is what you really want.

To make certain optimizations, your DSL may be output target specific thus not portable. This indicates that the DSL logic might be in the wrong abstraction layer.

If you put the encoding inside the compiler, it will require patching the compiler itself to support a new target. This seems worse than just having to edit a library.

It does make me wonder if WebAssembly (asm.js) would be a better target, because it would make it more like native targets and less dependent on the strangeness of JavaScript.

Maybe you are right and it would be better to include native boxing in the language. I want to keep the core language as small as possible, hence why I was thinking this should be in a library, but maybe its important enough to be in the core.

@shelby3
Copy link
Author

shelby3 commented May 30, 2017

@keean wrote:

Dereferencing would be overridable, so you can write a container where the type is encoded in the pointer if that is what you really want.

But as I wrote before, that forces you to expose low-level details in the language, e.g. pointers. I do not want any pointers in my language. K.I.S.S. principle and also encapsulation of details which for example are sandbox security holes and many other reasons.

If you put the encoding inside the compiler, it will require patching the compiler itself to support a new target. This seems worse than just having to edit a library.

The compiler could be modular so that post processing of the AST to different targets can be swapped. Essentially it is akin to that the compiler can have libraries. If you put the libraries above the language instead of below, it seems to me to be in the incorrect abstraction layer.

It does make me wonder if WebAssembly (asm.js) would be a better target, because it would make it more like native targets and less dependent on the strangeness of JavaScript.

I presume you know that WASM is not the same as ASM.js. Afaik, WASM does not run everywhere JavaScript does yet, and it is not a mature technology yet. ASM.js is low-level and afaik a nightmare to debug in the browser, although perhaps source maps could help with that somewhat, it still would not be as high-level intuitive as debugging in JavaScript. Also we need JavaScript’s GC.

Every output target has some strangeness.

Maybe you are right and it would be better to include native boxing in the language. I want to keep the core language as small as possible, hence why I was thinking this should be in a library, but maybe its important enough to be in the core.

I agree of course that separation-of-concerns and modularity are excellent design concepts. Yet exposing the low-level details necessary to optimize boxing in a library above the language seems to be conflating abstraction layers and thus not achieving optimal abstraction. I could also for example envision issues about compiler selected optimized binary unboxed data structures versus boxed members of data structures. I am planning to have a language feature to map between them, because otherwise we need to expose low-level JavaScript details such as ArrayBuffer and TypeArray above our language.

Just because we can put/hide details in a library above the language, does not prevent that the low-level details exposed above the language from allowing complexity to seep into userland code. The users will take advantage of any primitives exposed in the language, as flies are to honey. So attaining simplicity is not just about what is left out of the compiler, but what is also left out of the language. Libraries (aka modularity) can be above or below the language. It is all about the abstraction layer.

@keean
Copy link
Owner

keean commented May 30, 2017

The other point of view is that you don't want to lock the language into some type encoding that will prevent future extensibility.

Using bits in pointers to encode things limits porting to platforms that have different alignment requirements, and is probably a bad idea, and it is not enough space to encode all the types, so there would be some types that just don't benefit. Remember structural types are of unlimited length (as they have to encode the structure) whereas nominal types can be encoded as an integer.

So much approach would probably not use fancy pointer encodings, and from my experience with optimization, I don't think it will cost much performance, as CPUs are optimised for integer word performance. It's going to mess up pre-fetch and caching too.

My approach would be to use static typing wherever possible, so that the use of dynamic types is restricted to where it is really needed.

@shelby3
Copy link
Author

shelby3 commented Jun 1, 2017

The other point of view is that you don't want to lock the language into some type encoding that will prevent future extensibility.

Who proposed that? I certainly did not. I proposed there is no specification of these low-level details “above the language” and thus the language compiler is free to optimize for each target.

My point is do not expose complex, low-level details above the language, so the compiler can optimize for each output target and so the programmers are not given access to complexity that can make the code like Scala or C++ with its 1500 page manual and abundance of corner cases.

Using bits in pointers to encode things limits porting

That criticism thus does not apply.

So much approach would probably not use fancy pointer encodings, and from my experience with optimization, I don't think it will cost much performance, as CPUs are optimised for integer word performance. It's going to mess up pre-fetch and caching too

Compiler will be free to optimize whatever is tested to be most optimum. Btw, I am not 100% sure that unused bits of 64-bit pointers affect pre-fetch and caching.

My approach would be to use static typing wherever possible, so that the use of dynamic types is restricted to where it is really needed.

I do not know what caused you to mention this (?), as it seems so broad and not specifically related to the discussion we were having.

A union or sum-like type is not statically dispatched, although its bounds are statically typed.

Of course we statically type what we can that makes sense in terms of the priorities, but when you need union then you need it.

I hope we do not go on and on, just for sake of seeing who can be the last one to reply. What is your cogent point overall?

@keean
Copy link
Owner

keean commented Jun 1, 2017

Who proposed that? I certainly did not. I proposed there is no specification of these low-level details “above the language” and thus the language compiler is free to optimize for each target.

But you would lose binary compatability, and the ability to transfer data between machines (as they may have different versions of the compiler, or have different CPU and therefore type representations differ).

@keean
Copy link
Owner

keean commented Jun 1, 2017

Of course we statically type what we can that makes sense in terms of the priorities, but when you need union then you need it

Not necessarily, it is common in languages to use dynamics types everywhere, even when not needed, for example Java does this (every method is automatically virtual).

JavaScript also does this, and then tried to optimise it all away in the JIT compiler. The problem is it is all too easy to prevent the JIT compiler from being able to optimise by using a dynamic feature (like changing the type of a property) when you do not need to, you could for example use a separate property.

The C++ "virtual" keyword may seem like boilerplate, but it serves a useful purpose, that is it makes it clear when you are paying the cost. So a non-virtual method always dispatches at the fastest speed, and cannot be slowed down or break the optimiser, but it also cannot be dynamically dispatched. By making it virtual there is still the possibility it could be optimised, but you are allowing it to use the slower mechanism because you need dynamic dispatch.

This follows the principle of only paying for what you use, and making the cost visible in the source code. I think this is an important principle that languages I like follow.

@shelby3
Copy link
Author

shelby3 commented Jun 1, 2017

But you would lose binary compatability, and the ability to transfer data between machines

Such communication must be serialized to a standard protocol/format.

Not necessarily, it is common in languages to use dynamics types everywhere, even when not needed

What specifically does this have to do with the discussion we were having?

I agree that one of the design priorities can be to minimize accidental dynamic typing. Do you see a specific instance that applies to our discussion? Afaics, as I already wrote in my prior comment post, we were not discussing whether union types can be statically dispatched (because they can not in any design other than dependent typing), but rather about the other aspects of the design of a “sum-like” type.

@shelby3
Copy link
Author

shelby3 commented Aug 26, 2017

@keean wrote:

I have found a simple way to write single parameter typeclasses in TypeScript, which you can do by hand, or as a transpilation target. You can extend the interface of a class like this:

// declare the datatype:
class X {
  someproperty : number
}

// declare the typeclass:
interface Showable {
  show() : string
}

// create an instance:
interface X extends Showable {}
X.prototype.show = function() {
  return someproperty.toString();
}

This works because it merges the interface declaration for X with the class declaration for X, we then need to provide the implementation on the prototype for X so it is inherited by all instances of X. It only works for single parameter typeclasses, but it sort of solved the expression problem as we can declare a new instance to add a new type to an existing typeclass, and we can create a new typeclass for an existing type. Because TypeScript is dynamically typed, we don't need to worry about heterogenious collections, but of course there is no static guarantee that all objects in a collection implement any given typeclass, so its easy to get a runtime crash when you insert an ‘un-Showable’ object into a collection that gets ‘shown’ at some point.

That is similar to the idea I had proposed originally for using the prototype chain to simulate typeclasses. You’re showing it is possible to get TypeScript’s typing system to somewhat integrate with my idea. Thanks!

@shelby3
Copy link
Author

shelby3 commented Jun 3, 2018

TypeScript may be close to get HKT typing:

microsoft/TypeScript#1213

@shelby3
Copy link
Author

shelby3 commented Jul 5, 2018

@keean replied:

I replied:

@keean replied:

I wrote:

Btw, when a zero argument method returns a value and we are not discarding the result, do we want to allow as Scala does?

f(x: ILength) => x.length

Instead of:

f(x: ILength) => x.length()

I think leaving off the () on zero argument functions is not a good idea, because those functions execute. How do you pass a zero argument function as an argument to another function if you do not indicate when it is to be executed. You effectively lose control over the order in which side-effects happen.

Based on the type it is being assigned to, although there is a corner case where the result type is the same as the type being assigned to.

I think Scala allows the length(_) in that corner case. We could require this always for passing the function instead of its result value.

On the positive side, it does help to get rid of parenthesis more times than it will cause confusion.

That's worse :-( I think function application.needs to be consistent not some backwards special case, we seem to be introducing corner cases into the grammar. Scala is not a good model to follow.

Revisiting this I realized that instead of length(_), Zer0 could instead not allow discarding the () for function application when the return (aka output) type of the function is the same type as function.

Also application of non-pure functions will not allow discarding the ().

The advantage of discarding the () for pure functions is that it reduces the symbol soup of parentheses and it’s more sensible because a pure function has no side-effects and thus is analogous to reading a value. I propose it should be required that the () be discarded for pure functions so these are accessed consistent indicating they’re values.

Consider:

if (array.isEmpty)

Instead of:

if (array.isEmpty())

if (isEmpty(array))

Even if we have block indenting:

if array.isEmpty
   block
if array.isEmpty()
   block
if isEmpty(array)
   block

The symbol soup reduction still applies when applying functions:

foo(array.isEmpty)

foo(array.isEmpty())

foo(isEmpty(array))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants