Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve "welcome" message #6

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Improve "welcome" message #6

wants to merge 3 commits into from

Conversation

piegamesde
Copy link
Member

  1. It promotes a list of known relay servers to use
  2. It gives a proof of work challenge to solve

The list of known relay servers allows us to not always use the same one,
distributing both cost and availability. It also allows to reboot single
relay servers for maintenance without bringing the whole service down. Each
client should pick one or two, this will result in up to four relays being
pinged during relay initialization (remember, there are two clients who both
pick at random). Four connection attempts should be enough to give a
sufficiently high probability of finding one that is up.

The proof of work challenge is added in the attempt to protect the rendezvous
server from a DOS overload. It is a single point of failure, and we want to
be prepared if some dumb-ass spams it with connection attempts for some reason.
The challenge is made in a way that its difficulty is configurable by the server,
thus it can dynamically adapt at the load situation. It is also designed in a
way to be as light on the server resource as possible (also in order to not
accidentally introduce a new DOS vector):

  • The server can use the first 8 bytes of the challenge as running counter, only
    that needs to be stored. A binary heap gives
  • All other required information are stored and forwarded by the client, a MAC
    protects against clients that make up their challenges.

@meejah
Copy link
Member

meejah commented Apr 29, 2021

Note that there is already an attempt + implementation at a "nameplate allocation now requires a token" (in the python mailbox server) .. I'd have to (further) remind myself where exactly that is at, but the idea was an open-ended way to demand more from clients. A "proof of work" (I'm assuming "like hashcash" here) token would of course fit in there.

@meejah
Copy link
Member

meejah commented Apr 29, 2021

Ahh, I think it was magic-wormhole/magic-wormhole#126 ... I thought I had a branch that implemented most of that too but I can't immediately find it.

@piegamesde
Copy link
Member Author

piegamesde commented Apr 29, 2021

That sounds interesting, if you can dig out some details about it I'd be interested!

My current proposal for proof of work is some primitive brute-force task on a hash function. There are a lot of variations on the concept, this one is mostly optimized to minimize the server load. I'll try to investigate Verifiable Delay Functions. They obviously provide a lot more than what we need, but it'd be cool to find a one-way function that doesn't trivially parallelize. Maybe time lock puzzles or "proof of sequential work"?

Another thing that comes to my mind right now is that "nameplate allocation requires a token" is not enough to protect us – Claiming a mailbox directly is a pattern that will become more common when we have seeds. Nevermind, forget that, it should work fine.

@meejah
Copy link
Member

meejah commented Apr 29, 2021

magic-wormhole/magic-wormhole#126 (comment) has some notes from a #magic-wormhole discussion between (at least) myself and warner.

@piegamesde
Copy link
Member Author

I have seen that comment, but lacking the surrounding discussion context I don't fully understand the motivation for some of its aspects.

Most importantly, why have a separate abilities/permission round-trip if we can simply add fields onto the existing welcome/bind messages in a backwards-compatible way? Are there any issues with my proposal that I haven't thought of?

If we can't find the code, this will have to be re-implemented. But it shouldn't be too hard, it's a rather simple feature.


Also, any opinions on the relay discovery feature? (Maybe it was a bad idea to have both in one commit). I don't fully know which kinds of attributes one may want to advertise next to the URL (I could only think of "server location" for now). Also, the harder part – how the rendezvous server knows about 3rd-party relays – is not really part of the spec because clients can't be bothered. It however is something that needs to be figured out nevertheless.

@meejah
Copy link
Member

meejah commented Apr 30, 2021

I think the protocol in that comment anticipated a more open-eneded way to do things -- that is, it's not just for one style of proof-of-work. For example it could be used to do ZKAP "payments" or logins / proof-of-account or have different styles of proof-of-work.

So IIRC, it was a "change the overall protocol once" so that individual PoW etc schemes can be added to the "abilities" and "permission" messages more easily .. I definitely saw that code semi-recently. I will have some time this evening to dig around and find it for real.

@piegamesde
Copy link
Member Author

Hm, what is the intended semantic of it, when multiple concurrent styles are supported? The client sends all those that it supports as abilities, and then the server picks one? If we let the client pick one that it supports I can make the scheme work with only one round-trip.

Furthermore, I'd like to call the "proof of account"/"proof of human" family as out of scope, as it is requires additional human interaction.

@piegamesde piegamesde closed this May 1, 2021
@piegamesde piegamesde reopened this May 1, 2021
@meejah
Copy link
Member

meejah commented May 1, 2021

"proof-of-account" doesn't necessarily require additional human interaction. It certainly could require more action if it was e.g. "username + password"-based but a scheme could employ a keypair instead (for example). You're right that could be considered feature-creep on DoS .. but I do think it's worth considering (especially if it could fit in as a further, later enhancement to a DoS scheme).

After all, "in general" what we're talking about here is enhancing the protocol so that the server can ask for "something else" / more interaction from clients. Roughly speaking, an account system could be viewed as DoS / misuse prevention (e.g. for private deployments where any use outside your organization is unwanted).

As to relay-discovery I like the general idea .. but it's probably best expressed as its own enhancement, I think.

@piegamesde
Copy link
Member Author

I see. I could make it that the server sends the challenge data for all of the possible types of POW/Captcha/Auth that it supports. The client then picks one and submits the answer. This does not add any new message types, and only half a roundtrip is added compared to previously.

The client can freely choose which challenge to do (or none at all, because backwards compatibility). The server can control which one it prefers by making the other ones more expensive (or not providing them).

@meejah
Copy link
Member

meejah commented May 7, 2021

I think I greatly prefer the "abilities" based interaction, for several reasons:

  • it accommodates old and new clients (at the same time)
  • it is open-ended, easily allowing future innovation around "permission to use this rendezvous"
  • several mitigation strategies can be supported / used at once
  • it is easier to change or add new methods

Apart from that, I think it would be better to start with a more well-known PoW like "hashcash" .. the scheme proposed here looks very similar to that. Perhaps "hashcash" isn't the right one to choose, but something with existing libraries / spec is what I'm thinking here :)

Thinking generally about the protocol and the "abilities"-based interaction, I think the biggest point is the "easier to add new ones" and "several supported at once". In general I'm thinking of this as "permission to use the server right now". Certainly one use-case is DoS mitigation but there are others, especially for non-public/free-to-use deployments.

Here's the idea:

  • clients tell the server what they support (including "nothing" by sending BIND before ABILITIES)
  • the server chooses the permission strategy (if any) it wishes to use for this client
  • the server tells the client which one it chose in WELCOME (that is, WELCOME would always have 0 or 1 permission strategies)
  • the client responds to the challenge (if any) in PERMISSION

Let's consider a case where the server supports three permissions models: hashcash, proving existence of an account or spending ZKAPs.

A client that supports just "accounts" or hashcash connects and sends ABILITIES:

{
    "hashcash": {},
    "account": {
        "type": "cryptosign",
        "public_key": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
    }
}

The server chooses to use the "account" strategy, because that account does exist. It could instead have chosen "hashcash" but can't choose ZKAPs (because the client doesn't support that). It sends a "challenge" that the client must sign to prove it controls the corresponding private key. (Other similar methods could even use SCRAM or other password-methods that include human UX interaction). So it sends back WELCOME, like:

{
    ...
    "permission": {
        "account": {
            "type": "cryptosign",
            "nonce": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
        }
    }
    ...
}

The client would then use its private key to sign the nonce with the NaCL cryptosign signature scheme and send it up in the PERMISSION message. I've used somewhat generic names here; maybe calling the scheme account-cryptosign or similar would make more sense.

This gives server operators lots of flexibility. For example, maybe they are happy to offer a cost-free service for most users but have a "premium" offering for account-holders. Under high load or DoS or similar, they could demand hashcash from the free users but continue to allow account-holders immediate access. Or perhaps they decide to offer zero free service while under DoS.

Another scheme I could imagine .. especially for Web-based clients .. could be reCAPTCHs or similar.

Even more esoteric, an operator may decide to accept anonymous payment for service with a system like ZKAPs (https://leastauthority.com/product-development/zkaps/) or something else.

I think allowing multiple schemes wrapped inside fairly generic protocol messages like this makes it a LOT easier for implementations to experiment with different permissions strategies. Since "DoS mitigation" is essentially an arms-race I think this is very important for this use-case. By "arms race" here I mean that a motivated "bad actor" can usually get around various mitigation strategies (depending on their motivation and funding). As a nice addition it also captures use-cases that aren't strictly "DoS related" but generally relate to "permission to use this service right now" -- I kind of see DoS as a particular special case of that.

Perhaps if @warner has time / motivation he would be interested in chiming in ..?

@meejah
Copy link
Member

meejah commented May 7, 2021

Note also that the proposed "abilities and then welcome" dance could be used to note support for Dilation or Seeds and other future protocol enhancements.

@piegamesde
Copy link
Member Author

Note also that the proposed "abilities and then welcome" dance could be used to note support for Dilation or Seeds and other future protocol enhancements.

Uhm, I think you are confusing things here. The abilities negotiation with the server and with the other client are two distinct ones, for two different protocols with different features.

Regarding your other comment, I must admit that I'm not a huge fan of your use cases*, but I'll have a more in-depth look some time later.

* The problem with hosting custom rendezvous servers is that both sides need to agree on a rendezvous server in order to find each other. And I still haven't found a solution with sufficiently good UX that self-hosting one would be worth it.

@meejah
Copy link
Member

meejah commented May 7, 2021

Yes, you're right we already have a way to do seeds etc stuff. So, ignore that :)

"Self-hosting", maybe not?

But I'm getting at larger deployments or commercial offerings etc. I'm not necessarily strongly committed to any of those particular use-cases, but I do think that if there's a need to do DoS-mitigation then there's going to be a need to change the DoS mitigation strategies as the people doing DoS change tactics.

Basically anything that currently has its own AppID could instead use a whole separate deployment. Obviously, such a deployment would need to "burn in" or otherwise communicate the URL of the rendezvous service -- like is already done with the wormhole CLI.

There are certain advantages to having "one" such server .. but also disadvantages (such as "what if warner gets bored of maintaining it").

@piegamesde
Copy link
Member Author

Thanks, this is convincing.

@meejah
Copy link
Member

meejah commented Jul 23, 2021

#12 covers the proof-of-work parts of this proposal .. but I think the "list transit relays" piece is still interesting and useful; perhaps this could be trimmed down to just that?

@piegamesde
Copy link
Member Author

Yes, I have not forgotten about this. My plan is to wait for #12 merged, and then rebase on top of that with the PoW changes taken out.

Notably:

- Improved wording, added type information
- Deprecate the `current_cli_version` key
@piegamesde
Copy link
Member Author

@meejah I've rebased and adapted to the latest changes. There are still a few open questions to resolve, but please have a look at it first.

1 similar comment
@piegamesde
Copy link
Member Author

@meejah I've rebased and adapted to the latest changes. There are still a few open questions to resolve, but please have a look at it first.

@piegamesde
Copy link
Member Author

piegamesde commented Jul 30, 2021

One other question that just came up: What's the purpose of the error field? Where is it actually used, and what for? I think its main purpose got superseded by the permission-required field. Thus, I propose to deprecated it, and let clients ignore it. If the server wants to tell there's an error, it should use the error inbound message instead.

What do you think?

piegamesde added a commit to magic-wormhole/magic-wormhole.rs that referenced this pull request Jul 31, 2021
The actual log-in process is mostly untested
Also add an experimental implementation for <magic-wormhole/magic-wormhole-protocols#6>.
@meejah
Copy link
Member

meejah commented Aug 4, 2021

I believe the error field is for things like "This server is under maintenance, please try again". But it also has some "speculation" about CAPTCHAs etc in the text, so at least that part is superseded by `permission-required .. so I think it still has a purpose ("client should exit after displaying the message") but more narrow than previously anticipated..?

@piegamesde
Copy link
Member Author

In order to not derail this thread too much, I opened #15 about error handling instead.

Clients should make a preselection of viable relay servers (which may include entries from other
sources as well), and randomly select one or two (together with the other side's, this
makes up to four, which should be enough to have a high probability of at least one being
reachable).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the two sides determine which relay to use?

Not necessarily to derail this further, but it might not be best to encode the protocol into the URL; the current transit-relay implementation does support both TCP and WebSockets (for example) and can inter-operate. So a client with only websocket support can contact a client with only tcp support so long as they use the same relay.

Perhaps this implies something like this:

    {
        "host": "example.com",
        "port": 4321,
        "transports": ["tcp", "ws", "wss", "tls"]
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the "transports" wants to be a list of two-tuples, like [ ["tcp", 4321], ["wss", 443] ] and get rid of the "port"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the two sides determine which relay to use?

This is specified in the application layer protocol. For transfer, the sender side decides, for symmetric applications the higher side value chooses.

I think clients need not know which relay endpoints are connected together beforehand, they will find out anyways during the transit connection setup. Thus, simply adding one server entry for each supported protocol should do. But if you want, I can make the url field a list of strings instead, so at least they are grouped.

Copy link
Member

@meejah meejah Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a concrete example: if there's a Web based client and a Python client, the Web one will only ever choose a "wss://" relay. The Python one will only ever choose a "tcp://" one. But if they're using the same relay (i.e. same host) that's fine, and they can communicate (one via WebSockets and one via TCP).

So if they indicated their choice via URL they will never inter-op. But if they say "use the server at relay.wormhole.io" and it happens to support wss and tcp, then they can talk...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that there is not "the server at example.com". WebSockets endpoints are arbitrary HTTP paths, only TCP is the exception there (it really only has a port). I don't remember how I thought this ought to work with my proposal, but by your example it clearly doesn't.

We could indeed make urls a list or dict of connected endpoints of that server. An alternative would be to craft a new custom URL scheme that encodes all relevant information for all supported schemes of the server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah. Maybe what we really want is some meta-information about the server (arbitrary name, maybe more), and a transports: [ ... ] list consisting of dicts describing the transport. Something like:

{
    "type": "tcp",
    "host": "example.com",
    "port": 1234
}

or

{
    "type": "websocket",
    "url": "wss://example.com:4321/"
}

Thus, clients select "a transit relay" which has some collection of "transports" / ways to connect (at least 1). This allows clients with distinct support for transports to still connect (given that the transit relay supports that). This could also support multi-homed hosts that e.g. have multiple public IP addresses (or hostnames) that can be contacted. It would also nicely allow Tor and/or I2P support (from one or both sides). (Of course, one "pro" of tor/i2p is that you don't need a transit-relay in those cases, but only if both sides support and choose Tor .. which they might not).

(Also: I'm offline until Friday starting shortly so will probably be silent on this until then ...)

piegamesde added a commit to magic-wormhole/magic-wormhole.rs that referenced this pull request Aug 12, 2021
The actual log-in process is mostly untested
Also add an experimental implementation for <magic-wormhole/magic-wormhole-protocols#6>.
piegamesde added a commit to magic-wormhole/magic-wormhole.rs that referenced this pull request Oct 10, 2021
The actual log-in process is mostly untested
Also add an experimental implementation for <magic-wormhole/magic-wormhole-protocols#6>.
@piegamesde piegamesde changed the title Upgrade welcome message Improve "welcome" message Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants