#19 Pushing into repositories you don't have account on (cross-federation collab)

Open
opened 1 year ago by Houkime · 9 comments
Houkime commented 1 year ago

Imagine that you as a repository owner want to grant a collaborator status to a person who doesn't have an account on your instance.

Ideally what you want to do is to just add awesomecollaborator@otherinstance.com to a list of your collaborators and for them to push sth, create branches, help reviewing PRs and whatnot.

For such pushes to happen you might employ the following scheme:

  • you add your collaborator by name but server converts it to an actual user ID on another instance and adds it to collab list for a repo. [potential part of ForgeFed C2S]

  • your collaborator clones your repository. [pure git can do it]

  • in this repository "remote" config your instance is listed [pure git]

  • collaborator does changes and commits them locally

  • if collaborator had an account on your instance he could push right away with pure git BUT

  • instead he uses a hypothetical git wrapper ffgit, which includes some federation functions (ffgit push):

    • it has collaborator's current "home" instance and username in settings.
    • when asked to push it looks at the "origin" remote repository
    • it undergoes auth at his home instance
    • it sends collab commit in the text form to home instance
    • home instance relays it to a target "origin" instance with confirmation that it thinks that it comes from its user with some ID.
      • home instance DOESN'T need to store the repo itself. It is merely a relay in this scenario.
    • target instance checks ID (and maybe the commit is signed, then a public fingerprint of user can be stored at the home instance)
    • target instance decides if to accept the push
    • if target accepts the push, because your "origin" still links to it, vanilla git syncing functionality will decide what commits of yours are already pushed and ffgit won't attempt to push same commits one more time.
Imagine that you as a repository owner want to grant a collaborator status to a person who doesn't have an account on your instance. Ideally what you want to do is to just add awesomecollaborator@otherinstance.com to a list of your collaborators and for them to push sth, create branches, help reviewing PRs and whatnot. For such pushes to happen you might employ the following scheme: * you add your collaborator by name but server converts it to an actual user ID on another instance and adds it to collab list for a repo. [potential part of ForgeFed C2S] * your collaborator clones your repository. [pure git can do it] * in this repository "remote" config your instance is listed [pure git] * collaborator does changes and commits them locally * if collaborator had an account on your instance he could push right away with pure git BUT * instead he uses a hypothetical git wrapper ffgit, which includes some federation functions (`ffgit push`): * it has collaborator's current "home" instance and username in settings. * when asked to push it looks at the "origin" remote repository * it undergoes auth at his home instance * it sends collab commit in the text form to home instance * home instance relays it to a target "origin" instance with confirmation that it thinks that it comes from its user with some ID. * home instance DOESN'T need to store the repo itself. It is merely a relay in this scenario. * target instance checks ID (and maybe the commit is signed, then a public fingerprint of user can be stored at the home instance) * target instance decides if to accept the push * if target accepts the push, because your "origin" still links to it, vanilla git syncing functionality will decide what commits of yours are already pushed and ffgit won't attempt to push same commits one more time.
Houkime commented 1 year ago
Poster

For signed commits one can dumb this down even more and limit home instance's involvement to providing fingerprints of its users.

This way when target repository decides whether to accept a signed commit from a random dude it will just check fingerprint if it matches with a fingerprint of someone in collaborators' list.

However there might be a tricky situation if collaborator has a public fork from which a random dude manages to extract a signed commit. While it is an authentic commit from a needed collaborator it might not be meant for pushing yet.

For signed commits one can dumb this down even more and limit home instance's involvement to providing fingerprints of its users. This way when target repository decides whether to accept a signed commit from a random dude it will just check fingerprint if it matches with a fingerprint of someone in collaborators' list. However there might be a tricky situation if collaborator has a public fork from which a random dude manages to extract a signed commit. While it is an authentic commit from a needed collaborator it might not be meant for pushing yet.
fr33domlover commented 1 year ago
Collaborator

Hey @houkime! Thank you for the detailed proposal :)

We've indeed considered and discussed how to do git-push to non-homeserver locations. There are many ways to do that. The last thing I came up with is the following idea. There were more ideas but they were dropped. I'll write what I came up with, and then we can compare with what you proposed above.

The general idea is to use something simple, secure, non-dirty-hack, reusing the established authorization mechanism.

My idea is to do something more-or-less like this:

User user@A wants to push a commit to a repo repo@B. The user has an authorization token, which they use for all authorized access to the project, such as closing issues and accepting merge requests. However, most of the time, the server (server A) hosts that auth token in behalf of the user. The part I'm confident about is that somehow, when the git push happens, directly from the user to server B,

  • The server needs to authenticate the SSH key
  • The server needs to get the authorization token and verify that it gives push access

That raises 2 questions:

  1. How does server B know that the SSH key indeed belongs to user@A?
  2. How does server B obtain the authorization token?

There's no clear final decision on this yet, because I haven't started implementing this yet, but here's one of the options, I think this one is my favorite:

  • user@A can download their auth key from server A, and have it stored in their .git/config or something like that, in their local copy of repo@B
  • When doing the git push, either use a helper/wrapper command, or just include the auth token in the SSH URI, such that server B can obtain the token and verify that it provides user@A with push access
  • SSH public keys are specified in Actor documents, just like HTTP Signature keys, so server B can have a copy of the SSH key(s) of user@A (and if not, it can HTTP GET the key(s) when needed) and verify ownership of the SSH key by user@A
Hey @houkime! Thank you for the detailed proposal :) We've indeed considered and discussed how to do git-push to non-homeserver locations. There are many ways to do that. The last thing I came up with is the following idea. There were more ideas but they were dropped. I'll write what I came up with, and then we can compare with what you proposed above. The general idea is to use something simple, secure, non-dirty-hack, reusing the established authorization mechanism. My idea is to do something more-or-less like this: User `user@A` wants to push a commit to a repo `repo@B`. The user has an authorization token, which they use for all authorized access to the project, such as closing issues and accepting merge requests. However, most of the time, the server (server `A`) hosts that auth token in behalf of the user. The part I'm confident about is that somehow, when the `git push` happens, directly from the user to server `B`, - The server needs to authenticate the SSH key - The server needs to get the authorization token and verify that it gives push access That raises 2 questions: 1. How does server `B` know that the SSH key indeed belongs to `user@A`? 2. How does server `B` obtain the authorization token? There's no clear final decision on this yet, because I haven't started implementing this yet, but here's one of the options, I think this one is my favorite: - `user@A` can download their auth key from server A, and have it stored in their `.git/config` or something like that, in their local copy of `repo@B` - When doing the `git push`, either use a helper/wrapper command, or just include the auth token in the SSH URI, such that server `B` can obtain the token and verify that it provides `user@A` with push access - SSH public keys are specified in Actor documents, just like HTTP Signature keys, so server `B` can have a copy of the SSH key(s) of `user@A` (and if not, it can HTTP GET the key(s) when needed) and verify ownership of the SSH key by `user@A`
fr33domlover commented 1 year ago
Collaborator

(One thing I like about that last idea I wrote, is that it doesn't involve any relaying, and SSH key updates can be done push-based using Update activities, so when server B receives the git push, it already has the SSH key, and the auth token it gets right there from the user, so it can right away handle the commits without needing negotiation with server A etc.)

(One thing I like about that last idea I wrote, is that it doesn't involve any relaying, and SSH key updates can be done push-based using Update activities, so when server `B` receives the `git push`, it already has the SSH key, and the auth token it gets right there from the user, so it can right away handle the commits without needing negotiation with server `A` etc.)
Houkime commented 1 year ago
Poster

I am not sure there is mathematically a way to ensure that someone is user@A without communicating with A at least once in that collaborator's lifetime (i.e. to retrieve their public key for example).

Compared to your scheme, my idea (in the last dumbed down form and also upgraded based on your comment) is basically to use public keys themselves as authorization. Very similar to how SSH pushing does, you just mention your public key in your profile and then when there is an incoming encrypted connection (or signed commit) using this key it goes through as yours.

So repo owner can effectively add a public key as collaborator (either directly or looking up a key for user@A).

And there is NO additional authorisation token which repo owner sends away to collaborators, it is just repo owner writes into settings of the repo which keys are allowed to push.

I am not sure there is mathematically a way to ensure that someone is user@A without communicating with A at least _once_ in that collaborator's lifetime (i.e. to retrieve their public key for example). Compared to your scheme, my idea (in the last dumbed down form and also upgraded based on your comment) is basically to use **public keys themselves** as authorization. Very similar to how SSH pushing does, you just mention your public key in your profile and then when there is an incoming encrypted connection (or signed commit) using this key it goes through as yours. So repo owner can effectively **add a public key as collaborator** (either directly or looking up a key for user@A). And there is **NO additional authorisation token** which repo owner sends away to collaborators, it is just repo owner writes into **settings** of the repo **which keys are allowed to push**.
Houkime commented 1 year ago
Poster

As far as I remember using it, Arch Linux's AUR utilises similar scheme a lot.

https://wiki.archlinux.org/index.php/AUR_submission_guidelines

Notice a special user aur@aur.archlinux.org which you are supposed to SSH-connect to with your key to clone and then push.
And after you made a package, you can set up additional collaborators by name (server knows their keys) and when they connect using their keys they will be able to push.

All of this is done with no extra authorisation token except for the initial session when you connect to server normally as a user to tell it what your public key is.

As far as I remember using it, Arch Linux's AUR utilises similar scheme a lot. https://wiki.archlinux.org/index.php/AUR_submission_guidelines Notice a special user aur@aur.archlinux.org which you are supposed to SSH-connect to with your key to clone and then push. And after you made a package, you can set up additional collaborators by name (server knows their keys) and when they connect using their keys they will be able to push. All of this is done with **no extra authorisation token** except for the initial session when you connect to server normally as a user to tell it what your public key is.
Houkime commented 1 year ago
Poster

It may be a more robust scheme than with auth tokens being held by server A because this way even if server A is malicious it still can't push anything on its own to server B.
At least if the key was added directly to B or before A was compromised.

This is because server A doesn't have a private key of collaborator - only a public one.

The only thing it can do is to try and persuade server B that user@A changed his key, but if there was a federation alarm concerning server A, or B is not configured to be easily persuadable, or there is a dublicate key on the backup keyserver that wasn't revoked server B can just ignore such attempts.

One can also configure repo@B to update keys only in manual mode when repo owner himself decides when and which keys to update.

It may be a more robust scheme than with auth tokens being held by server A because this way even if server A is malicious **it still can't push anything on its own** to server B. At least if the key was added directly to B or before A was compromised. This is because server A doesn't have a private key of collaborator - only a public one. The only thing it can do is to try and persuade server B that user@A changed his key, but if there was a federation alarm concerning server A, or B is not configured to be easily persuadable, or there is a dublicate key on the backup keyserver that wasn't revoked server B can just ignore such attempts. One can also configure repo@B to update keys only in manual mode when repo owner himself decides when and which keys to update.
fr33domlover commented 1 year ago
Collaborator

@houkime,

What I propose is what SSH keys are used for authentication, the plain regular way. Servers can't maliciously push because they don't have your SSH private key :)

That scheme where there's no extra authorization token is a traditional one, that uses ACLs. When you try to make some operation, your identity is the only piece of information passed, that is used for deciding whether or not to give you access.

Another way, more secure and flexible, is to use authorization tokens. These tokens don't replace the authentication mechanism, they come on top of it, to do authorization. The benefits of object capabilities over ACLs deserves its own discussion (and we should probably link to some for reference), but the bottom line is, it's probably coming to the fediverse one way or another, and if not, it probably should, and our solution here should take that into account.

When you add a collaborator,

  1. You indeed just add a name, not necessarily tied to a single specific key (although we could do that if we encounter security issues)
  2. You likely give them an authorization token

How do you authenticate people though, to be sure names are maliciously used in identity theft?

When processing ActivityPub activities, you verify the HTTP Signature. Actors publish public keys (e.g. RSA or Ed25519) and you grab those and verify the activity.

When processing a git-push, you verify that the SSH key is one of the SSH keys listed by the actors. Actors similarly publish SSH public keys (which, too, can use things like RSA and Ed25519).

I want to reuse, if reasonably possible, these existing authentication mechanisms. SSH keys and SSH Signature keys. The only stuff added on top is:

  1. Propagating SSH public keys between servers
  2. Sending authorization tokens, to support object capabilities

I agree that (2) is a bit weird in the sense that plain regular git-push traditionally doesn't use any such tokens, only SSH keys. And if servers want to do it that way, that's okay :) I just wonder how to pass them if some server does want to use them.

Soon I'll properly describe that proposal, and then we can check it against all the points raised here, and change it if there's stuff to improve :)

@houkime, What I propose is what SSH keys are used for authentication, the plain regular way. Servers can't maliciously push because they don't have your SSH private key :) That scheme where there's no extra authorization token is a traditional one, that uses ACLs. When you try to make some operation, your identity is the only piece of information passed, that is used for deciding whether or not to give you access. Another way, more secure and flexible, is to use authorization tokens. These tokens don't replace the authentication mechanism, they come *on top* of it, to do authorization. The benefits of object capabilities over ACLs deserves its own discussion (and we should probably link to some for reference), but the bottom line is, it's probably coming to the fediverse one way or another, and if not, it probably should, and our solution here should take that into account. When you add a collaborator, 1. You indeed just add a name, not necessarily tied to a single specific key (although we could do that if we encounter security issues) 2. You likely give them an authorization token How do you authenticate people though, to be sure names are maliciously used in identity theft? When processing ActivityPub activities, you verify the *HTTP Signature*. Actors publish public keys (e.g. RSA or Ed25519) and you grab those and verify the activity. When processing a `git-push`, you verify that the SSH key is one of the SSH keys listed by the actors. Actors similarly publish SSH public keys (which, too, can use things like RSA and Ed25519). I want to reuse, if reasonably possible, these existing authentication mechanisms. SSH keys and SSH Signature keys. The only stuff added on top is: 1. Propagating SSH public keys between servers 2. Sending authorization tokens, to support object capabilities I agree that (2) is a bit weird in the sense that plain regular git-push traditionally doesn't use any such tokens, only SSH keys. And if servers want to do it that way, that's okay :) I just wonder how to pass them if some server does want to use them. Soon I'll properly describe that proposal, and then we can check it against all the points raised here, and change it if there's stuff to improve :)
Houkime commented 1 year ago
Poster

In my fork of mcfi which is here: https://notabug.org/Houkime/mcfi
Accompanied by equally modified clif: https://notabug.org/Houkime/clif

I managed to achieve adding a foregn collaborator by his name via simple SSH key lookup.
A strange demo video (and a cartoon!) for that is here:
https://peertube.social/videos/watch/e65e0c70-d165-4c64-8518-3578d3d392d9

Basically how it works is that you have a Gitosis server which manages Git and SSH connections to git and pushing rights.

And as an admin for this Gitosis server you install a special admin-daemon which is a modified MCFI.
This daemon runs as a separate unprivileged user without a login shell called in my code "trurl".
Trurl's SSH key is registered as "admin-daemon" key, doesn't have a passphrase and at the gitosis init gitosis makes this key its special admin key.

Then a daemon as system user "trurl" can git clone from ssh://gitosis@127.0.0.1/gitosis-admin to make changes in the config and push them back as needed.

The rest is fairly trivial. Modified mcfi can register a user using C2S, and when it does it, it registers a users' public key not only for Gitosis but also saves it in its database. Then this key is publicly available via a basic API which describes actors (and by extension - users):

@get('/users/<username>')
def describe_user(username):
    ''' Return profile of a user '''

    id_ = '{}/{}'.format(settings.mcfi_url, 'users',username )
    user = database.get_user_local(username)

    if not user:
        return HTTPResponse(status=400, body=None)
    
    return {
            '@context':  [
                "https://www.w3.org/ns/activitystreams",
                "https://forgefed.peers.community/ns"
            ],
            'type':      'Person',
            'id':        id_,
            'name':      '{}'.format(username),
            'preferredUsername': '{}'.format(username),
            'summary':   '',
            'inbox':     '{}/inbox'.format(id_),
            'outbox':    '{}/outbox'.format(id_),
            'followers': '{}/followers'.format(id_),
            'following': '{}/following'.format(id_),
            'publicKey': user["public_key"]
        }

And then, when a user on another server decides to add this user as a collaborator from to his repo, the endpoint for collaboration adding on his server fetches the foreign (for it) key like this, by simply looking at the foreign actor:

def import_foreign_key(address):
    
    user, host = address.split("@")
    foreigner = PROTOCOL+host+"/users/"+user #IMPORTANT: don't forget to change if routing changes
    response = requests.get(foreigner, headers={ 'Content-Type': 'application/json' })
    if response.status_code != 200:
        return False
    res_body = response.json()
    if "publicKey" not in res_body.keys():
        return False    
    
    add_key(address, res_body["publicKey"]) # we add a key without adding a user to a db
    
    return True


And also adds a Collaboration to its database like this


def add_collaboration(repository, collaborator):
    with db:
        db.execute (
            """
            INSERT OR IGNORE INTO collaborations (repository, collaborator)
            VALUES (?, ?)
            """,
            [repository, collaborator]
        )

And then regenerates and pushes the Gitosis config so that a foreigner key is listed as a collaborator.

The whole process in my test was orchestrated by a modified clif client, and here's what was in the bash script for a basic foreign collaborator addition (I used two independent test VMs on a virtual network, one ends with .101 and one ends with .102):

#!/usr/bin/bash

./clipauc.py register 192.168.56.101:9090 bukkij bobobo123 bukkij@example.com keys/bukkij.pub 
sleep 3
./clipauc.py register 192.168.56.102:9090 maruc bobobo123 maruc@example.com keys/maruc.pub 
sleep 3
./clipauc.py newrepo 192.168.56.101:9090 bukkij awesomerepo
sleep 3
./clipauc.py newcollab 192.168.56.101:9090 bukkij awesomerepo maruc@192.168.56.102:9090

And the output is seen in this post or in the aforementioned video (after cartoon): https://mastodon.technology/@houkimenator/102372757386157121

In my fork of mcfi which is here: https://notabug.org/Houkime/mcfi Accompanied by equally modified clif: https://notabug.org/Houkime/clif I managed to achieve adding a foregn collaborator by his name via simple SSH key lookup. A strange demo video (and a cartoon!) for that is here: https://peertube.social/videos/watch/e65e0c70-d165-4c64-8518-3578d3d392d9 Basically how it works is that you have a Gitosis server which manages Git and SSH connections to git and pushing rights. And as an admin for this Gitosis server you install a special admin-daemon which is a modified MCFI. This daemon runs as a separate unprivileged user without a login shell called in my code "trurl". Trurl's SSH key is registered as "admin-daemon" key, doesn't have a passphrase and at the gitosis init gitosis makes this key its special admin key. Then a daemon as system user "trurl" can git clone from ssh://gitosis@127.0.0.1/gitosis-admin to make changes in the config and push them back as needed. The rest is fairly trivial. Modified mcfi can register a user using C2S, and when it does it, it registers a users' public key not only for Gitosis but also saves it in its database. Then this key is publicly available via a basic API which describes actors (and by extension - users): ``` @get('/users/<username>') def describe_user(username): ''' Return profile of a user ''' id_ = '{}/{}'.format(settings.mcfi_url, 'users',username ) user = database.get_user_local(username) if not user: return HTTPResponse(status=400, body=None) return { '@context': [ "https://www.w3.org/ns/activitystreams", "https://forgefed.peers.community/ns" ], 'type': 'Person', 'id': id_, 'name': '{}'.format(username), 'preferredUsername': '{}'.format(username), 'summary': '', 'inbox': '{}/inbox'.format(id_), 'outbox': '{}/outbox'.format(id_), 'followers': '{}/followers'.format(id_), 'following': '{}/following'.format(id_), 'publicKey': user["public_key"] } ``` And then, when a user on **another** server decides to add this user as a collaborator from to **his** repo, the endpoint for collaboration adding on **his** server fetches the foreign (for it) key like this, by simply looking at the foreign actor: ``` def import_foreign_key(address): user, host = address.split("@") foreigner = PROTOCOL+host+"/users/"+user #IMPORTANT: don't forget to change if routing changes response = requests.get(foreigner, headers={ 'Content-Type': 'application/json' }) if response.status_code != 200: return False res_body = response.json() if "publicKey" not in res_body.keys(): return False add_key(address, res_body["publicKey"]) # we add a key without adding a user to a db return True ``` And also adds a Collaboration to its database like this ``` def add_collaboration(repository, collaborator): with db: db.execute ( """ INSERT OR IGNORE INTO collaborations (repository, collaborator) VALUES (?, ?) """, [repository, collaborator] ) ``` And then regenerates and pushes the Gitosis config so that a foreigner key is listed as a collaborator. The whole process in my test was orchestrated by a modified clif client, and here's what was in the bash script for a basic foreign collaborator addition (I used two independent test VMs on a virtual network, one ends with `.101` and one ends with `.102`): ``` #!/usr/bin/bash ./clipauc.py register 192.168.56.101:9090 bukkij bobobo123 bukkij@example.com keys/bukkij.pub sleep 3 ./clipauc.py register 192.168.56.102:9090 maruc bobobo123 maruc@example.com keys/maruc.pub sleep 3 ./clipauc.py newrepo 192.168.56.101:9090 bukkij awesomerepo sleep 3 ./clipauc.py newcollab 192.168.56.101:9090 bukkij awesomerepo maruc@192.168.56.102:9090 ``` And the output is seen in this post or in the aforementioned video (after cartoon): https://mastodon.technology/@houkimenator/102372757386157121
fr33domlover commented 1 year ago
Collaborator

@houkime, thank you for the detailed description of what you did! I'll comment here when there are new developments in this topic.

@houkime, thank you for the detailed description of what you did! I'll comment here when there are new developments in this topic.
Sign in to join this conversation.
Loading...
Cancel
Save
There is no content yet.