#85 Discussion on patch

Open
opened 1 month ago by fr33domlover · 5 comments

After you submit a patch, you may send updated versions. In the patch-via-email traditional scenario, you send your patch, get review via replies, then send a new email with a 2nd version, get review on it too, and so on, until your patch is ready to apply. In other words, each version of the patch has its own set of code review replies.

Should the comments on different versions be separate? How do code review comments refer to a patch, to a patch version, to a specific file or diff or code line(s)? Should there be a set of comments not associated with any specific patch version?

TODO: Look at how GitLab CE does it. And how NAB and Gitea do it. Look at some Merge Request with multiple versions and code reviews on them. Examine the UI.

Possible MRs to examine:

After you submit a patch, you may send updated versions. In the patch-via-email traditional scenario, you send your patch, get review via replies, then send a new email with a 2nd version, get review on it too, and so on, until your patch is ready to apply. In other words, each version of the patch has its own set of code review replies. Should the comments on different versions be separate? How do code review comments refer to a patch, to a patch version, to a specific file or diff or code line(s)? Should there be a set of comments not associated with any specific patch version? TODO: Look at how GitLab CE does it. And how NAB and Gitea do it. Look at some Merge Request with multiple versions and code reviews on them. Examine the UI. Possible MRs to examine: - <https://gitlab.com/inkscape/inkscape/merge_requests/1304> - <https://framagit.org/fr33domlover/test/merge_requests/1>
fr33domlover commented 1 month ago
Collaborator

It seems that in GitLab, discussion threads are linear. A thread has a topic and a linear list of replies. There are two kinds of topics in MRs:

  1. A comment made under the MR itself
  2. A specific line in a specific file in a diff between specific two versions of the MR

For this to be possible, a starting point is to have IDs for the different versions of a patch. And yes, comments on different versions get to be separate. How UI displays them is up to UI, the point is that UI would be able to choose to display them separately. And it would be clear to all the participating servers, which object is being commented on.

How to represent patch versions?

I suppose patch versions don't need names or numbers, it's enough to have a list of them. A number can be deduced from the position in the list.

Where should the patch versions be listed, and what's the @type of those objects?

One option is that the patch attached to the ticket has a fixed ID URI, and under the patch there's a versions property mapping to an OrderedCollection. But then, what is the @type of each such version? Also it seems a bit redundant, the patch object is kind of useless.

Another option is that the versions are of type Patch and are listed under the ticket. The attachment would be an OrderedCollection of Patch objects, starting with 1 item at the beginning. One downside is that maybe having an OrderedCollection as an attachment is weird. Software would need to dig into the items to figure out that it's a collection of Patch objects. Maybe we could use a custom property here. Another downside is that the patch versions aren't linked by a top-level patch object. But they are linked together by being items of the same OrderedCollection, and that collection can have an ID. So, the ability to have them grouped under 1 top-level object isn't lost.

How to represent the context when commenting on a line of code?

If patches are sent in a VCS specific format, there would be no general way for ForgeFed implementations to understand and display context that is merely a reference and doesn't quote the actual piece of code. So it may be a good idea to quote the code in the Create Note that you send. An alternative is for a Patch object to specify not only the VCS specific data, but also a generic AP representation of the changes. I believe there's a good chance a general representation of a diff/hunk would be easy to define (and it's enough at least in GitLab, where the only thing you can comment on is a specific line of code), and then a Create Note can refer without quoting. Even then, it could be nice to include the quote.

How to represent quotes in ActivityPub? Quoting some else's message (or a part of it) in your own message. Or quoting lines of code on which you're writing a review comment.

I'm leaving this question open because it's not critical right now; people can just quote in plain text without some smart AP representation. But relevant vocabs for quoting/resource selection are:

It seems that in GitLab, discussion threads are linear. A thread has a topic and a linear list of replies. There are two kinds of topics in MRs: 1. A comment made under the MR itself 2. A specific line in a specific file in a diff between specific two versions of the MR For this to be possible, a starting point is to have IDs for the different versions of a patch. And yes, comments on different versions get to be separate. How UI displays them is up to UI, the point is that UI would be able to choose to display them separately. And it would be clear to all the participating servers, which object is being commented on. How to represent patch versions? I suppose patch versions don't need names or numbers, it's enough to have a list of them. A number can be deduced from the position in the list. Where should the patch versions be listed, and what's the `@type` of those objects? One option is that the patch attached to the ticket has a fixed ID URI, and under the patch there's a `versions` property mapping to an `OrderedCollection`. But then, what is the `@type` of each such version? Also it seems a bit redundant, the patch object is kind of useless. Another option is that the versions are of type `Patch` and are listed under the ticket. The `attachment` would be an `OrderedCollection` of `Patch` objects, starting with 1 item at the beginning. One downside is that maybe having an `OrderedCollection` as an `attachment` is weird. Software would need to dig into the items to figure out that it's a collection of `Patch` objects. Maybe we could use a custom property here. Another downside is that the patch versions aren't linked by a top-level patch object. But they *are* linked together by being items of the same `OrderedCollection`, and that collection can have an ID. So, the ability to have them grouped under 1 top-level object isn't lost. How to represent the context when commenting on a line of code? If patches are sent in a VCS specific format, there would be no general way for ForgeFed implementations to understand and display context that is merely a reference and doesn't quote the actual piece of code. So it may be a good idea to quote the code in the `Create Note` that you send. An alternative is for a `Patch` object to specify not only the VCS specific data, but also a generic AP representation of the changes. I believe there's a good chance a general representation of a diff/hunk would be easy to define (and it's enough at least in GitLab, where the only thing you can comment on is a specific line of code), and then a `Create Note` can refer without quoting. Even then, it could be nice to include the quote. How to represent quotes in ActivityPub? Quoting some else's message (or a part of it) in your own message. Or quoting lines of code on which you're writing a review comment. I'm leaving this question open because it's not critical right now; people can just quote in plain text without some smart AP representation. But relevant vocabs for quoting/resource selection are: - <https://www.w3.org/TR/annotation-vocab/#textquoteselector> - <https://www.w3.org/TR/selectors-states>
bill-auger commented 1 month ago
Collaborator
<zPlus> fr33domlover: I think I'm getting confused with the terms here.
What is a MR for you? So far we have an actor that creates a Ticket
with a Patch attachment and other actors can reply to the Ticket.
It seems enough for me that an actor can reply to the Ticket with
a new message containing a fixed Patch.

<fr33domlover> zPlus, we can do that but we lose the patch version
ordering and versions end up being attached to random deeply nested
replies, that's fine but I'd like to have the versions also be
collected in a list

<fr33domlover> I guess a patch and an MR are the same except a MR
specifies a source repo/branch

that looks confusing to me too - a patch is not analogous to a MR - an MR is a ticket - that is analogous to an email thread; but every one i have seen is linear - nesting and ordering should not be a concern - a patch is just a text attachment to a text post, just lke an individual email in the thread

on a git forge, the patch needs to be converted into a git commit before the forge would have any way of representing it - so a patch is more ananlogous a git commit, although it entered the system attached to a comment - that caveat is almost irrelevant, because the forge probably has no way of storing or representing such attachments - if the forge creates a git commit, to represent the patch, that will be dated - the ticket posts with which all patches would have entered the forge as attachments to, are also necessarily ordered in the forge database - so stable ordering and correlation of the patches and their associated comments should be no problem

as for orphaned patches, thats probably no a big concern either - a MR ticket on git forges can relate to multiple commits; but not arbitrarily - it is always a linear "fast-forward" mergable chain of commits - when force pushes happen, the previous commit, and any related comments become irrelevant are essentially orphaned

on other ticketing systems, patches may be disjoint, much like email; but still, newer patches are generally considered to replace previous revisions; and the previous revisions are irrelevant

the source of the patch is completely irrelevant - the only important things to specify are on which checkout point the patch is to be applied and which name and email to attribute as the author

the concern of how to represent that, would be easier to address if there were some resonable expectations of how these patches will actually be created and enter the system - no forge has that ability - no email bridge is planned - so where would these patches originate? - local CLI clients (that dont extst)? - maston toots (i doubt that)?

a properly formatted git send-email specifies the base upon which the patch applies, and the author IIRC - same with DARCS, right? - so it could be specified that implementors must provide that information in data fields - otherwise, if that information is missing, the result would be undefined - forge would need to fallback to some default target branch for non-git patches, and attribute the commit to unknown-email@localhost, or probably best reject the submission as incomplete

``` <zPlus> fr33domlover: I think I'm getting confused with the terms here. What is a MR for you? So far we have an actor that creates a Ticket with a Patch attachment and other actors can reply to the Ticket. It seems enough for me that an actor can reply to the Ticket with a new message containing a fixed Patch. <fr33domlover> zPlus, we can do that but we lose the patch version ordering and versions end up being attached to random deeply nested replies, that's fine but I'd like to have the versions also be collected in a list <fr33domlover> I guess a patch and an MR are the same except a MR specifies a source repo/branch ``` that looks confusing to me too - a patch is not analogous to a MR - an MR is a ticket - that is analogous to an email thread; but every one i have seen is linear - nesting and ordering should not be a concern - a patch is just a text attachment to a text post, just lke an individual email in the thread on a git forge, the patch needs to be converted into a git commit before the forge would have any way of representing it - so a patch is more ananlogous a git commit, although it entered the system attached to a comment - that caveat is almost irrelevant, because the forge probably has no way of storing or representing such attachments - if the forge creates a git commit, to represent the patch, that will be dated - the ticket posts with which all patches would have entered the forge as attachments to, are also necessarily ordered in the forge database - so stable ordering and correlation of the patches and their associated comments should be no problem as for orphaned patches, thats probably no a big concern either - a MR ticket on git forges can relate to multiple commits; but not arbitrarily - it is always a linear "fast-forward" mergable chain of commits - when force pushes happen, the previous commit, and any related comments become irrelevant are essentially orphaned on other ticketing systems, patches may be disjoint, much like email; but still, newer patches are generally considered to replace previous revisions; and the previous revisions are irrelevant the source of the patch is completely irrelevant - the only important things to specify are on which checkout point the patch is to be applied and which name and email to attribute as the author the concern of how to represent that, would be easier to address if there were some resonable expectations of how these patches will actually be created and enter the system - no forge has that ability - no email bridge is planned - so where would these patches originate? - local CLI clients (that dont extst)? - maston toots (i doubt that)? a properly formatted git send-email specifies the base upon which the patch applies, and the author IIRC - same with DARCS, right? - so it could be specified that implementors must provide that information in data fields - otherwise, if that information is missing, the result would be undefined - forge would need to fallback to some default target branch for non-git patches, and attribute the commit to unknown-email@localhost, or probably best reject the submission as incomplete
bill-auger commented 1 month ago
Collaborator

BTW please keep discussions on the discussion forum - the volume of text surrouding forge-fed is massive - it is really hard to follow even if only the feneas forum activity - there are 87 tickets on this issue tracker, i dont think i have read even one of them - plus, as you know, notabug is not accessible via email - that alone makes it a terrible discussion forum IMHO

BTW please keep discussions on the discussion forum - the volume of text surrouding forge-fed is massive - it is really hard to follow even if only the feneas forum activity - there are 87 tickets on this issue tracker, i dont think i have read even one of them - plus, as you know, notabug is not accessible via email - that alone makes it a terrible discussion forum IMHO
fr33domlover commented 1 month ago
Collaborator

that looks confusing to me too - a patch is not analogous to a MR - an MR is a ticket - that is analogous to an email thread; but every one i have seen is linear - nesting and ordering should not be a concern - a patch is just a text attachment to a text post, just lke an individual email in the thread

Email is just a specific protocol that happens to be used for sending patches. And ActivityPub would be another. I'd like to think about patches and MRs on a conceptual level, independent of communication protocol.

A patch, more precisely a series of patches, is a set of named changes proposed against a codebase. Such a patch set is offered to a project, and possibly has some discussion and rounds of review and updates, producing multiple versions of the patch set, until it passes review and gets applied.

That's exactly how a MR works too. The difference is that a MR is a work item that has all kinds of properties and discussion thread(s), and is associated with one or more versions of a patch set, each version having its own discussion and code review. And the patch is specified as a VCS repo branch with commits, rather than patch diff text.

I think it would be really nice to have a model that can represent both patches and MRs. Even if in the UI you "open a MR", the federation protocol may communicate your MR as a patch diff. Or it may just state the repo/branch and let the other server pull the branch. It doesn't matter; both a patch and a MR represent changes proposed for a codebase.

So my proposal is to think of a patch/MR as a work item with a proposal attached, to apply some changes to a codebase. That proposal may specify the changes as a source repo/branch and/or a versioned patch set. That model is general enough both for MRs in complicated forges like GitLab and for more traditional patches sent by email. Thanks to the flexibility and extensibility of AP and AS2, it's very easy to represent this model in AP, with mostly standard existing vocabulary.

on a git forge, the patch needs to be converted into a git commit before the forge would have any way of representing it - so a patch is more ananlogous a git commit, although it entered the system attached to a comment

That's VCS specific. For example, in Darcs, patches are first class objects. A patch bundle sent by email has the same format as the Darcs repo. Darcs repos are patch sets, rather than file tree snapshots like in Git.

I agree a patch is sort of like a commit, kind of like a-commit-not-yet-committed-into-the-repo. But it doesn't have to be converted into a commit to be represented; that depends on the VCS.

the ticket posts with which all patches would have entered the forge as attachments to, are also necessarily ordered in the forge database - so stable ordering and correlation of the patches and their associated comments should be no problem

I agree they'd be ordered in the DB, but I'd like to have a way for the forge to communicate that ordering in the AS2 JSON object. So that other ActivityPub and ForgeFed implementations can see it (instead of having to dig it out of the comments, which is an unnecessary waste and complication, considering the ordering is known and we are free to use an expressive vocabulary and not just the limited send-text-with-attachment model of mailing lists).

the concern of how to represent that, would be easier to address if there were some resonable expectations of how these patches will actually be created and enter the system - no forge has that ability - no email bridge is planned - so where would these patches originate? - local CLI clients (that dont extst)? - maston toots (i doubt that)?

a properly formatted git send-email specifies the base upon which the patch applies, and the author IIRC - same with DARCS, right? - so it could be specified that implementors must provide that information in data fields - otherwise, if that information is missing, the result would be undefined - forge would need to fallback to some default target branch for non-git patches, and attribute the commit to unknown-email@localhost, or probably best reject the submission as incomplete

Good points. I have an updated proposal for representing patches, that's for a separate issue though. The point for this issue is that each patch version gets its own code review, its own set of comments. Both in email patches and in web MRs.

BTW please keep discussions on the discussion forum - the volume of text surrouding forge-fed is massive - it is really hard to follow even if only the feneas forum activity - there are 87 tickets on this issue tracker, i dont think i have read even one of them - plus, as you know, notabug is not accessible via email - that alone makes it a terrible discussion forum IMHO

Bill, we all carefully picked Gogs together for NotABug.org, remember? I know it's not perfect, but what can I do about it now? We may migrate away from here but for now this is where we are.

I often open forum threads for issues here. You can follow this repo, and if you see an issue without a forum thread and you want to comment, feel free to open a forum thread. You can even do that via email without browsing Discourse via web (check out the pinned ForgeFed topics in forum, they specify how to open threads via email).

> that looks confusing to me too - a patch is not analogous to a MR - an MR is a ticket - that is analogous to an email thread; but every one i have seen is linear - nesting and ordering should not be a concern - a patch is just a text attachment to a text post, just lke an individual email in the thread Email is just a specific protocol that happens to be used for sending patches. And ActivityPub would be another. I'd like to think about patches and MRs on a conceptual level, independent of communication protocol. A patch, more precisely a series of patches, is a set of named changes proposed against a codebase. Such a patch set is offered to a project, and possibly has some discussion and rounds of review and updates, producing multiple versions of the patch set, until it passes review and gets applied. That's exactly how a MR works too. The difference is that a MR is a work item that has all kinds of properties and discussion thread(s), and is associated with one or more versions of a patch set, each version having its own discussion and code review. And the patch is specified as a VCS repo branch with commits, rather than patch diff text. I think it would be really nice to have a model that can represent both patches and MRs. Even if in the UI you "open a MR", the federation protocol may communicate your MR as a patch diff. Or it may just state the repo/branch and let the other server pull the branch. It doesn't matter; both a patch and a MR represent changes proposed for a codebase. So my proposal is to think of a patch/MR as a work item with a proposal attached, to apply some changes to a codebase. That proposal may specify the changes as a source repo/branch and/or a versioned patch set. That model is general enough both for MRs in complicated forges like GitLab and for more traditional patches sent by email. Thanks to the flexibility and extensibility of AP and AS2, it's very easy to represent this model in AP, with mostly standard existing vocabulary. > on a git forge, the patch needs to be converted into a git commit before the forge would have any way of representing it - so a patch is more ananlogous a git commit, although it entered the system attached to a comment That's VCS specific. For example, in Darcs, patches are first class objects. A patch bundle sent by email has the same format as the Darcs repo. Darcs repos are patch sets, rather than file tree snapshots like in Git. I agree a patch is sort of like a commit, kind of like a-commit-not-yet-committed-into-the-repo. But it doesn't have to be converted into a commit to be represented; that depends on the VCS. > the ticket posts with which all patches would have entered the forge as attachments to, are also necessarily ordered in the forge database - so stable ordering and correlation of the patches and their associated comments should be no problem I agree they'd be ordered in the DB, but I'd like to have a way for the forge to *communicate* that ordering in the AS2 JSON object. So that other ActivityPub and ForgeFed implementations can see it (instead of having to dig it out of the comments, which is an unnecessary waste and complication, considering the ordering is known and we are free to use an expressive vocabulary and not just the limited send-text-with-attachment model of mailing lists). > the concern of how to represent that, would be easier to address if there were some resonable expectations of how these patches will actually be created and enter the system - no forge has that ability - no email bridge is planned - so where would these patches originate? - local CLI clients (that dont extst)? - maston toots (i doubt that)? > a properly formatted git send-email specifies the base upon which the patch applies, and the author IIRC - same with DARCS, right? - so it could be specified that implementors must provide that information in data fields - otherwise, if that information is missing, the result would be undefined - forge would need to fallback to some default target branch for non-git patches, and attribute the commit to unknown-email@localhost, or probably best reject the submission as incomplete Good points. I have an updated proposal for representing patches, that's for a separate issue though. The point for this issue is that each patch version gets its own code review, its own set of comments. Both in email patches and in web MRs. > BTW please keep discussions on the discussion forum - the volume of text surrouding forge-fed is massive - it is really hard to follow even if only the feneas forum activity - there are 87 tickets on this issue tracker, i dont think i have read even one of them - plus, as you know, notabug is not accessible via email - that alone makes it a terrible discussion forum IMHO Bill, we all carefully picked Gogs together for NotABug.org, remember? I know it's not perfect, but what can I do about it now? We may migrate away from here but for now this is where we are. I often open forum threads for issues here. You can follow this repo, and if you see an issue without a forum thread and you want to comment, feel free to open a forum thread. You can even do that via email without browsing Discourse via web (check out the pinned ForgeFed topics in forum, they specify how to open threads via email).
fr33domlover commented 4 weeks ago
Collaborator

This is waiting for #88, see there. Once that issue is closed, proceed here.

This is waiting for #88, see there. Once that issue is closed, proceed here.
Sign in to join this conversation.
Loading...
Cancel
Save
There is no content yet.