[RFC pre-GLEP] Gentoo Git Workflow

classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
Hi, everyone.

There have been multiple attempts at grasping this but none so far
resulted in something official and indisputable. At the same time, we
end having to point our users at semi-official guides which change
in unpredictable ways.

Here's the current draft:
https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git

The basic idea is that the GLEP provides basic guidelines for using git,
and then we write a proper manual on top of it (right now, all the pages
about it end up as a mix of requirements and a partial git manual).

What do you think about it? Is there anything else that needs being
covered?

Copy of the markup for inline comments follows.

---

{{GLEP
|Number=xx
|Title=Gentoo Git Workflow
|Type=Standards Track
|Status=Draft
|Author=Michał Górny <[hidden email]>
}}

==Abstract==
This GLEP specifies basic standards and recommendations for using git
with the Gentoo ebuild repository. It covers only Gentoo-specific
policies, and is not meant to be a complete guide.

==Motivation==
Although the main Gentoo repository is using git for two years already,
developers still lack official documentation on how to use git
consistently. Most of the developers learn spoken standards from others
and follow them. This eventually brings consistency to some extent but
is suboptimal. Furthermore, it results in users having to learn things
the hard way instead of having proper documentation to follow.

There were a few attempts to standardize git use over the time. Most
noteworthy are [[Gentoo git workflow]] and [[Gentoo GitHub]] articles.
However, they are not any kind of official standards, and they have too
broad focus to become one. There was also an initial GLEP attempt but it
never even reached the draft stage.

This GLEP aims to finally provide basic standardization for the use of
git in the Gentoo repository. It aims to focus purely on Gentoo-specific
standards and not git usage in general. It doesn't mean to be a complete
guide but a formal basis on top of which official guides could be
created.

==Specification==
===Branching model===
The main branch of the Gentoo repository is the <kbd>master</kbd>
branch. All Gentoo developers push their work straight to the master
branch, provided that the commits meet the minimal quality standards.
The master branch is also used straight for continous user repository
deployment.

Since multiple developers work on master concurrently, they may be
required to rebase multiple times before being able to push. Developers
are requested not to use workflows that could prevent others from
pushing, e.g. pushing single commits frequently instead of staging them
and using a single push.

Developers can use additional branches to facilitate review and testing
of long-term projects of larger scale. However, since git fetches all
branches by default, they should be used scarcely. For smaller projects,
local branches or repository forks are preferred.

Unless stated otherwise, the rules set by this specification apply to
the master branch only. The development branches can use relaxed rules.

Rewriting history (i.e. force pushes) of the master branch is forbidden.

===Merge commits===
The use of merge commits in the Gentoo repository is strongly
discouraged. Usually it is preferable to rebase instead. However, the
developers are allowed to use merge commits in justified cases. Merge
commits can be only used to merge additional branches, the use of
implicit <kbd>git pull</kbd> merges is entirely forbidden.

In a merge commit that is committed straight to the Gentoo repository,
the first parent is expected to reference an actual Gentoo commit
preceding the merge, while the remaining parents can be used to
reference external repositories. The commits following the first parent
are required to conform to this specification alike regular Gentoo
commits. The additional commits following other parents can use relaxed
rules.

===OpenPGP signatures===
Each commit in the Gentoo repository must be signed using the
committer's OpenPGP key. Furthermore, each push to the repository must
be signed using the key belonging to the developer performing the push
(matched via the SSH key).

The requirements for OpenPGP keys are covered by [[GLEP:63|GLEP 63]].

===Splitting commits===
Git commits are lightweight, and the developers are encouraged to split
their commits to improve readability and the ability of reverting
specific sub-changes. When choosing how to split the commits, the
developers should consider the following three rules:
# Use atomic commits — one commit per logical change.
# Split commits at logical unit (package, eclass, profile…) boundaries.
# Avoid creating commits that are 'broken' — e.g. are incomplete, have
uninstallable packages.

It is technically impossible to always respect all of the three rules,
so developers have to balance between them at their own discretion. Side
changes that are implied by other change (e.g. revbump due to some
change) should be included in the first commit requiring them. Commits
should be ordered to avoid breakage, and follow logical ordering
whenever possible.

Examples:
* When doing a version bump, it is usually not reasonable to split every
necessary logical change into separate commit since the interim commits
would correspond to a broken package. However, if the package has a live
ebuild, it ''might'' be reasonable to perform split logical changes on
the live ebuild, then create a release as another logical step.
* When doing one or more changes that require a revision bump, bump the
revision in the commit including the first change. Split the changes
into multiple logical commits without further revision bumps — since
they are going to be pushed in a single push, the user will not be
exposed to interim state.
* When adding a new version of a package that should be masked, you can
include the {{Path|package.mask}} edit in the commit adding it.
Alternatively, you can add the mask in a split commit ''preceding'' the
bump.
* When doing a minor change to a large number of packages, it is
reasonable to do so in a single commit. However, when doing a major
change (e.g. a version bump), it is better to split commits on package
boundaries.

===Commit messages===
A standard git commit message consists of three parts, in order: a
summary line, an optional body and an optional set of tags. The parts
are separated by a single empty line.

The summary line is included in the short logs (<kbd>git log --
oneline</kbd>, gitweb, GitHub, mail subject) and therefore should
provide a short yet accurate description of the change. The summary line
starts with a logical unit name, followed by a colon, a space and a
short description of the most important changes. If a bug is associated
with a change, then it should be included in the summary line as
<kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69
characters, and must not be wrapped.

The suggested logical unit name formats are:
* for a package, <kbd>category/package: …</kbd>;
* for an eclass, <kbd>name.eclass: …</kbd>;
* for other directories or files, their path or filename (as long as a
developer reading the commit messages is able to figure out what it is)
— e.g. <kbd>licenses/foo: …</kbd>, <kbd>package.mask: …</kbd>.

The body is included in the full commit log (<kbd>git log</kbd>,
detailed commit info on gitweb/GitHub, mail body). It is optional, and
it can be used to describe the commit in more detail if the summary line
is not sufficient. It is generally a good idea to repeat the information
contained in the summary (except for the logical unit) since the summary
is frequently formatted as a title. The body should be wrapped at 72
characters. It can contain multiple paragraphs, separated by empty
lines.

The tag part is included in the full commit log as an extension to the
body. It consists of one or more lines consisting of key, followed by a
colon and a space, followed by value. Git does not enforce any
standardization of the keys, and the tag format is ''not'' meant for
machine processing.

A few tags of common use are:
* user-related tags:
** <kbd>Acked-by: Full Name <[hidden email]></kbd> — commit approved
by another person (usually without detailed review),
** <kbd>Reported-by: Full Name <[hidden email]></kbd>,
** <kbd>Reviewed-by: Full Name <[hidden email]></kbd> — usually
indicates full review,
** <kbd>Signed-off-by: Full Name <[hidden email]></kbd> — DCO
approval (not used in Gentoo right now),
** <kbd>Suggested-by: Full Name <[hidden email]></kbd>, 
** <kbd>Tested-by: Full Name <[hidden email]></kbd>.
* commit-related tags:
** <kbd>Fixes: commit-id (commit message)</kbd> — to indicate fixing a
previous commit,
** <kbd>Reverts: commit-id (commit message)</kbd> — to indicate
reverting a previous commit,
* bug tracker-related tags:
** <kbd>Bug: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; — to
reference a bug,
** <kbd>Closes: <nowiki>https://github.com/gentoo/gentoo/pull/NNNN</nowi
ki></kbd>; — to automatically close a GitHub pull request,
** <kbd>Fixes: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; —
to indicate a fixed bug,
* package manager tags:
** <kbd>Package-Manager: …</kbd> — used by repoman to indicate Portage
version,
** <kbd>RepoMan-Options: …</kbd> — used by repoman to indicate repoman
options.

The bug tracker-related tags can be used to extend the body message.
However, they should be skipped if the bug number is already provided in
the summary and there is no explicit body.

==Rationale==
===Branching model===
The model of multiple developers pushing concurrently to the repository
containing all packages is preserved from CVS. The developers have
discussed the possibility of using other models, in particular of using
multiple branches for developers that are afterwards automatically
merged into the master branch. However, it was determined that there is
no need to use a more complex model at the moment and the potential
problems with them outweighed the benefits.

The necessity of rebasing is a natural consequence of concurrent work,
along with the ban of reverse merge commits. Since rebasing a number of
commits can take a few seconds or even more, another developer sometimes
commits during that time, enforcing another rebase.

In the past, there were cases of developers using automated scripts
which created single commits, ran repoman and pushed them straight to
the repository. This resulted in pushes from a single developer every
10-15 seconds which made it impossible for other developers to rebase
larger commit batches. This kind of workflow is therefore strongly
discouraged.

Creating multiple short-time branches is discouraged as it implies
additional transfer for users cloning the repository and additional
maintenance burden. Since the git migration, the developers have created
a few branches on the repository, and did not maintain them. The Infra
had to query the developers about the state of the branches and clean
them up. Keeping branches local or hosting them outside Gentoo Infra
(e.g. on GitHub) reduces the burden on our users, even if the developers
do not clean after themselves.

===Merge commits===
Merge commits have been debated multiple times in various media, in
particular IRC. They have very verbose opponents whose main argument is
that they make history unreadable. At the same time, it has been
frequently pointed out that merge commits have valid use cases. To
satisfy both groups, this specification strongly discourages merge
commits but allows their use in justified cases.

Most importantly, the implicit merge commits created by <kbd>git
pull</kbd> are forbbiden. Those merges have no real value or justified
use case, and since they are created implicitly by default there have
been historical cases where developers pushed them unintentionally. They
are banned explicitly to emphasize the necessity of adjusting git
configuration to the developers.

When processing merge commits, it is important to explicitly distinguish
the parent that represents 'real' Gentoo history from the one(s) that
represent external branches. The former can either be an existing Gentoo
commit or a commit that the developer has prepared (on top of existing
Gentoo history) before merging the branch. For this reason, it is
important to enforce the full set of Gentoo policies on this parent and
the commits preceding it. On the other hand, the external branches can
be treated similarly to development branches. Relaxing the rules for
external branches also makes it possible to merge user contributions
with original user OpenPGP signatures, while adding a final developer
signature on top of the merge commit.

When using <kbd>git merge ''foo''</kbd>, the first parent represents the
current <kbd>HEAD</kbd> and the second one the merged branch. This is
the model used by the specification.

===OpenPGP signatures===
The signature requirements strictly correspond to the git setup deployed
by the Infrastructure team.

The commit signatures provide an ability to verify the authenticity of
all commits throughout the Gentoo repository history (to the point of
git conversion). The push signatures mostly serve the purpose of
additional authentication for the developer pushing a specific set of
commits.

===Splitting commits===
The goal of the commit splitting rules is to make the best use of git
while avoiding enforcing too much overhead on the developer and
optimizing to avoid interim broken commits.

Splitting commits by logical changes improves the readability and makes
it easier to revert a specific change while preserving the remaining
(irrelevant) changes. The changes done by a developer are easier to
comprehend when the reviewer can follow them in the specific order done
by the author, rather than combined with other changes.

Splitting commits on logical unit boundary was used since CVS times.
Mostly it improves readability via making it possible to include the
unit (package, eclass…) name in the commit message — so that developers
perceive what specific packages are affected by the change without
having to look into diffstat.

Requiring commits to be non-'broken' is meant to preserve a good quality
git history of the repository. This means that the users can checkout an
interim commit without risking a major problem such as a missing
dependency that is being added by the commit following it. It also makes
it safer to revert the most recent changes with reduced risk of exposing
a breakage.

Those rules partially overlap, and if that is the case, the developers
are expected to use common sense to determine the course of action that
gives the best result. Furthermore, requiring the strict following of
the rules would mean a lot of additional work for developers and a lot
of additional commits for no real benefit.

The examples are provided to make it possible for the developers to get
a 'feeling' how to work with the rules.

===Commit messages===
The basic commit message format is similar to the one used by other
projects, and provides for reasonably predictable display of results.

The summary line is meant to provide a good concise summary of the
changes. It is included in the short logs, and should include all the
information to help developer determine whether he is interested in
looking into the commit details. Including the logical unit name
accounts for the fact that most of the Gentoo commits are specific to
those units (e.g. packages). The length limit is meant to avoid wrapping
the shortlog — which could result in unreadable <kbd>git log --
oneline</kbd> or ugly mid-word ellipsis on GitHub.

The body is meant to provide the detailed information for a commit. It
is usually displayed verbatim, and the use of paragraphs along with line
wrapping is meant to improve readability. The body should include the
information contained in the summary since the two are sometimes really
disjoint, and expecting the user to read body as a continuation of
summary is confusing. For example, in <kbd>git send-email</kbd>, the
summary line is used to construct the mail's summary and is therefore
disjoint from the body.

The tag section is a traditional way of expressing quasi-machine-
readable data. However, the commit messages are not really suited for
machine use and only a few tags are actually processed by scripts. The
specification tries to provide a concise set of potentially useful tags
collected from various projects (the Linux kernel, X.org). Those tags
can be used interchangeably with plaintext explanation in the body.

The only tag defined by git itself is the <kbd>Signed-off-by</kbd> line,
that is created by <kbd>git commit -s</kbd>. However, Gentoo does not
currently enforce a DCO consistently, and therefore it is meaningless.

The only tag subject to machine processing is the <kbd>Closes</kbd> line
that is used by GitHub to automatically close pull requests (and issues
— however, Gentoo does not use GitHub's issue tracker).

All the remaining tags serve purely as a user convenience.

Historically, Gentoo has been using a few tags starting with <kbd>X-
</kbd>. However, this practice was abandoned once it has been pointed
out that git does not enforce any standard set of tags, and therefore
indicating non-standard tags is meaningless.

Gentoo developers are still frequently using <kbd>Gentoo-Bug</kbd> tag,
sometimes followed by <kbd>Gentoo-Bug-URL</kbd>. Using both
simultaneously is meaningless (they are redundant), and using the former
has no advantages over using the classic <kbd>#nnnnnn</kbd> form in the
summary or the body.

==Backwards Compatibility==
Most of the new policy will apply to the commits following its approval.
Backwards compatibility is not relevant there.

One particular point that affects commits retroactively is the OpenPGP
signing. However, it has been an obligatory requirement enforced by the
infrastructure since the git switch. Therefore, all the git history
conforms to that.

==Reference implementation==
All of the elements requiring explicit implementation on the git
infrastructure are implemented already. In particular this includes:
* blocking force pushes on the <kbd>master</kbd> branch,
* requiring signed commits on the <kbd>master</kbd> branch,
* requiring signed pushes to the repository.

The remaining elements are either non-obligatory or non-enforceable at
infrastructure level.

RepoMan suggests starting the commit message with package name since
commit [https://gitweb.gentoo.org/proj/portage.git/commit/?id=46dafadff5
8da0220511f20480b73ad09f913430
46dafadff58da0220511f20480b73ad09f913430].

==Acknowledgements==
Most of the foundations for this specification were laid out by
[[User:Hasufell|Julian Ospald (hasufell)]] in his initial version of
[[Gentoo git workflow]] article.

==Copyright==

This work is licensed under the Creative Commons Attribution-ShareAlike
3.0 Unported License. To view a copy of this license, visit http://creat
ivecommons.org/licenses/by-sa/3.0/.

--
Best regards,
Michał Górny

signature.asc (1007 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Nicolas Bock-2
On Tue, Jul 25, 2017 at 10:05:06AM +0200, Michał Górny wrote:

>Hi, everyone.
>
>There have been multiple attempts at grasping this but none so far
>resulted in something official and indisputable. At the same time, we
>end having to point our users at semi-official guides which change
>in unpredictable ways.
>
>Here's the current draft:
>https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
>
>The basic idea is that the GLEP provides basic guidelines for using git,
>and then we write a proper manual on top of it (right now, all the pages
>about it end up as a mix of requirements and a partial git manual).
>
>What do you think about it? Is there anything else that needs being
>covered?
I like it. +1

>Copy of the markup for inline comments follows.


--
Nicolas Bock <[hidden email]>

signature.asc (1000 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Dirkjan Ochtman-3
In reply to this post by Michał Górny-5
On Tue, Jul 25, 2017 at 10:05 AM, Michał Górny <[hidden email]> wrote:
What do you think about it? Is there anything else that needs being
covered?
 
Looks good to me. Thanks for writing it up!

Cheers,

Dirkjan
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Tobias Klausmann-4
In reply to this post by Michał Górny-5
Hi!

On Tue, 25 Jul 2017, Michał Górny wrote:
> The summary line is included in the short logs (<kbd>git log --
> oneline</kbd>, gitweb, GitHub, mail subject) and therefore should
> provide a short yet accurate description of the change. The summary line
> starts with a logical unit name, followed by a colon, a space and a
> short description of the most important changes. If a bug is associated
> with a change, then it should be included in the summary line as
> <kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69
> characters, and must not be wrapped.

This limit can be a problem if there's a nontrivial change to the
more than 80 packages in the tree that have more than forty characters in
cat/pkg[0]. Is the only option there to do word-smithing or
making the commit summary less usefu?

Or do we have a "violate if necessary" agreement regarding that?


Regards,
Tobias

[0]
$ cd /usr/portage
$ ls -d *-*/*|awk '{if (length>=40) {print length, $0}}'|sort -n


--
Sent from aboard the Culture ship
        GSV Of Course I Still Love You

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michael Orlitzky
In reply to this post by Michał Górny-5
On 07/25/2017 04:05 AM, Michał Górny wrote:
>
> Here's the current draft:
> https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
>

It's mostly fine, but there are two changes I disagree with:

> When doing one or more changes that require a revision bump, bump the
> revision in the commit including the first change. Split the changes
> into multiple logical commits without further revision bumps — since
> they are going to be pushed in a single push, the user will not be
> exposed to interim state.

We shouldn't play games in the repo and hope that everything works out
if we wait to push until just the right time. We're not going to run out
of numbers -- it's simpler and more correct to do a new revision with
each commit.


> Gentoo developers are still frequently using Gentoo-Bug tag,
> sometimes followed by Gentoo-Bug-URL. Using both simultaneously is
> meaningless (they are redundant), and using the former has no
> advantages over using the classic #nnnnnn form in the summary or the
> body.

There are two main advantages over having the bug number in the summary.
Space is at a premium in the summary, as Tobias pointed out, and the

  Gentoo-Bug: whatever

format is trivially machine-readable, whereas sticking it somewhere else
is less so.

And just a reminder -- Gokturk worked to get a lot of this stuff into
the devmanual, e.g.

  https://devmanual.gentoo.org/ebuild-maintenance/index.html

Some of that is important, like the warning not to use "bug #x" in the
body of the commit message.


Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
In reply to this post by Tobias Klausmann-4
Dnia 25 lipca 2017 12:59:21 CEST, Tobias Klausmann <[hidden email]> napisał(a):

>Hi!
>
>On Tue, 25 Jul 2017, Michał Górny wrote:
>> The summary line is included in the short logs (<kbd>git log --
>> oneline</kbd>, gitweb, GitHub, mail subject) and therefore should
>> provide a short yet accurate description of the change. The summary
>line
>> starts with a logical unit name, followed by a colon, a space and a
>> short description of the most important changes. If a bug is
>associated
>> with a change, then it should be included in the summary line as
>> <kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69
>> characters, and must not be wrapped.
>
>This limit can be a problem if there's a nontrivial change to the
>more than 80 packages in the tree that have more than forty characters
>in
>cat/pkg[0]. Is the only option there to do word-smithing or
>making the commit summary less usefu?
>
>Or do we have a "violate if necessary" agreement regarding that?

Yeah, i meant to apply the "must not" to wrapping but "should not" to length. Though I suggest you to ellipsize the package name, if it is unambiguous enough.

The problem is that if you exceed the length, the summary will be usually cut one way or another anyway.

>
>
>Regards,
>Tobias
>
>[0]
>$ cd /usr/portage
>$ ls -d *-*/*|awk '{if (length>=40) {print length, $0}}'|sort -n


--
Best regards,
Michał Górny (by phone)

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Jonas Stein
In reply to this post by Michał Górny-5
Hi everyone,

> Here's the current draft:
> https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
>
> The basic idea is that the GLEP provides basic guidelines for using git,
> and then we write a proper manual on top of it (right now, all the pages
> about it end up as a mix of requirements and a partial git manual).
>
> What do you think about it? Is there anything else that needs being
> covered?

Thank you, Michał, for preparing an official guide from all the spread
informations. I think it is important for Gentoo to have such GLEP.

I think we should not bundle GLEPs to companies, but keep it more
abstract. The GLEP should still be valid, if we do not use github
anymore. Many large repositories have shut down in the last years. Even
after years we have not fixed all ebuilds [1]. We must be prepared, to
loose github very suddenly and should not hope that it will end with an
announcement years before.


Hence, I suggest to write the GLEP without naming "github" a single time.

[1] https://wiki.gentoo.org/wiki/Upstream_repository_shutdowns

--
Best,
Jonas

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
In reply to this post by Michael Orlitzky
Dnia 25 lipca 2017 13:25:38 CEST, Michael Orlitzky <[hidden email]> napisał(a):

>On 07/25/2017 04:05 AM, Michał Górny wrote:
>>
>> Here's the current draft:
>> https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
>>
>
>It's mostly fine, but there are two changes I disagree with:
>
>> When doing one or more changes that require a revision bump, bump the
>> revision in the commit including the first change. Split the changes
>> into multiple logical commits without further revision bumps — since
>> they are going to be pushed in a single push, the user will not be
>> exposed to interim state.
>
>We shouldn't play games in the repo and hope that everything works out
>if we wait to push until just the right time. We're not going to run
>out
>of numbers -- it's simpler and more correct to do a new revision with
>each commit.

I have no clue what you mean. I'm just saying that if you push 10 changes in 10 commits, you don't have to go straight to -r10 in a single push.

>
>
>> Gentoo developers are still frequently using Gentoo-Bug tag,
>> sometimes followed by Gentoo-Bug-URL. Using both simultaneously is
>> meaningless (they are redundant), and using the former has no
>> advantages over using the classic #nnnnnn form in the summary or the
>> body.
>
>There are two main advantages over having the bug number in the
>summary.
>Space is at a premium in the summary, as Tobias pointed out, and the
>
>  Gentoo-Bug: whatever
>
>format is trivially machine-readable, whereas sticking it somewhere
>else
>is less so.

Except that there is no machines using it. In all contexts, using full URL for machine readability is better as it works with all software out of the box.

>
>And just a reminder -- Gokturk worked to get a lot of this stuff into
>the devmanual, e.g.
>
>  https://devmanual.gentoo.org/ebuild-maintenance/index.html
>
>Some of that is important, like the warning not to use "bug #x" in the
>body of the commit message.


--
Best regards,
Michał Górny (by phone)

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michael Orlitzky
On 07/25/2017 07:52 AM, Michał Górny wrote:
>
> I have no clue what you mean. I'm just saying that if you push 10
> changes in 10 commits, you don't have to go straight to -r10 in a
> single push.
>

Exactly. Do that instead of hoping that no one checks out your
intermediate commits. There's no limit to the number of revisions we can
have, and trying to keep track of when it's safe to push in your head is
asking for trouble.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michael Palimaka
In reply to this post by Michał Górny-5
On 07/25/2017 06:05 PM, Michał Górny wrote:
> Hi, everyone.
>
> There have been multiple attempts at grasping this but none so far
> resulted in something official and indisputable. At the same time, we
> end having to point our users at semi-official guides which change
> in unpredictable ways.
>
> Here's the current draft:
> https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git

This looks really nice, thanks for working on it.

> * When doing a minor change to a large number of packages, it is
> reasonable to do so in a single commit. However, when doing a major
> change (e.g. a version bump), it is better to split commits on package
> boundaries.

In some cases we do prefer to make major changes on a set of related
package all in one commit. For example, we always bump the 240+ KDE
Applications collection together because that's how it's released.

> ===Commit messages===
> A standard git commit message consists of three parts, in order: a
> summary line, an optional body and an optional set of tags. The parts
> are separated by a single empty line.
>
> The summary line is included in the short logs (<kbd>git log --
> oneline</kbd>, gitweb, GitHub, mail subject) and therefore should
> provide a short yet accurate description of the change. The summary line
> starts with a logical unit name, followed by a colon, a space and a
> short description of the most important changes. If a bug is associated
> with a change, then it should be included in the summary line as
> <kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69
> characters, and must not be wrapped.

Does a bug # really need to always be in the summary line? It can eat
valuable characters and tags which are pretty popular are equally valid IMO.

> ** <kbd>Bug: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; — to
> reference a bug,
> ** <kbd>Closes: <nowiki>https://github.com/gentoo/gentoo/pull/NNNN</nowi
> ki></kbd>; — to automatically close a GitHub pull request,
> ** <kbd>Fixes: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; —
> to indicate a fixed bug,

grepping the git log shows that 'Gentoo-bug' is much more common than
plain 'Bug'. 'Fixes' is hardly used at all, and I think it's a bit
confusing to use this for bugs as well as commits.

> a few branches on the repository, and did not maintain them. The Infra
> had to query the developers about the state of the branches and clean
> them up.

Should 'The Infra' be 'The Infra team' or just 'Infra'?

> Gentoo developers are still frequently using <kbd>Gentoo-Bug</kbd> tag,
> sometimes followed by <kbd>Gentoo-Bug-URL</kbd>. Using both
> simultaneously is meaningless (they are redundant), and using the former
> has no advantages over using the classic <kbd>#nnnnnn</kbd> form in the
> summary or the body.

I agree that using both is redundant, but I don't agree with
discouraging or banning the use of 'Gentoo-bug'. If someone prefers to
use it so it sits nicely with the other tags why stop them?

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Joshua Kinard-2
In reply to this post by Michał Górny-5
On 07/25/2017 04:05, Michał Górny wrote:

> Hi, everyone.
>
> There have been multiple attempts at grasping this but none so far
> resulted in something official and indisputable. At the same time, we
> end having to point our users at semi-official guides which change
> in unpredictable ways.
>
> Here's the current draft:
> https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
>
> The basic idea is that the GLEP provides basic guidelines for using git,
> and then we write a proper manual on top of it (right now, all the pages
> about it end up as a mix of requirements and a partial git manual).
>
> What do you think about it? Is there anything else that needs being
> covered?
>
> Copy of the markup for inline comments follows.

I haven't seen it mentioned yet, but will this GLEP update or replace this
existing Wiki article on using git w/ Gentoo?:

https://wiki.gentoo.org/wiki/Gentoo_git_workflow

Some of the step-by-step bits in the above Wiki page look like good candidates
to be integrated into the GLEP.  It also contains guidelines on writing commit
messages, such as limiting the first line to ~50 characters, an optional body
wrapped at 75 chars/line, and including the usual git tags for sign-off and
such.  Though, I like the explicitness of the GLEP's text on a few things more.

--
Joshua Kinard
Gentoo/MIPS
[hidden email]
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
In reply to this post by Michael Orlitzky
On wto, 2017-07-25 at 08:26 -0400, Michael Orlitzky wrote:

> On 07/25/2017 07:52 AM, Michał Górny wrote:
> >
> > I have no clue what you mean. I'm just saying that if you push 10
> > changes in 10 commits, you don't have to go straight to -r10 in a
> > single push.
> >
>
> Exactly. Do that instead of hoping that no one checks out your
> intermediate commits. There's no limit to the number of revisions we can
> have, and trying to keep track of when it's safe to push in your head is
> asking for trouble.
>
How is that relevant? Revision bumps are merely a tool to encourage
'automatic' rebuilds of packages during @world upgrade. I can't think of
a single use case where somebody would actually think it sane to
checkout one commit after another, and run @world upgrade in the middle
of it.

--
Best regards,
Michał Górny

signature.asc (1007 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Rich Freeman
In reply to this post by Michał Górny-5
On Tue, Jul 25, 2017 at 7:52 AM, Michał Górny <[hidden email]> wrote:
>
> Except that there is no machines using it. In all contexts, using full URL for machine readability is better as it works with all software out of the box.
>

Until the domain name of the bugzilla server changes/etc.  Even if we
migrated all the old bugs the URLs would break.  That might be an
argument for not having a full URL.

There would also be less variation.  Bug: 123456 is pretty unambiguous
as a reference.  When you start having http vs https and maybe a few
different ways of creating a URL to a bug it could get messier.

That said, I really don't have a strong opinion on this.

--
Rich

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
In reply to this post by Michael Palimaka
On wto, 2017-07-25 at 22:28 +1000, Michael Palimaka wrote:

> On 07/25/2017 06:05 PM, Michał Górny wrote:
> > Hi, everyone.
> >
> > There have been multiple attempts at grasping this but none so far
> > resulted in something official and indisputable. At the same time, we
> > end having to point our users at semi-official guides which change
> > in unpredictable ways.
> >
> > Here's the current draft:
> > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
>
> This looks really nice, thanks for working on it.
>
> > * When doing a minor change to a large number of packages, it is
> > reasonable to do so in a single commit. However, when doing a major
> > change (e.g. a version bump), it is better to split commits on package
> > boundaries.
>
> In some cases we do prefer to make major changes on a set of related
> package all in one commit. For example, we always bump the 240+ KDE
> Applications collection together because that's how it's released.
It's merely a recommendation. I don't want to cover every single use
case because that would be insane. I'm already worried I've covered too
many cases for people to read it all.

> > ===Commit messages===
> > A standard git commit message consists of three parts, in order: a
> > summary line, an optional body and an optional set of tags. The parts
> > are separated by a single empty line.
> >
> > The summary line is included in the short logs (<kbd>git log --
> > oneline</kbd>, gitweb, GitHub, mail subject) and therefore should
> > provide a short yet accurate description of the change. The summary line
> > starts with a logical unit name, followed by a colon, a space and a
> > short description of the most important changes. If a bug is associated
> > with a change, then it should be included in the summary line as
> > <kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69
> > characters, and must not be wrapped.
>
> Does a bug # really need to always be in the summary line? It can eat
> valuable characters and tags which are pretty popular are equally valid IMO.
Tags don't appear on 'git log --oneline' or cgit/gitweb shortlog. If you
are groking through multiple bugs, it is more convenient if you can find
the bug no straight away.

> > ** <kbd>Bug: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>;; — to
> > reference a bug,
> > ** <kbd>Closes: <nowiki>https://github.com/gentoo/gentoo/pull/NNNN</nowi
> > ki></kbd>; — to automatically close a GitHub pull request,
> > ** <kbd>Fixes: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>;; —
> > to indicate a fixed bug,
>
> grepping the git log shows that 'Gentoo-bug' is much more common than
> plain 'Bug'. 'Fixes' is hardly used at all, and I think it's a bit
> confusing to use this for bugs as well as commits.
'Fixes' is the original tag used by other projects. 'Bug' is shorter
than 'Gentoo-bug' and avoids repeating the obvious. Much like we do not
have 'Gentoo-signed-off-by', 'Gentoo-thanks-to' and so on, having
'Gentoo-bug' is equally silly.

Furthermore, full URLs should be used with tags. If you are already
using tags (i.e. long form), don't do it half-way and put useless digits
there. Put URL that will be interpreted by practically all visual git
tools written ever.

> > a few branches on the repository, and did not maintain them. The Infra
> > had to query the developers about the state of the branches and clean
> > them up.
>
> Should 'The Infra' be 'The Infra team' or just 'Infra'?

Yes, thanks.

>
> > Gentoo developers are still frequently using <kbd>Gentoo-Bug</kbd> tag,
> > sometimes followed by <kbd>Gentoo-Bug-URL</kbd>. Using both
> > simultaneously is meaningless (they are redundant), and using the former
> > has no advantages over using the classic <kbd>#nnnnnn</kbd> form in the
> > summary or the body.
>
> I agree that using both is redundant, but I don't agree with
> discouraging or banning the use of 'Gentoo-bug'. If someone prefers to
> use it so it sits nicely with the other tags why stop them?
I'm not stopping anyone. This is merely a suggestion. Encouraging two
different tags for the same thing would be confusing to users.

--
Best regards,
Michał Górny

signature.asc (1007 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
In reply to this post by Joshua Kinard-2
On wto, 2017-07-25 at 08:54 -0400, Joshua Kinard wrote:

> On 07/25/2017 04:05, Michał Górny wrote:
> > Hi, everyone.
> >
> > There have been multiple attempts at grasping this but none so far
> > resulted in something official and indisputable. At the same time, we
> > end having to point our users at semi-official guides which change
> > in unpredictable ways.
> >
> > Here's the current draft:
> > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
> >
> > The basic idea is that the GLEP provides basic guidelines for using git,
> > and then we write a proper manual on top of it (right now, all the pages
> > about it end up as a mix of requirements and a partial git manual).
> >
> > What do you think about it? Is there anything else that needs being
> > covered?
> >
> > Copy of the markup for inline comments follows.
>
> I haven't seen it mentioned yet, but will this GLEP update or replace this
> existing Wiki article on using git w/ Gentoo?:
>
> https://wiki.gentoo.org/wiki/Gentoo_git_workflow
We will probably remove it in favor of a proper devmanual section.
Proxy-maint already stopped using it because there's too much noise
there.

> Some of the step-by-step bits in the above Wiki page look like good candidates
> to be integrated into the GLEP.

Could you be more specific?

>   It also contains guidelines on writing commit
> messages, such as limiting the first line to ~50 characters, an optional body
> wrapped at 75 chars/line, and including the usual git tags for sign-off and
> such.  Though, I like the explicitness of the GLEP's text on a few things more.

There is a large section on commit messages in the GLEP. Though it uses
69 as the technical limit of summary line, since ~50 is realistically
hard to achieve for Gentoo.

--
Best regards,
Michał Górny

signature.asc (1007 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
In reply to this post by Rich Freeman
On wto, 2017-07-25 at 09:26 -0400, Rich Freeman wrote:
> On Tue, Jul 25, 2017 at 7:52 AM, Michał Górny <[hidden email]> wrote:
> >
> > Except that there is no machines using it. In all contexts, using full URL for machine readability is better as it works with all software out of the box.
> >
>
> Until the domain name of the bugzilla server changes/etc.  Even if we
> migrated all the old bugs the URLs would break.  That might be an
> argument for not having a full URL.

This is a very stupid argument. If we ever break bug URLs, commit
messages are the *least* of our concerns.

> There would also be less variation.  Bug: 123456 is pretty unambiguous
> as a reference.  When you start having http vs https and maybe a few
> different ways of creating a URL to a bug it could get messier.

Except that 123456 could refer to any bugtracker anywhere. No reasonable
tool will do anything with that number since it's ambiguous by
definition.

And if I were to use stupid arguments, then I should point out if we
ever have a review platform, then the numbers would suddenly become
ambiguous -- is it Bugzilla or the review platform?

--
Best regards,
Michał Górny

signature.asc (1007 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michael Orlitzky
In reply to this post by Michał Górny-5
On 07/25/2017 09:23 AM, Michał Górny wrote:
>
> How is that relevant? Revision bumps are merely a tool to encourage
> 'automatic' rebuilds of packages during @world upgrade. I can't think of
> a single use case where somebody would actually think it sane to
> checkout one commit after another, and run @world upgrade in the middle
> of it.
>

Revisions are to indicate that one incarnation of a package differs from
another in a way that the user or package manager might care about. And
on principal, it's no business of yours what people want to do with
their tree. If someone wants to check out successive commits and emerge
@world, he's within his rights to do so.

This is relevant because your proposed policy,

  * presumes to know how people will use the tree, and places arbitrary
    restrictions on them

  * can cause problems if those assumptions don't hold

  * requires developers to think about when it's safe to push (Did I
    push those changes last night? Do I need another revision?)

  * and is more complicated than the safe solution, anyway

Here's my proposal regarding revisions:

  If you make a commit that requires a revision, make a revision.

If you wind up with an -r15 in the tree, who cares? It's simpler, safer,
and less to think about.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Mike Gilbert-2
On Tue, Jul 25, 2017 at 12:12 PM, Michael Orlitzky <[hidden email]> wrote:

> On 07/25/2017 09:23 AM, Michał Górny wrote:
>>
>> How is that relevant? Revision bumps are merely a tool to encourage
>> 'automatic' rebuilds of packages during @world upgrade. I can't think of
>> a single use case where somebody would actually think it sane to
>> checkout one commit after another, and run @world upgrade in the middle
>> of it.
>>
>
> Revisions are to indicate that one incarnation of a package differs from
> another in a way that the user or package manager might care about. And
> on principal, it's no business of yours what people want to do with
> their tree. If someone wants to check out successive commits and emerge
> @world, he's within his rights to do so.

I don't feel I should be obligated by policy to support this use case.
One revbump per push seems sufficiently safe for 99.9% of users.

If you want to do more revbumps, you are free to do so.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michael Orlitzky
On 07/25/2017 04:29 PM, Mike Gilbert wrote:
>
> I don't feel I should be obligated by policy to support this use case.
> One revbump per push seems sufficiently safe for 99.9% of users.
>
> If you want to do more revbumps, you are free to do so.
>

Can I also delete packages and break the tree so long as I put
everything back before I push?


Reply | Threaded
Open this post in threaded view
|

Re: [RFC pre-GLEP] Gentoo Git Workflow

Michał Górny-5
On wto, 2017-07-25 at 16:31 -0400, Michael Orlitzky wrote:

> On 07/25/2017 04:29 PM, Mike Gilbert wrote:
> >
> > I don't feel I should be obligated by policy to support this use case.
> > One revbump per push seems sufficiently safe for 99.9% of users.
> >
> > If you want to do more revbumps, you are free to do so.
> >
>
> Can I also delete packages and break the tree so long as I put
> everything back before I push?
That is not the same, and you know that. Plus, there's a major
difference between not doing unnecessary work and purposely doing
something awful just to prove a point.

--
Best regards,
Michał Górny

signature.asc (1007 bytes) Download Attachment
123