Stabilisation procedure

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Stabilisation procedure

Michael Palimaka
I've prepared a document that tries to consolidate the various bits of
information regarding the stabilisation procedure into one document /
policy. Please review.

The live version currently lives at
https://wiki.gentoo.org/wiki/User:Kensington/Stabilisation_procedure and
I've included the raw text below for convenience.




This article describes the procedure for moving an ebuild from testing
to stable.

== Responsibility ==

The primary purpose of the stabilisation process is to integrate a
testing ebuild into the stable tree. This can involve maintaining the
consistency of the dependency graph, basic compatibility checks with
other packages, and smoke testing of the package itself.

Stabilisation is not intended to relieve a package maintainer of their
responsibility to ship a quality package - the primary responsibility of
ensuring that a package is a good stable candidate remains with the
person approving the stabilisation request. The stabilisation process
does not include more than basic functionality checks unless explicitly
requested.

== Configuration ==

It is preferred that testing take place on a real system, inside a
chroot, or within another type of non-virtualised container.
Virtualisation may be acceptable in situations where it is not possible
or practical to test on real hardware.

The testing system must only have stable packages installed, with no
testing packages keyworded or unmasked. It should be up-to-date, and it
is recommended to have as few packages installed as possible.

{{path|make.conf}} should have settings similar to the following:
{{FileBox|filename=/etc/portage/make.conf|lang=bash|1=
# CFLAGS should be reasonable
CFLAGS="-march=native -O2 -pipe -frecord-gcc-switches"
CXXFLAGS="${CFLAGS}"

# Defining all four *FLAGS variables enables a portage QA check
reporting when these variables are not respected
FFLAGS="${CFLAGS}"
FCFLAGS="${CFLAGS}"

# Enables a portage QA check to report when LDFLAGS is not respected
LDFLAGS="${LDFLAGS} -Wl,--hash-style=gnu"

# collision-protect - prevent a package from overwriting files it does
not own
# ipc-sandbox - prevent host IPC access (requires Linux and namespace
support in kernel)
# network-sandbox - prevent network access during merge (requires Linux
and network namespace support in kernel)
# sandbox - ensure package does not write directly to live system
# split-log - store logs created by PORTAGE_ELOG_SYSTEM="save" in
category subdirectories
# split-elog - store build logs in category subdirectories
# strict - have portage react strongly to conditions that have the
potential to be dangerous
# test - run package tests, or alternatively test-fail-continue
# userfetch - drop privileges during fetching
# userpriv - drop privileges during merge
# usersandbox - enable sandbox when userpriv is enabled
FEATURES="collision-protect ipc-sandbox network-sandbox sandbox
split-log split-elog strict test userfetch userpriv usersandbox"

# Display selected types of messages again when emerge exits, and save
them to disk
PORTAGE_ELOG_CLASSES="log warn error qa"
PORTAGE_ELOG_SYSTEM="echo save"
}}

== Testing ==

Each package in Gentoo is different and therefore requires a slightly
different approach to stabilisation. Consider the following guidelines
for each class of package, and use common sense when in doubt.

=== General ===

==== USE flags ====

While it is preferable to test every USE flag combination, this is not
always possible or appropriate. The package may have a large number of
USE flags, a long compile time, or the stabilisation in question may
just not call for it.

In cases where all USE flags combinations are not being tested, it is
still recommended to test:
* with all USE flags enabled
* with all USE flags disabled
* the default USE flag settings

==== Runtime testing ====

Consider the level of runtime testing that is required for the target
package. Remember, the focus of stabilisation is to integrate a testing
ebuild into the stable tree and not to identify routine bugs or
regressions - that is the purpose of the package's waiting time in ~arch.

The level of runtime testing required will vary wildly based on a
variety of factors. Consider the following examples:

* Multiple days of "normal use" testing may be appropriate for a new
version of {{package|sys-libs/glibc}}
* Basic functionality testing, such as browsing some web pages, may make
sense for a new version of {{package|www-client/firefox}}
* Passing tests might be enough for {{package|dev-python/yenc}}
* A leaf package such as {{package|kde-apps/kcalc}} may not require any
runtime testing at all

=== Libraries ===

A new library version may introduce incompatibles with reverse
dependencies. Where there's a risk of such breakage, each stable reverse
dependency must be rebuilt. Beware of reverse dependencies that only use
the library conditionally (eg. <code>USE="png"</code>).

=== Kernel ===

Kernel packages referenced in the handbook have certain exemptions from
the usual stabilisation policy, so stabilisation requests are normally
only filed for the first version in a long term stable branch
(subsequent versions can be stabilised at the discretion of the maintainer).

First, test all available kernel options:

{{Cmd|cd /usr/src/example-sources-1.2.3
|make allyesconfig
|make # add '-j' as appropriate for your hardware}}

If that succeeds, build with your normal configuration:

{{Cmd|make distclean
|make menuconfig
|make
|make modules_install # if you use modules}}

After reboot, check <code>dmesg</code> for anything strange and use the
system as normal, trying to get a bit of uptime.

If stabilising a special feature variant such as
{{package|sys-kernel/hardened-sources}}, try to test those features.

=== Toolchain ===

New versions of toolchain packages can often introduce major changes and
widespread breakage into the tree. The purpose of a stabilisation
request for a toolchain package is to test the package itself on each
architecture - not to detect build failures in miscellaneous packages.
It is expected that such failures are managed and resolved by the
maintainer (normally through tracker bugs and tinderboxing) prior to
filing a stabilisation request.

Once the normal testing is successful, rebuild <code>@system</code> (or
<code>@world</code> if the hardware permits) and once successful,
observe the system in normal operation for abnormalities.

== QA violations ==

Most of these violations will be detected automatically using the
testing tool, but are also described here for completeness.

* Does not respect CC
* Does not respect CFLAGS
* Does not respect LDFLAGS
* Bundled symbols
* Insecure symbols
* Installs documentation outside of /usr/share/doc/${PF}
* ELF files found in /usr/share
* ...
* ...

== Architecture-specific notes ==

A number of items described in earlier sections, such as checking of
reverse dependencies and miscellaneous QA checks, are
architecture-neutral. At a stabilisation level, the primary
responsibility for carrying out these checks rests on the first
architecture to stabilise an ebuild. Subsequent architectures may assume
that these checks have been completed and skip them if they wish.

=== amd64 ===
* Any developer may perform {{keyword|amd64}} stabilisations - it is not
necessary to be on the arch team
* <code>multilib-strict</code> must be added to <code>FEATURES</code>

=== arm ===
The [[Project:ARM|ARM project]] supports four {{keyword|arm}} variants -
armv4, armv5, armv6, and armv7. In addition to regular testing, the
package must be build tested on each variant. If access to each physical
variant is not possible, <code>CFLAGS="-march=$arch"</code> is acceptable.

=== x86 ===
* Any developer may perform {{keyword|x86}} stabilisations - it is not
necessary to be on the arch team
* It is acceptable to stabilise in an {{keyword|x86}}
[[Project:AMD64/32-bit Chroot Guide|chroot]] on {{keyword|amd64}}
* It is generally acceptable to stabilise a package with only a build
test on {{keyword|x86}} if it is already stable on {{keyword|amd64}}

== Acknowledgements ==
Most of this guide was shameless stolen from many sources, including but
not limited to:
* Agostino Sarubbo
* Various arch teams

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Rich Freeman
On Thu, Nov 17, 2016 at 2:16 AM, Michael Palimaka <[hidden email]> wrote:
>
> In cases where all USE flags combinations are not being tested, it is
> still recommended to test:
> * with all USE flags enabled
> * with all USE flags disabled
> * the default USE flag settings
>

I imagine that in practice only the last of these really tends to get tested.

> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
> runtime testing at all

I'm not really a big fan of this, but if we REALLY didn't want to do
any runtime testing on a package then we should have some way to tag
the package as such, and then have some way to do automated
build-test-only stabilization.  If you aren't doing runtime testing
then manual stabilization adds zero value.

Overall though the writeup was good and maybe it will trigger some
discussion.  I tend to think that if we want to do things like testing
permutations and such then automated build-only tools might be the way
to address this.  Manual effort should be focused on things like
runtime testing where it adds the most value.  This also strikes me as
the sort of thing that could probably be assigned out to volunteers
who do not have commit access.

It really seems like the sort of thing that could be managed by
something other than bugzilla.  Some tool finds out about packages
that ought to be stabilized (probably via multiple methods), then it
triggers the automated build tests/etc that do a lot of the low-level
QA, and if the package looks good it gets queued for runtime testing.
Then volunteers report in on status and when whatever criteria we
establish is met then the tool stabilizes the package, probably in a
dependency-aware fashion.  Obviously this would require some care for
coordinated packages like xorg/DEs/etc, and it might not be the
preferred approach for many system packages.

--
Rich

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michael Palimaka
On 17/11/16 20:16, Rich Freeman wrote:

> On Thu, Nov 17, 2016 at 2:16 AM, Michael Palimaka <[hidden email]> wrote:
>>
>> In cases where all USE flags combinations are not being tested, it is
>> still recommended to test:
>> * with all USE flags enabled
>> * with all USE flags disabled
>> * the default USE flag settings
>>
>
> I imagine that in practice only the last of these really tends to get tested.
>
>> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
>> runtime testing at all
>
> I'm not really a big fan of this, but if we REALLY didn't want to do
> any runtime testing on a package then we should have some way to tag
> the package as such, and then have some way to do automated
> build-test-only stabilization.  If you aren't doing runtime testing
> then manual stabilization adds zero value.

How much value do you think we gain from runtime testing a package like
kcalc as part of the stabilisation process, considering that it already
sat in ~arch for at least 30 days?

Also, based on conversations with various arch team members, my
understanding is that a lot of stabilisations that happen right now are
already build-only.

>
> Overall though the writeup was good and maybe it will trigger some
> discussion.  I tend to think that if we want to do things like testing
> permutations and such then automated build-only tools might be the way
> to address this.  Manual effort should be focused on things like
> runtime testing where it adds the most value.  This also strikes me as
> the sort of thing that could probably be assigned out to volunteers
> who do not have commit access.

There's a few tools for this out there already. I've personally been
working to update app-portage/tatt for git - see
https://asciinema.org/a/cqsy983t9jimszvypcjr3zg5m for a demo.

>
> It really seems like the sort of thing that could be managed by
> something other than bugzilla.  Some tool finds out about packages
> that ought to be stabilized (probably via multiple methods), then it
> triggers the automated build tests/etc that do a lot of the low-level
> QA, and if the package looks good it gets queued for runtime testing.
> Then volunteers report in on status and when whatever criteria we
> establish is met then the tool stabilizes the package, probably in a
> dependency-aware fashion.  Obviously this would require some care for
> coordinated packages like xorg/DEs/etc, and it might not be the
> preferred approach for many system packages.
>

We're working on a tool like this in #gentoo-grumpy - new contributors
welcome!


Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Rich Freeman
On Thu, Nov 17, 2016 at 4:37 AM, Michael Palimaka <[hidden email]> wrote:

> On 17/11/16 20:16, Rich Freeman wrote:
>> On Thu, Nov 17, 2016 at 2:16 AM, Michael Palimaka <[hidden email]> wrote:
>>> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
>>> runtime testing at all
>>
>> I'm not really a big fan of this, but if we REALLY didn't want to do
>> any runtime testing on a package then we should have some way to tag
>> the package as such, and then have some way to do automated
>> build-test-only stabilization.  If you aren't doing runtime testing
>> then manual stabilization adds zero value.
>
> How much value do you think we gain from runtime testing a package like
> kcalc as part of the stabilisation process, considering that it already
> sat in ~arch for at least 30 days?

We ensure that it actually runs at all with non-~arch dependencies?

The 30 days spent in ~arch tells you very little about whether the
package works with stable dependencies, since only those running mixed
keywords would be testing that.

>
> Also, based on conversations with various arch team members, my
> understanding is that a lot of stabilisations that happen right now are
> already build-only.
>

Certainly this isn't the documented process and it is the first I've
heard of this.

I think one of two things make sense:

1.  Manual runtime testing followed by stabilization.
2.  Automated build testing followed by stabilization, with no human involved.

What doesn't make sense is manual build testing.  The person is adding
zero value.

>>
>> Overall though the writeup was good and maybe it will trigger some
>> discussion.  I tend to think that if we want to do things like testing
>> permutations and such then automated build-only tools might be the way
>> to address this.  Manual effort should be focused on things like
>> runtime testing where it adds the most value.  This also strikes me as
>> the sort of thing that could probably be assigned out to volunteers
>> who do not have commit access.
>
> There's a few tools for this out there already. I've personally been
> working to update app-portage/tatt for git - see
> https://asciinema.org/a/cqsy983t9jimszvypcjr3zg5m for a demo.
>

Assuming we decide we don't care about runtime testing (which I'm not
sure I'm a fan of), it sounds like the only thing this is missing is:
1.  Running as a service on Gentoo infra without any person at the keyboard.
2.  Automatically monitoring the bug queue for anything that can be
stabilized, taking into account blockers/dependencies/etc.
3.  Posting failures to the bug, and then removing that bug from the
queue until somebody marks it as ready to go back in.
4.  If there is a success updating the bug (including closing if the
last arch) and doing the commit using an infra account.

However, I'd be interested in metrics on failures discovered in
runtime testing and so on, or missed with a lack of runtime testing.
I'll admit that a lot of runtime testing tends to be fairly shallow,
but I do think there is something to be said for doing some kind of
runtime testing.

I think we need to think about why we actually have a stable branch.
Does it offer any value if all we do is build testing, when I'm sure
the maintainers at least build their packages in ~arch before they
commit them?

--
Rich

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michael Orlitzky
In reply to this post by Michael Palimaka
On 11/17/2016 02:16 AM, Michael Palimaka wrote:
>
> # strict - have portage react strongly to conditions that have the
> potential to be dangerous
> ...
> FEATURES="collision-protect ipc-sandbox network-sandbox sandbox
> split-log split-elog strict test userfetch userpriv usersandbox"

Maybe "stricter" too if it's not absent on purpose.


> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
> runtime testing at all

For packages that have a small number of users, this might miss the fact
that a package e.g. segfaults immediately. Maintainers can't always test
on every architecture, and if there's only three ~ppc (say) users, then
they may miss the 30 day window.


> === amd64 ===
> * Any developer may perform {{keyword|amd64}} stabilisations - it is not
> necessary to be on the arch team
>
> === x86 ===
> * Any developer may perform {{keyword|x86}} stabilisations - it is not
> necessary to be on the arch team

The arch teams are OK with this now? If so I can go close some STABLEREQs...


Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michael Palimaka
In reply to this post by Rich Freeman
On 17/11/16 22:56, Rich Freeman wrote:

> On Thu, Nov 17, 2016 at 4:37 AM, Michael Palimaka <[hidden email]> wrote:
>> On 17/11/16 20:16, Rich Freeman wrote:
>>> On Thu, Nov 17, 2016 at 2:16 AM, Michael Palimaka <[hidden email]> wrote:
>>>> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
>>>> runtime testing at all
>>>
>>> I'm not really a big fan of this, but if we REALLY didn't want to do
>>> any runtime testing on a package then we should have some way to tag
>>> the package as such, and then have some way to do automated
>>> build-test-only stabilization.  If you aren't doing runtime testing
>>> then manual stabilization adds zero value.
>>
>> How much value do you think we gain from runtime testing a package like
>> kcalc as part of the stabilisation process, considering that it already
>> sat in ~arch for at least 30 days?
>
> We ensure that it actually runs at all with non-~arch dependencies?
>
> The 30 days spent in ~arch tells you very little about whether the
> package works with stable dependencies, since only those running mixed
> keywords would be testing that.

What is the *real* risk that kde-apps/kcalc builds against stable
dev-libs/gmp but then starts producing funny numbers at runtime?

Let's put it another way - assume we're stabilising a new version of
dev-libs/gmp instead. Should we test already-stable kde-apps/kcalc
first? What about the other hundred reverse dependencies?

Just to be clear, I'm not advocating banning runtime testing. I just
think that, considering the state of the stable tree, we should consider
very careful in which situations we actually gain value from it. That's
for another thread, however. I deliberately worded the document so that
the final decision on the exact level of testing required (runtime or
otherwise) is between the person filing the stabilisation request and
the person actioning it.

>> Also, based on conversations with various arch team members, my
>> understanding is that a lot of stabilisations that happen right now are
>> already build-only.
>>
>
> Certainly this isn't the documented process and it is the first I've
> heard of this.

Indeed it is not documented, but then again not a lot about
stabilisation is well documented.

Consider bug #584468, a typical bulk stabilisation request from the
GNOME team consisting of 188 packages of varying types. I very much
doubt each arch team member sat down and tested file-roller to make sure
it extracts archives with stable libarchive and looks OK with stable
GTK, then makes sure gnucash-docs still looks pretty with stable
docbook-xsl-stylesheets, and so on 186 more times.

> However, I'd be interested in metrics on failures discovered in
> runtime testing and so on, or missed with a lack of runtime testing.
> I'll admit that a lot of runtime testing tends to be fairly shallow,
> but I do think there is something to be said for doing some kind of
> runtime testing.

I'd be interested in metrics like this too, but I'm not sure how we'd
collect such information.

Runtime testing is important, and in an ideal world every package would
receive detailed runtime testing before being moved to stable. Where we
are now, I think forcing runtime testing in every case costs more than
we gain.

Building and running for as long as possible the next stable glibc has a
comparatively low cost compared to the pain if it broke. Manually
running 300 kde-apps/* packages every three months is a guaranteed waste
of time. Most packages probably sit somewhere in between.

> I think we need to think about why we actually have a stable branch.
> Does it offer any value if all we do is build testing, when I'm sure
> the maintainers at least build their packages in ~arch before they
> commit them?

The whole point of the 30 day waiting period is that issues can and do
appear that the maintainer did not encounter when they bumped.


Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Kristian Fiskerstrand-2
In reply to this post by Michael Orlitzky
On 11/17/2016 01:49 PM, Michael Orlitzky wrote:
> On 11/17/2016 02:16 AM, Michael Palimaka wrote:
>>
...

>
>> === amd64 ===
>> * Any developer may perform {{keyword|amd64}} stabilisations - it is not
>> necessary to be on the arch team
>>
>> === x86 ===
>> * Any developer may perform {{keyword|x86}} stabilisations - it is not
>> necessary to be on the arch team
>
> The arch teams are OK with this now? If so I can go close some STABLEREQs...
>
>
Strictly speaking GLEP 40 forbids it still, although some arch teams
have made announcements to approve it, see e.g [1,2]. I wouldn't be
surprised if one of the results of the stable WG is an updated GLEP 40
that (new GLEP replacing existing) that allows for MAINTAINER
self-stabilization. Personally I don't like "any developer may perform"
part of it. The maintainer is responsible, and should at least ack
stabilization at all before anything is stabilized (for arch-team
stabilization as well), and consequently individual stabilizations by
developers, either the maintainer itself or someone with applicable
hardware.

References:
[1]
https://archives.gentoo.org/gentoo-dev/message/355f4b4272c0049cffcdec88d815e267
[2]
https://archives.gentoo.org/gentoo-dev/message/1246dd8fabe44e7e7ecf59ecf029af3e

--
Kristian Fiskerstrand
OpenPGP keyblock reachable at hkp://pool.sks-keyservers.net
fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3


signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michael Palimaka
In reply to this post by Michael Orlitzky
On 17/11/16 23:49, Michael Orlitzky wrote:
> On 11/17/2016 02:16 AM, Michael Palimaka wrote:
>>
>> # strict - have portage react strongly to conditions that have the
>> potential to be dangerous
>> ...
>> FEATURES="collision-protect ipc-sandbox network-sandbox sandbox
>> split-log split-elog strict test userfetch userpriv usersandbox"
>
> Maybe "stricter" too if it's not absent on purpose.

I did consider that, but I'm unsure if the issues "stricter" covers are
bad enough to abort the build, or even if they normally block stabilisation.

>> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
>> runtime testing at all
>
> For packages that have a small number of users, this might miss the fact
> that a package e.g. segfaults immediately. Maintainers can't always test
> on every architecture, and if there's only three ~ppc (say) users, then
> they may miss the 30 day window.

If a package cannot get sufficient testing in ~ppc, I don't think it
should stabilised on that architecture.

In any case, this is only an example that was admittedly written with
amd64 in mind. The person performing the stabilisation is still free to
perform whatever runtime testing they feel necessary for the package /
architecture.

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michael Palimaka
In reply to this post by Kristian Fiskerstrand-2
On 18/11/16 00:26, Kristian Fiskerstrand wrote:

> Strictly speaking GLEP 40 forbids it still, although some arch teams
> have made announcements to approve it, see e.g [1,2]. I wouldn't be
> surprised if one of the results of the stable WG is an updated GLEP 40
> that (new GLEP replacing existing) that allows for MAINTAINER
> self-stabilization. Personally I don't like "any developer may perform"
> part of it. The maintainer is responsible, and should at least ack
> stabilization at all before anything is stabilized (for arch-team
> stabilization as well), and consequently individual stabilizations by
> developers, either the maintainer itself or someone with applicable
> hardware.

Isn't it implied that any stabilisation is approved by the maintainer?
Has it ever been acceptable to go around stabilising random packages?

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Kristian Fiskerstrand-2
On 11/17/2016 02:47 PM, Michael Palimaka wrote:

> On 18/11/16 00:26, Kristian Fiskerstrand wrote:
>> Strictly speaking GLEP 40 forbids it still, although some arch teams
>> have made announcements to approve it, see e.g [1,2]. I wouldn't be
>> surprised if one of the results of the stable WG is an updated GLEP 40
>> that (new GLEP replacing existing) that allows for MAINTAINER
>> self-stabilization. Personally I don't like "any developer may perform"
>> part of it. The maintainer is responsible, and should at least ack
>> stabilization at all before anything is stabilized (for arch-team
>> stabilization as well), and consequently individual stabilizations by
>> developers, either the maintainer itself or someone with applicable
>> hardware.
>
> Isn't it implied that any stabilisation is approved by the maintainer?
> Has it ever been acceptable to go around stabilising random packages?
>
Explicit > Implicit when we're updating things anyways.

There are scenarios where e.g Security is calling for stabilization ,
I'll add some info to the draft security GLEP with some requirements for
when this can happen without maintainer involvement as well..

Ultimately maintainer is responsible for the state of the stable tree
for the packages they maintain and should be taking proactive steps for
this also for security bugs, it doesn't "always" happen like that.....

--
Kristian Fiskerstrand
OpenPGP keyblock reachable at hkp://pool.sks-keyservers.net
fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3


signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

William Hubbs
In reply to this post by Michael Palimaka
On Thu, Nov 17, 2016 at 06:16:27PM +1100, Michael Palimaka wrote:

> ==== USE flags ====
>
> While it is preferable to test every USE flag combination, this is not
> always possible or appropriate. The package may have a large number of
> USE flags, a long compile time, or the stabilisation in question may
> just not call for it.
>
> In cases where all USE flags combinations are not being tested, it is
> still recommended to test:
> * with all USE flags enabled
> * with all USE flags disabled
Does this mean we are changing our policy to support users running
USE="-*"? I'm asking for clarification because in the past we have
always told users that if they do that they are on their own.

William


signature.asc (169 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michael Palimaka
On 18/11/16 01:58, William Hubbs wrote:

> On Thu, Nov 17, 2016 at 06:16:27PM +1100, Michael Palimaka wrote:
>> ==== USE flags ====
>>
>> While it is preferable to test every USE flag combination, this is not
>> always possible or appropriate. The package may have a large number of
>> USE flags, a long compile time, or the stabilisation in question may
>> just not call for it.
>>
>> In cases where all USE flags combinations are not being tested, it is
>> still recommended to test:
>> * with all USE flags enabled
>> * with all USE flags disabled
>
> Does this mean we are changing our policy to support users running
> USE="-*"? I'm asking for clarification because in the past we have
> always told users that if they do that they are on their own.

Testing with all USE flags disabled is more about catching build
failures than guaranteeing the package will necessarily do something useful.

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Rich Freeman
In reply to this post by Michael Palimaka
On Thu, Nov 17, 2016 at 8:13 AM, Michael Palimaka <[hidden email]> wrote:
>
> Just to be clear, I'm not advocating banning runtime testing. I just
> think that, considering the state of the stable tree, we should consider
> very careful in which situations we actually gain value from it. That's
> for another thread, however. I deliberately worded the document so that
> the final decision on the exact level of testing required (runtime or
> otherwise) is between the person filing the stabilisation request and
> the person actioning it.
>

++

--
Rich

Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Duncan-42
In reply to this post by Michael Palimaka
Michael Palimaka posted on Fri, 18 Nov 2016 02:35:26 +1100 as excerpted:

> On 18/11/16 01:58, William Hubbs wrote:
>> On Thu, Nov 17, 2016 at 06:16:27PM +1100, Michael Palimaka wrote:
>>> ==== USE flags ====
>>>
>>> While it is preferable to test every USE flag combination, this is not
>>> always possible or appropriate. The package may have a large number of
>>> USE flags, a long compile time, or the stabilisation in question may
>>> just not call for it.
>>>
>>> In cases where all USE flags combinations are not being tested, it is
>>> still recommended to test:
>>> * with all USE flags enabled * with all USE flags disabled
>>
>> Does this mean we are changing our policy to support users running
>> USE="-*"? I'm asking for clarification because in the past we have
>> always told users that if they do that they are on their own.
>
> Testing with all USE flags disabled is more about catching build
> failures than guaranteeing the package will necessarily do something
> useful.

Along the same line but with all flags enabled, how does that apply to
exclusive-or flags such as the qt4/qt5 thing that has been quite common?

Sure common sense suggests "all" doesn't really mean "all" in that case,
but given the opportunity presented by the update, if a guideline for the
case can be made explicit...

--
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Robin H. Johnson-2
In reply to this post by Kristian Fiskerstrand-2
On Thu, Nov 17, 2016 at 03:05:41PM +0100, Kristian Fiskerstrand wrote:

> > Isn't it implied that any stabilisation is approved by the maintainer?
> > Has it ever been acceptable to go around stabilising random packages?
> >
>
> Explicit > Implicit when we're updating things anyways.
>
> There are scenarios where e.g Security is calling for stabilization ,
> I'll add some info to the draft security GLEP with some requirements for
> when this can happen without maintainer involvement as well..
>
> Ultimately maintainer is responsible for the state of the stable tree
> for the packages they maintain and should be taking proactive steps for
> this also for security bugs, it doesn't "always" happen like that.....
The interaction of this proposal and the prior discussion of allow
maintainers to document the maintenance policy of given packages is
where it would really come into play.

Using two packages for examples:
app-admin/diradm: I am the upstream author as well as the package
maintainer. I care about it being marked stable. I'd prefer the normal
policy of other people asking me (with timeout) before touching it.

app-admin/cancd: It's a very obscure package that I put in the tree
because I needed it, but I haven't personally used it in many years.
I fix the packaging if it's broken only.
I'm inclined to mark it with 'anybody-may-bump/fix/stabilize'.

--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
E-Mail   : [hidden email]
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

signature.asc (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Kent Fredric-2
In reply to this post by Michael Palimaka
On Fri, 18 Nov 2016 00:13:35 +1100
Michael Palimaka <[hidden email]> wrote:

> What is the *real* risk that kde-apps/kcalc builds against stable
> dev-libs/gmp but then starts producing funny numbers at runtime?
>
> Let's put it another way - assume we're stabilising a new version of
> dev-libs/gmp instead. Should we test already-stable kde-apps/kcalc
> first? What about the other hundred reverse dependencies?
>
> Just to be clear, I'm not advocating banning runtime testing. I just
> think that, considering the state of the stable tree, we should consider
> very careful in which situations we actually gain value from it. That's
> for another thread, however. I deliberately worded the document so that
> the final decision on the exact level of testing required (runtime or
> otherwise) is between the person filing the stabilisation request and
> the person actioning it.
This IME rather depends on the nature of the package in question, and the
general nature of its dependencies.

Usually you can make the conjecture that only direct dependents of
each other can affect each other via changes.

But spooky-action-at-a-distance can also happen, where in

   a -> b -> c

C changes
B is unaffected
A is broken

Though it makes more sense to not have a blanket "recusively check dependents"
policy, and perhaps either have a "Test these things when I change" list,
or an inverse "Test me when X changes" list.

The latter of these is not entirely unlike the need to add new := slotdeps
for things that didn't need to depend on Perl, but needed to be rebuilt when perl is.

(Except s/rebuilt/retest/ )


attachment0 (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Harald Alfred Weiner
In reply to this post by Duncan-42
Dear Duncan,

maybe you already know the project at http://orca.varstack.com/
Otherwise I would like to advise the following link to you
to answer the question of how to test different USE flag
combinations:
https://github.com/pallavagarwal07/SummerOfCode16/blob/997078ebbf1aa86ba17fa53e400e4c99d7d640b7/Documents/SAT-Solver.md

Actually, the guy who coded on this GSoC project and wrote the article
used a SAT solver to find out all possible legal use-flag combinations.
So maybe this solution can prevent someone from re-inventing the wheel ;-).


Best wishes,


Harald Weiner.

>>> Duncan <[hidden email]> 11/17/16 6:02 PM >>>
Michael Palimaka posted on Fri, 18 Nov 2016 02:35:26 +1100 as excerpted:

> On 18/11/16 01:58, William Hubbs wrote:
>> On Thu, Nov 17, 2016 at 06:16:27PM +1100, Michael Palimaka wrote:
>>> ==== USE flags ====
>>>
>>> While it is preferable to test every USE flag combination, this is not
>>> always possible or appropriate. The package may have a large number of
>>> USE flags, a long compile time, or the stabilisation in question may
>>> just not call for it.
>>>
>>> In cases where all USE flags combinations are not being tested, it is
>>> still recommended to test:
>>> * with all USE flags enabled * with all USE flags disabled
>>
>> Does this mean we are changing our policy to support users running
>> USE="-*"? I'm asking for clarification because in the past we have
>> always told users that if they do that they are on their own.
>
> Testing with all USE flags disabled is more about catching build
> failures than guaranteeing the package will necessarily do something
> useful.

Along the same line but with all flags enabled, how does that apply to
exclusive-or flags such as the qt4/qt5 thing that has been quite common?

Sure common sense suggests "all" doesn't really mean "all" in that case,
but given the opportunity presented by the update, if a guideline for the
case can be made explicit...

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman


Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Michał Górny-5
In reply to this post by Michael Palimaka
On Thu, 17 Nov 2016 18:16:27 +1100
Michael Palimaka <[hidden email]> wrote:

> ==== Runtime testing ====
>
> Consider the level of runtime testing that is required for the target
> package. Remember, the focus of stabilisation is to integrate a testing
> ebuild into the stable tree and not to identify routine bugs or
> regressions - that is the purpose of the package's waiting time in ~arch.
>
> The level of runtime testing required will vary wildly based on a
> variety of factors. Consider the following examples:
>
> * Multiple days of "normal use" testing may be appropriate for a new
> version of {{package|sys-libs/glibc}}
> * Basic functionality testing, such as browsing some web pages, may make
> sense for a new version of {{package|www-client/firefox}}
> * Passing tests might be enough for {{package|dev-python/yenc}}
> * A leaf package such as {{package|kde-apps/kcalc}} may not require any
> runtime testing at all
Could we maybe include some place (metadata.xml?) to state what is
the best way to test a package? I'm thinking it could include things
like:

- whether the test of the package are reliable,

- whether runtime testing is required and what kind of,

- how likely it is that revdeps need to be checked.

For example, in LLVM I would like to ask arch testers to always check
a few common clang calls.

--
Best regards,
Michał Górny
<http://dev.gentoo.org/~mgorny/>

attachment0 (981 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

James Le Cuirot
On Sat, 19 Nov 2016 09:04:01 +0100
Michał Górny <[hidden email]> wrote:

> Could we maybe include some place (metadata.xml?) to state what is
> the best way to test a package? I'm thinking it could include things
> like:
>
> - whether the test of the package are reliable,

Shouldn't you set RESTRICT="test" if they're not?

> - whether runtime testing is required and what kind of,
>
> - how likely it is that revdeps need to be checked.

These two sound good.

--
James Le Cuirot (chewi)
Gentoo Linux Developer

attachment0 (968 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Stabilisation procedure

Daniel Campbell (zlg)
In reply to this post by Robin H. Johnson-2
On 11/17/2016 01:07 PM, Robin H. Johnson wrote:

> On Thu, Nov 17, 2016 at 03:05:41PM +0100, Kristian Fiskerstrand wrote:
>>> Isn't it implied that any stabilisation is approved by the maintainer?
>>> Has it ever been acceptable to go around stabilising random packages?
>>>
>>
>> Explicit > Implicit when we're updating things anyways.
>>
>> There are scenarios where e.g Security is calling for stabilization ,
>> I'll add some info to the draft security GLEP with some requirements for
>> when this can happen without maintainer involvement as well..
>>
>> Ultimately maintainer is responsible for the state of the stable tree
>> for the packages they maintain and should be taking proactive steps for
>> this also for security bugs, it doesn't "always" happen like that.....
>
> The interaction of this proposal and the prior discussion of allow
> maintainers to document the maintenance policy of given packages is
> where it would really come into play.
>
> Using two packages for examples:
> app-admin/diradm: I am the upstream author as well as the package
> maintainer. I care about it being marked stable. I'd prefer the normal
> policy of other people asking me (with timeout) before touching it.
>
> app-admin/cancd: It's a very obscure package that I put in the tree
> because I needed it, but I haven't personally used it in many years.
> I fix the packaging if it's broken only.
> I'm inclined to mark it with 'anybody-may-bump/fix/stabilize'.
>
Agreed. For most of my packages, I really don't mind since we're all
working on Gentoo together, but it'd be super helpful if I was simply
notified in the event that a package I maintain has gotten a security
bump, patch, or stabilization. Sure, 'git log' and 'git blame' can
explain a few things, but if I was going to edit a package, I have the
maintainer's e-mail available right there in metadata.xml. To me it's a
courtesy that should be a requirement by default, while devs that don't
care can use whatever means we agree upon to indicate that they don't care.

This creates a "contact first" practice, which it seems we want to
encourage. If someone isn't responsive and/or away, that complicates
things, but if it's a security concern or the last blocker in a big
stabilization effort (looking at you, tcl 8.6...), then it makes sense
to just go ahead and make the bumps necessary.

--
Daniel Campbell - Gentoo Developer
OpenPGP Key: 0x1EA055D6 @ hkp://keys.gnupg.net
fpr: AE03 9064 AE00 053C 270C  1DE4 6F7A 9091 1EA0 55D6


signature.asc (817 bytes) Download Attachment
12