[PATCH] Enable FEATURES=parallel-install by default (bug 715110)

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
The feature enables finer grained locks for install operations, and
everyone agrees that it's safe to enable by default.

Bug: https://bugs.gentoo.org/715110
Signed-off-by: Zac Medico <[hidden email]>
---
 cnf/make.globals | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cnf/make.globals b/cnf/make.globals
index 4a59dbe3c..5ba1ac6fa 100644
--- a/cnf/make.globals
+++ b/cnf/make.globals
@@ -55,7 +55,7 @@ FETCHCOMMAND_SFTP="bash -c \"x=\\\${2#sftp://} ; host=\\\${x%%/*} ; port=\\\${ho
 FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs
           config-protect-if-modified distlocks ebuild-locks
           fixlafiles ipc-sandbox merge-sync multilib-strict
-          network-sandbox news parallel-fetch pid-sandbox
+          network-sandbox news parallel-fetch parallel-install pid-sandbox
           preserve-libs protect-owned qa-unresolved-soname-deps
           sandbox sfperms strict
           unknown-features-warn unmerge-logs unmerge-orphans userfetch
--
2.25.3


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Brian Dolbec-3
On Sun, 10 May 2020 21:32:25 -0700
Zac Medico <[hidden email]> wrote:

> The feature enables finer grained locks for install operations, and
> everyone agrees that it's safe to enable by default.
>
> Bug: https://bugs.gentoo.org/715110
> Signed-off-by: Zac Medico <[hidden email]>
> ---
>  cnf/make.globals | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/cnf/make.globals b/cnf/make.globals
> index 4a59dbe3c..5ba1ac6fa 100644
> --- a/cnf/make.globals
> +++ b/cnf/make.globals
> @@ -55,7 +55,7 @@ FETCHCOMMAND_SFTP="bash -c \"x=\\\${2#sftp://} ;
> host=\\\${x%%/*} ; port=\\\${ho FEATURES="assume-digests
> binpkg-docompress binpkg-dostrip binpkg-logs
> config-protect-if-modified distlocks ebuild-locks fixlafiles
> ipc-sandbox merge-sync multilib-strict
> -          network-sandbox news parallel-fetch pid-sandbox
> +          network-sandbox news parallel-fetch parallel-install
> pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps
>            sandbox sfperms strict
>            unknown-features-warn unmerge-logs unmerge-orphans
> userfetch

works for me :)

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Michał Górny-5
In reply to this post by Zac Medico-2
W dniu nie, 10.05.2020 o godzinie 21∶32 -0700, użytkownik Zac Medico
napisał:
> The feature enables finer grained locks for install operations, and
> everyone agrees that it's safe to enable by default.

Who's 'everyone' and where's their analysis of the problem?
The manpage doesn't really help understand what this does, exactly.

--
Best regards,
Michał Górny


signature.asc (655 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
On 5/11/20 10:54 PM, Michał Górny wrote:
> W dniu nie, 10.05.2020 o godzinie 21∶32 -0700, użytkownik Zac Medico
> napisał:
>> The feature enables finer grained locks for install operations, and
>> everyone agrees that it's safe to enable by default.
>
> Who's 'everyone' and where's their analysis of the problem?
> The manpage doesn't really help understand what this does, exactly.

Before parallel-install there was just one big lock, so only one package
slot could enter the merge/unmerge state at a given time.

With parallel install, there are a few finer-grained locks that come
into play. These are the really important ones:

* When merging/unmerging files, an exclusive lock must be held on
vardbapi._fs_lock.

* When executing an unsandboxed ebuild phase, an exclusive lock must be
held on the FEATURES=ebuild-locks lock.
--
Thanks,
Zac


signature.asc (352 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Michał Górny-5
W dniu wto, 12.05.2020 o godzinie 01∶40 -0700, użytkownik Zac Medico
napisał:

> On 5/11/20 10:54 PM, Michał Górny wrote:
> > W dniu nie, 10.05.2020 o godzinie 21∶32 -0700, użytkownik Zac
> > Medico
> > napisał:
> > > The feature enables finer grained locks for install operations,
> > > and
> > > everyone agrees that it's safe to enable by default.
> >
> > Who's 'everyone' and where's their analysis of the problem?
> > The manpage doesn't really help understand what this does, exactly.
>
> Before parallel-install there was just one big lock, so only one
> package
> slot could enter the merge/unmerge state at a given time.
>
> With parallel install, there are a few finer-grained locks that come
> into play. [...]

I'm sorry but I was asking of a more high-level implications.

I presume that this means that more than files of more than one package
can be merged simultaneously.  However:

1. Are collisions handled correctly then?  i.e. if you start installing
A, and then B, and the two packages collide will portage fail before
starting to install any file from B?

2. Are preinst/postinst phases called simultaneously or serialized?

--
Best regards,
Michał Górny



Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
On 5/12/20 3:39 AM, Michał Górny wrote:

> W dniu wto, 12.05.2020 o godzinie 01∶40 -0700, użytkownik Zac Medico
> napisał:
>> On 5/11/20 10:54 PM, Michał Górny wrote:
>>> W dniu nie, 10.05.2020 o godzinie 21∶32 -0700, użytkownik Zac
>>> Medico
>>> napisał:
>>>> The feature enables finer grained locks for install operations,
>>>> and
>>>> everyone agrees that it's safe to enable by default.
>>>
>>> Who's 'everyone' and where's their analysis of the problem?
>>> The manpage doesn't really help understand what this does, exactly.
>>
>> Before parallel-install there was just one big lock, so only one
>> package
>> slot could enter the merge/unmerge state at a given time.
>>
>> With parallel install, there are a few finer-grained locks that come
>> into play. [...]
>
> I'm sorry but I was asking of a more high-level implications.
>
> I presume that this means that more than files of more than one package
> can be merged simultaneously.  However:
No, an exclusive lock must be held on vardbapi._fs_lock for this. This
is currently required at least to guarantee that access to the config
memory file is serialized (config memory is the thing that emerge
--noconfmem disables, but --noconfmem does not disable this lock).

We assume that it's probably not worthwhile to try to merge files for
more than one package at a time, since that would cause them to compete
for IO bandwidth.

> 1. Are collisions handled correctly then?  i.e. if you start installing
> A, and then B, and the two packages collide will portage fail before
> starting to install any file from B?

There are no guarantees here. However, the risk is minimal, since it's
unlikely that a file collision of this sort would occur. file collisions
are a QA problem that is generally detected and corrected log before we
would encounter a collision of this sort.

> 2. Are preinst/postinst phases called simultaneously or serialized?

They're serialized.
--
Thanks,
Zac


signature.asc (352 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Michał Górny-5
W dniu wto, 12.05.2020 o godzinie 10∶05 -0700, użytkownik Zac Medico
napisał:

> On 5/12/20 3:39 AM, Michał Górny wrote:
> > I'm sorry but I was asking of a more high-level implications.
> >
> > I presume that this means that more than files of more than one
> > package
> > can be merged simultaneously.  However:
>
> No, an exclusive lock must be held on vardbapi._fs_lock for this.
> This
> is currently required at least to guarantee that access to the config
> memory file is serialized (config memory is the thing that emerge
> --noconfmem disables, but --noconfmem does not disable this lock).
>
> We assume that it's probably not worthwhile to try to merge files for
> more than one package at a time, since that would cause them to
> compete
> for IO bandwidth.
>
> > 1. Are collisions handled correctly then?  i.e. if you start
> > installing
> > A, and then B, and the two packages collide will portage fail
> > before
> > starting to install any file from B?
>
> There are no guarantees here. However, the risk is minimal, since
> it's
> unlikely that a file collision of this sort would occur. file
> collisions
> are a QA problem that is generally detected and corrected log before
> we
> would encounter a collision of this sort.
>
> > 2. Are preinst/postinst phases called simultaneously or serialized?
>
> They're serialized.

Now I'm lost here.  Could you try to explain to me, without getting
into the deep technicalities, how parallel-install achieves better
speed or at doing what is non-parallel-install so slow?

--
Best regards,
Michał Górny



Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
On 5/12/20 12:28 PM, Michał Górny wrote:

> W dniu wto, 12.05.2020 o godzinie 10∶05 -0700, użytkownik Zac Medico
> napisał:
>> On 5/12/20 3:39 AM, Michał Górny wrote:
>>> I'm sorry but I was asking of a more high-level implications.
>>>
>>> I presume that this means that more than files of more than one
>>> package
>>> can be merged simultaneously.  However:
>>
>> No, an exclusive lock must be held on vardbapi._fs_lock for this.
>> This
>> is currently required at least to guarantee that access to the config
>> memory file is serialized (config memory is the thing that emerge
>> --noconfmem disables, but --noconfmem does not disable this lock).
>>
>> We assume that it's probably not worthwhile to try to merge files for
>> more than one package at a time, since that would cause them to
>> compete
>> for IO bandwidth.
>>
>>> 1. Are collisions handled correctly then?  i.e. if you start
>>> installing
>>> A, and then B, and the two packages collide will portage fail
>>> before
>>> starting to install any file from B?
>>
>> There are no guarantees here. However, the risk is minimal, since
>> it's
>> unlikely that a file collision of this sort would occur. file
>> collisions
>> are a QA problem that is generally detected and corrected log before
>> we
>> would encounter a collision of this sort.
>>
>>> 2. Are preinst/postinst phases called simultaneously or serialized?
>>
>> They're serialized.
>
> Now I'm lost here.  Could you try to explain to me, without getting
> into the deep technicalities, how parallel-install achieves better
> speed or at doing what is non-parallel-install so slow?
It allows preinst/postinst/prerm/postrm phases to run for one package
while files are concurrently being merged or unmerged for another
package. This makes it possible to approach saturation of IO bandwidth.
--
Thanks,
Zac


signature.asc (352 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Michał Górny-5
On Tue, 2020-05-12 at 13:18 -0700, Zac Medico wrote:

> On 5/12/20 12:28 PM, Michał Górny wrote:
> > W dniu wto, 12.05.2020 o godzinie 10∶05 -0700, użytkownik Zac Medico
> > napisał:
> > > On 5/12/20 3:39 AM, Michał Górny wrote:
> > > > I'm sorry but I was asking of a more high-level implications.
> > > >
> > > > I presume that this means that more than files of more than one
> > > > package
> > > > can be merged simultaneously.  However:
> > >
> > > No, an exclusive lock must be held on vardbapi._fs_lock for this.
> > > This
> > > is currently required at least to guarantee that access to the config
> > > memory file is serialized (config memory is the thing that emerge
> > > --noconfmem disables, but --noconfmem does not disable this lock).
> > >
> > > We assume that it's probably not worthwhile to try to merge files for
> > > more than one package at a time, since that would cause them to
> > > compete
> > > for IO bandwidth.
> > >
> > > > 1. Are collisions handled correctly then?  i.e. if you start
> > > > installing
> > > > A, and then B, and the two packages collide will portage fail
> > > > before
> > > > starting to install any file from B?
> > >
> > > There are no guarantees here. However, the risk is minimal, since
> > > it's
> > > unlikely that a file collision of this sort would occur. file
> > > collisions
> > > are a QA problem that is generally detected and corrected log before
> > > we
> > > would encounter a collision of this sort.
> > >
> > > > 2. Are preinst/postinst phases called simultaneously or serialized?
> > >
> > > They're serialized.
> >
> > Now I'm lost here.  Could you try to explain to me, without getting
> > into the deep technicalities, how parallel-install achieves better
> > speed or at doing what is non-parallel-install so slow?
>
> It allows preinst/postinst/prerm/postrm phases to run for one package
> while files are concurrently being merged or unmerged for another
> package. This makes it possible to approach saturation of IO bandwidth.
Doesn't this imply that programs run in these phases could fail if
Portage is simultaneously replacing their files?

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
On 5/12/20 11:22 PM, Michał Górny wrote:

> On Tue, 2020-05-12 at 13:18 -0700, Zac Medico wrote:
>> On 5/12/20 12:28 PM, Michał Górny wrote:
>>> W dniu wto, 12.05.2020 o godzinie 10∶05 -0700, użytkownik Zac Medico
>>> napisał:
>>>> On 5/12/20 3:39 AM, Michał Górny wrote:
>>>>> I'm sorry but I was asking of a more high-level implications.
>>>>>
>>>>> I presume that this means that more than files of more than one
>>>>> package
>>>>> can be merged simultaneously.  However:
>>>>
>>>> No, an exclusive lock must be held on vardbapi._fs_lock for this.
>>>> This
>>>> is currently required at least to guarantee that access to the config
>>>> memory file is serialized (config memory is the thing that emerge
>>>> --noconfmem disables, but --noconfmem does not disable this lock).
>>>>
>>>> We assume that it's probably not worthwhile to try to merge files for
>>>> more than one package at a time, since that would cause them to
>>>> compete
>>>> for IO bandwidth.
>>>>
>>>>> 1. Are collisions handled correctly then?  i.e. if you start
>>>>> installing
>>>>> A, and then B, and the two packages collide will portage fail
>>>>> before
>>>>> starting to install any file from B?
>>>>
>>>> There are no guarantees here. However, the risk is minimal, since
>>>> it's
>>>> unlikely that a file collision of this sort would occur. file
>>>> collisions
>>>> are a QA problem that is generally detected and corrected log before
>>>> we
>>>> would encounter a collision of this sort.
>>>>
>>>>> 2. Are preinst/postinst phases called simultaneously or serialized?
>>>>
>>>> They're serialized.
>>>
>>> Now I'm lost here.  Could you try to explain to me, without getting
>>> into the deep technicalities, how parallel-install achieves better
>>> speed or at doing what is non-parallel-install so slow?
>>
>> It allows preinst/postinst/prerm/postrm phases to run for one package
>> while files are concurrently being merged or unmerged for another
>> package. This makes it possible to approach saturation of IO bandwidth.
>
> Doesn't this imply that programs run in these phases could fail if
> Portage is simultaneously replacing their files?
We've got the same potential issue with emerge --jobs, but we use the
dependency graph to schedule merges such that our dependencies do not
mutate underneath us.
--
Thanks,
Zac


signature.asc (352 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[PATCH v2] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
In reply to this post by Zac Medico-2
Enable FEATURES=parallel-install in order to increase IO throughput by
allowing files to be merged or unmerged for one package while merge or
unmerge ebuild phases execute for a different package.

This feature introduces a small risk of file collisions going
undetected for packages that are merged at about the same time, but
the risk is considered to be practically negligible since those file
collisions would typically be detected and prevented long before such
an event would have an opportunity to occur.

Bug: https://bugs.gentoo.org/715110
Signed-off-by: Zac Medico <[hidden email]>
---
[PATCH v2] documents the mechanism of increased IO throughput, and
adds a warning about the risk of undetected file collisions.

 cnf/make.globals |  2 +-
 man/make.conf.5  | 20 ++++++++++++++++----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/cnf/make.globals b/cnf/make.globals
index 4a59dbe3c..5ba1ac6fa 100644
--- a/cnf/make.globals
+++ b/cnf/make.globals
@@ -55,7 +55,7 @@ FETCHCOMMAND_SFTP="bash -c \"x=\\\${2#sftp://} ; host=\\\${x%%/*} ; port=\\\${ho
 FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs
           config-protect-if-modified distlocks ebuild-locks
           fixlafiles ipc-sandbox merge-sync multilib-strict
-          network-sandbox news parallel-fetch pid-sandbox
+          network-sandbox news parallel-fetch parallel-install pid-sandbox
           preserve-libs protect-owned qa-unresolved-soname-deps
           sandbox sfperms strict
           unknown-features-warn unmerge-logs unmerge-orphans userfetch
diff --git a/man/make.conf.5 b/man/make.conf.5
index f82fed65a..2380fe3ce 100644
--- a/man/make.conf.5
+++ b/man/make.conf.5
@@ -1,4 +1,4 @@
-.TH "MAKE.CONF" "5" "Nov 2019" "Portage VERSION" "Portage"
+.TH "MAKE.CONF" "5" "May 2020" "Portage VERSION" "Portage"
 .SH "NAME"
 make.conf \- custom settings for Portage
 .SH "SYNOPSIS"
@@ -550,9 +550,21 @@ Fetch in the background while compiling. Run
 terminal to view parallel-fetch progress.
 .TP
 .B parallel\-install
-Use finer\-grained locks when installing packages, allowing for greater
-parallelization. For additional parallelization, disable
-\fIebuild\-locks\fR.
+Use finer\-grained locks when installing packages, in order to increase
+IO throughput by allowing files to be merged or unmerged for one package
+while merge or unmerge ebuild phases execute for a different package. For
+additional parallelization, disable \fIebuild\-locks\fR.
+
+\fB***warning***\fR
+.br
+This feature introduces a small risk of file collisions going
+undetected (by collision\-protect or protect\-owned features) for
+packages that are merged at about the same time, but the risk is
+considered to be practically negligible since those file
+collisions would typically be detected and prevented long before such
+an event would have an opportunity to occur. The risk may increase when
+ACCEPT_KEYWORDS is used to accept packages which have not yet been
+deemed 'stable', or when using uncommon USE flag configurations.
 .TP
 .B pid\-sandbox
 Isolate the process space for the ebuild processes. This makes it
--
2.25.3


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v2] Enable FEATURES=parallel-install by default (bug 715110)

Zac Medico-2
On 5/17/20 1:29 PM, Zac Medico wrote:
> Enable FEATURES=parallel-install in order to increase IO throughput by
> allowing files to be merged or unmerged for one package while merge or
> unmerge ebuild phases execute for a different package.
>
> This feature introduces a small risk of file collisions going
> undetected for packages that are merged at about the same time, but
> the risk is considered to be practically negligible since those file
> collisions would typically be detected and prevented long before such
> an event would have an opportunity to occur.

NOTE: We could close this collision-protect hole by re-running he
collision-protect routine if anything has been merged since its run
prior to pkg_preinst.
--
Thanks,
Zac


signature.asc (352 bytes) Download Attachment