SAT-based dependency solver: request for test cases

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

SAT-based dependency solver: request for test cases

Michael Lienhardt
Dear all,

With the help of some friends and colleagues, I am working on an SAT-based dependency solver for portage.
We need your help to test it and possibly improve in the long run the already great portage toolset.

To help, you can send us the tar generated by this bash script: https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
This bash script extracts your world file, the USE flags and keywords configuration of your system and the list of installed packages you have (it should not take more than few seconds).
With this, we will see if our solver is able to recreate your system and how much time it takes.

You can send everything to my professional email: [hidden email]


The goal of this alternative solver is to overcome some of the limitations of emerge.
Thanks to some users of the forum and the gentoo-user mailing list, I already tested the solver on 8 systems, and the results for now are:
  - emerge is not able to recreate any of these systems (i.e., 'cat world_of_test_configuration | xargs emerge -vp' on a gentoo osboxes VM does not succeed, even with the right /etc/portage/package.* files)
  - our solver takes 2 minutes in average (with little variation), and gives either a yes answer (with what to install, which USE flags to set, which packages to keyword) or a no answer (with the set of conflicting constraints) for every systems
  - we solved one bug in our solver (a behavior of emerge that seems documented only in the PMS document, not in devmanual nor in the wiki, and that is visible only in 15 packages), but at least one seems to still be around.

I started discussing this on the gentoo-portage-dev and the gentoo-user mailing lists (sorry for the duplicates), and on the forum with three posts:
  - https://forums.gentoo.org/viewtopic-t-1074170.html about the documentation I collected to implement the solver
  - https://forums.gentoo.org/viewtopic-t-1074202.html about the solver and its motivations
  - https://forums.gentoo.org/viewtopic-t-1075286.html about the tests I'm doing


About me:
I'm a researcher in computer science, about formal methods, concurrency and software engineering.
Friday, I will present my first paper on portage, showing that its packages and their dependencies constitute what is called a "Multi-Software Product Line" in software engineering. If you skip the algebraic definition, it should be readable ^^: http://www.di.unito.it/~mlienhar/vamos.pdf
In 2013, I designed and contributed to the implementation of a dependency solver for dynamic cloud architecture (currently maintained by Jacopo Mauro https://bitbucket.org/jacopomauro/zephyrus2/src )
I'm a gentoo user since 2007, and I'm very happy by this opportunity to contribute back :).


Thank you!
Michael Lienhardt

Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Roy Bamford-2
On 2018.02.06 10:52, Michael Lienhardt wrote:

> Dear all,
>
> With the help of some friends and colleagues, I am working on an
> SAT-based dependency solver for portage.
> We need your help to test it and possibly improve in the long run the
> already great portage toolset.
>
> To help, you can send us the tar generated by this bash script:
> https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
> This bash script extracts your world file, the USE flags and keywords
> configuration of your system and the list of installed packages you
> have (it should not take more than few seconds).
> With this, we will see if our solver is able to recreate your system
> and how much time it takes.
>
> You can send everything to my professional email: [hidden email]
>
>
> The goal of this alternative solver is to overcome some of the
> limitations of emerge.
> Thanks to some users of the forum and the gentoo-user mailing list, I
> already tested the solver on 8 systems, and the results for now are:
>   - emerge is not able to recreate any of these systems (i.e., 'cat
> world_of_test_configuration | xargs emerge -vp' on a gentoo osboxes VM
> does not succeed, even with the right /etc/portage/package.* files)
>   - our solver takes 2 minutes in average (with little variation), and
> gives either a yes answer (with what to install, which USE flags to
> set, which packages to keyword) or a no answer (with the set of
> conflicting constraints) for every systems
>   - we solved one bug in our solver (a behavior of emerge that seems
> documented only in the PMS document, not in devmanual nor in the wiki,
> and that is visible only in 15 packages), but at least one seems to
> still be around.
>
> I started discussing this on the gentoo-portage-dev and the
> gentoo-user mailing lists (sorry for the duplicates), and on the forum
> with three posts:
>   - https://forums.gentoo.org/viewtopic-t-1074170.html about the
> documentation I collected to implement the solver
>   - https://forums.gentoo.org/viewtopic-t-1074202.html about the
> solver and its motivations
>   - https://forums.gentoo.org/viewtopic-t-1075286.html about the tests
> I'm doing
>
>
> About me:
> I'm a researcher in computer science, about formal methods,
> concurrency and software engineering.
> Friday, I will present my first paper on portage, showing that its
> packages and their dependencies constitute what is called a
> "Multi-Software Product Line" in software engineering. If you skip the
> algebraic definition, it should be readable ^^:
> http://www.di.unito.it/~mlienhar/vamos.pdf
> In 2013, I designed and contributed to the implementation of a
> dependency solver for dynamic cloud architecture (currently maintained
> by Jacopo Mauro https://bitbucket.org/jacopomauro/zephyrus2/src )
> I'm a gentoo user since 2007, and I'm very happy by this opportunity
> to contribute back :).
>
>
> Thank you!
> Michael Lienhardt
>
>
>
Michael,

Posting here to alert other potential helpers.
Your script uses

PACKAGE_KEYWORDS="/etc/portage/package.accept_keywords"

but thats a recent name change.

PACKAGE_KEYWORDS="/etc/portage/package.keywords"

is the old name and my older systems still use that.
You probably need to harvest both and process both as portage does.

--
Regards,

Roy Bamford
(Neddyseagoon) a member of
elections
gentoo-ops
forum-mods

attachment0 (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Matthew Thode (prometheanfire)
In reply to this post by Michael Lienhardt
On 18-02-06 11:52:10, Michael Lienhardt wrote:

> Dear all,
>
> With the help of some friends and colleagues, I am working on an SAT-based dependency solver for portage.
> We need your help to test it and possibly improve in the long run the already great portage toolset.
>
> To help, you can send us the tar generated by this bash script: https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
> This bash script extracts your world file, the USE flags and keywords configuration of your system and the list of installed packages you have (it should not take more than few seconds).
> With this, we will see if our solver is able to recreate your system and how much time it takes.
>
> You can send everything to my professional email: [hidden email]
>
>
> The goal of this alternative solver is to overcome some of the limitations of emerge.
> Thanks to some users of the forum and the gentoo-user mailing list, I already tested the solver on 8 systems, and the results for now are:
>  - emerge is not able to recreate any of these systems (i.e., 'cat world_of_test_configuration | xargs emerge -vp' on a gentoo osboxes VM does not succeed, even with the right /etc/portage/package.* files)
>  - our solver takes 2 minutes in average (with little variation), and gives either a yes answer (with what to install, which USE flags to set, which packages to keyword) or a no answer (with the set of conflicting constraints) for every systems
>  - we solved one bug in our solver (a behavior of emerge that seems documented only in the PMS document, not in devmanual nor in the wiki, and that is visible only in 15 packages), but at least one seems to still be around.
>
> I started discussing this on the gentoo-portage-dev and the gentoo-user mailing lists (sorry for the duplicates), and on the forum with three posts:
>  - https://forums.gentoo.org/viewtopic-t-1074170.html about the documentation I collected to implement the solver
>  - https://forums.gentoo.org/viewtopic-t-1074202.html about the solver and its motivations
>  - https://forums.gentoo.org/viewtopic-t-1075286.html about the tests I'm doing
>
>
> About me:
> I'm a researcher in computer science, about formal methods, concurrency and software engineering.
> Friday, I will present my first paper on portage, showing that its packages and their dependencies constitute what is called a "Multi-Software Product Line" in software engineering. If you skip the algebraic definition, it should be readable ^^: http://www.di.unito.it/~mlienhar/vamos.pdf
> In 2013, I designed and contributed to the implementation of a dependency solver for dynamic cloud architecture (currently maintained by Jacopo Mauro https://bitbucket.org/jacopomauro/zephyrus2/src )
> I'm a gentoo user since 2007, and I'm very happy by this opportunity to contribute back :).
>
>
This sounds intresting, I wonder how it'd handle things like
sys-cluster/openstack-meta which can sometimes require masking a package
(gentoo stablizes a package ahead of what openstack has tested support
for).

--
Matthew Thode (prometheanfire)

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michael Lienhardt
In reply to this post by Roy Bamford-2


Il 06/02/2018 15:00, Roy Bamford ha scritto:

> Posting here to alert other potential helpers.
> Your script uses
>
> PACKAGE_KEYWORDS="/etc/portage/package.accept_keywords"
>
> but thats a recent name change.
>
> PACKAGE_KEYWORDS="/etc/portage/package.keywords"
>
> is the old name and my older systems still use that.
> You probably need to harvest both and process both as portage does.

You are right, I was lazy (and hoped that everyone already made the switch because, as I understand it, package.keywords and package.accept_keywords do not have the same semantics).
I added the package.keywords file/folder in the script.

Thank you,
Michael Lienhardt

Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Duncan-42
Michael Lienhardt posted on Tue, 06 Feb 2018 23:53:05 +0100 as excerpted:

> Il 06/02/2018 15:00, Roy Bamford ha scritto:
>> Posting here to alert other potential helpers.
>> Your script uses
>>
>> PACKAGE_KEYWORDS="/etc/portage/package.accept_keywords"
>>
>> but thats a recent name change.
>>
>> PACKAGE_KEYWORDS="/etc/portage/package.keywords"
>>
>> is the old name and my older systems still use that.
>> You probably need to harvest both and process both as portage does.
>
> You are right, I was lazy (and hoped that everyone already made the
> switch because, as I understand it, package.keywords and
> package.accept_keywords do not have the same semantics).
> I added the package.keywords file/folder in the script.

AFAIK, (plain) etc-portage semantics are the same for both files -- that
is, /etc/portage/package.keywords and the newer package.accept_keywords
are identical.

The reason for the new name and deprecation of the old one was that
package.keywords also exists in the /profile/, where the semantics are
different, which created confusion for devs and others attempting to edit
the profile version as well as the more commonly user-edited (plain)
/etc/portage version.

(I add the parenthesized "(plain)" because there's also the deeper
/etc/portage/profile path, which takes profile changes and therefore the
profile format.  Actually, I suspect it was someone editing that using
the wrong format and then filing a bug when things didn't work as
expected that likely prompted the new name and deprecation of the old
one.)


Meanwhile...

I've a rather unusual portage config here:

* /etc/portage/profile/packages contains a -* entry, negating the entire
normal @system set.  Many normally @system packages I really need are
dependencies of something or other I already have in @world anyway, and
I've added @world entries where necessary to keep the few exceptions
installed.

* My USE starts with a -* entry as well, negating profile and package USE
defaults so I have direct control of all USE flag settings and don't have
to worry about USE flag changes on profile updates or tracking down why a
flag is changing when I didn't change anything, both previous problems I
had to deal with until I set that initial -*, so the only flags set are
the ones I deliberately chose, myself.

* My world file (/var/lib/portage/world) is empty.  I've categorized
everything into individual sets found in /etc/portage/sets, with those in
turn listed in the world_sets file (/var/lib/portage/world_sets).

* Overlays... (Less unusual, but still not mainline...) I run nearly all
the kde I have installed (frameworks/plasma/apps), as well as a few other
packages, from the live-git *-9999 packages found in the gentoo/kde
overlay (and others).  Package keywording/masking is adjusted
accordingly.  (Everything else is mainline ~amd64.)

I expect  one or more of these you won't have anticipated so they'll
present a challenge for your current script, but because it /is/ a rather
unusual setup, I'm not sure it's worth your trouble to bother with.

OTOH, if you want unusual corner-cases to test...

So bother sending the results in (you're ready for it already), or you
want them, but wait until you've adjusted the script to deal with it, or
don't bother, you're not going to try supporting anything that unusual
anyway?

--
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michael Lienhardt


Il 07/02/2018 05:11, Duncan ha scritto:

> AFAIK, (plain) etc-portage semantics are the same for both files -- that
> is, /etc/portage/package.keywords and the newer package.accept_keywords
> are identical.
>
> The reason for the new name and deprecation of the old one was that
> package.keywords also exists in the /profile/, where the semantics are
> different, which created confusion for devs and others attempting to edit
> the profile version as well as the more commonly user-edited (plain)
> /etc/portage version.
>
> (I add the parenthesized "(plain)" because there's also the deeper
> /etc/portage/profile path, which takes profile changes and therefore the
> profile format.  Actually, I suspect it was someone editing that using
> the wrong format and then filing a bug when things didn't work as
> expected that likely prompted the new name and deprecation of the old
> one.)

Ok, thank you for the information, it's very interesting.
I may need to update my solver then :) (I thought that *.keywords always manipulated the KEYWORDS variable, while *.accept_keywords manipulated the ACCEPT_KEYWORDS).

> I've a rather unusual portage config here:
>
> * /etc/portage/profile/packages contains a -* entry, negating the entire
> normal @system set.  Many normally @system packages I really need are
> dependencies of something or other I already have in @world anyway, and
> I've added @world entries where necessary to keep the few exceptions
> installed.
>
> * My USE starts with a -* entry as well, negating profile and package USE
> defaults so I have direct control of all USE flag settings and don't have
> to worry about USE flag changes on profile updates or tracking down why a
> flag is changing when I didn't change anything, both previous problems I
> had to deal with until I set that initial -*, so the only flags set are
> the ones I deliberately chose, myself.

Does the -* also remove profile USE defaults for USE flags in PROFILE_ONLY_VARIABLES?
My understanding is that only files in a profile (either /etc/portage/make.profile or /etc/portage/profile) can configure these USE flags.

> * My world file (/var/lib/portage/world) is empty.  I've categorized
> everything into individual sets found in /etc/portage/sets, with those in
> turn listed in the world_sets file (/var/lib/portage/world_sets).

Interesting, I didn't know about the /var/lib/portage/world_sets file (it escaped me somehow).
I currently include in all the packages listed in the /etc/portage/sets/* files.
It can be fixed easily.

> * Overlays... (Less unusual, but still not mainline...) I run nearly all
> the kde I have installed (frameworks/plasma/apps), as well as a few other
> packages, from the live-git *-9999 packages found in the gentoo/kde
> overlay (and others).  Package keywording/masking is adjusted
> accordingly.  (Everything else is mainline ~amd64.)
>
> I expect  one or more of these you won't have anticipated so they'll
> present a challenge for your current script, but because it /is/ a rather
> unusual setup, I'm not sure it's worth your trouble to bother with.

Out of all your points, only the overlays are not managed by our solver.
As your setup is unusual, apar from the overlay, it could be a good test case for our tool to check the configuration loading.
Right now in my tests, I deal with overlay by adding manually all the installed packages that is missing from the base portage tree...
If there is not too much packages to add (I guess 50 is still ok, 100 is a bit too much).

> OTOH, if you want unusual corner-cases to test...

I'm from formal methods, focused on correction and completeness...
So yes, corner cases are important and interesting, and are useful to detect early bug or design errors.

> So bother sending the results in (you're ready for it already), or you
> want them, but wait until you've adjusted the script to deal with it, or
> don't bother, you're not going to try supporting anything that unusual
> anyway?

Send it in :).
If I have problems with it, I may not fix the bugs right away, but at least I would know about it.

Thanks,
Michael Lienhardt


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Toralf Förster-2
In reply to this post by Michael Lienhardt
On 02/06/2018 11:52 AM, Michael Lienhardt wrote:

To help, you can send us the tar generated by this bash script: https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
This bash script extracts your world file, the USE flags and keywords configuration of your system and the list of installed packages you have (it should not take more than few seconds).
With this, we will see if our solver is able to recreate your system and how much time it takes.

You can send everything to my professional email: [hidden email]

Just send an email to that with an uunencoded tar.xz file from one of the tinderbox images [1] I do run.

I can adapt the scripts to send the result file of each of the currently 7 running images daily.


[1] https://zwiebeltoralf.de/tinderbox.html

-- 
Toralf
PGP 23217DA7 9B888F45

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michał Górny-5
In reply to this post by Michael Lienhardt
W dniu wto, 06.02.2018 o godzinie 11∶52 +0100, użytkownik Michael
Lienhardt napisał:
> Dear all,
>
> With the help of some friends and colleagues, I am working on an SAT-based dependency solver for portage.
> We need your help to test it and possibly improve in the long run the already great portage toolset.
>
> To help, you can send us the tar generated by this bash script: https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
> This bash script extracts your world file, the USE flags and keywords configuration of your system and the list of installed packages you have (it should not take more than few seconds).
> With this, we will see if our solver is able to recreate your system and how much time it takes.
>

To be honest, I don't think this is the right approach to the problem.
Truth is, dependencies in Gentoo are seriously broken, and most of
the developers aren't even aware of that because of layers upon layers
of hacks in Portage that make emerge somewhat go on.

If you are really able to build something on top of the input you
receive, it's probably going to be even worse than what's already
in Portage.

Example: many packages have impossible circular dependencies. However,
Portage conditionally pretends they don't exist, preferring some random
install-time breakage over fixing the packages in question.

--
Best regards,
Michał Górny


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Ulrich Mueller-2
>>>>> On Sat, 10 Feb 2018, Michał Górny wrote:

> Example: many packages have impossible circular dependencies.
> However, Portage conditionally pretends they don't exist, preferring
> some random install-time breakage over fixing the packages in
> question.

Isn't that what the PMS allows, though? RDEPEND must be fulfilled,
"unless the particular dependency results in a circular dependency,
in which case it may be installed later".

Ulrich

attachment0 (501 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michał Górny-5
W dniu sob, 10.02.2018 o godzinie 11∶20 +0100, użytkownik Ulrich Mueller
napisał:

> > > > > > On Sat, 10 Feb 2018, Michał Górny wrote:
> > Example: many packages have impossible circular dependencies.
> > However, Portage conditionally pretends they don't exist, preferring
> > some random install-time breakage over fixing the packages in
> > question.
>
> Isn't that what the PMS allows, though? RDEPEND must be fulfilled,
> "unless the particular dependency results in a circular dependency,
> in which case it may be installed later".
>

Yes, and I regret ever adjusting this to match the horribly misjudged
Portage behavior.

--
Best regards,
Michał Górny


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Paweł Hajdan, Jr.
In reply to this post by Michał Górny-5
On 10/02/2018 09:20, Michał Górny wrote:
> To be honest, I don't think this is the right approach to the problem.

Feel free to suggest a better one.

> Truth is, dependencies in Gentoo are seriously broken, and most of
> the developers aren't even aware of that because of layers upon layers
> of hacks in Portage that make emerge somewhat go on.

Indeed, I may not be aware of many such problems.

Is there a place where the known ones can be documented?

Paweł


signature.asc (844 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michael Lienhardt
In reply to this post by Toralf Förster-2
Sorry for the late reply.

This is a very interesting offer :)
However, I don't have the capacity to manage such quantity of data yet.
Up until now, I performed my tests in a VM on my laptop without anything else running...
Few days ago, my department lent me a dedicated laptop to perform the tests, which I still need to set up...

I'll definitively get back to you soonish.
Thanks!
Michael


Il 08/02/2018 20:57, Toralf Förster ha scritto:

> On 02/06/2018 11:52 AM, Michael Lienhardt wrote:
>>
>> To help, you can send us the tar generated by this bash script: https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
>> This bash script extracts your world file, the USE flags and keywords configuration of your system and the list of installed packages you have (it should not take more than few seconds).
>> With this, we will see if our solver is able to recreate your system and how much time it takes.
>>
>> You can send everything to my professional email: [hidden email]
>>
> Just send an email to that with an uunencoded tar.xz file from one of the tinderbox images [1] I do run.
>
> I can adapt the scripts to send the result file of each of the currently 7 running images daily.
>
>
> [1] https://zwiebeltoralf.de/tinderbox.html
>
> --
> Toralf
> PGP 23217DA7 9B888F45
>

Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michael Lienhardt
In reply to this post by Michał Górny-5
Dear Michał,

I understand your concerns, and for some time I shared them too.
Portage is a very complex piece of software that evolved for almost 20 years to perform a very difficult task quite quickly with good results, so it is bound to contain many ad-hoc fixes and tweaks that can hardly be encoded into SAT constraints.
However, I think there are two arguments strongly in favor of my approach.
1. it seems to me that there is a real effort from the gentoo community and management to have a precise description of how Portage should behave, with the PMS and all the documentation on the wiki and devmanual. My work is based on this documentation and goes in the same direction by giving a simple and unambiguous description (i.e., a formal semantics) of Portage's dependencies as described in
Section 8.2 of the PMS.
2. My work currently *only* focuses on the package dependencies. I don't deal with eclasses (I use egencache files), I don't deal with installation order (I just compute which packages must be installed in the end: in doing so, I avoid the circular dependency problem you were talking about), I don't deal with actual package compilation, installation, etc. I focus on a very precise part of Portage,
dependency solving, subject on which I have some expertise.

So I agree with you that it is unlikely that my prototype's outputs will correspond one to one to emerge's.
However, I think it can be positive for Portage.
As my prototype uses an off-the-shelf solver as backend, in principle it just consists of a translation of Section 8.2 of the PMS into SAT constraints: consequently, it is a rather small piece of code compared to Portage, and once I finish reading the PMS and clean up my implementation, it would likely be a correct implementation of the PMS, or at least simple enough to be easily checked and
corrected.
Then, testing my prototype to check for bugs is also a good opportunity to check for bugs into Portage itself, by taking the PMS as reference. And if you are right and my prototype --a direct translation of the PMS-- ends up worse than Portage, then maybe this means that the PMS should be revised.
Also, in the long run, having an alternative implementation of one of Portage's functionality instead of a complete alternative to Portage like paludis or pkgcore, could help in the discussion about a modular implementation of Portage (I've heard some people talking about it), or about a next version of the PMS.

I'll finish my prototype as soon as possible and see what the tests will tell.

Best regards,
Michael Lienhardt


Il 10/02/2018 09:20, Michał Górny ha scritto:

> W dniu wto, 06.02.2018 o godzinie 11∶52 +0100, użytkownik Michael
> Lienhardt napisał:
>> Dear all,
>>
>> With the help of some friends and colleagues, I am working on an SAT-based dependency solver for portage.
>> We need your help to test it and possibly improve in the long run the already great portage toolset.
>>
>> To help, you can send us the tar generated by this bash script: https://raw.githubusercontent.com/HyVar/gentoo_to_mspl/master/benchmarks/get_installation.sh
>> This bash script extracts your world file, the USE flags and keywords configuration of your system and the list of installed packages you have (it should not take more than few seconds).
>> With this, we will see if our solver is able to recreate your system and how much time it takes.
>>
>
> To be honest, I don't think this is the right approach to the problem.
> Truth is, dependencies in Gentoo are seriously broken, and most of
> the developers aren't even aware of that because of layers upon layers
> of hacks in Portage that make emerge somewhat go on.
>
> If you are really able to build something on top of the input you
> receive, it's probably going to be even worse than what's already
> in Portage.
>
> Example: many packages have impossible circular dependencies. However,
> Portage conditionally pretends they don't exist, preferring some random
> install-time breakage over fixing the packages in question.
>

Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Martin Vaeth-2
Michael Lienhardt <[hidden email]> wrote:
>
> ad-hoc fixes and tweaks that can hardly be encoded into SAT constraints.

The main difficulty which I see is that one does not want only _some_
solution, but among all solutions one which optimizes certain quantities.
So it seems to me that a discrete optimization under constraints
is required instead of a pure SAT solver (although formally, of course,
such an optimization problem can be reduced to SAT, I do not know whether
you went the road):

a. The number of packages which change their status (from installed to
uninstalled or vice versa) should be minimal.

b. Similarly, the number of USE-flag changes necessary to achieve this
aim should be minimized.
(You didn't tell whether your solver already supports such changes,
but when it is finished, it definitely should.)

Hopefully in near future, there will be a second class of USE-flags
whose change is "cheap" which should be excluded from this minimization
restriction:
https://bugs.gentoo.org/show_bug.cgi?id=424283
I think the main reason why nobody dared to implement them yet
(although almost everybody wants them) are exactly these implications
on the current solver in portage which nobody dares to change anymore
seriously.

c. A solution with dependency loops should be avoided if possible
(which is why currently the PDEPEND hacks do exist: To tell the solver
which loops are not a problem.)

d. In || ( ... ) clauses the left-most packages should be preserved.
There are similar (and more difficult) rules for REQUIRED_USE.

e. Last but not least: The preferences are ordered a > b > c > d
(A dependency loop of uninstalled packages should probably have even
higher priority than a).
Additionally the change installed -> uninstalled should be less
"expensive" than the change uninstalled -> installed.
The correct weighting of the quantities should probably be a matter
of discussion (or preferrably even user-customizable), and preferrably
should not depend only on the number of packages but also on other
customizable quantities (e.g. the package size).

Perhaps - if one can rely on some solver - in a future EAPI instead
of the size heuristic, one could give a hint for how "expensive" it
is to recompile a certain package, so that dependency results can
be better optimized for that.

IIRC, this is already built in rudimentarily in portage's current
solver by defining the recompilation costs of virtual packages as 0.

> I don't deal with installation order [...] circular dependency problem

++
Once the packages are known, the installation order and breaking of
unavoidable circles is a matter of a single graph traversal in the
resulting subgraph which needs neglectable linear time.
However, if there is a solution without a circle this should have
been found instead in the first place (cf. c).


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michał Górny-5
W dniu wto, 13.02.2018 o godzinie 07∶49 +0000, użytkownik Martin Vaeth
napisał:

> Michael Lienhardt <[hidden email]> wrote:
> >
> > ad-hoc fixes and tweaks that can hardly be encoded into SAT constraints.
>
> The main difficulty which I see is that one does not want only _some_
> solution, but among all solutions one which optimizes certain quantities.
> So it seems to me that a discrete optimization under constraints
> is required instead of a pure SAT solver (although formally, of course,
> such an optimization problem can be reduced to SAT, I do not know whether
> you went the road):
>
> a. The number of packages which change their status (from installed to
> uninstalled or vice versa) should be minimal.
>
> b. Similarly, the number of USE-flag changes necessary to achieve this
> aim should be minimized.
> (You didn't tell whether your solver already supports such changes,
> but when it is finished, it definitely should.)
>
> Hopefully in near future, there will be a second class of USE-flags
> whose change is "cheap" which should be excluded from this minimization
> restriction:
> https://bugs.gentoo.org/show_bug.cgi?id=424283
> I think the main reason why nobody dared to implement them yet
> (although almost everybody wants them) are exactly these implications
> on the current solver in portage which nobody dares to change anymore
> seriously.
>
> c. A solution with dependency loops should be avoided if possible
> (which is why currently the PDEPEND hacks do exist: To tell the solver
> which loops are not a problem.)
>
> d. In || ( ... ) clauses the left-most packages should be preserved.
> There are similar (and more difficult) rules for REQUIRED_USE.
>
> e. Last but not least: The preferences are ordered a > b > c > d
> (A dependency loop of uninstalled packages should probably have even
> higher priority than a).
> Additionally the change installed -> uninstalled should be less
> "expensive" than the change uninstalled -> installed.
> The correct weighting of the quantities should probably be a matter
> of discussion (or preferrably even user-customizable), and preferrably
> should not depend only on the number of packages but also on other
> customizable quantities (e.g. the package size).
>

Thank you for listing this. However, I think you've missed the most
important point: we want to prefer the newest version, whenever possible
;-).

--
Best regards,
Michał Górny


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Martin Vaeth-2
Michał Górny <[hidden email]> wrote:
>>
>> d. In || ( ... ) clauses the left-most packages should be preserved.

s/preserved/preferred/

> you've missed the most important point: we want to prefer
> the newest version, whenever possible
> ;-).

Yes, you are right: I had thought only about packages, not about versions.
Of course, a version upgrade within the same slot should have a
negative penalty. I am already not sure how upgrades with slot
changes should be treated. And then there are subslots...
The list is probably still not exhaustive and - as already mentioned -
the penalties are certainly a topic of discussion (and even more of
trial and error: which works best in practice).

Anyway, I think that it would be a huge improvement over the current
(portage's) solver if one could formulate such a list explicitly in
some specified format, and then one abstract algorithm takes care to
find the best solution according to the specified penalties:
If I understand it correctly, all these rules are currently hard-coded
into the algorithm by using various hacks in backtracking steps,
finding of a solution is convoluted with determining the order, etc.
One would obtain a solver which not only is "provably" correct,
but also much easier to maintain and to tweak (e.g. if one realizes
that certain penalties were not cleverly chosen).

This would also provide the possibility for much richer user configuration.
An example which immediately come to mind is "weak-masking" of packages
or versions ("use it or upgrade only if no alternative is available").
Theoretically, one could then even endow the entries of package.mask
with a penalty number.


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Michael Lienhardt
In reply to this post by Michał Górny-5
Thanks a lot for this list!

You are totally right, simply translating the dependencies into SAT constraints and feeding them to a solver returns in most cases a very bad, totally useless solution.
However, nowadays many solvers support solution optimization, i.e., you can specify an ordered list of criteria you want to maximize/minimize, exactly like you did.
I already implemented the minimal status difference and the minimal installation size criteria, and the solutions I get are already acceptable/good.

I wasn't aware of the criteria list you gave (maybe it's in the PMS), so thank you very much, this is a big help :)
I have few questions about these criteria:
  - in criteria a., there could be an ambiguity between what I call a package (e.g., 'app-editors/nano-2.9.3') and a package group (e.g., 'app-editors/nano'): is the criteria about package group (i.e., are version changes allowed)?
  - similarly in criteria b.: is this criteria valid across versions (i.e., when changing version, the USE-flag change should be minimal), across slots (i.e., two installed versions of the same package group should have a minimal USE flag difference)?
    If yes, what if package.use specifies very different USE flags for two versions of the same package group?
  - does the "prefer new version" criteria go between a. and b.?

>> b. Similarly, the number of USE-flag changes necessary to achieve this
>> aim should be minimized.
>> (You didn't tell whether your solver already supports such changes,
>> but when it is finished, it definitely should.)

Yes, my solver supports USE-flag, keyword and mask changes (it is currently oblivious of licenses, but supporting them should just be a technical/time consuming effort).
Due to keywords and mask changes, the "prefer new version" criteria needs to be after criteria "less keyword and mask change".
  - just to be sure, the "less keyword and mask change" criteria is at the top of the list?

Best regards,
Michael Lienhardt

Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Martin Vaeth-2
Michael Lienhardt <[hidden email]> wrote:
> the criteria list you gave (maybe it's in the PMS)

I doubt that it is in PMS, and IMHO it also does not belong there:
As long as the result configuration is valid (no collisions
or unresolvable loops) all should be equally fine from the
viewpoint of PMS: It is in the responsibility of the package
manager and its configuration to determine correctly what the
_user_ actually wants.

>   - in criteria a., there could be an ambiguity between what I call a package
> (e.g., 'app-editors/nano-2.9.3') and a package group (e.g.,
> 'app-editors/nano')

This is what already Michał has oberved: In my list, I had completely
forgotten to refer to versions ('app-editors/nano-2.9.3') and had only
packages ('app-editors/nano') in mind.
(I think "version" vs. "package" is the usual terminology in gentoo;
there are also "slots" which in a sense might be considered as
different packages, although perhaps with a different "penalty")

> is the criteria about package group (i.e., are version changes allowed)?

Version upgrades should even be preferred over remaining with the version.
OTOH, version downgrades are perhaps even worse than using a different package
(opinions might differ for the latter).

>   - similarly in criteria b.: is this criteria valid across versions
> (i.e., when changing version, the USE-flag change should be minimal)

Cf. the next point: The USE-flag change compared to user configuration
(package.use etc.) should be minimal.

>   - just to be sure, the "less keyword and mask change" criteria is at the
> top of the list?

Yes. Only "illegal" configurations (colliding packages or unresolvable
dependency loops of uninstalled packages) should weight stronger:
If possible at all, the user's choice should always be preferred,
and changes to the configuration should only be suggested if no other
solution exists (and probably the number of suggested changes should
be minimal in a sense).


Reply | Threaded
Open this post in threaded view
|

Re: SAT-based dependency solver: request for test cases

Benda Xu
In reply to this post by Michael Lienhardt
Hi Michael,

I haven't fully understood SAT yet and I haven't completely follow the
discussion.  But I think this is a logical direction to improve
dependency solving in Gentoo.  Keep on the good work, I am interested in
knowing how well it performs.

Yours,
Benda