New distfile mirror layout

classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
On Wed, 2019-10-23 at 17:04 -0500, William Hubbs wrote:

> On Wed, Oct 23, 2019 at 01:18:02AM -0400, Joshua Kinard wrote:
> > On 10/21/2019 19:36, Matt Turner wrote:
> > > On Mon, Oct 21, 2019 at 9:42 AM Richard Yao <[hidden email]> wrote:
> > > > Also, another idea is to use a cheap hash function (e.g. fletcher) and just have the mirrors do the hashing behind the scenes. Then we would have the best of both worlds.
> > >
> > > It probably would have been better to make these suggestions when the
> > > GLEP was discussed close to two years ago.
> > >
> > > I'm glad that we have ideas for improvements but I worry that we're
> > > just backseat driving at this point given that the GLEP's now
> > > implemented.
> >
> > Agreed, although, I don't even remember this coming up two years ago.  But,
> > I was tied up with a lot of work-related stress and tasks, so probably just
> > my memory storage backend not having enough cycles to commit it to...neurons.
>  
>  After looking at this further, I found that the glep was presented to
>  us in Jan 2018 on the dev ml [1].
>
> I checked all council meeting logs and discovered that this was never
> brought to us formally for approval.
>
> It looks like the developers decided to do this as an
> infrastructure/portage project and because of that they felt like they
> didn't need a glep.
>
...or simply forgotten whether it was approved or not after waiting
almost two years for Portage team provide a reference implementation.

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Chí-Thanh Christopher Nguyễn
In reply to this post by Michał Górny-5
Hi!

> Today you get chastised for using /space/distfiles-local and not
> following policy changes.  The devmanual states that it's deprecated
> since at least 2011, and talks of using d.g.o [1].
 > [1]
https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitable-download-hosts

Sorry I'm late to the party, but I would like to enquire about what happens
if a file with existing filename but different b2sum gets uploaded to
/space/distfiles-local now?

Doing so and updating the Manifest used to be another (not necessarily
preferred) method to address upstream remaking release packages.


Best regards,
Chí-Thanh Christopher Nguyễn

Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
On Tue, 2019-10-29 at 00:24 +0100, Chí-Thanh Christopher Nguyễn wrote:

> Hi!
>
> > Today you get chastised for using /space/distfiles-local and not
> > following policy changes.  The devmanual states that it's deprecated
> > since at least 2011, and talks of using d.g.o [1].
>  > [1]
> https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitable-download-hosts
>
> Sorry I'm late to the party, but I would like to enquire about what happens
> if a file with existing filename but different b2sum gets uploaded to
> /space/distfiles-local now?
The same as before.  It gets put in top-level disfiles directory.
Hashes are calculated from filenames, so this wouldn't affect it.  That
is, if it put those files in subdirectories in the first place because
it doesn't.

> Doing so and updating the Manifest used to be another (not necessarily
> preferred) method to address upstream remaking release packages.
>

It's no longer valid.

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Fabian Groffen-2
On 29-10-2019 05:27:37 +0100, Michał Górny wrote:

> On Tue, 2019-10-29 at 00:24 +0100, Chí-Thanh Christopher Nguyễn wrote:
> > Hi!
> >
> > > Today you get chastised for using /space/distfiles-local and not
> > > following policy changes.  The devmanual states that it's deprecated
> > > since at least 2011, and talks of using d.g.o [1].
> >  > [1]
> > https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitable-download-hosts
> >
> > Sorry I'm late to the party, but I would like to enquire about what happens
> > if a file with existing filename but different b2sum gets uploaded to
> > /space/distfiles-local now?
>
> The same as before.  It gets put in top-level disfiles directory.
> Hashes are calculated from filenames, so this wouldn't affect it.  That
> is, if it put those files in subdirectories in the first place because
> it doesn't.
/space/distfiles-local is no longer copied to the mirrors? or just not
copied in the subdir-hierarchy?

> > Doing so and updating the Manifest used to be another (not necessarily
> > preferred) method to address upstream remaking release packages.
> >
>
> It's no longer valid.

Just wondering.  Do you mean it isn't valid that some upstreams do this
(yes horror)?  We surely need a way to work around that ...

Thanks,
Fabian


--
Fabian Groffen
Gentoo on a different level

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
Dnia October 29, 2019 9:34:01 AM UTC, Fabian Groffen <[hidden email]> napisał(a):

>On 29-10-2019 05:27:37 +0100, Michał Górny wrote:
>> On Tue, 2019-10-29 at 00:24 +0100, Chí-Thanh Christopher Nguyễn
>wrote:
>> > Hi!
>> >
>> > > Today you get chastised for using /space/distfiles-local and not
>> > > following policy changes.  The devmanual states that it's
>deprecated
>> > > since at least 2011, and talks of using d.g.o [1].
>> >  > [1]
>> >
>https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitable-download-hosts
>> >
>> > Sorry I'm late to the party, but I would like to enquire about what
>happens
>> > if a file with existing filename but different b2sum gets uploaded
>to
>> > /space/distfiles-local now?
>>
>> The same as before.  It gets put in top-level disfiles directory.
>> Hashes are calculated from filenames, so this wouldn't affect it.
>That
>> is, if it put those files in subdirectories in the first place
>because
>> it doesn't.
>
>/space/distfiles-local is no longer copied to the mirrors? or just not
>copied in the subdir-hierarchy?

The latter.

>
>> > Doing so and updating the Manifest used to be another (not
>necessarily
>> > preferred) method to address upstream remaking release packages.
>> >
>>
>> It's no longer valid.
>
>Just wondering.  Do you mean it isn't valid that some upstreams do this
>(yes horror)?  We surely need a way to work around that ...

I mean the method using same filename and expecting distfiles-local to overwrite it. It is preferable to just rename it.

>
>Thanks,
>Fabian


--
Best regards,
Michał Górny

Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Ulrich Mueller-2
>>>>> On Tue, 29 Oct 2019, Michał Górny wrote:

> Dnia October 29, 2019 9:34:01 AM UTC, Fabian Groffen <[hidden email]> napisał(a):
>> /space/distfiles-local is no longer copied to the mirrors? or just
>> not copied in the subdir-hierarchy?

> The latter.

So, what has to be be done to have it appear in the proper place? Should
the file be placed in a subdir of /space/distfiles-local/? That seems to
be error prone, and certainly could be automated?

>> Just wondering. Do you mean it isn't valid that some upstreams do
>> this (yes horror)? We surely need a way to work around that ...

> I mean the method using same filename and expecting distfiles-local to
> overwrite it. It is preferable to just rename it.

Looks like this will break backwards compatibility. IIUC, backwards
compatibility is also broken on the receiving side, that is,
mirror://gentoo/ in SRC_URI will no longer work as expected?

Shouldn't GLEP 75 have mentioned this? It's certainly something that
needs to be discussed before the GLEP is implemented.

Ulrich

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
On Tue, 2019-10-29 at 13:23 +0100, Ulrich Mueller wrote:
> > > > > > On Tue, 29 Oct 2019, Michał Górny wrote:
> > Dnia October 29, 2019 9:34:01 AM UTC, Fabian Groffen <[hidden email]> napisał(a):
> > > /space/distfiles-local is no longer copied to the mirrors? or just
> > > not copied in the subdir-hierarchy?
> > The latter.
>
> So, what has to be be done to have it appear in the proper place? Should
> the file be placed in a subdir of /space/distfiles-local/? That seems to
> be error prone, and certainly could be automated?

The file should be placed in SRC_URI, and emirrordist will take care of
fetching it.

>
> > > Just wondering. Do you mean it isn't valid that some upstreams do
> > > this (yes horror)? We surely need a way to work around that ...
> > I mean the method using same filename and expecting distfiles-local to
> > overwrite it. It is preferable to just rename it.
>
> Looks like this will break backwards compatibility. IIUC, backwards
> compatibility is also broken on the receiving side, that is,
> mirror://gentoo/ in SRC_URI will no longer work as expected?

Yes, this was noted in the top mail.

> Shouldn't GLEP 75 have mentioned this? It's certainly something that
> needs to be discussed before the GLEP is implemented.

GLEP only covers how regular distfile fetching works.  Third-party
mirrors are out of scope, and all the people working on it and reviewing
it have missed the problem.  That said, this can't be fixed within
bounds defined by PMS.

Given that mirror://gentoo is discouraged since at least 2011, I don't
see a big deal here.  One day it'll stop working; we should stop using
it before then.

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Ulrich Mueller-2
>>>>> On Tue, 29 Oct 2019, Michał Górny wrote:

> On Tue, 2019-10-29 at 13:23 +0100, Ulrich Mueller wrote:
>> So, what has to be be done to have it appear in the proper place?
>> Should the file be placed in a subdir of /space/distfiles-local/?
>> That seems to be error prone, and certainly could be automated?

> The file should be placed in SRC_URI, and emirrordist will take care
> of fetching it.

What if the file is hosted at a non-standard tcp port upstream (like
http://example.org:8080/)? The devmanual says that it _must_ be manually
uploaded to /space/distfiles-local/ in such cases.

Ulrich

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Ulrich Mueller-2
>>>>> On Tue, 29 Oct 2019, Ulrich Mueller wrote:

>>>>> On Tue, 29 Oct 2019, Michał Górny wrote:
>> The file should be placed in SRC_URI, and emirrordist will take care
>> of fetching it.

> What if the file is hosted at a non-standard tcp port upstream (like
> http://example.org:8080/)? The devmanual says that it _must_ be manually
> uploaded to /space/distfiles-local/ in such cases.

Or another example, app-emacs/vhdl-mode-3.38.1, where (incompetent,
or nasty?) upstream blocks wget for some reason, but other methods
(e.g., curl, firefox) work? How would I get the file onto the mirrors
there?

Ulrich

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
In reply to this post by Ulrich Mueller-2
On Tue, 2019-10-29 at 14:03 +0100, Ulrich Mueller wrote:

> > > > > > On Tue, 29 Oct 2019, Michał Górny wrote:
> > On Tue, 2019-10-29 at 13:23 +0100, Ulrich Mueller wrote:
> > > So, what has to be be done to have it appear in the proper place?
> > > Should the file be placed in a subdir of /space/distfiles-local/?
> > > That seems to be error prone, and certainly could be automated?
> > The file should be placed in SRC_URI, and emirrordist will take care
> > of fetching it.
>
> What if the file is hosted at a non-standard tcp port upstream (like
> http://example.org:8080/)? The devmanual says that it _must_ be manually
> uploaded to /space/distfiles-local/ in such cases.
>
I can't really see why this wouldn't work.  I've just did an experiment
using app-benchmarks/forkbomb, and emirrordist fetched it just fine.

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
In reply to this post by Ulrich Mueller-2
On Tue, 2019-10-29 at 14:09 +0100, Ulrich Mueller wrote:

> > > > > > On Tue, 29 Oct 2019, Ulrich Mueller wrote:
> > > > > > On Tue, 29 Oct 2019, Michał Górny wrote:
> > > The file should be placed in SRC_URI, and emirrordist will take care
> > > of fetching it.
> > What if the file is hosted at a non-standard tcp port upstream (like
> > http://example.org:8080/)? The devmanual says that it _must_ be manually
> > uploaded to /space/distfiles-local/ in such cases.
>
> Or another example, app-emacs/vhdl-mode-3.38.1, where (incompetent,
> or nasty?) upstream blocks wget for some reason, but other methods
> (e.g., curl, firefox) work? How would I get the file onto the mirrors
> there?
>
If I were you, I would've explicitly mirrored the file anyway.
If upstream blocks wget, then users who do not use GENTOO_MIRRORS will
also suffer due to it.

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Ulrich Mueller-2
>>>>> On Tue, 29 Oct 2019, Michał Górny wrote:

> On Tue, 2019-10-29 at 14:09 +0100, Ulrich Mueller wrote:
>> > What if the file is hosted at a non-standard tcp port upstream
>> > (like http://example.org:8080/)? The devmanual says that it _must_
>> > be manually uploaded to /space/distfiles-local/ in such cases.

>> Or another example, app-emacs/vhdl-mode-3.38.1, where (incompetent,
>> or nasty?) upstream blocks wget for some reason, but other methods
>> (e.g., curl, firefox) work? How would I get the file onto the mirrors
>> there?

> If I were you, I would've explicitly mirrored the file anyway.
> If upstream blocks wget, then users who do not use GENTOO_MIRRORS will
> also suffer due to it.

All what I'm saying is that there can be unusual circumstances where
manual uploading of a file is useful. So please don't take that
possibility away.

Ulrich

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Fabian Groffen-2
On 29-10-2019 15:17:38 +0100, Ulrich Mueller wrote:

> >>>>> On Tue, 29 Oct 2019, Michał Górny wrote:
>
> > On Tue, 2019-10-29 at 14:09 +0100, Ulrich Mueller wrote:
> >> > What if the file is hosted at a non-standard tcp port upstream
> >> > (like http://example.org:8080/)? The devmanual says that it _must_
> >> > be manually uploaded to /space/distfiles-local/ in such cases.
>
> >> Or another example, app-emacs/vhdl-mode-3.38.1, where (incompetent,
> >> or nasty?) upstream blocks wget for some reason, but other methods
> >> (e.g., curl, firefox) work? How would I get the file onto the mirrors
> >> there?
>
> > If I were you, I would've explicitly mirrored the file anyway.
> > If upstream blocks wget, then users who do not use GENTOO_MIRRORS will
> > also suffer due to it.
>
> All what I'm saying is that there can be unusual circumstances where
> manual uploading of a file is useful. So please don't take that
> possibility away.
In addition, there are currently files there that aren't referenced from
ebuilds.  Prefix uses these files during bootstrap, local mirrors are
often much faster than dev.g.o.

If the files don't get mirrored anymore, I guess I can create a dummy
ebuild that has the files in SRC_URI.

If the files get mirrored, but put in a subdir based on the filename
hash, the original query endpoint on distfiles.g.o changes, much like
the SRC_URI approach.

Now I can use distfiles.prefix.b.n which redirects to the distfiles.g.o
URL with subdir for most part I think, but it's sub-optimal from my
point of view.  Calculating the hash is not always feasible due to the
lack of b2sum or other means.  Hence my earlier request to have such
official translation service on Gentoo hardware.

(I just wrote a small wsgi script that calculates the hash and generates
the redirect from Python, served via uwsgi/nginx, but there should be
many ways to achieve the same goals, if and only if a blake2b
implementation were available for it.)

Thanks,
Fabian

--
Fabian Groffen
Gentoo on a different level

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Michał Górny-5
On Tue, 2019-10-29 at 15:33 +0100, Fabian Groffen wrote:

> On 29-10-2019 15:17:38 +0100, Ulrich Mueller wrote:
> > > > > > > On Tue, 29 Oct 2019, Michał Górny wrote:
> > > On Tue, 2019-10-29 at 14:09 +0100, Ulrich Mueller wrote:
> > > > > What if the file is hosted at a non-standard tcp port upstream
> > > > > (like http://example.org:8080/)? The devmanual says that it _must_
> > > > > be manually uploaded to /space/distfiles-local/ in such cases.
> > > > Or another example, app-emacs/vhdl-mode-3.38.1, where (incompetent,
> > > > or nasty?) upstream blocks wget for some reason, but other methods
> > > > (e.g., curl, firefox) work? How would I get the file onto the mirrors
> > > > there?
> > > If I were you, I would've explicitly mirrored the file anyway.
> > > If upstream blocks wget, then users who do not use GENTOO_MIRRORS will
> > > also suffer due to it.
> >
> > All what I'm saying is that there can be unusual circumstances where
> > manual uploading of a file is useful. So please don't take that
> > possibility away.
>
> In addition, there are currently files there that aren't referenced from
> ebuilds.  Prefix uses these files during bootstrap, local mirrors are
> often much faster than dev.g.o.
>
> If the files don't get mirrored anymore, I guess I can create a dummy
> ebuild that has the files in SRC_URI.
Ok, this is something I wasn't aware of.  I agree that dummy ebuild
should not be necessary here.  However, I'm also not sure if distfiles-
local is really the proper way either, especially that I don't see such
files on woodpecker right now.

I don't think the matter is urgent right now, so let's ponder on it
a bit.  In particular, I think we should have a clear indication of who
added which files, when, what for and where they came from.  Those are
precisely the things that the current distfiles-local approach misses.

> If the files get mirrored, but put in a subdir based on the filename
> hash, the original query endpoint on distfiles.g.o changes, much like
> the SRC_URI approach.
>
> Now I can use distfiles.prefix.b.n which redirects to the distfiles.g.o
> URL with subdir for most part I think, but it's sub-optimal from my
> point of view.  Calculating the hash is not always feasible due to the
> lack of b2sum or other means.  Hence my earlier request to have such
> official translation service on Gentoo hardware.
>
> (I just wrote a small wsgi script that calculates the hash and generates
> the redirect from Python, served via uwsgi/nginx, but there should be
> many ways to achieve the same goals, if and only if a blake2b
> implementation were available for it.)
>
This is also something that needs thinking.  I personally don't mind
having one but it would be nice if it was able to account for geodns
and such.

--
Best regards,
Michał Górny


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Fabian Groffen-2
On 29-10-2019 15:45:34 +0100, Michał Górny wrote:

> On Tue, 2019-10-29 at 15:33 +0100, Fabian Groffen wrote:
> > In addition, there are currently files there that aren't referenced from
> > ebuilds.  Prefix uses these files during bootstrap, local mirrors are
> > often much faster than dev.g.o.
> >
> > If the files don't get mirrored anymore, I guess I can create a dummy
> > ebuild that has the files in SRC_URI.
>
> Ok, this is something I wasn't aware of.  I agree that dummy ebuild
> should not be necessary here.  However, I'm also not sure if distfiles-
> local is really the proper way either, especially that I don't see such
> files on woodpecker right now.
There should be /space/distfiles-local and
/space/distfiles-whitelist/prefix with a list of files to retain on the
mirror.

Thanks,
Fabian

> I don't think the matter is urgent right now, so let's ponder on it
> a bit.  In particular, I think we should have a clear indication of who
> added which files, when, what for and where they came from.  Those are
> precisely the things that the current distfiles-local approach misses.
>
> > If the files get mirrored, but put in a subdir based on the filename
> > hash, the original query endpoint on distfiles.g.o changes, much like
> > the SRC_URI approach.
> >
> > Now I can use distfiles.prefix.b.n which redirects to the distfiles.g.o
> > URL with subdir for most part I think, but it's sub-optimal from my
> > point of view.  Calculating the hash is not always feasible due to the
> > lack of b2sum or other means.  Hence my earlier request to have such
> > official translation service on Gentoo hardware.
> >
> > (I just wrote a small wsgi script that calculates the hash and generates
> > the redirect from Python, served via uwsgi/nginx, but there should be
> > many ways to achieve the same goals, if and only if a blake2b
> > implementation were available for it.)
>
> This is also something that needs thinking.  I personally don't mind
> having one but it would be nice if it was able to account for geodns
> and such.
>
> --
> Best regards,
> Michał Górny
>


--
Fabian Groffen
Gentoo on a different level

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New distfile mirror layout

Kent Fredric-2
In reply to this post by Joshua Kinard-2
On Wed, 23 Oct 2019 01:16:51 -0400
Joshua Kinard <[hidden email]> wrote:

> And for Perl or Python, I think we should be making an effort to leverage
> their respective mirroring systems first before putting their distfiles onto
> our mirrors.  Perl's got CPAN, and Python has pypi.  For things that don't
> exist on those systems, then we use our mirrors.

We still have to mirror them, because upstream has a tendency to nuke
things so that they can't be fetched any more from these primary
sources.

So whether end user fetch from the distfiles mirror for the first hit,
or as a fallback, the cost is still there.

The packages aren't broken, upstream hasn't stopped shipping it, just
some upstreams have a fetish for nuking everything but the
latest-and-greatest, and at a pace that is absolutely rediculous and
can't be imagined for us to keep up with with all the stabilization
rigmarole.

Yes, backpan does exist, but its neither perfect, nor fast.

And the faster upstream nukes things, the more likely it is it won't
even be mirrored on backpan!

( I wish I was imagining this circumstance, but its happened far too
often )

And we're not doing our users any service by burdening them with this
madness.

attachment0 (849 bytes) Download Attachment
123