Viability of other SCM/version control systems for big repo's

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Viability of other SCM/version control systems for big repo's

Donnie Berkholz
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

I know some of you have done research on how gentoo-x86 converts over to
other systems besides CVS such as SVN, arch, etc. But I can't find the
info anywhere in my archives.

Could whoever's got it, post it?

I'm particularly interested in hearing about CVS, SVN, mercurial,
bazaar, darcs.

Thanks,
Donnie
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDpw2YXVaO67S1rtsRAo3aAJ99o9SxpAsgGow3zSGcHu5hXZ13rwCgsXKl
DD25pAKELMogICmdH5dSvhY=
=bWsH
-----END PGP SIGNATURE-----
--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Ciaran McCreesh
On Mon, 19 Dec 2005 11:44:24 -0800 Donnie Berkholz
<[hidden email]> wrote:
| I know some of you have done research on how gentoo-x86 converts over
| to other systems besides CVS such as SVN, arch, etc. But I can't find
| the info anywhere in my archives.
|
| Could whoever's got it, post it?

The SVN stuff is over a year out of date, and SVN has supposedly gotten
a lot better wrt scalability since then... I suspect someone will have
to redo the tests...

As for Arch, I managed to find three different "FATAL ERROR!" bugs in
tla within the first five minutes of using it. Two of them were
reported and known, with no fix forthcoming. Plus, we don't use a
distributed development model so Arch doesn't really suit us...

--
Ciaran McCreesh : Gentoo Developer (I can kill you with my brain)
Mail            : ciaranm at gentoo.org
Web             : http://dev.gentoo.org/~ciaranm


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Mike Frysinger
On Mon, Dec 19, 2005 at 08:04:19PM +0000, Ciaran McCreesh wrote:

> On Mon, 19 Dec 2005 11:44:24 -0800 Donnie Berkholz
> <[hidden email]> wrote:
> | I know some of you have done research on how gentoo-x86 converts over
> | to other systems besides CVS such as SVN, arch, etc. But I can't find
> | the info anywhere in my archives.
>
> As for Arch, I managed to find three different "FATAL ERROR!" bugs in
> tla within the first five minutes of using it. Two of them were
> reported and known, with no fix forthcoming. Plus, we don't use a
> distributed development model so Arch doesn't really suit us...

along those same lines, ive used monotone with a project or two and
found it to be highly unstable and very incompatible across minor
releases
-mike
--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Patrick Lauer
In reply to this post by Donnie Berkholz
On Mon, 2005-12-19 at 11:44 -0800, Donnie Berkholz wrote:

> Hi all,
>
> I know some of you have done research on how gentoo-x86 converts over to
> other systems besides CVS such as SVN, arch, etc. But I can't find the
> info anywhere in my archives.
>
> Could whoever's got it, post it?
>
> I'm particularly interested in hearing about CVS, SVN, mercurial,
> bazaar, darcs.
I've only tried svn with the cvs2svn script.
Importing with history took ~8h on a 500Mhz box (which surprised me
because I had heard "it takes days"). Doing checkouts caused about the
same load as cvs, but I have no data points on multi-user behaviour.

http://www.keltia.net/EuroBSDCon/slides.pdf has some performance data on
mercurial for FreeBSD, roughly the same size as the Gentoo cvs
repositories.

Patrick
--
Stand still, and let the rest of the universe move

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Ciaran McCreesh
On Mon, 19 Dec 2005 22:17:56 +0100 Patrick Lauer <[hidden email]>
wrote:
| I've only tried svn with the cvs2svn script.
| Importing with history took ~8h on a 500Mhz box (which surprised me
| because I had heard "it takes days"). Doing checkouts caused about the
| same load as cvs, but I have no data points on multi-user behaviour.

The interesting part isn't really how long it takes to convert things
or how long it takes to do a checkout, since they're in effect one time
costs. I'm guessing we have at least a hundred full tree updates and a
thousand commits for every full checkout...

--
Ciaran McCreesh : Gentoo Developer (I can kill you with my brain)
Mail            : ciaranm at gentoo.org
Web             : http://dev.gentoo.org/~ciaranm


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Fernando J. Pereda-2
In reply to this post by Patrick Lauer
On Mon, Dec 19, 2005 at 10:17:56PM +0100, Patrick Lauer wrote:
| http://www.keltia.net/EuroBSDCon/slides.pdf has some performance data on
| mercurial for FreeBSD, roughly the same size as the Gentoo cvs
| repositories.

It's not the size of the repo what matters... it is the workflow. I
don't know how they work... but I definately don't think ours suits in a
distributed SCM as Ciaran pointed out.

Cheers,
Ferdy

--
Fernando J. Pereda Garcimartín
Gentoo Developer (Alpha,net-mail,mutt,git)
20BB BDC3 761A 4781 E6ED  ED0B 0A48 5B0C 60BD 28D4

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Patrick Lauer
In reply to this post by Ciaran McCreesh
On Mon, 2005-12-19 at 21:23 +0000, Ciaran McCreesh wrote:

> On Mon, 19 Dec 2005 22:17:56 +0100 Patrick Lauer <[hidden email]>
> wrote:
> | I've only tried svn with the cvs2svn script.
> | Importing with history took ~8h on a 500Mhz box (which surprised me
> | because I had heard "it takes days"). Doing checkouts caused about the
> | same load as cvs, but I have no data points on multi-user behaviour.
>
> The interesting part isn't really how long it takes to convert things
> or how long it takes to do a checkout, since they're in effect one time
> costs.
Yes, but generating a "realistic" workload isn't trivial.If we had cvs
logs to replay we might get some good data.
>  I'm guessing we have at least a hundred full tree updates and a
> thousand commits for every full checkout...
Provide us with a script to generate partial updates/commits and I think
many people will just run it for fun ...

Maybe the nice Infra dudes could provide cvs snapshots for testing?


Patrick
--
Stand still, and let the rest of the universe move

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Peter Johanson-2
In reply to this post by Fernando J. Pereda-2
On Mon, Dec 19, 2005 at 10:38:10PM +0100, Fernando J. Pereda wrote:
>
> It's not the size of the repo what matters... it is the workflow. I
> don't know how they work... but I definately don't think ours suits in a
> distributed SCM as Ciaran pointed out.

I'm not sure about that. Having portage in arch/bazaar would let 'gentoo
derivatives' to easily pull selective changes from the 'mainline', would
potentially allow us to pull back from them, etc. It might facilitate the
stable project to do branches of portage and snipe individual patches
for updates, etc very easily. The distributed aspect of it doesn't
necessary have to immediately impact core gentoo devs commiting ebuilds.

Or maybe not, I dunno. The point being I don't think we should immediately write off
any of the distributed SCMs without pondering how they might make a difference or be usable.

-pete

--
Peter Johanson
<[hidden email]>
--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Chris Bainbridge
On 19/12/05, Peter Johanson <[hidden email]> wrote:
>
> Or maybe not, I dunno. The point being I don't think we should immediately write off
> any of the distributed SCMs without pondering how they might make a difference or be usable.

It would  be very useful for people who aren't devs but only if they
have access to the repository. It would also be useful for devs to
have a standard way of publishing their testing/development portage
overlays. On the first point, would any of the alternative SCMs prove
to be better than CVS resource wise for providing anonymous access to
users? It might also be useful to facilitate non-devs contributing
patches to the tree - rather than posting files into bugzilla they
could point towards whereever they publish their current tree (or
changes), and developers can then work with their changes directly
instead of the bugzilla upload/download dance we do now.

--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Kalin KOZHUHAROV
Chris Bainbridge wrote:

> On 19/12/05, Peter Johanson <[hidden email]> wrote:
>
>>Or maybe not, I dunno. The point being I don't think we should immediately write off
>>any of the distributed SCMs without pondering how they might make a difference or be usable.
>
>
> It would  be very useful for people who aren't devs but only if they
> have access to the repository. It would also be useful for devs to
> have a standard way of publishing their testing/development portage
> overlays. On the first point, would any of the alternative SCMs prove
> to be better than CVS resource wise for providing anonymous access to
> users? It might also be useful to facilitate non-devs contributing
> patches to the tree - rather than posting files into bugzilla they
> could point towards whereever they publish their current tree (or
> changes), and developers can then work with their changes directly
> instead of the bugzilla upload/download dance we do now.

I am using subversion for a year now, both for work, personal data, system administration (~/, /etc/
 on most machines) and gentoo development (my overlay).
Migrated from CVS that was used only for some code repositories.
It felt like changing a Trabant for Subaru (substitute your fav. rally car)!
Because of the ease-of-use and flexibility of access (ssh, https) I started using it everywhere (See
good article "My life in subversiion").

As far as speed is concerned, it is comparable with CVS.
Storage-space-wise, it takes about twice the space because a pristine copy of every file is held
locally (this allows diffs, reverts, etc. to be done from the local copy, so the server is not
contacted).
Branching/merging is logical, svn:externals is very useful to import other repositories in place.
Currently lacks owner:group and permisosons storage, but can be implemented as a wrapper.

Compared to CVS, it is a clear winner in my opinion. And learnig curve is steep.

Just my 2 yen.

Kalin.

--
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|
--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Ciaran McCreesh
On Tue, 20 Dec 2005 09:17:56 +0900 Kalin KOZHUHAROV
<[hidden email]> wrote:
 | As far as speed is concerned, it is comparable with CVS.

Be more specific please. We're looking for benchmarks showing how well
it performs in terms of speed, bandwidth and memory usage for actions
such as commit and update on a repository with 100k+ small files.

--
Ciaran McCreesh : Gentoo Developer (I can kill you with my brain)
Mail            : ciaranm at gentoo.org
Web             : http://dev.gentoo.org/~ciaranm


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Chandler Carruth
Ciaran McCreesh wrote:

>On Tue, 20 Dec 2005 09:17:56 +0900 Kalin KOZHUHAROV
><[hidden email]> wrote:
> | As far as speed is concerned, it is comparable with CVS.
>
>Be more specific please. We're looking for benchmarks showing how well
>it performs in terms of speed, bandwidth and memory usage for actions
>such as commit and update on a repository with 100k+ small files.
>
>  
>
I have hardware on which I would be more than willing to perform this
type of benchmark. Can you provide/point to a repository of files to
benchmark, and a set of operations to perform? The obvious being the
portage tree itself, with some/all of its history (however much is
necessary for the benchmarks to be meaningful), but would require a set
of activities to generate a relevant benchmark.

For reference, I have a server that is not yet in production, but
readying for production in the next few months, running Gentoo, on a
raid-5 array of SCSI harddrives. I don't remember the precise
specifications off hand, but I could provide them along with the results.

Would this be useful? Would more/other hardware be necessary useful? (I
have access to multiple workstations on which I could run simultaneous
tests, causing transactions to become relevant and important, etc etc,
and further hardware might be available here.) Hope this can be of some
use to you in trying to make this evaluation.

-Chandler Carruth
--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Donnie Berkholz
In reply to this post by Donnie Berkholz
Donnie Berkholz wrote:
> I know some of you have done research on how gentoo-x86 converts over to
> other systems besides CVS such as SVN, arch, etc. But I can't find the
> info anywhere in my archives.
>
> Could whoever's got it, post it?
>
> I'm particularly interested in hearing about CVS, SVN, mercurial,
> bazaar, darcs.

I've downloaded a copy of the gentoo-x86 repo and will run tests myself.
Please advise me as to exactly which tests you would like to see, beyond
whatever I feel like doing.

Thanks,
Donnie
--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Bret Towe
On 12/21/05, Donnie Berkholz <[hidden email]> wrote:

> Donnie Berkholz wrote:
> > I know some of you have done research on how gentoo-x86 converts over to
> > other systems besides CVS such as SVN, arch, etc. But I can't find the
> > info anywhere in my archives.
> >
> > Could whoever's got it, post it?
> >
> > I'm particularly interested in hearing about CVS, SVN, mercurial,
> > bazaar, darcs.
>
> I've downloaded a copy of the gentoo-x86 repo and will run tests myself.
> Please advise me as to exactly which tests you would like to see, beyond
> whatever I feel like doing.

might i also suggest testing out git along with the above listed?
since i dont know git well enuf or what exactly are requirements of
a gentoo dev ill just point to some documents
tutorial can be found here:
http://www.kernel.org/pub/software/scm/git/docs/tutorial.html
and documention here:
http://www.kernel.org/pub/software/scm/git/docs/

--
[hidden email] mailing list

Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Ryan Phillips-5
* Bret Towe <[hidden email]> [2005-12-21 23:16]:

> On 12/21/05, Donnie Berkholz <[hidden email]> wrote:
> > Donnie Berkholz wrote:
> > > I know some of you have done research on how gentoo-x86 converts over to
> > > other systems besides CVS such as SVN, arch, etc. But I can't find the
> > > info anywhere in my archives.
> > >
> > > Could whoever's got it, post it?
> > >
> > > I'm particularly interested in hearing about CVS, SVN, mercurial,
> > > bazaar, darcs.
> >
> > I've downloaded a copy of the gentoo-x86 repo and will run tests myself.
> > Please advise me as to exactly which tests you would like to see, beyond
> > whatever I feel like doing.
>
> might i also suggest testing out git along with the above listed?
> since i dont know git well enuf or what exactly are requirements of
> a gentoo dev ill just point to some documents
> tutorial can be found here:
> http://www.kernel.org/pub/software/scm/git/docs/tutorial.html
> and documention here:
> http://www.kernel.org/pub/software/scm/git/docs/
I know that some people have big reservations on distributed SCMs, but
why not switch to a distributed format, and cherry pick from other
developer's (and users) repositories?

Git allows for pushing to a centralized server, so it still 'works' in
a similar sense of committing changes to CVS.

cg-branch-add gentoo-main git+ssh://user@someplace/
cg-push gentoo-main

-Ryan

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Paul de Vrieze-2
In reply to this post by Chandler Carruth
On Tuesday 20 December 2005 04:07, Chandler Carruth wrote:

> Ciaran McCreesh wrote:
> >On Tue, 20 Dec 2005 09:17:56 +0900 Kalin KOZHUHAROV
> >
> ><[hidden email]> wrote:
> > | As far as speed is concerned, it is comparable with CVS.
> >
> >Be more specific please. We're looking for benchmarks showing how well
> >it performs in terms of speed, bandwidth and memory usage for actions
> >such as commit and update on a repository with 100k+ small files.
>
> I have hardware on which I would be more than willing to perform this
> type of benchmark. Can you provide/point to a repository of files to
> benchmark, and a set of operations to perform? The obvious being the
> portage tree itself, with some/all of its history (however much is
> necessary for the benchmarks to be meaningful), but would require a set
> of activities to generate a relevant benchmark.
>
> For reference, I have a server that is not yet in production, but
> readying for production in the next few months, running Gentoo, on a
> raid-5 array of SCSI harddrives. I don't remember the precise
> specifications off hand, but I could provide them along with the results.
>
> Would this be useful? Would more/other hardware be necessary useful? (I
> have access to multiple workstations on which I could run simultaneous
> tests, causing transactions to become relevant and important, etc etc,
> and further hardware might be available here.) Hope this can be of some
> use to you in trying to make this evaluation.
In this respect we want to know things like:

- Checkout time of a full new tree (no load, and with load)

- Update time (without load, and with load)

- Concurrency performance (how do multiple simultaneous commits and updates
  perform)

- Is there a difference between checking out and committing parts of the tree
  instead of the full tree.

Paul

--
Paul de Vrieze
Gentoo Developer
Mail: [hidden email]
Homepage: http://www.devrieze.net

attachment0 (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Paul de Vrieze-2
In reply to this post by Bret Towe
On Thursday 22 December 2005 08:13, Bret Towe wrote:

> On 12/21/05, Donnie Berkholz <[hidden email]> wrote:
> > Donnie Berkholz wrote:
> > > I know some of you have done research on how gentoo-x86 converts over
> > > to other systems besides CVS such as SVN, arch, etc. But I can't find
> > > the info anywhere in my archives.
> > >
> > > Could whoever's got it, post it?
> > >
> > > I'm particularly interested in hearing about CVS, SVN, mercurial,
> > > bazaar, darcs.
> >
> > I've downloaded a copy of the gentoo-x86 repo and will run tests myself.
> > Please advise me as to exactly which tests you would like to see, beyond
> > whatever I feel like doing.
>
> might i also suggest testing out git along with the above listed?
> since i dont know git well enuf or what exactly are requirements of
> a gentoo dev ill just point to some documents
> tutorial can be found here:
> http://www.kernel.org/pub/software/scm/git/docs/tutorial.html
> and documention here:
> http://www.kernel.org/pub/software/scm/git/docs/
Also look at usability of the system. From my perspective, arch/tla is not
that easy to use. Cvs and subversion are better.

Paul

--
Paul de Vrieze
Gentoo Developer
Mail: [hidden email]
Homepage: http://www.devrieze.net

attachment0 (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Spider (D.m.D. Lj.)
On Fri, 2005-12-23 at 20:34 +0100, Paul de Vrieze wrote:

> On Thursday 22 December 2005 08:13, Bret Towe wrote:
> > On 12/21/05, Donnie Berkholz <[hidden email]> wrote:
> > > Donnie Berkholz wrote:
> > > > I know some of you have done research on how gentoo-x86 converts over
> > > > to other systems besides CVS such as SVN, arch, etc. But I can't find
> > > > the info anywhere in my archives.
> > > >
> > > > Could whoever's got it, post it?
> > > >
> > > > I'm particularly interested in hearing about CVS, SVN, mercurial,
> > > > bazaar, darcs.
> > >
> > > I've downloaded a copy of the gentoo-x86 repo and will run tests myself.
> > > Please advise me as to exactly which tests you would like to see, beyond
> > > whatever I feel like doing.
> >
>
> Also look at usability of the system. From my perspective, arch/tla is not
> that easy to use. Cvs and subversion are better.
Add to this that tla is constantly misreporting and has a tendency to
mess up repositories.

For example, " I screwed up, rm file, checkout"  doesn't work with
arch...  You get a friendly "your repository is pristine" .... .

Right.


After screwing around with tla and tlx, their hideously annoying tag and
branch names (sheesh) their overabundance of {}  and the braindeadness
of being unable to verify that my tree is really exactly the same as any
other person is seeing,   I cannot speak strongly enough against this.

Git, seems useful, but a bit hard to track ( I really dislike having to
fibble around with long random characterstrings just to check out a
certain version. I can deal, but still....)

Mercurial,  last I checked, was still rather fragile. Fast, decent, but
fragile. :(

svn I haven't tried, actually. Although in current terms, it seems to be
a good replacement from how we work and what we do with things.

//Spider

--
begin  .signature
Tortured users / Laughing in pain
See Microsoft KB Article Q265230 for more information.
end


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Ciaran McCreesh
In reply to this post by Paul de Vrieze-2
On Fri, 23 Dec 2005 20:33:13 +0100 Paul de Vrieze <[hidden email]>
wrote:
| - Checkout time of a full new tree (no load, and with load)

Do we really care about this? SVN will do really really badly here, but
does it matter?

| - Concurrency performance (how do multiple simultaneous commits and
| updates perform)

With this one, you've got to bear in mind that SVN will correctly
handle transaction commits, whereas CVS will quite happily let you crap
all over half of someone else's transaction.

Performance comparisons are only one part of it...

--
Ciaran McCreesh : Gentoo Developer (I can kill you with my brain)
Mail            : ciaranm at gentoo.org
Web             : http://dev.gentoo.org/~ciaranm


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Viability of other SCM/version control systems for big repo's

Paul de Vrieze-2
On Friday 23 December 2005 22:36, Ciaran McCreesh wrote:
> On Fri, 23 Dec 2005 20:33:13 +0100 Paul de Vrieze <[hidden email]>
>
> wrote:
> | - Checkout time of a full new tree (no load, and with load)
>
> Do we really care about this? SVN will do really really badly here, but
> does it matter?

Depends on how long it takes. More than half an hour on a fast connection
would certainly be quite long. If it gets into 4 hours or more, it becomes a
real anoyance.

>
> | - Concurrency performance (how do multiple simultaneous commits and
> | updates perform)
>
> With this one, you've got to bear in mind that SVN will correctly
> handle transaction commits, whereas CVS will quite happily let you crap
> all over half of someone else's transaction.
>
> Performance comparisons are only one part of it...

I know, I should probably have mentioned it. But the proper concurrency
support comes at a price. To make a proper decision, we need to know how big
the price is. A theoretically perfect solution may very well be practically
impossible. At that point it shows that the theory overlooked certain issues
that users care about.

Paul

--
Paul de Vrieze
Gentoo Developer
Mail: [hidden email]
Homepage: http://www.devrieze.net

attachment0 (205 bytes) Download Attachment
12