||Puppy Linux Discussion Forum
Puppy home page: puppylinux.com
You last visited on Yesterday, at 9:30 pm
The time now is Sun Apr 13, 2008 10:28 pm
All times are UTC + 10.5 (DST in action)
Joined: 27 May 2006
Location: Southampton, UK
|Posted: Sun Apr 06, 2008 9:47 pm Post subject:
Design Of Distributed Repository Backend
Subject description: request for comments
I've scoped out an initial design for an improved repository system here
http://puppylinux.org/wikka/DistributedPackageRepository It should be straightforward to copy stuff from the old system to the new one.
Following the discussion at Puppy Website: Package Lists
I thought it would be good to get the discussion going in a way that
all who wanted to comment could get their ideas written down in the
The design is concerned with the backed of a distributed repository
system. Whilst it needs to provide a few user API functions it is not
intended to offer a full end user interface. This function could be
provided either by a website such as prit1's proof of concept or client software similar to PSI.
Ideally the repository system could provide a search mechanism to help make client systems work faster rather.
It would be great if those interested could edit the RFC on the wiki.
Joined: 12 Jan 2007
Location: Bristol, UK
|Posted: Wed Apr 09, 2008 7:49 am Post subject:
Hi there HairyWill,
I've been meaning to reply for a while, I've just needed to let it all
sink in! You really have been thinking about this, which really helps
to highlight the scope and the details of what is involved here.
I know I've expressed reluctance regarding such a project, but
that's not to say that I wouldn't love to see this come about nor that
I wouldn't be willing to enthusiastically offer whatever help I can.
It's just that I don't think I could single-handedly take
responsibility for it.
I've read your WIKI entry and have obviously been following the
Package Lists thread and there really is a lot to consider -- I still
don't feel I've thought about it enough but I just wanted to say a few
things so that my silence wasn't misinterpreted. Firstly, I have no
experience of this kind of thing, other than general server
administration and PHP programming, so I may be ignorant of practiced
standards in the area. My own take on the solution differs slightly
from yours and so I thought I'd try and explain it here first before I
edited the WIKI page.
Rather than each server in effect being capable of independence, my
feeling is that there should only be one master server. This master
would be the only point of access for uploading and would therefore
also be the only point at which the package lists could be updated and
maintained. The package lists would be stored in a single file (maybe
in XML rather than CSV) simulating a database. Perhaps even a
fully-fledged SQL database could be used and XML files parsed from it
automatically whenever PSI (or whatever other client) requested it.
Mirrors would then be exact replicas of the master filesystem and would reflect its contents through rsync or FTP.
As far as I can see the only shortcoming that this approach has over
your current suggestion is that mirror-admins could have no choice over
the packages they served (though there are ways around this). However,
there are a number of benefits, namely to do with managing the
interchange and communication between numerous servers. The XML
file/database would essentially be the workhorse of the whole system --
it would be used to manage all the meta-data, catagories, deletion
requests and user browsing and so all the files could potentially
exist in a single directory, wihtout any detriment to the end-user's
browsing experience. It would also only exist on, and be accessible
from, the master server -- daily backups would of course be taken! I
cannot immediately see any need for cron jobs other than for the slave
servers regular execution of rsync or FTP. If the system is properly
implemented the XML file/database would always exactly reflect the
contents of the filesystem and so there should never be a need to have
one update the other.
As for the organisation of the metadata I'm in complete agreement.
I'm not entirely sure how digital signing works, but I guess that the
private key can be automatically applied at upload time if the author
has already signed in to the website using their unique cookie session
password? That would save an extra step during compilation of the
package. So there would be minimal fields to fill in and as much
meta-data would be automatically generated as possible.
Does that make any sense? Had you already consider this approach but found too many caveats that I haven't noticed yet?
Joined: 05 May 2005
Location: Floor Six, U.S.A.
|Posted: Thu Apr 10, 2008 5:03 pm Post subject:
I like the idea of being able to have only a
portion of the packages on a server. That way people who own smaller
(and cheaper) servers could still host parts of the repository. As the
number of packages increases, this would be a bigger issue.
Another benefit of partial mirrors is that if a package is illegal
in some countries but not others, the mirrors can take appropriate
measures. Otherwise, a single package that was illegal in the US would
make all US-based servers illegal. That would put us in the situation
of either dropping US mirrors or not hosting that package (thus hurting
the rest of the globe).
Yes, I think deletions should be approved first. However, maybe
allow the package's maintainer to flag it as unstable immediately, in
case he finds a bug. Also have a way for the admin-types to override
that in case a package maintainer goes loco.
And the client should bring up a big red warning before downloading
anything flagged as unstable (complete with a "yes for all" button for
the hardcore testers who feel inclined to download fifteen packages of
questionable stability in one fell swoop).
Maybe give it "magnitudes" of instability: "works", "mostly works",
"almost works", "unusable", and "will implode your computer after
killing your dog and dying your clothes pink".
Edited: clarified what I meant about small servers
"I have a tendency to wear my mind on my sleeve / I have a history of losing my shirt"--Barenaked Ladies
Last edited by Pizzasgood on Fri Apr 11, 2008 1:18 pm; edited 1 time in total
Joined: 27 May 2006
Location: Southampton, UK
|Posted: Thu Apr 10, 2008 8:19 pm Post subject:
It is good to be talking about this.
I'm off camping for a few days.
Joined: 15 Aug 2006
Location: Adelaide, South Australia
|Posted: Yesterday, at 9:30 pm Post subject:
|tombh wrote: |
than each server in effect being capable of independence, my feeling is
that there should only be one master server. This master would be the
only point of access for uploading and would therefore also be the only
point at which the package lists could be updated and maintained. The
package lists would be stored in a single file (maybe in XML rather
than CSV) simulating a database. Perhaps even a fully-fledged SQL
database could be used and XML files parsed from it automatically
whenever PSI (or whatever other client) requested it. Mirrors would
then be exact replicas of the master filesystem and would reflect its contents through rsync or FTP. |
Some readers would be aware of my association with what was called
Lindows in 2002, and which changed its name to Linspire and obtained a
22 Million US$ cash injection in mid 2004 (from memory) courtesy Mr
William of Gates fame.
If you care to read the forums (now at Freespire) you will see that
many of us got burned a few weeks ago when they pulled the plug on
their CNR repository in favour of a beta non-working "new CNR" that was
designed to not work with Linspire v 5 and 5.1 and Freespire v 1 and
I believe it would be highly dangerous to restrict just one location to be a repository.
While dotpups and dotpets don't work the same way as CNR does/did
and while PC-BSD dotpbis don't either, it might be wise to ensure that
server-side stuff NEVER gets used for the reason that the Linspire
community has almost entirely bailed out. I refer you to the
introduction of Peter van der Linden's "Guide to Linux" in which in
2005 he spent some six months working with the Linspire insider team
and other users.
Words like "legacy" and "necessary
hardware upgrades" send chills down my spine as a result of what just
happened, so please take my caution very, very, seriously, if you don't
want to frighten users away who got burned by Mr. Gates first, and by
Mr. Robertson second.
I'm very, very, serious.
Puppy is an excellent product. Like Linspire, there's heaps of
hardware it doesn't work succesfully upon. So let's keep private
repositories working, eh?
I host all the downloads I've tried out on-line, because I know they
work. I don't choose to be a mirror. But I need to be able to download
either across my LAN or from someone else's location.
Richard in Adelaide
Capital of South Australia
one-time tester for a range of OS's.
Have you noticed editing is always needed for the inevitable typos that weren't there when you hit the "post" button?
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You cannot delete your posts in this forum
You can vote in polls in this forum
You can attach files in this forum
You can download files in this forum
Powered by phpBB © 2001, 2005 phpBB Group