Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy home page: puppylinux.com
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups 
 ProfileProfile   You have no new messagesYou have no new messages   Log out [ richard.a ]Log out [ richard.a ] 

You last visited on Yesterday, at 9:30 pm
The time now is Sun Apr 13, 2008 10:28 pm
All times are UTC + 10.5 (DST in action)
 Forum index » Advanced Topics » Puppy Projects
Design Of Distributed Repository Backend
Moderators: Flash, JohnMurga
Post new topic   Reply to topic Stop watching this topic
View previous topic :: View next topic
Page 1 of 1 [5 Posts]  
Author Message
HairyWill

Joined: 27 May 2006
Posts: 1723
Location: Southampton, UK

PostPosted: Sun Apr 06, 2008 9:47 pm    Post subject:  Design Of Distributed Repository Backend
Subject description: request for comments
  Reply with quote 

I've scoped out an initial design for an improved repository system here
http://puppylinux.org/wikka/DistributedPackageRepository It should be straightforward to copy stuff from the old system to the new one.
Following the discussion at Puppy Website: Package Lists I thought it would be good to get the discussion going in a way that all who wanted to comment could get their ideas written down in the same place.

The design is concerned with the backed of a distributed repository system. Whilst it needs to provide a few user API functions it is not intended to offer a full end user interface. This function could be provided either by a website such as prit1's proof of concept or client software similar to PSI.
Ideally the repository system could provide a search mechanism to help make client systems work faster rather.

It would be great if those interested could edit the RFC on the wiki.

_________________
Will
Back to top
View user's profile Send private message 
tombh


Joined: 12 Jan 2007
Posts: 186
Location: Bristol, UK

PostPosted: Wed Apr 09, 2008 7:49 am    Post subject:   Reply with quote 

Hi there HairyWill,

I've been meaning to reply for a while, I've just needed to let it all sink in! You really have been thinking about this, which really helps to highlight the scope and the details of what is involved here.

I know I've expressed reluctance regarding such a project, but that's not to say that I wouldn't love to see this come about nor that I wouldn't be willing to enthusiastically offer whatever help I can. It's just that I don't think I could single-handedly take responsibility for it.

I've read your WIKI entry and have obviously been following the Package Lists thread and there really is a lot to consider -- I still don't feel I've thought about it enough but I just wanted to say a few things so that my silence wasn't misinterpreted. Firstly, I have no experience of this kind of thing, other than general server administration and PHP programming, so I may be ignorant of practiced standards in the area. My own take on the solution differs slightly from yours and so I thought I'd try and explain it here first before I edited the WIKI page.

Rather than each server in effect being capable of independence, my feeling is that there should only be one master server. This master would be the only point of access for uploading and would therefore also be the only point at which the package lists could be updated and maintained. The package lists would be stored in a single file (maybe in XML rather than CSV) simulating a database. Perhaps even a fully-fledged SQL database could be used and XML files parsed from it automatically whenever PSI (or whatever other client) requested it. Mirrors would then be exact replicas of the master filesystem and would reflect its contents through rsync or FTP.

As far as I can see the only shortcoming that this approach has over your current suggestion is that mirror-admins could have no choice over the packages they served (though there are ways around this). However, there are a number of benefits, namely to do with managing the interchange and communication between numerous servers. The XML file/database would essentially be the workhorse of the whole system -- it would be used to manage all the meta-data, catagories, deletion requests and user browsing and so all the files could potentially exist in a single directory, wihtout any detriment to the end-user's browsing experience. It would also only exist on, and be accessible from, the master server -- daily backups would of course be taken! I cannot immediately see any need for cron jobs other than for the slave servers regular execution of rsync or FTP. If the system is properly implemented the XML file/database would always exactly reflect the contents of the filesystem and so there should never be a need to have one update the other.

As for the organisation of the metadata I'm in complete agreement. I'm not entirely sure how digital signing works, but I guess that the private key can be automatically applied at upload time if the author has already signed in to the website using their unique cookie session password? That would save an extra step during compilation of the package. So there would be minimal fields to fill in and as much meta-data would be automatically generated as possible.

Does that make any sense? Had you already consider this approach but found too many caveats that I haven't noticed yet?

Tally-Ho Smile
Back to top
View user's profile Send private message Send e-mail Visit poster's website 
Pizzasgood


Joined: 05 May 2005
Posts: 3878
Location: Floor Six, U.S.A.

PostPosted: Thu Apr 10, 2008 5:03 pm    Post subject:   Reply with quote 

I like the idea of being able to have only a portion of the packages on a server. That way people who own smaller (and cheaper) servers could still host parts of the repository. As the number of packages increases, this would be a bigger issue.

Another benefit of partial mirrors is that if a package is illegal in some countries but not others, the mirrors can take appropriate measures. Otherwise, a single package that was illegal in the US would make all US-based servers illegal. That would put us in the situation of either dropping US mirrors or not hosting that package (thus hurting the rest of the globe).


Yes, I think deletions should be approved first. However, maybe allow the package's maintainer to flag it as unstable immediately, in case he finds a bug. Also have a way for the admin-types to override that in case a package maintainer goes loco.

And the client should bring up a big red warning before downloading anything flagged as unstable (complete with a "yes for all" button for the hardcore testers who feel inclined to download fifteen packages of questionable stability in one fell swoop).

Maybe give it "magnitudes" of instability: "works", "mostly works", "almost works", "unusable", and "will implode your computer after killing your dog and dying your clothes pink".


Edited: clarified what I meant about small servers

_________________
"I have a tendency to wear my mind on my sleeve / I have a history of losing my shirt"--Barenaked Ladies
Pizzapup 3.0.1


Last edited by Pizzasgood on Fri Apr 11, 2008 1:18 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website 
HairyWill

Joined: 27 May 2006
Posts: 1723
Location: Southampton, UK

PostPosted: Thu Apr 10, 2008 8:19 pm    Post subject:   Reply with quote 

It is good to be talking about this.
I'm off camping for a few days.

_________________
Will
Back to top
View user's profile Send private message 
richard.a


Joined: 15 Aug 2006
Posts: 438
Location: Adelaide, South Australia

PostPosted: Yesterday, at 9:30 pm    Post subject:   Reply with quote Edit/Delete this post 

tombh wrote:
Rather than each server in effect being capable of independence, my feeling is that there should only be one master server. This master would be the only point of access for uploading and would therefore also be the only point at which the package lists could be updated and maintained. The package lists would be stored in a single file (maybe in XML rather than CSV) simulating a database. Perhaps even a fully-fledged SQL database could be used and XML files parsed from it automatically whenever PSI (or whatever other client) requested it. Mirrors would then be exact replicas of the master filesystem and would reflect its contents through rsync or FTP.

Some readers would be aware of my association with what was called Lindows in 2002, and which changed its name to Linspire and obtained a 22 Million US$ cash injection in mid 2004 (from memory) courtesy Mr William of Gates fame.

If you care to read the forums (now at Freespire) you will see that many of us got burned a few weeks ago when they pulled the plug on their CNR repository in favour of a beta non-working "new CNR" that was designed to not work with Linspire v 5 and 5.1 and Freespire v 1 and 1.1

I believe it would be highly dangerous to restrict just one location to be a repository.

While dotpups and dotpets don't work the same way as CNR does/did and while PC-BSD dotpbis don't either, it might be wise to ensure that server-side stuff NEVER gets used for the reason that the Linspire community has almost entirely bailed out. I refer you to the introduction of Peter van der Linden's "Guide to Linux" in which in 2005 he spent some six months working with the Linspire insider team and other users.

Words like "legacy" and "necessary hardware upgrades" send chills down my spine as a result of what just happened, so please take my caution very, very, seriously, if you don't want to frighten users away who got burned by Mr. Gates first, and by Mr. Robertson second.

I'm very, very, serious.

Puppy is an excellent product. Like Linspire, there's heaps of hardware it doesn't work succesfully upon. So let's keep private repositories working, eh?


I host all the downloads I've tried out on-line, because I know they work. I don't choose to be a mirror. But I need to be able to download either across my LAN or from someone else's location.

Richard in Adelaide
Capital of South Australia
one-time tester for a range of OS's.

_________________
Have you noticed editing is always needed for the inevitable typos that weren't there when you hit the "post" button?


Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 1 [5 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
Stop watching this topic
 Forum index » Advanced Topics » Puppy Projects
Jump to:  

You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You cannot delete your posts in this forum
You can vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 2.1052s ][ Queries: 12 (0.0397s) ][ Debug on ]