Team Ninja Bulletin Board  

Go Back   Team Ninja Bulletin Board > DC Vault > New Projects

View Poll Results: Add RNA to the DC-Vault.com
Yes 10 83.33%
No 2 16.67%
Voters: 12. You may not vote on this poll

Reply
 
Thread Tools Display Modes
  #1  
Old 2nd March 2010, 06:29 PM
Saenger's Avatar
Saenger Saenger is offline
SETI.Germany
 
Join Date: Mar 2007
Location: Bremen, Germany
Posts: 16
RNA World @ Home

New project:
RNA World (beta):
Quote:
Originally Posted by RKN-Wiki
RNA World project description
RNA World is a distributed supercomputer that uses Internet-connected computers to advance RNA research. This system is dedicated to identify, analyze, structurally predict and design RNA molecules on the basis of established bioinformatics software in a high-performance, high-throughput fashion.

In contrast to classical bioinformatic approaches, RNA World does not rely on individual desktop computers, web servers or supercomputers. Instead, it represents a continuously evolving cluster of world-wide distributed machines of any type. As such, RNA World is very heterogenous and, depending on the sub-project, currently addresses Internet-connected computers running Linux, Windows and OSX operating systems - your computer could be an important part of it. The fact that hardware and electricity costs are shared among the volunteer contributors raises the possibility of performing interesting analyses which under economical aspects would often not be affordable. In return, RNA World is not for profit, exclusively uses open source code and will make its results available to the public.

In its present form, RNA World runs a fully automated high-throughput analysis software version of Infernal, a program suite originally developed in Sean Eddys laboratory for the systematic identification of non-coding RNAs. The goal of this RNA World sub-project is to systematically identify all known RNA family members in all organisms known to date and make the results available to the public in a timely fashion. With your help, we also aim at supplying established bioinformatic databases such as Rfam with our results to help reduce their future maintenance costs.

In contrast to other distributed and grid computing projects, the RNA World developers are currently designing generalized user interfaces that, in parallel to the projects our own research team is following up, allow non-associated individual scientists to submit their own projects in a manner similar to using a web server interface - of course, free of cost.
Stats are exported but not yet counted here (but on the other stats sites)
__________________
Gruesse vom Saenger
Reply With Quote
  #2  
Old 4th March 2010, 01:32 AM
Rusty Rusty is offline
Owner
 
Join Date: Jan 2004
Location: Tasmania, Australia
Posts: 11,210
Looks good.. anyone running it?
__________________
RUSTY


Team Ninja Forever : Once a Ninja, always a Ninja - Team Ninja

"I'm a SAS NINJA"

Drafted to the SAS
Reply With Quote
  #3  
Old 4th March 2010, 03:06 AM
cswchan's Avatar
cswchan cswchan is offline
Crunchers Inc
 
Join Date: Mar 2006
Location: Mississauga, Ontario, Canada
Posts: 190
Yep... started running today... seems to play nicely with the rest of the Boinc crowd on my quad...
__________________
Crunchers Inc
Reply With Quote
  #4  
Old 4th March 2010, 04:35 AM
fractal fractal is offline
ARS Technica
 
Join Date: Sep 2007
Location: earth
Posts: 75
I ran it for a week on both linux and windows machines. Work units vary from small to huge, some taking 1.5-2.5 gig of ram. Many finish in minutes, but some take 2 weeks with a 1 week deadline. Work lately has finished in seconds but require minutes to upload and download. The web site is annoyingly slow but the admin is responsive.

I don't read german, but it uses the same forums as Yoyo and team Rechenkraft. Someone who does read german might want to figure out the relation between the three.
__________________
Reply With Quote
  #5  
Old 4th March 2010, 04:58 AM
Saenger's Avatar
Saenger Saenger is offline
SETI.Germany
 
Join Date: Mar 2007
Location: Bremen, Germany
Posts: 16
Quote:
Originally Posted by fractal View Post
I don't read german, but it uses the same forums as Yoyo and team Rechenkraft. Someone who does read german might want to figure out the relation between the three.
Rechenkraft is a) a team in DC and b) a German non-profit association for the propagation and promotion of DC.

I'm a member of b), but a member of SETI.Germany as a team, Yoyo is a member of both, as is the scientific head of RNA World, Michael Weber. The association exists since 2004

Yoyo started yoyo@home to bring non-BOINC projects in the BOINC world, usually not the biggest ones. Especially Evolution@Home, as some members of Rechenkraft knew the Admin from the past. He refuses to incorporate Folding, as the user base of Folding is already big enough.

Michael Weber, or in this case better Dr. Michael Weber , always wanted to start a DC project in his field of science. He especially wanted to do the bio-informatics with a real-world validation process of the results. And as chairman of the association he "used" it as his stepping stone, and as a good place to get co-operaters from his and other teams, mainly from Germany.

With the start of the project in the open it was soon clear that the old yoyo@home server was not sufficient for both projects, so now they are hosted still on the same, but a very new server. The forum and Wiki are on another server, so they are online even if the project server is down.
Reply With Quote
  #6  
Old 4th March 2010, 07:39 AM
Xaverius's Avatar
Xaverius Xaverius is offline
Dutch Power Runner! Uhm, Cow
DPC - crew member
 
Join Date: Aug 2008
Location: Arnhem, the Netherlands
Posts: 98
The website is also available in English: http://www.rechenkraft.net/wiki/inde...ft.net_e.V./en
I always had in mind that Rechenkraft was an mathematical (or something like that) institute in Germany, I guess I was a bit wrong.
Reply With Quote
  #7  
Old 20th March 2010, 06:07 AM
Rusty Rusty is offline
Owner
 
Join Date: Jan 2004
Location: Tasmania, Australia
Posts: 11,210
So what are we looking at here? Project in or out.. Is it up or down..

Whats the deal banana peel
Reply With Quote
  #8  
Old 20th March 2010, 03:44 PM
Ungelovende Ungelovende is offline
Team Norway
 
Join Date: Oct 2007
Location: Norway
Posts: 30
Quote:
Originally Posted by Rusty View Post
Whats the deal banana peel
I need more RAM!
i7 running on 4 cores with 9GB RAM -> swapping on HD. Its my problem that I only have crappy hardware - thumbs up from me!
Reply With Quote
  #9  
Old 5th April 2010, 05:58 PM
fractal fractal is offline
ARS Technica
 
Join Date: Sep 2007
Location: earth
Posts: 75
The project admin is traveling and thus not overly responsive at the moment. The project was still issuing work units that require 3-4 gig of ram per unit with BOINC unable to limit a project to a single active unit. The result is as Ungelovende said ... it brings your machines to their knees and swaps until long after the work unit deadline and you get around to aborting them.

I do not believe this project will be ready until the admins can get a handle on how much memory each work unit needs.
Reply With Quote
  #10  
Old 26th September 2010, 07:45 PM
Rusty Rusty is offline
Owner
 
Join Date: Jan 2004
Location: Tasmania, Australia
Posts: 11,210
Bumpity-Bump
Reply With Quote
  #11  
Old 26th September 2010, 08:27 PM
Michael H.W. We's Avatar
Michael H.W. We Michael H.W. We is offline
Neophyte
 
Join Date: Aug 2007
Location: Germany
Posts: 10
Ok, so maybe a brief update. A lot of progress has been made since the last postings shown above which I just saw today.

RNA World is active and open for public sign-up since around January/February this year. It has its own server now (which will be upgraded further, soon), so issues with responsiveness as described above have been resolved as it is no longer shared with the Yoyo@home project (instead it is now a 12 GB Intel i7 920 machine with 2x 1.5 TB RAID and huge bandwidth).

A number of presentations of the project at some conferences (BOINC Workshop@Barcelona 2009, RNA2010@Seattle, GDNÄ@Dresden, 2010) and seminar talks (M.I.T/Cambridge, 2010) have been accomplished.

The project currently updates its binaries and soon will also offer OSX support in addition to the current Linux and Windows clients. It should be said that this project still has somewhat higher demands than other DC projects in terms of runtimes, RAM requirements and data transfer. Well, our FAQ gives details on all of these aspects. If something is missing, please drop by our forum and inform us, so that we can further improve it.

Because I have supported DC projects for around a decade now, I am very communicative in terms of project support and invest quite some of my time to even scooter around in external forums to answer questions and help resolve problems (as you can see here, as well).

Michael.
Reply With Quote
  #12  
Old 27th September 2010, 03:49 PM
DigiK-oz DigiK-oz is offline
Dutch Power Cows
 
Join Date: Oct 2008
Location: netherlands
Posts: 107
Well, I appreciate the update. However, there's some points in the FAQ you mention that makes me hesitate to join your project.

Memory usage up to 2.5 GB (or 1GB): this will bring most quad+ machines to their knees should all cores get such a unit.

No checkpointing : that's fine with small workunits, but the FAQ mentions "up to several days". This means a WU like that will simply never finish on a machine that's not online 24/7. Sure, sleeping it instead of shutdown will alleviate this, but crashes (either your app, BOINC or OS) or occasional reboots (patches etc) will potentially send 10 days of CPU time down the drain

In my opinion, memory usage should be around 500MB max (or the project should limit the number of workunits a system can get at any one time), and checkpointing is preferred for ANY workunit, but mandatory for anything running longer than half an hour or so.
__________________
DC-Vault statistics
Reply With Quote
  #13  
Old 28th September 2010, 01:12 AM
Michael H.W. We's Avatar
Michael H.W. We Michael H.W. We is offline
Neophyte
 
Join Date: Aug 2007
Location: Germany
Posts: 10
Quote:
Originally Posted by DigiK-oz View Post
Memory usage up to 2.5 GB (or 1GB): this will bring most quad+ machines to their knees should all cores get such a unit.
Well, there is a memory managment system on both the server and the client side to counteract that situation - although it is not absolutely perfect. More importantly, however, most of our WUs have memory demands far below 500 MB as stated in the FAQ. What you cited is the maximum.

Quote:
Originally Posted by DigiK-oz View Post
No checkpointing : that's fine with small workunits, but the FAQ mentions "up to several days". This means a WU like that will simply never finish on a machine that's not online 24/7. Sure, sleeping it instead of shutdown will alleviate this, but crashes (either your app, BOINC or OS) or occasional reboots (patches etc) will potentially send 10 days of CPU time down the drain
Yes, that is why we have the FAQ such that users can check in advance whether or not their machines meet the project's system requirements. By the way: For Linux-x32 there is checkpointing available provided memory randomization is disabled in your kernel settings.

Quote:
Originally Posted by DigiK-oz View Post
In my opinion, memory usage should be around 500MB max (or the project should limit the number of workunits a system can get at any one time), and checkpointing is preferred for ANY workunit, but mandatory for anything running longer than half an hour or so.
Well, there is no such thing like maximum hardware demands in computing. We have a server-side management that will check your machine's hardware capabilities and determine which subset of the WUs that we have available will be suitable for your machine. In this way, we can properly assign which machine gets what and also get the demanding work done while your machine won't be overloaded.

Michael.

P.S.: Maybe a brief word on checkpointing. I participate in DC for more than 10 years now and I know how much of a "pain" it is when checkpointing is not available. However, checkpointing is available ONLY when an application has been de novo designed to offer that feature. In scientific software, writing out checkpoints is almost never seen since from the viewpoint of the developer it mostly is a superfluid overhead (time-, electricity- and compute power-consuming). That might at first sound strange to you but it is a fact and makes sense if you think about it for a moment. Consequently, almost all scientific software does not have checkpointing and this means that a project such as RNA World which utilizes open source scientific software could only offer checkpointing if the code would be entirely rewritten. Unfortunately, that we cannot do for technical and manpower/funding reasons. And it also makes no sense in general since for each new software version, the code would again have to be reorganized. As a consequence, we are seeking to find a solution for a universal checkpointing system that is in a way integrated into BOINC. One solution to this is the employment of a virtual machine approach. Another one is to write the relevant RAM portion to disk in certain time intervals - similar to what you know as sleep mode. A flavor of the latter we currently employ for Linux-x32 systems.

Last edited by Michael H.W. We; 28th September 2010 at 04:08 AM.
Reply With Quote
  #14  
Old 28th September 2010, 04:52 PM
DigiK-oz DigiK-oz is offline
Dutch Power Cows
 
Join Date: Oct 2008
Location: netherlands
Posts: 107
Quote:
Originally Posted by Michael H.W. We View Post
Well, there is a memory managment system on both the server and the client side to counteract that situation - although it is not absolutely perfect. More importantly, however, most of our WUs have memory demands far below 500 MB as stated in the FAQ. What you cited is the maximum.


Yes, that is why we have the FAQ such that users can check in advance whether or not their machines meet the project's system requirements. By the way: For Linux-x32 there is checkpointing available provided memory randomization is disabled in your kernel settings.


Well, there is no such thing like maximum hardware demands in computing. We have a server-side management that will check your machine's hardware capabilities and determine which subset of the WUs that we have available will be suitable for your machine. In this way, we can properly assign which machine gets what and also get the demanding work done while your machine won't be overloaded.

Michael.
1: I know I cited the max. But any machine COULD get several of these "max" units simultaneously.

2: So you're on the right track But I think a lot of users here are windows users, hence no checkpointing

3: I don't care about overloading My main machine is an I7 920 with 12 GB. However, even THAT would basically die with 8 units IF they all happened to require 2.5 GB. Your server-side checks will not know what the hell I am running alongside your project.

As for the checkpointing, I hear what you are saying. But, it is all a technical rundown of why things are currently the way they are. Believe me, I know what I am doing, have been running DC for ages, including BOINC, have my own (test) BOINC-server running and have written (as a hobby-project) my own BOINC-project executables including Nvidia CUDA, ATI Stream and OpenCL implementations. I can agree with your technical explanation, but still think your project as it is is unsuitable to be "released into the wild" on everyone since it has to many IFs (It has checkpointing IF linux, IF randomization is off, IF....).

I'm not trying to be negative here, I simply see a lot of issues if an unsuspecting user attaches to your project and expects things to just work. I, myself, just might attach shortly just to get some points on the board

Last edited by DigiK-oz; 28th September 2010 at 04:56 PM.
Reply With Quote
  #15  
Old 28th September 2010, 06:17 PM
Michael H.W. We's Avatar
Michael H.W. We Michael H.W. We is offline
Neophyte
 
Join Date: Aug 2007
Location: Germany
Posts: 10
Quote:
Originally Posted by DigiK-oz View Post
1: I know I cited the max. But any machine COULD get several of these "max" units simultaneously.
Getting them simultaneously does not mean that they are also computed simultaneously. BOINC checks prior to starting a task from the queue, how much of RAM is free. Still, you are correct in that the memory management in BOINC is worth improving. But that is nothing we can deal with and you may also know that other projects have the same issue.

Quote:
Originally Posted by DigiK-oz View Post
2: So you're on the right track But I think a lot of users here are windows users, hence no checkpointing
...except if you use a Linux VM which does good with our project.

Quote:
Originally Posted by DigiK-oz View Post
3: I don't care about overloading My main machine is an I7 920 with 12 GB. However, even THAT would basically die with 8 units IF they all happened to require 2.5 GB.
Well, see above and remember you have swap space, too.

Quote:
Originally Posted by DigiK-oz View Post
Your server-side checks will not know what the hell I am running alongside your project.
...but the memory will be checked for space prior to task starting (see above).

Quote:
Originally Posted by DigiK-oz View Post
As for the checkpointing, I hear what you are saying. But, it is all a technical rundown of why things are currently the way they are. Believe me, I know what I am doing, have been running DC for ages, including BOINC, have my own (test) BOINC-server running and have written (as a hobby-project) my own BOINC-project executables including Nvidia CUDA, ATI Stream and OpenCL implementations. I can agree with your technical explanation, but still think your project as it is is unsuitable to be "released into the wild" on everyone since it has to many IFs (It has checkpointing IF linux, IF randomization is off, IF....).

I'm not trying to be negative here, I simply see a lot of issues if an unsuspecting user attaches to your project and expects things to just work. I, myself, just might attach shortly just to get some points on the board
I understand, but as a project leader, I expect that people inform themselves BEFORE deciding to install software on their machines. I consider this the one and only minimum requirement in DC. And indeed, if people think there are too many IFs, then please do not participate. But in the meantime we are generating interesting results with those that do and will continue to improve our project to more and more meet "failsafe expectations".

Michael.

P.S.: If you have that much experience with BOINC, you might consider helping us with the VM approach?
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Forum Jump


All times are GMT. The time now is 10:06 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.