View Full Version : [GAH] Frequently Asked Questions
23rd February 2003, 12:29 PM
INTRODUCTION TO THE GENOME@HOME PROJECT
Q. What is Genome@Home all about?
The Human Genome Project is nearing completion, and scientists are working hard to develop the understanding needed to use this wealth of genetic information in ways that will be significant to medicine and humankind. One of the most important ways to do this is to study other genomes and individual gene sequences that are already available to us. By understanding how these genomes work, we will be able to put the huge amounts of data (over 50,000 genes and 3 billion nucleotide base pairs) from the Human Genome Project into biological and medical context, giving it real meaning.
Q: How do I Join Genome@Home?
You can include you computer in this world-wide project by heading to the main site at Stanford. All you need to do is download the software, at the prompts add your details and the TeamNinja_Genome Team Number = 1853278109 and you’re away. The program does everything else for you. To register at Stanford HERE (http://gah.stanford.edu/download.html). After you register you will be able to download the latest version. We suggext the no frills client version though there is a graphic client for those interested.
As you can probably guess by now, designing just one new gene sequence is already computationally demanding. To design hundreds of new sequences for hundreds of proteins, literally thousands of computers are needed.
(See Scientific background (http://gah.stanford.edu/science.html) for more details about genomes, proteins, how proteins and genes are related).
Q: How you can I help?
To design these large numbers of protein sequences, we need lots of computers. By running the Genome@home protein sequence design client, you can lend us your computer while you're not using it, for as long or as little as you like. It simply runs alongside your other programs and does its calculations in the unused cpu time while you're away from your desk, or even while you're working on your computer. You won't notice a loss of speed, and your computer will work as usual. All you see is a small window that shows you the protein sequences you're designing. If you don't want to look at it, just minimize the window and move it to a corner of your desktop. A day or two's worth of running G@h2 is enough to design new protein sequences that the world has never seen before. All the sequences get added to the Genome@home database, so every little bit helps.
Q. What is the goal of Genome@Home project?
The goal of Genome@home is to design new genes that can form working proteins in the cell. Genome@home uses a computer algorithm (SPA), based on the physical and biochemical rules by which genes and proteins behave, to design new proteins (and hence new genes) that have not been found in nature. By comparing these "virtual genomes" to those found in nature, we can gain a much better understanding of how natural genomes have evolved and how natural genes and proteins work. Some important applications of the Genome@home virtual genome protein design database:
* engineering new proteins for medical therapy
* designing new pharmaceutical
* assigning functions to the dozens of new genes being sequenced every day
* understanding protein evolution
Q. How long has the Genome@Home project been running?
The project started early 2001.
Q. How long will the Genome@Home project run?
Stanford plan to continue with this project.
Q. Who owns the results?
Unlike other distributed computing projects, Genome@home is run by an academic institution (specifically the Pande Group, at Stanford University's Chemistry Department), which is a non-profit institution dedicated to science research and education.
* The results from Genome@home will be made available on several levels. First, the statistics and information about the protein sequences being designed are available on the web for everyone to see. These are updated daily, and include information about which users contributed which sequences.
* Second, analysis of the sequences will be submitted to scientific journals for publication, and these journal articles will be posted on the web page after publication.
* Thirdly, after publication of these scientific articles which analyse the data, the raw data will be available for everyone, including other researchers, at http://gah.stanford.edu
Q. Is there an official FAQ?
The official Stanford FAQ can be found here: http://gah.stanford.edu/faq.html
23rd February 2003, 12:30 PM
ABOUT THE TEAM
Q. Who is TeamNinja_Genome?
TeamNinja_Genome is a distributed computing team, dedicated to the Genome@Home project run by Stanford University. We support the goals of that project and provide a group of like minded individuals that will soon become your friends.
Q. Where are the TeamNinja_Genome members located?
The members are from all over the globe and from all walks of life. Members are situated in Australia, Canada, the UK and USA.
Q. How did TeamNinja_Genome get started?
TeamNinja_Genome was formed when our good friends at Team Picard became an independent entitiy with many of the NinjaMicros members staying here to form a brand new team.
Q. Can I join TeamNinja_Genome?
Yes you can! Anyone and everyone are eligible and welcometo join.
Q. What does it cost to join?
Q. What operating systems are supported?
The (console) client supports Windows 95, 98, 2000, NT, ME, XP and Linux.
Q. What processor families are supported?
x86 processors and greater.
Q. Intel or AMD?
The AMD Duron/Athlon/XP processors have proven to be faster (per clock cycle) than the Intel Pentium processors. AMD definitely provides better "bang for buck" with Genome@Home.
Q. Does Genome@home run on dual processor machines?
Yes. Genome@home supports dual processor machines. You just need to run two copies of Genome@home, each installed into its
Q. Will running the client program slow down my computer or affect other programs?
No. The client runs at the lowest execution priority, you should not notice any slowness at all if sufficient memory is in place. In general the specific memory requirment depends on the operating system that is being run.
Q. What is the current recommended client software version?
The current version (3.24) is now a joint application with F@H3.
Q. I have a modem, can I use Genome@home?
Yes, the Genome@home client will work with most modem setups; ideally an always on connection is preferred though caching will be implemented soon. The UD Monitor application can be used to cache units at this stage.
Q. How do the results get back to Stanford?
Your computer will automatically upload the results to the Genome@home server each time it finishes a gene design, and download a new job at the same time.
Q. What about security issues?
The Genome@home client software is available for download only from the Stanford web site: http://gah.stanford.edu
This software will upload and download data only from the Stanford University Genome@Home project data server. The server does not download any executable code to your computer.
23rd February 2003, 12:31 PM
Q. How do I participate?
First you must download the client software, available only from Stanford: http://gah.stanford.edu/download.html
Q. How do I configure the client software?
Run the Genome@home client by choosing Genome@home from the Start menu, when first run the client will asks you to enter your username enter whatever name you'd like. To join the TeamNinja_Genome team, enter team number 1853278109. Note: The user name is case sensitive, so "Fred" and "fred" are different members! Use only letters, numbers and the underscore characters, do not use a space!
Q. Are there any characters I should avoid in a user name?
You can use anything except whitespace (space, tab, etc.) If you want a space in your user name, use an underscore "_".
Q. Can I re-configure the client at any time?
Yes. Either: Stop the client and re-run with "ghclient -config".
RUNNING THE SOFTWARE
In Linux, open a terminal window (or go to a console login), change to the installation directory and enter "./ghclient.x".
The Genome@home client will start receiving and processing data automatically, providing you are connected to the Internet.
Q. I'm behind a firewall, can I run G@h2?
If you are behind a firewall, please answer yes at the "firewall" dialogue box, and then give the client some info about your firewall.
Q. What port number does the client use to connect to the Internet?
Q. My Internet connection is really slow, does this use up a lot of my bandwidth?
It uses very little bandwidth. Units only take seconds to download and upload.
Q. Can I run the client without sending my work back to Stanford?
Only by caching the units with UD Monitor - caching to a limited extent will occur later via the client.
Q. I'm running multiple machines. Can they all have the same user name?
Yes. (And remember about case sensitive names.) The G@h2 server will assign each machine a unique cpu id, but you don't need to worry about that. You should enter the same user name and group number in the configuration dialog box on every machine onto which you install Genome@home.
Q. Can I use more than one computer with my username/team?
Yes, ensure you use the EXACT same name (including capitalisation) and TEAM NUMBER. You can copy the ghclient.cfg file between computers if you like.
Q. How do I only run G@h units?
When setting up the client in the advanced options you have to specify g@h to get only genome units. By entering the G@h team number (1853278109) you should also only be able to download G@h units. In rare cases when the G@h servers are unavailable you may get a folding unit, which will not go towards the teams total. Team Ninja also has a F@h3 team (here (http://folding.stanford.edu/cgi-bin/teampage?q=31379)) that has the team number 31379. The F@h3 stats accept both folding and genome units.
23rd February 2003, 12:32 PM
NO INTERNET CONNECTION?
Q. What if my machine is not permanently connected to the Internet?
The client software will try to upload completed work for a 5 minute period. If this fails, the work remains on your
machine until it is successfully uploaded, the client will not do any further work until it can get a new unit.
Q. How will the client process more work if it can't connect to the Internet?
The only way to do this is with UD Monitor at this point in time.
Q. What if I have computers that don't have access to the Internet?
Unsure at the moment...
PROTEINS, GENES AND STATISTICS
Q. What does the output on my screen mean?
Genome@home tells you how it's progressing through the design of the gene. It starts off with a huge variety of possibly good sequences, and iteratively searches through and refines these sequences, until a well-designed sequence is found. The core of the design algorithm repeats itself thirty times, each time producing one "best" sequence. After thirty iterations, the Genome@home client will attempt to send the data back to the server and get more work.
Q. How many sequences are there in a gene?
Q. How long does it take to do a WU?
It varies depending on how many aa the gene has and the computer you are using. A 72aa WU takes a 1.3 GHz machine about 7 hours.
Q. How do I tell how many amino acids are in my current work unit?
For Windows, download and run Electron Microscope (EM3) (download here (http://www.em-dc.com)).
For Linux users, look in the file "input.inp", the first number in the first line is the number of amino acids for the work unit. Also when you start to process a gene you will be adviced to how many amino acid threads it contains.
Q. When does the client checkpoint work?
It will checkpoint its status after each sequence has been completed (100%). If you interrupt the client before it is done 100% of a sequence, it will need to reprocess the entire sequence. If you interrupt the client before it has done the first sequence, it will download a new work unit and start over; if the client has generated a gene then the client can be stopped and restarted without it needing to downloading a new work unit.
Q. How do I find out how many genes/WUs I've done?
Stanford provide user and team statistics calculated every three hours. Other statistic vendors provide value added statistics based on the Stanford stats.
Q. What size genes can I expect ie. number of amino acids?
To date, the genes have ranged from 5aa to almost 200aa.
Q. What are the chances that my work will result in a duplicate of someone else's work?
Very slim, a large number of different work units are sent out, each with a random 16-bit seed. There are many, many protein sequences to be designed. The current chances of duplicate work being reported is almost non-existant as all genes issued by Stanford have a unique, individual, identity reference.
Q. How do I avoid returning duplicate work?
It is important to have a "fresh" download as the source for every CPU on every machine. Incorrect caching techniques may also be at fault.
Q. How can I process more work units?
Upgrade your computer, overclock your computer, get more computers, recruit your friends and colleagues. But remember, make sure you have permission to install the client if you don't own the computer.
Q. I uploaded my work 3 hours ago, why doesn't the work show up on the Stanford Statistics?
It usually takes 2-3 statistic periods (6-9 hours) for your work to be credited.
Q. Do my units move with me if I change teams?
All participants have two sets of characteristics their team statistics and their individual statistics - the team statistics will always remain with the particular team that they were crunched with but the idividual statistics remains as it states with the individual irrespective of which team they are currently part of providing that they retain their unique user identity (username).
23rd February 2003, 12:43 PM
All data above has been collected from various sources including the following TGC (http://www.thegenomecollective.com), TPR (http://www.teamphoenixrising.net), TP (http://www.team-picard.com) and G@H Stanford (http://gah.stanford.edu/) and personal experience and of course my home - http://www.ninjamicros.com :).
23rd February 2003, 02:07 PM
Originally posted by data
MarsJupiter Members Position Table (http://www.marsjupiter.com/seti/generic_member_table.php?project=genome_pica)
MarsJupiter Real Time Stats (http://www.marsjupiter.com/seti/generic_realtime.php?project=genome_pica)
Standford Team Stats (http://genomeathome.stanford.edu/teamstats.html) courtesy of Monkeymia
Stanford Member Stats (http://genomeathome.stanford.edu/userstats.html)
Stanford TeamNinja_Genome Stats (http://gah.stanford.edu/teams/TeamNinja_Genome.html)courtesy of Monkeymia
Genome Latesr News Bulliten (http://gah.stanford.edu/new.html)
old but still used YAHOO Genome Community Group (http://groups.yahoo.com/group/genomeathome/)
Genome New Community Group Forum also covers Folding (http://forum.folding-community.org/)
Statsman 1000 Team Stats (http://www.statsman.org/genomestats/index.html)
Statsman Top 1000 User Stats (http://www.statsman.org/genomestats/users.html)
Statsman TeamNinja_Genome Breakdown (http://www.statsman.org/genomestats/4876.html)
SETIATWORK virtual GENOME STATS for TeamNinja_Genome (http://www.setiatwork.com/cgi-bin/gahstats.cgi?days=7&sort=Rank&team=TeamNinja_Genome&stat=teams) courtesy of Monkeymia
link to many things DC related (http://homepage.ntlworld.com/squeaky/DClinks.htm) courtesy of Squeaky
GENOME SPY (http://www.v8.franken.de/GS/index_eng.htm) courtesy by Stargazer
27th February 2003, 06:41 PM
LINKS TO MONITORING TOOLS AND G@H RELATED DOWNLOADS
Electron Microscope III (http://www.em-dc.com/) - Neat little customisable monitoring tool.
Borg Utilities (http://www.marsjupiter.com/seti/borg.php?project=genome_pica) - Don't miss these excellent applications, you won't know how you coped without them.
6th March 2003, 12:38 AM
Sunday (for Saturday) ................Monkeymia
Monday (for Sunday) ..................Jules
Tuesday (for Monday) .................Ragnarog
Wednesday (for Tuesday) ............Meady
Thursday (for Wednesday) ...........Ragnarog
Friday (for Thursday) ..................Meady
Saturday (for Friday) ..................Farley
12th April 2003, 01:23 PM
Team Ninja - The Future (http://www.ninjamicros.com/vbulletin/showthread.php?s=&threadid=20496)
14th April 2003, 11:27 AM
And this thread (http://ninjamicros.com/vbulletin/showthread.php?s=&threadid=22217) will help dispel any fears members/anyone might have about the new client ;)
21st April 2003, 12:23 AM
UD Monitor (which works as a cacheing tool for G@H2 effectively, and is approved by PandeGroup) can be downloaded from here (http://distributed.org.ru/?udmon)
9th June 2003, 09:38 AM
Info for Folding@home3 can be found here (http://www.ninjamicros.com/vbulletin/showthread.php?postid=210819#post210819)
14th June 2003, 03:29 PM
To set up and run the new TEXT CONSOLE CLIENT for F@H client to work on the G@H project follow these instructions
a/ create a folder on your HDD (e.g. G@H v2.06)
b/ download the core file from here - http://genomeathome.stanford.edu/download.html
c/ to be able to get access to the core file you will need to have at hand the following information
* your email address
* your user name
* your password
d/ once you complete c/above you will be transferred to a link page which reads
"Thank you for your continued support of Genome@home.
Follow this link to download the Genome@home client."
Click on the download link to download the Windows 95/98/ME/NT/2000/XP text-only console (third download from the top of the selection list. Clicking this link will activate a download window that will advise you are about to download File Name FAH3Console.exe this is OK do not be alarmed you have the correct file (the G@H project uses the F@H core client to do the work) save the file to the FOLDER you set up at a/ above - the file is a self extracting .exe file of 244Kb in size
e/ once the download is complete you are now ready to set up and run your G@H client
f/ to set up the client and obtain work follow these instructions
g/ using explorer find the F@H3Console file you downloaded to your G@H v2.06 folder
h/ for ease of use I have put a shortcut to the client on my desk top to do this - right click on the client and click on send to desktop
i/ now double click on the file and it will open up and now you start to follow the 11 instructions below to successfully set up/install and run the G@H Client
1/ User name = [add your user name here]
The user name must be exactly as you advised at the time of log on at genome@home site to get the new G@H 2.06 client software
2/ team number = 1853278109 for doing G@H work only and having your work credited to the TeamNinja_Genome team or
team number = 31379 for doing G@H and F@H work for the TeamNinja_Folding team
3/ ask before fetching/sending work [no] <yes/no>? = no
This needs to be completed according to the type of connection to the Internet you have at your disposal
4/ Use Internet Explorer Settings [no] <yes/no>? = no
This needs to be completed by entering NO if you have problems reaching G@H automatically then change this to yes
5/ change advanced options [no] <yes/no>? = yes
This enables you to select which project you require to work on full time selecting no will almost certaily mean you will work on the Folding@Home project full time selecting NO will allow you to make a further selection which will allow you to work on either project or F@H or G@H full time
6/ Core Type [no pref] <no-pref/fah/gah>? = gah
as above this is where you select to work on which project you prefer if you choose no-pref or fah you are most certainly going to be working on the Folding@Home full time as this is the prefered project for the Pande Group
7/ Core Priority [idle] <idle/low>? = idle
the selection you make here depends on how you use your computer - if its going to crunch only select idle - only select low if you are going o be using your computer on doing other things as well as crunching ( i have found that irrespective whether i am doing other work or not the best setting is idle)
8/ CPU usage requested <5-100>? = 100
like 7 above the choice you make here is dependent on what you are going to be using your computer for If crunching only entre 100 if being used for other work select anything between 75 - 90 this is a trial and error entry and is supposed to effect the reaction to instruction to your computer if you select too high a value - but i have found that with my main using computer i selected 7 above as being idle and 8 as being 100 (what in effect this means is it should tie up my computer completely and make
access difficult this has not been the case so i would reccommend this setting (not sure if it will work this way with gameing)
(items 9 and 10 below are associated with F@H more than G@H and should be set at default as advised)
9/ Disable highly optimized assembly code [no] <yes/no>? = no
This item needs to be set as NO
10/ Ignore deadline information <mainly useful
if system clock frequently has errors>? [no] <np/yes> = no
This item needs to be set as NO
11/ Machine ID <1-4> ? = 1
this item is for recording the amount of CPU on the system you are using
1 cpu set as 1
2 cpus set up first as 1 second as 2
3 cpus set up first as 1 second as 2 third as 3
4 cpus set up first as 1 second as 2 third as 3 and finally fourth as 4
once you answer 11 above the client will automatically get under way and it will take approx 3 1/2 minutes from start to finish and the client will now be working on the 1st iteration of designing a new protein
If you have followed the above you should now have successfully downloaded, setup, installed and the client should be working on the 1st iteration of designing a new protein
now minimise the console box to your task bar and forget
to open up the console box in the task bar just click on the F@h symbol
to stop the client - wait till it has completed 100% of an iteration (reason to wait is if you stop before this is complete you will loose the current iteration it is working on - to stop the client use Control C this will close it down)
to start the client just click on the desktop icon you set up earlier refer h/ above
8th February 2004, 12:14 AM
The Idiots Guide to G@H using the FOLDING Client needs updating - few new additions since the client V4 was released.
8th February 2004, 09:11 AM
Hi Tom, nice to see you around again :)
This has been noted by myself and TGC but as yet we have not updated this due to lack of time...
vBulletin® v3.7.3, Copyright ©2000-2015, Jelsoft Enterprises Ltd.