25 November 2006

Marshall's IT Plan for Janelia Farm -- Bio IT World

Might be useful to think about in terms of future high-performance computing purchases. See extracted snippets below.

Oct. 2006 
Marshall's IT Plan for Janelia Farm
By  Kevin Davies
Oct. 16, 2006 |  Driving north from Washington Dulles Airport towards the Potomac River, it's easy to miss Janelia Farm. The only road sign faces the opposite direction, belatedly guiding lost taxi drivers retracing their route in search of the campus. Outside a makeshift hut in the middle of a construction site, the security guard waves a visitor's taxi down a long, winding dirt road appropriately named Helix Drive. Around a corner, however, the scene changes dramatically.....
The data center is completely fiber and boasts a multi 10-Gb network. "That's a constant question," says Peterson. "Am I going to get the data to my desktop fast? If I can't, then I'm going to start having people buying their own supercomputers and sliding it under their desk. I don't want that - it's not cost effective, and you can't manage it." He adds: "We're going to have very high-resolution graphics, and people are going to see it very fast. Just one set of microscopes will be generating 500 GB data/day. 24x7x365."....
With some 1,200 64-bit Intel Xeon processors in all, cooling was a major concern. Peterson explains: "We ended up going with Dell and Xeons, which are hot, but we did a calculation: given the price we got with them and given the increased power requirements, it still came in price effective. Having said that, we're very interested in the new generation of Intels and obviously AMD." ...
Everything in the data center is designed to be ripped out and replaced if needed. "The idea is to design infrastructure that is cost effective and easy to replace. We try to be open source - everything is Linux-based, low stress. It helps hugely with the maintenance."....
Peterson selected three tiers and 150 TB of spinning disk storage from EMC. "We started small... seriously!" Peterson smiles. Tier 1 is 30 TB of SAN. Tier 2 is 70 TB of NAS. Tier 3 - the archive - consists of more NAS on disk plus tape. Peterson wants to expand tier 3. "We have capability of over 1 PB of tape," says Peterson. "I can grow to multi petabytes without adding another cabinet." He opens one of a long row of EMC cabinets to show rows of vacant racks....

The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements -- Nature Biotech

Might be good to use as benchmark to judge tiling arrays and protein chips

Nat Biotechnol. 2006 Sep;24(9):1151-61.
The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

Great Example of Spring Minimization


original URL

How to get on the Farmington Canal Heritage Trail in Hamden

How to get on the Farmington Canal Heritage Trail in Hamden

Official Info.

Hamden – Farmington Canal Heritage Trail
Cheshire – Farmington Canal HeritageTrail

Useful maps
Looking at map on
Looks like approximate entry point is at:

Some Distances
Farmington Canal Greenway:
Simsbury-Avon section; 8 mi
Avon-Farmington section; 2.3 mi
Hamden-Cheshire section; 8 mi

24 November 2006

Global variation in copy number in the human genome -- Nature

Assume everyone has seen this. Might be nice to start using this data.

Nature 444, 444-454 (23 November 2006)
Global variation in copy number in the human genome
Richard Redon1, Shumpei Ishikawa2,3, Karen R. Fitch4, Lars Feuk5,6, George H. Perry7, T. Daniel Andrews1, Heike Fiegler1, Michael H. Shapero4, Andrew R. Carson5,6, Wenwei Chen4, Eun Kyung Cho7, Stephanie Dallaire7, Jennifer L. Freeman7, Juan R. González8, Mònica Gratacòs8, Jing Huang4, Dimitrios Kalaitzopoulos1, Daisuke Komura3, Jeffrey R. MacDonald5, Christian R. Marshall5,6, Rui Mei4, Lyndal Montgomery1, Kunihiro Nishimura2, Kohji Okamura5,6, Fan Shen4, Martin J. Somerville9, Joelle Tchinda7, Armand Valsesia1, Cara Woodwark1, Fengtang Yang1, Junjun Zhang5, Tatiana Zerjal1, Jane Zhang4, Lluis Armengol8, Donald F. Conrad10, Xavier Estivill8,11, Chris Tyler-Smith1, Nigel P. Carter1, Hiroyuki Aburatani2,12, Charles Lee7,13, Keith W. Jones4, Stephen W. Scherer5,6 and Matthew E. Hurles

13 November 2006

Machine Over Man: Stock Pickers' Woes -- WSJ

Appears indexing maybe back in fashion....

Machine Over Man: Stock Pickers' Woes
November 6, 2006; Page R1
Stock pickers' recently healed egos are about to be battered anew....

ooo[clip]ooo ooo[general]ooo ooo[finance]ooo

The Word on Warranties: Don’t Bother -- NY Times

What I always thought...

After the Sale
The Word on Warranties: Don’t Bother
Published: November 1, 2006
IT may be tempting to buy extended warranties with all those high-tech gadgets on your holiday list, but the experts say they are almost always a waste of money.

ooo[clip]ooo ooo[computers]ooo ooo[purchases]ooo

07 November 2006

Gaming the Search Engine, in a Political Season -- NY Times

Interesting article pointing to the future of misinformation

Gaming the Search Engine, in a Political Season
Published: November 6, 2006
A GOOGLE bomb — which some Web gurus have suggested is perhaps better called a link bomb, in that it affects most search engines — has typically been thought of as something between a prank and a form of protest. The idea is to select a certain search term or phrase (“borrowed time,” for example), and then try to force a certain Web site (say, the Pentagon’s official Donald H. Rumsfeld profile) to appear at or near the top of a search engine’s results whenever that term is queried.....

ooo[clip]ooo ooo[computers]ooo ooo[search]ooo

01 November 2006

A loss-of-function RNA interference screen for molecular targets in cancer -- Nature

Interesting datasets related to cancer and phenotypes

Nature. 2006 May 4;441(7089):106-10. Epub 2006 Mar 29.
A loss-of-function RNA interference screen for molecular targets in cancer.
Ngo VN, Davis RE, Lamy L, Yu X, Zhao H, Lenz G, Lam LT, Dave S, Yang L, Powell J, Staudt LM.

ooo[clip]ooo ooo[bioinfo]ooo ooo[phenotypes]ooo

The Connectivity Map: using gene-expression signatures to connect

Interesting dataset related to cancer and phenotypes

Science. 2006 Sep 29;313(5795):1929-35.
The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP,
Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ,
Clemons PA, Wei R, Carr SA, Lander ES, Golub TR.

ooo[clip]ooo ooo[bioinfo]ooo ooo[phenotypes]ooo

Letter ("Programming Pedigrees") responding to "The Semicolon Wars" -- American Scientist

Resurrected one of my favorite letters, which might be good to follow up on. Viz: :
I read with great interest Brian Hayes's recent column "The Semicolon Wars" on the genealogy of computer languages (Computing Science, July-August). I was struck by the first figure that shows how many well-known languages are related to each other in a tree-like structure. Mr. Hayes carefully compared the development of computer languages to that of spoken languages. This is, of course, quite appropriate. However, another illuminating comparison is to the development and evolution of genomes in biology.
The genome has often been compared to an organism's operating system, and in this sense, its underlying genetic coding is the ultimate computer language. The triplet codons in the literal genetic code have not changed much over time. However, the specific features they encode in different genomes have changed dramatically since life first appeared.
One of the nice things about biology, moreover, is that the main mechanisms underlying this evolution can be studied experimentally. It is believed that the genome evolves through a variety of processes duplicating and copying chunks of DNA, and then further variation happens to these copies. One also sees, most often in bacteria, whole genetic elements horizontally transferred from one organism to another.
The parallels to computer languages in the operation of these mechanisms are quite strong: In a specific lineage of languages (such as that for Algol60, C and Java) one sees the duplication and variation of basic control structures for such items as loops and subroutines, and horizontal transfer of new structures (such as the object-oriented constructions from Simula67).

Citation of Letter:
Programming Pedigrees
Volume 94, Number 6 (November-December 2006)
Mark Gerstein
Yale University
New Haven, CT

Article Commented on:
The Semicolon Wars
Every programmer knows there is one true programming language.
A new one every week
Brian Hayes
American Scientist, July-August 2006
If you want to be a thorough-going world traveler, you need to learn 6,912 ways to say "Where is the toilet, please?" That's the number of languages known to be spoken by the peoples of planet Earth, according to Ethnologue.com. If you want to be the complete polyglot programmer, you also have quite a challenge ahead of you, learning all the ways to say: printf("hello, world\n") ;.....