24 December 2007

let. to ed. re. "For a Tunnel to Go 16 Miles, No Light Yet" -- NY Times

Here's a letter to the Times that wasn't published:
I was very intrigued by your recent column about a proposed tunnel
connecting Long Island to Westchester. As the piece noted, this
tunnel follows a route long ago advocated by Robert Moses. If such a
tunnel were ever to be built, it would validate one of Moses' original
designs for the road network encircling New York City; thus, in a way,
this proposal revitalizes his original dream. However, it is worth
pointing out that Moses' original plan was for a bridge, not a tunnel.
Moses, in fact, strongly disliked and actively opposed tunnels,
whereas he felt bridges made a stronger and grander statement. Thus,
the interesting twist in the column is how it simultaneously revives
Moses' original dream just as it implicitly criticizes his
stubbornness.


Letter in response to:
http://www.nytimes.com/2007/11/29/nyregion/29towns.html
November 29, 2007
Our Towns
For a Tunnel to Go 16 Miles, No Light Yet
By PETER APPLEBOME
GARDEN CITY, N.Y.
A reasonably sane person contemplating the modest proposal of the
developer Vincent Polimeni to build a $10 billion, privately financed,
16-mile tunnel linking Long Island and Westchester the longest
autos-only tunnel in the world and the first to be privately built in
the United States might start with two thoughts.
The first: This might be a brilliant idea or a nutty one, but in an
era of shrunken ambitions, give the guy credit for a big idea that
goes back to Robert Moses, who in the 1960s championed a bridge over
pretty much the same route.....

03 December 2007

Cycling in DC (TripBethesda)

Cycling from Bethesda into DC and back

Route taken on Google Maps
Summary: ~40 miles cycling
Geotagged images, served from [P]icassa or [F]lickr (also look at overall flickr map)

02 December 2007

Potential alternate terms RFBRs for chIP-chip "hits"

The ENCODE paper coined the term RFBR for "hits" in chIP-chip experiments, viz: "We refer to regions with enriched binding of regulatory factors as RFBRs. RFBRs were identified on the basis of ChIP-chip data in two ways..." Some other terms were considered instead of RFBR. Here's a list of some of them:

CHIRP -- chip hit of regulatory potential
EIGR -- experimentally identified genomic region (like mountain)
GRAF -- Genomic region associated with function
GRAB -- Genomic region associated with binding
MONOD
RELIC -- Regulatory element in living cells
GELC -- Genomic element in living cells
FELC -- Functional event in living cells
MOJO
EDGE -- Experimentally determined genomic element
GEMMS -- Genomic element defined by multple methods
LORE
REIVE -- Regulatory element in vivo
GERP -- Genomic element of regulatory potential
TRE -- transcriptional regulatory element
GIVE -- Genomic in-vivo element
LRE -- long-range element
RFBS -- regulatory factor binding site

11 November 2007

Bad MILK. How does milk know when to expire? - Bad Astronomy and Universe Today Forum

Some useful quotes on what makes milk go bad:
"Generally, if kept constantly cold, bacteria in milk will grow at a constant slow rate. The bacteria generally feeds on the lactose, turning it into lactic acid. As it accumulates, it makes the milk taste more and more sour. But our taste buds will generally ignore the small bit of sourness. The milk is considered bad when the sourness becomes noticeable. The proteins in the milk will start to stick to each other in an acid environment. However, for this to happen on a large scale requires the milk to be fairly acid (pH around to 5.5). Our taste buds register sour at a much higher pH (~6)."

30 October 2007

Cycling and Walking around Vienna, with Pictures (TripVienna)

Attempted to carefully integrated image and geo data on last summer's vacation to Vienna

Photos
(Geotagged images, served from [P]icassa or [F]lickr)

  • All of the photos [P][F] (also look at overall flickr map)
  • Various Subsets (sublime to ridiculous):
  1. Interesting Architecture [P][F]
  2. Street Art [P][F]
  3. Generic, non-geotagged shots [P][F]
  4. Graffiti [P][F] (alternate picassa with subset map)
  • Shots of People [P] (need auth. key)
Cycling and Walking Routes
Useful Links

28 October 2007

Cycling near New Haven (BikeCT + BikeOrchard)

BikeCT
Cycling along CT coast near Guilford and Branford, with a lunch in Branford.
Route taken on Google Maps
(kmz colored by course heading)
Associated useful links.
Summary: ~52 miles cycling (7 hrs. of biking time)

BikeOrchard
Cycling to Lyman Orchard, with a lunch there.
Route taken on Google Maps
(kml colored by elevation -- note hills.)
Associated useful links.
Summary: ~41 miles cycling (7.5 hrs. of total time)

28 September 2007

Some examples of Cockney rhyming slang

Cockney rhyming slang - Wikipedia, the free encyclopedia:
'plates' means 'feet' ('plates of meat')
'brown' means dead ('brown bread')
'titfer' means hat ('tit for tat')

23 September 2007

Some Useful Tidbits on Asthma

The Harvard Medical School Guide To Taking Control Of Asthma

by Christopher H. Fanta (Author), Lynda M. Cristiano (Author), Kenan Haver (Author)
http://www.amazon.com/Harvard-Medical-School-Taking-Control/dp/0743224787/ref=ed_oe_p/104-6142984-5779139?ie=UTF8&qid=1190536146&sr=1-6
http://books.google.com/books?id=ohIbOrVcRBwC

Leukotriene Inhibitors

[From link above:]
"Molecular biology promises to provide drugs that will control our immunologic and biochemical reactions. Several pharmaceutical companies are now producing leukotriene receptor blockers - the first of the new line of agents being developed to control the inflammatory response in asthma. The first agents coming to market in Canada are zafirlukast (Accolate) and montelukast (Singulair). Both act by blocking the most potent of the leukotrienes, L4. Asthma is felt to be due to allergens triggering an inflammatory response. .... Leukotrienes were found to be the active factor in the old SRSA (slow releasing substance of anaphylaxis) first described in 1938. They are 1000x more potent in inducing bronchospasm than histamine. They are part of the lipoxygenase pathway that is also involved in prostoglandin production. These pathways are also used in allergic rhinitis. At the moment, leukotrienes are hoped to be the appendix of the immune system - of no good use, but known to cause problems....."

Beta Blockers & Asthma

[From link above:]
"What is a beta-blocker? A beta-blocker is a medicine used to treat high blood pressure and heart problems. Some beta-blockers are atenolol (brand name: Tenormin), metoprolol (brand names: Lopressor, Toprol XL) and propranolol (brand name: Inderal). A beta-blocker blocks the harmful effects of stress hormones on your heart. This medicine also slows your heart rate. Beta-blockers can also be used to prevent migraine headaches in people who get them frequently. Can I take a beta-blocker if I have asthma or chronic lung disease? Beta-blockers are generally not used in people with asthma. A beta-blocker can cause asthma attacks.... "

Beta-agonists

[From link above:]
"Beta-agonists are bronchodilator medicines that open airways by relaxing the muscles around the airways that tighten during an asthma attack....
Beta-agonists come in many different forms. Some common beta-agonist medicines are: albuterol, Alupent, Brethine, metaproteronol, Metaprel, Proventil, Salbutamol, terbutaline, Ventolin. "

The concept of "Pre-treatment" with beta-agonist

Preventing Respiratory Viruses

[From Harvard Med. Guide above:]
"Most viruses are spread from oral or nasal secretions onto surfaces, are picked up by hand contact, and are then spread from your hands to your nose and mouth. So wash your hands frequently..."

http://en.wikipedia.org/wiki/Purell

Eosinophils: mischief-makers in asthma

[From link above:]
"Whatever are eosinophils?

Eosinophils are a type of white blood cell (corpuscle) and take up the red dye eosin when blood is examined under a microscope by the commonest method.

They accumulate wherever allergic reactions like those in asthma take place. Their natural role is to defend us against parasites. In fact allergies such as asthma are probably a malfunction of our protective mechanism against parasites...."

21 September 2007

AOI

Interesting acronym: AOI = achievements, objectives, issues

10 September 2007

Triathlons I've done

2007 MADISON SPRINT TRIATHLON
Swim .5 miles/Bike 13 miles/Run 3 miles
Saturday Sept 8, 2007 | Surf Club - Madison, CT
Part of running route
http://www.madisonjc.com/trimaps.shtml
http://www.madisonjc.com/triathlon.shtml
Cache: C:\...\pers\misc\exercise\Sept07-triathlon\madtri07.txt + related

JONES BEACH TRIATHLON
SEPTEMBER 26, 2004
FINSIH LINE ROAD RACE TECHNICIANS - www.FLRRT.com
1/2 Mile Swim, 14 Mile Bike, 3.1 Mile Run
http://metrotri.com/document/45330
Cache: D:\...\y10.unzip\pers-pics\pers\clip\jones-beach-triathalon

VYTRA TOBAY TRIATHLON
TOWN OF OYSTER BAY, OYSTER BAY, NEW YORK
AUGUST 19, 2001
FINSIH LINE ROAD RACE TECHNICIANS, www.FLRRT.com
1K Swim - 15K Bike - 5K Run
http://bioinfo.mbb.yale.edu/~mbg/dom/fun3/tobay-triathalon
Cache: file:///D:/.../y6-021014-closed/y6/bikemaps/tobay-times/tri01m.txt

TRIATHLON IN MODESTO
Modesto, CA ; 13 October 1996
Pool swim, Bike, Run
Can't find any more details

07 September 2007

let. to ed. re. "Mom's Genes or Dad's? Map Can Tell" -- Washington Post

Here's a letter to the Washington Post that was published:
Regarding the Sept. 4 front-page article "Mom's Genes or Dad's? Map Can Tell," about the unraveling of the Venter "diploid" genome: The article noted that sequencing an individual's DNA provides a wealth of information not only about that person but also about his or her relations. But it did not mention that sequencing also provides information about all of an individual's unborn descendants. Thus, when an individual's genome sequence is publicly released, consent implicitly is being given for these unborn descendants without their approval.
Fifty years from now, our understanding of genomic information will undoubtedly be more sophisticated than it is today. In the future, from a bit of sequence, it might be possible to glean a tremendous amount about such things as the diseases or behavioral anomalies that might befall someone. What might these unborn descendants have to say about the release of such highly personal information?
It is worth underscoring that when information is publicly released, it gets widely distributed (via the Internet and other means); any such decision made today will have far-reaching and irreversible consequences.


Citation of the Letter
http://www.washingtonpost.com/wp-dyn/content/article/2007/09/06/AR2007090602362.html
DNA Rights and Wrongs
Friday, September 7, 2007; A20
MARK GERSTEIN
New Haven, Conn.

Citation of Article Letter Responds to
http://www.washingtonpost.com/wp-dyn/content/article/2007/09/03/AR2007090301106.html
Mom's Genes or Dad's? Map Can Tell.
One Man's DNA Shows We're Less Alike Than We Thought
By Rick Weiss
Tuesday, September 4, 2007; Page A01
Scientists have for the first time determined the order of virtually every letter of DNA code in an individual, offering an unprecedented readout of the separate genetic contributions made by that person's mother and father....

Other Articles that this Letter Potentially Responds to
http://www.nytimes.com/2007/06/12/opinion/12tue4.html
June 12, 2007
Editorial
The Discoverer's DNA
When scientists talk about sequencing the human genome, they have been talking
so far about creating a composite picture drawn from the gene sequences of many
people. That has now changed for good. Recently, the director of the Human
Genome Sequencing Center at the Baylor College of Medicine gave James D. Watson
— who with Francis Crick discovered the structure of the DNA molecule — two DVDs
that contained the complete sequence of Mr. Watson's DNA.....

http://www.nytimes.com/2007/06/03/weekinreview/03harm.htm
June 3, 2007
6 Billion Bits of Data About Me, Me, Me!
By AMY HARMON
JAMES D. WATSON, who helped crack the DNA code half a century ago, last week
became the first person handed the full text of his own DNA on a small computer
disk. But he won't be the last.
Soon enough, scientists say, we will all be able to decipher our own genomes —
the six billion letters of genetic code containing the complete inventory of the
traits we inherited from our parents — for as little as $1,000.
Just what we will do with the essence of who we are once we bottle it, however,
is likely to be as much a social experiment as a scientific one....

http://www.nytimes.com/2007/06/01/science/01gene.html
June 1, 2007
Genome of DNA Discoverer Is Deciphered
By NICHOLAS WADE
The full genome of James D. Watson, who jointly discovered the structure of DNA
in 1953, has been deciphered, marking what some scientists believe is the
gateway to an impending era of personalized genomic medicine.
A copy of his genome, recorded on two DVDs, was presented to Dr. Watson
yesterday in a ceremony in Houston by Richard A. Gibbs, director of the Human
Genome Sequencing Center at the Baylor College of Medicine, and by Jonathan M.
Rothberg, founder of the company 454 Life Sciences.
"I am thrilled to see my genome," Dr. Watson said....

05 September 2007

let. to ed. re. "Logged In and Sharing Gossip, er, Intelligence" -- NY Times

To the Editor:

I was very impressed by the recent article in the week in review about
how the intelligence community could use collective knowledge in the
form of wikis and blogs to help combat potential threats. While I
think this idea is great, I was surprised that the article did not
mention the public episode a few years ago where it was suggested that
the Defense Advanced Research Projects Agency (DARPA) establish a
Policy Analysis Market to help predict terrorist threats. In the
framework of the article, the DARPA proposal appears to be quite
prescient. Given the clear incentive of profit, efficient markets are
an even better idea for harnessing collective intelligence than wikis
and blogs. However, this proposal was strongly criticized in the
press, which led to the resignation of DARPA head John Poindexter.

Mark Gerstein


Above is an unpublished letter in response to:
September 2, 2007
Logged In and Sharing Gossip, er, Intelligence
By SCOTT SHANE
http://www.nytimes.com/2007/09/02/weekinreview/02shane.html
Week in Review

AMERICA’S spies, like America’s teenagers, are secretive, talk in code and get
in trouble if they’re not watched closely. It’s hard to imagine spies logging on
and exchanging “whuddups” with strangers, though. They’re just not wired that
way. If networking is lifeblood to the teenager, it’s viewed with deep suspicion
by the spy.....

04 September 2007

Cycling along the Hudson

BikeNyack
Cycling through Palisades Park to Nyack, with a long lunch at famous Runcible Spoon.
Route taken on Google Maps
(kmz colored by elevation, note tough climb at end of park when one joins highway)
Associated useful links.
Summary: ~51.2 miles cycling (7.3 beyond that shown in KML)

BikeStormKing
Walking in StormKing sculpture park then cycling from there through Newburgh, over Hudson on I84 bridge, and into Beacon and then back.
Route taken on Google Maps
(kmz colored by speed)
Associated useful links.
Summary: ~23.1 miles cycling and 3 miles walking

BikeCroton
Cycling along Old Croton Aqueduct Trail, which is dirt. Trip aborted by rain in northern portion.
Route taken on Google Maps
Associated useful links.
Summary: ~18.5 miles

14 August 2007

Cycling in the Bronx (BikeBx)

Into Bronx over Broadway Bridge and then to City Island
Route taken on Google Maps (kmz Color Ramped)
Associated useful links.
Summary: apx 10:30 AM-4PM, 26 miles

29 July 2007

Jogging in Cambridge, UK (CamUK4Jogs)

Here are four slow jogs projected onto Google Maps in Cambridge radiating outwards from Trinity in four directions (in chronological order): CamWest (53:57 - 5.1km), CamNorth (34:30 - 3.8km), CamSouth (59:32 - 6.2km), CamShort (33:35 - 3.7km)

24 July 2007

Cycling in Queens (BikeQueens)

Into Queens over Queensboro, then Roosevelt Island and Triboro to Randall's Island
Route taken on Google Maps (kml)
Summary: apx 11AM-5:30PM, 31 miles

Total Time (h:m:s) 6:40:21 12:53 pace
Moving Time (h:m:s) 4:06:06 7:55 pace
Distance (mi ) 31.07
Moving Speed (mph) 7.6 avg. 231.3 max.
Temperature (°F) 53.6°F avg. 55.4°F high
Wind Speed ( mph) SSE 3.8 avg. SSE 6.9 max.

22 July 2007

let. to ed. re. "Sowing Seeds Of Cures" -- C&EN

Here's a letter that wasn't published:
I read with great interest the recent article on venture philanthropy. I think
this is an interesting trend allowing philanthropic contributions to energize
the commercial process towards a good end. However, an important aspect was not
emphasized is the significant potential for conflicts of interest to arise.
Non-profits, such as medical charities, are given special status in the United
States by the tax code. However, in the scenario described in the article where
a philanthropist contributes money to making a biotech investment opportunity
more favorable for venture capital fund, he is essentially using charitable,
untaxed money towards a profitable end. This raises obvious conflicts: One could
imagine a person contributing money sheltered from taxes to a charity and then
having the charity redirect the funds to a commercial endeavor from which he
would directly benefit. Clearly, safeguards need to be developed to prevent this.


Letter in response to:
http://pubs.acs.org/cen/coverstory/85/8519cover.html
May 7, 2007
Volume 85, Number 19
pp. 19-26
Sowing Seeds Of Cures
As venture capitalists' priorities shift, venture philanthropists fill the gap
in funding of drug discovery by biotechs
Chemical & Engineering News (C&EN)
Lisa M. Jarvis
IN 1989, when scientists found the defective gene that causes cystic fibrosis,
it seemed that a cure, or at least an array of better treatment options, was
just around the corner. Research efforts, largely funded by the Cystic Fibrosis
Foundation (CFF), gained momentum, and by the mid-1990s, scientists had pieced
together much of the complex biology behind this debilitating and eventually
deadly disease.....

let. to ed. re. "Biology's Big Bang" -- Economist

Here's a letter to the Economist that wasn't published:
I read, with great interest the recent cover article describing biology's big
bang. I agree wholeheartedly agree with the thrust of this piece. The article
makes the point that there is a revolution in biology akin to that in early
20th-century physics. It also compares the genome to a computer operating
system. One can take these comparisons even further. The revolution in biology
is fundamentally about how a discipline once preoccupied with descriptions of
anatomy and taxonomy is now increasingly concerned with digital information
processing. We are, in fact, witnessing the fusion of parts of biology and
computer science. The new roles found for RNA are so important because of its
central place in cellular information processing.


Letter in response to:
http://www.economist.com/opinion/displaystory.cfm?story_id=9339752
The RNA revolution
Biology's Big Bang
Jun 14th 2007
>From The Economist print edition
What physics was to the 20th century, biology will be to the 21st—and RNA will
be a vital part of it.
NATURE is full of surprises. When atoms were first proved to exist (and that was
a mere century ago), they were thought to be made only of electrons and protons.
That explained a lot, but it did not quite square with other observations. Then,
in 1932, James Chadwick discovered the neutron. Suddenly everything made
sense—so much sense that it took only another 13 years to build an atomic bomb....

21 July 2007

let. to ed. re. "A Challenge to Gene Theory, a Tougher Look at Biotech" -- NY Times

Here's a letter to the Times (in response to their one mention of ENCODE) that wasn't published:
As a participating scientist in the consortium cited in the July 1st Sunday
edition, I was excited that the Times covered some of the findings of our
project (ENCODE). The article discussed how the consortium's work is changing
the definition of a gene, and it was fascinating to see how scientific findings
ripple over into commercial and legal contexts. One of the interesting things
about genes is how plastic their definition has been over time. The current
definition, which is being recast by the ENCODE project's findings, derives from
the cracking of the genetic code in the 1960s. However, before that, a gene had
a more abstract definition as a unit of heredity, divorced from the physical
molecules actually encoding it. One of the amazing things about successive
redefinitions of a gene is that they have all been "backwards compatible" in a
scientific sense, still allowing old findings to apply to the current
definitions, with a bit of mental gymnastics. However, maybe we will find that
this backwards compatibility only applies in a scientific sphere and that a
redefinition of the gene will require substantial changes outside of it, in our
notions of commercially viable entities.
(Also, you might note that this subject is quite related to some recent publications, viz:
http://papers.gersteinlab.org/papers/grgenerev/
http://papers.gersteinlab.org/papers/whatisgene )


Letter in response to:
http://www.nytimes.com/2007/07/01/business/yourmoney/01frame.html
A Challenge to Gene Theory, a Tougher Look at Biotech - New York Times
July 1, 2007
Re:framing
A Challenge to Gene Theory, a Tougher Look at Biotech
By DENISE CARUSO
THE $73.5 billion global biotech business may soon have to grapple with a
discovery that calls into question the scientific principles on which it was
founded. Last month, a consortium of scientists published findings that
challenge the traditional view of how genes function. The exhaustive four-year
effort was organized by the United States National Human Genome Research
Institute and carried out by 35 groups from 80 organizations around the world.
To their surprise, researchers found that the human genome might not be a "tidy
collection of independent genes" after all, with each sequence of DNA linked to
a single function, such as a predisposition to diabetes or heart disease.
Instead, genes appear to operate in a complex network, and interact and overlap
with one another and with other components in ways not yet fully understood.
According to the institute, these findings will challenge scientists "to rethink
some long-held views about what genes are and what they do."...

20 July 2007

Cycling in Brooklyn (Aborted5boro07 + BikeBeltPkwy)

Aborted 5 Boro, following course route to Brooklyn Bridge and then improvising to Verrazano after a long lunch
Route taken on Google Maps (kml)
Summary: apx 6:30AM-5PM, 46 miles

Total Time (h:m:s) 10:08:40 13:07 pace
Moving Time (h:m:s) 6:39:03 8:35 pace
Distance (mi ) 46.4
Moving Speed (mph) 7.0 avg. 67.8 max.
Temperature (°F) 55.6°F avg. 62.6°F high
Wind Speed ( mph) NE 8.5 avg. NE 11.5 max.

Central Park to Prospect Park and then onto Belt Parkway Bikeway
Route taken on Google Maps (kml)
Summary: apx 8:30AM-5PM, 49 miles, with stops, including lunch
Total Time (h:m:s) 9:47:41 12:05 pace
Moving Time (h:m:s) 6:29:44 8:01 pace
Distance (mi ) 48.55
Moving Speed (mph) 7.5 avg. 51.0 max.
Temperature (°F) 67.3°F avg. 71.6°F high
Wind Speed ( mph) SE 5.3 avg. SE 12.6 max.

17 July 2007

Cycling on Northern Part of Farmington Valley Greenway (BikingCTGreenwayN)

Actuals
Route taken on Google Maps (kml)
Summary Data (apx. 11:30AM-6:30PM, with stops, 46 miles)

Total Time (h:m:s) 7:05:31 9:19 pace
Moving Time (h:m:s) 5:08:33 6:45 pace
Distance (mi ) 45.67
Moving Speed (mph) 8.9 avg. 59.8 max.
Temperature (°F) 82.1°F avg. 84.2°F high
Wind Speed ( mph) W 9.2 avg. W 11.5 max.

Planning
Useful links: http://del.icio.us/mbgmbg/FunBikingCTGreenwayN

01 July 2007

let. to ed. re. "A Smarter Web" -- Tech Review

Here's a letter to Technology Review that was published:
We read with interest John ­Borland's piece on the Semantic Web ("A Smarter Web," March/April 2007). We agree that this is an exciting time in the Semantic Web's development, yet we want to point out that its great degree of structure has drawbacks. As the article noted, Semantic Web users must learn complex ontology languages and structure their information and data using them. This difficulty inhibits the growth of the Semantic Web. It is thus arguable whether the Semantic Web can approach the scale of the standard Web, where anyone can easily create and publish content.
Ideally, we should combine the strengths of the Semantic Web and the normal Web. Search would be a good place to start. Today, global free-text search is the primary means of querying the whole Web, but it provides only coarse-grained access to documents. In contrast, the Semantic Web allows much more precise queries across multiple information sources (say, querying for a particular attribute, such as "street address"). However, it is on a much smaller scale, involving far fewer documents. We could imagine combining normal and Semantic Web queries--for instance, to search the free text of all real-estate Web pages written by women in Boston during the last week for the word "Jacuzzi." Taking this further, the few structured relationships currently in the Semantic Web could be used to refine the results of mainstream search engines.
Finally, as so much activity in the life sciences is focused on large-scale interoperation on the Web (as found in drug discovery), we feel that biological research could serve as a useful guide and driving force for the development of Web 3.0.


Citation of Letter
http://www.technologyreview.com/Infotech/18851/page2/
The Semantic Web
July/August Issue of Technology Review
Mark Gerstein and Andrew Smith
Computational Biology and Bioinformatics Program
Yale University
New Haven, CT


Letter in response to:
http://www.technologyreview.com/Infotech/18395/
Monday, March 19, 2007
Part I: A Smarter Web
New technologies will make online search more intelligent--and may even lead to a "Web 3.0."
By John Borland
Last year, Eric Miller, an MIT-affiliated computer scientist, stood on a beach in southern France, watching the sun set, studying a document he'd printed earlier that afternoon. A March rain had begun to fall, and the ink was beginning to smear....

Original Letter Text (before edit by magazine)

We read with great interest John Borland's March/April 2007 article "A Smarter
Web." We agree that this is an exciting time in the development of the semantic
web (or Web 3.0), and that it is on the cusp of more widespread acceptance and
use. A problem with the semantic web, however, is that it is not as flexible as
the free-text publishing supported by the standard web. As the article noted,
users must learn the semantic web's ontology languages and structure their
information and data using them. This presents a learning curve to users, acting
to inhibit the growth and spread of semantic web data. It is thus arguable
whether the semantic web can approach the huge size of the standard web where
almost anyone can easily create and publish web pages. The standard web will
likely still be the primary web most users see and use for the foreseeable
future, while the semantic web could remain a niche.

We thus feel that a practical direction is to investigate ways that the semantic
web and standard web can work together and leverage each other in a kind of
symbiosis. Keyword-based web search ala Google is the primary way of mining the
web for information today, but it only provides coarse-grained topical access to
documents and there are many kinds of information requests it cannot handle. For
example, queries that combine general relational information (such as provided
by the semantic web) about pages with keyword based searches are not supported.
Furthermore, one wants to be able to develop ways of leveraging small amounts of
highly structured information (as in the semantic web) as "training sets" to
better enable querying and clustering of the large bodies of unstructured, free
text information on the web; i.e. the small amount of highly structured
information could be used to bootstrap the automated organization, in support of
better querying, of the much larger unstructured information through data
mining. Since searching is widely perceived to be a crucial web application, the
semantic web's ability to improve it could be of high practical value and an
important driving force to help more fully realize the vision of the semantic
web. An important part of Web 3.0 should thus be to enumerate the kinds of
information requests that could be fruitfully made, and the kinds of information
infrastructure and data mining techniques needed to fulfill them. Finally, there
is much activity and excitement within biological research towards the goal of
truly large-scale integration and interoperation of its vast data, e.g. to aid
in more efficient drug discovery. The life sciences could thus be a useful
guide, test case, and driving force for Web 3.0.

27 June 2007

Cycling in Southern Westchester (BikeSW)

Actuals
Riverdale into Southern Westchester, following Bronx River trailway after Bronxville
Route taken on Google Maps (kml) (Streetview works for part of the route, showing some of the Bronx River Parkway Trailway.)
With elevation information, which is too big for Google Maps but works with Earth (kml, kmz)
Summary Data: 23.7 miles, most of the day with stops

Planning
Rough google map route of lower (aborted) part of trip.
Nearby places on this route created with Google's My Maps: early and late.
Some useful links.

25 June 2007

Cycling in Northern Westchester (BikeNW)

Actuals
Route taken on Google Maps (kml, kmz)
Bottom Southern and Northern Westchester Trails to Yorktown Heights and back with detour through Tarrytown (on old aqueduct trail)
Summary Data (apx. 9:30AM-6:10PM, with stops, 56 miles, uncorrected)

Planning
Links collections relevant to planning the trip: http://del.icio.us/mbgmbg/FunBikingNW

24 June 2007

Hamptons Cycling (BikeHamptonsBM)

Actuals
Route taken on Google Maps (kml)
Bridgehampton to Montauk and back.
Summary Data (apx. 8AM-4PM, with stops, 53 miles)

Total Time (h:m:s) 8:11:17 9:09 pace
Moving Time (h:m:s) 5:22:45 6:01 pace
Distance (mi ) 53.56
Moving Speed (mph) 10.0 avg. 23.2 max.
Temperature (°F) 69.1°F avg. 69.8°F high
Wind Speed ( mph) SSW 12.4 avg. SSW 13.8 max.
(51 miles without correction)

Planning
A useful reference map from ResortMaps.com
A suggested route plan and some places to stop (from Google Map's My Map)

09 June 2007

tennis courts near CU & GWB

For reference:
For tennis courts near GWB, there appears to be only nearby bridges over tracks, with the best entry point at 165th & Riverside.
For tennis courts near CU, best entry point is near 120th and Riverside.


http://maps.google.com/maps?f=q&hl=en&q=W+165th+St+%26+Riverside+Dr,+New+York,+NY+10032&ie=UTF8&om=1&z=19&ll=40.847148,-73.945726&spn=0.001781,0.004227&t=h
http://maps.google.com/maps?f=q&hl=en&q=W+120th+Street+%26+Riverside+Dr,+New+York,+NY&ie=UTF8&z=18&ll=40.81132,-73.96394&spn=0.003557,0.00685&t=h&om=1
ooo[link]ooo ooo[maps]ooo ooo[fun]ooo

02 June 2007

The Drive for the $1000 Genome -- BioITWorld

Interesting overview the high-throughput sequencing technologies, with quotes below from 454 people and on cost of storing a Solexa's run of data, estimated at ~$15K/run .


http://www.bio-itworld.com/issues/2007/may/cover-story
http://www.bio-itworld.com/issues/2007/may/cover-story-sidebar1/
http://www.bio-itworld.com/issues/2007/may/cover-story-sidebar2/
The Drive for the $1000 Genome
By Kevin Davies
May 15, 2007 | J. Craig Venter recently made his Comedy Central debut on The Colbert Report. Asked by host Stephen Colbert "What makes you think you can do a better job with life and genetics than God?" Venter shot back: "We have computers!" rendering Colbert momentarily (and uncharacteristically) speechless.....
SOLiD Storage
A potential downside of the SOLiD setup is the premium it puts on compute power and storage. The complete SOLiD system including compute and workflow pieces, could push the price above $600,000....
"No-one stores the [Sanger] images when they're small, so who's going to store them when they're large? So we want to get you past the images and into analyzing the data," says Rhodes.
By contrast, a typical 454 FLX run produces a paltry 13 GB of raw image data After data extraction, namely base calling, we're at a final of just less than 20 GB in total. That's actually quite manageable, especially nowadays with 500-GB hard drives," says Harkins. "We're looking to compress that down so potentially you could burn a DVD for one drive. You could store an instrument run for a few dollars."
Harkins notes that other next-generation sequencing platforms are talking about terabytes per run. "We're talking about pushing the science, but these other companies have a dilemma. It could cost more in computer hardware than reagents for an instrument run," says Harkins.
While Illumina's Smith agrees that, "The really big data is in the images," Illumina offers customers the opportunity to store all of their images, "because there will be people who want to do that. The issue is you get into hundreds of GB or even 1 TB [per run]." And that will only increase in the future. "The customer may decide to store a subset of the images for quality control purposes, or store images for a particularly important run and archive them to a tape backup."
The question for the market, Harkins reckons, is: Do you want to save your raw data? 454 allows users to re-evaluate their raw data. "We had one customer who re-processed his raw data using the updated GS FLX software and is seeing improvements," says Harkins. "When you're talking about 1-2% error down to 0.5%, that leads to tangible improvements for downstream analysis."
But Rhodes dismisses such criticism. With an instrument potentially pumping out 4 Megabases each second over three days, "People don't need the images, they need the data. What you really want is the result," says Rhodes.
During a panel discussion at CHI's Next Generation Sequencing conference, Rhodes said: "Back of the envelope calculations say that if you wanted to store the raw image data, it's 6 TB a week... that could require you to spend $1 million on storage, backup, and stuff. So unless you think you're going to want to go back to every image, it's cheaper to do the experiment again." Rhodes can see certain situations for storing images, say for a precious cDNA clone. "But as a routine workaday measure, no."
"Once you've got to that stage, you still have a large dataset — if you're going to generate 1 billion bases per run, you've got to have quite a lot of bytes as well as bases," says Smith. "But you're no longer in the TB of data, you're back down in the 100 GB or so. So you can reduce the data quantity by not storing the images." Smith says many customers already have the necessary compute infrastructure. ...
But Harkins says the market hasn't come to terms with the dilemma of paying $10,000-20,000 to save a single instrument run's data. "That's going to put the market into a bind," he says. "Throwing raw data away is a paradigm shift I don't think people are ready for yet."

28 May 2007

Tetratops

Tetratops in the New York Times: "PATENTS
Geodesic Spinning Tops; Church Playhouse; Vacuum for Tiny Toys
By TERESA RIORDAN

When Kurt Przybilla was growing up in International Falls, Minn., he and his three sisters looked forward every summer to the day their father would put up the geodesic jungle gym in their back yard....
When one imagines the centers of the spheres as dots connected by lines, a cluster of four balls describes a tetrahedron, a three-dimensional shape with four triangular faces. Przybilla's tops also come in clusters of six (an octahedron, with eight triangular faces when the imaginary dots are connected), 12 (an icosahedron, with 20 faces) and 13 (a cube octahedron, which despite its extra ball has only 14 faces).

16 May 2007

Letter in response to "A Digital Life" -- Sci. Am.

Here's the final text of letter I wrote in response to the article below (which was never published). Felt the letter and the article give one compelling vision of an information-rich future where data mining will be all important.

I read with great interest Gordon Bell's and Jim Gemmell's recent article in
Scientific American about the MyLifeBits project. The concept of recording
all the events in a person's life into a digital lifestream is fascinating.  The
logical complement of a such lifestream would be the personal genome.  Coupling
a person's genome, molecular blueprint, with the lifecourse he has taken would
potentially enable us to address one of the major questions in genetics: how
genes and the environment interrelate, or put more simply, the relationship
between nature and nurture. As the article points out, privacy is an essential
aspect of this discussion. However, one nuance that wasn't raised is the idea
that revealing personal information -- be it from your genome or your
"lifestream" -- potentially compromises not only your privacy but also that of
your friends and relatives. For example, an individual could consent to posting
his genome on the web, but what about his parents and children? Or what about a
day's worth of your videostream: did all the people that crossed your
path consent? Surely, the law needs to be revised to address these
important concerns.


http://research.microsoft.com/barc/mediapresence/MyLifeBits.aspx
http://www.sciam.com/article.cfm?chanID=sa006&colID=1&articleID=CC50D7BF-E7F2-99DF-34DA5FF0B0A22B50
FEATURE ARTICLES
March 2007 issue
INFORMATION TECHNOLOGY
A Digital Life
New systems may allow people to record everything they see and hear--and even
things they cannot sense--and to store all these data in a personal digital archive
By Gordon Bell and Jim Gemmell

Human memory can be maddeningly elusive. We stumble upon its limitations every
day, when we forget a friend's telephone number, the name of a business contact
or the title of a favorite book. People have developed a variety of strategies
for combating forgetfulness--messages scribbled on Post-it notes, for example,
or electronic address books carried in handheld devices--but important
information continues to slip through the cracks. Recently, however, our team at
Microsoft Research has begun a quest to digitally chronicle every aspect of a
person's life, starting with one of our own lives (Bell's). For the past six
years, we have attempted to record all of Bell's communications with other
people and machines, as well as the images he sees, the sounds he hears and the
Web sites he visits--storing everything in a personal digital archive that is
both searchable and secure....
[L2E]

Letter in response to "Friendster for Proteins" -- Forbes

Here's the final text of letter written by me and Philip Kim in response to the article below (which was never published):
We felt that your article "Friendster for Proteins" (Mar 12th) overlooked the
most predominant type of systems biology currently practiced in science. While
there is a growing effort in the type of bottom-up modeling described in your
article, the focus thus far has been on top-down analysis of large-scale
networks. At this point in time, our understanding of biological systems is too
tenuous to accurately simulate cellular processes -- and the recent failures at
Airbus suggest that accurate simulation is difficult to achieve even in
engineering. Current research focuses mostly on global properties of networks
and analyzing them on a more abstract level. Many new biological insights have
been gained from this type of analysis and many advances in understanding
protein function, as well as identifying new drug targets and cancer genes, have
been made in this field.


http://members.forbes.com/forbes/2007/0312/072.html
Forbes Mar 12, 2007
Friendster for Proteins - Robert Langreth & Matthew Herper
Understanding how the body's tiny components communicate is opening up vast
territory in drug research. Peter Sorger spent eight years developing new
laboratory gadgets and arcane mathematical theorems to explain....
[L2E]

30 March 2007

Hiking in Sleeping Giant Park

Route on Google Maps
Extra Data on the route: Just Climbing Down [kml], ERockClimbing-HighPoints.jpg, DipInMiddle-After1stHighPt.jpg, Overview.jpg

Stats on the Hike
Spreadsheet
HeartRate-Elev-Profile.jpg


Absolute-from-start delta-vs-previous-step


Time Dist (mi) Elev (ft) Time Dist (mi) Elev (ft) Lat. (dd) Long. (dd)
Start 0:00:00 0 89


41.4216 -72.89847
1st-top 0:33:59 1.44 713 0:33:59 1.44 624 41.42608 -72.89899
Middle-dip 0:57:21 2.05 467 0:23:22 0.61 -246 41.42832 -72.89726
2nd-top 1:21:16 3.03 814 0:23:55 0.98 347 41.43053 -72.89052
Leave-top 1:23:55 3.13 707


41.43044 -72.89068
End 1:48:57 4.62 182 0:25:02 1.49 -525 41.42137 -72.89822

External Links
http://www.sgpa.org [http://www.sgpa.org/colormap.pdf , Map Detail ]
http://www.ct.gov/dep/cwp/view.asp?A=2716&Q=325264

28 March 2007

American Artist Rankings

Type

Name

RS

MG

Link Wiki

Link Y!

??

artist

Edward Hopper

5

1

=

948000


artist

Fredrick Church

9

2

=

820000


artist

Frank Stella

-

3

=

829000


artist

Sargent

4

4

=

597000


artist

Bierstadt

-

5

=

111000


artist

Pollack

-

6

=

383000


artist

Jasper Johns

6

7

=

567000


artist

Rauschenberg

14

8

=

247000


artist

Inness

-

9

=

54800


artist

Thomas Cole

8

10

=

2890000


artist

Eakins

6

11

=

134000


artist

Homer

10

12

=

1460000


artist

Wyeth

-

13

=

323000


artist

Rothko

7

14

=

364000


artist

O'Keefe

15

15

=

371000


artist

Copley

16

16

=

314000


artist

James Whistler

-

17

=

279000


artist

Cassatt

2

-

=

294000


artist

Hassam

3

-

=

101000


artist

T Robinson

11

-

=

1780000

?

artist

Milton Avery

12

-

=

265000


artist

C Durand

13

-

=

164000

?








architect

Frank Lloyd Wright

6

1

=

605000


architect

P Johnson


2

=

1170000


architect

Gehry


3

=

429000


sculptor

Calder

1

4

=

70500


architect

Venturi


5

=

89500


photographer

Walker Evans


6

=

293000


photographer

Dorothea Lange


7

=

167000


sculptor

Noguchi


8

=

47400


architect

Saarinen


9

=

110000


architect

Gordon Bunshaft


10

=

27800


photographer

Leibowitz

2

-

=

36200


photographer

Richard Avedon

3

-

=

701000


photographer

Stieglitz

4

-

=

97500


other

Louis Tiffany

5

-

=

3060000


photographer

Gordon Parks

7

-

=

438000


http://gerstein.info/gps/artist-rankings.xls