<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-12538363</id><updated>2011-04-21T14:03:11.988-07:00</updated><title type='text'>HELYXZION</title><subtitle type='html'>THE LANGUAGE OF DNA, RNA, AMINO ACIDS, PROTEINS AND NANOTECHNOLOGY</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>40</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-12538363.post-114615628795587719</id><published>2006-04-27T09:43:00.000-07:00</published><updated>2006-04-27T09:44:48.000-07:00</updated><title type='text'>IBM DISCOVERS WHAT HELYXZION HAS KNOWN FOR 20 YEARS!</title><content type='html'>04/25/06 -- IBM today announced its researchers have discovered numerous DNA patterns shared by areas of the human genome that were thought to have little or no influence on its function and those areas that do.&lt;br /&gt;As reported today in the Proceedings of the National Academy of Sciences (PNAS), regions of the human genome that were assumed to largely contain evolutionary leftovers (called "junk DNA") may actually hold significant clues that can add to scientists' understanding of cellular processes. IBM researchers have discovered that these regions contain numerous, short DNA "motifs," or repeating sequence fragments, which also are present in the parts of the genome that give rise to proteins.&lt;br /&gt;If verified experimentally, the discovery suggests a potential connection between these coding and non-coding parts of the human genome that could have a profound impact on genomic research and provide important insights on the workings of cells.&lt;br /&gt;"Our goal is to apply advanced computational techniques to analyze the workings of processes and systems, in this case the function of the human genome," said Ajay Royyuru, head of the Computational Biology Center at IBM Research "Using these tools, we've been able to shed new light on parts of the DNA that were traditionally thought of as not having a specific purpose. We believe the innovative application of technology can provide further understanding in the life sciences at large."&lt;br /&gt;The IBM team used a mathematical tool called pattern-discovery, often applied to mine useful information from very large repositories of data in both business and scientific applications, to sift through the approximately six billion letters (oh sorry again we here all mistaken about the human genome having 3.1 base pair) in the non-coding regions of the human genome and look for repeating sequence fragments, or motifs.&lt;br /&gt;Among the millions of discovered motifs, the team identified approximately 128,000 that also occur in the coding region of the genome and are significantly over-represented in genes involved in specific biological processes such as cell communication, regulation of transcription, transport and others. In fact, copies of one or more of these motifs can be found in over 90 percent of all known human gene sequences, as well as some genes of other animals where they associate with similar biological processes.&lt;br /&gt;Yes leave it to IBM A corporate giant with the largest array of Supercomputer in the world DISCOVERS WHAT HELYXZION HAS KNOWN FOR 20 YEARS!  IT’S NOT JUNK!   What IBM refers to was known to be an important part of DNAs form and function 20 years ago by Helyxzion and a few lone researches that The “myth” of junk DNA conjured up by proteomics wizards to explain the fact that some 97% of DNA did not code for protien. This was the junk!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;SORRY IBM BUT HELYXZION HAS BEEN DOING THAT FOR 6 YEARS AND WITHOUT THE NEED OF SUPERCOMPUTERS AND 3,000 SCIENTISTS BUT ON A SIGNLE DESK TOP COMPUTER AND THE “ANVIL”   VIEWER!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114615628795587719?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114615628795587719/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114615628795587719' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114615628795587719'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114615628795587719'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/04/ibm-discovers-what-helyxzion-has-known.html' title='IBM DISCOVERS WHAT HELYXZION HAS KNOWN FOR 20 YEARS!'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-114434815784160862</id><published>2006-04-06T11:28:00.000-07:00</published><updated>2006-04-06T11:29:17.873-07:00</updated><title type='text'>news from the genetic front</title><content type='html'>Helyxzion, LLC is pleased to join VDDI, Inc., Telomolecular, Corp. “Combining virutal drug discovery with the best genomics platforms, along with improvements in large-molecule drug delivery and custom protein design opens up new doors of possibility in the treatment of genetic disorders and disease," said Telomolecular CEO Matthew A. Sarad.”, and other industry partners in this critical effort to convert scientific breakthroughs into practical therapies. Simplified access to high-performance computing is a key component of the initiative to develop personalized therapies,” said Dr. Charles Stevens, CEO of Helyxzion. “As genomic analysis becomes economically feasible at a personal level, we see opportunities to bring supercomputing performance to every hospital, every clinic, and every doctor's office. The Helyxzion Consortium can help accelerate these opportunities, bringing real benefits to millions of patients.”&lt;br /&gt;&lt;br /&gt;Helyxzion, along with the other Consortium members, are joining forces to improve the speed of drug discovery, gene sequencing and other compute-intensive life-saving therapies to ultimately improve the lives of consumers and give healthcare providers the ability to offer personalized therapies to their patients. This principle of matching a genomic analysis based therapeutic entity to a companion genomically based diagnostic ensures that the right drug is used for the right patient, at the right time for the right cost"  said Dr. Stephen Porter, (VDDI, Inc., ) platform strategy advisor to Helyxzion This is true Theranostics. The term itself encompasses many areas such as predictive medicine, personalized medicine, integrated medicine, pharmacodiagnostics and Dx/Rx partnering. Theranostic applications offer a tighter clinical fit between genomic, genetic profiling, medical diagnostics and therapeutic drug treatment.  Helyxzion flagship distributed computing product ANVIL, is in use at leading biotech companies in North America, Europe and Asia where the foundations of personalized therapies are being developed today.&lt;br /&gt; “The ability to ‘scale out’ processing is increasingly important for our customers who use computational techniques for life science discovery,” said Dr. Stephen Porter, platform strategy advisor to Helyxzion. “We are glad to be a Helyxzion Consortium partner, as a member of the Consortium and with its expertise in using the ANVIL technology environment to enable rapid application development for these scenarios.”&lt;br /&gt;In addition, the Consortium has begun to capture and annotate data to address proof-of-concept demonstrations in the development of an “anti virus”, which will have a broad spectrum ability to fight many types of virus... As the project progresses, the Consortium will be sharing the code that is developed as well as information on the implementation to help its member companies build applications faster and allow independent biotech companies an affordable distributed computing solution that dramatically improves the speed and performance of demanding real-world biomedical and nanotechnology applications. It is radically easier to buy, and use than any other genomic-computing solutions. Using ANVIL, researches can easily make discoveries and develop applications that will take advantage of the High Performance Computing power of thousands of computers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114434815784160862?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114434815784160862/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114434815784160862' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114434815784160862'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114434815784160862'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/04/news-from-genetic-front.html' title='news from the genetic front'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-114342114294381553</id><published>2006-03-26T16:57:00.000-08:00</published><updated>2006-08-02T10:22:28.300-07:00</updated><title type='text'>as the future catches you</title><content type='html'>QUOTE BY: Dr. Juan Enriquez&lt;br /&gt;in his book "AS THE FURURE CATCHES YOU"&lt;br /&gt;&lt;br /&gt;THE DOMINANT LANGUAGE... AND ECONOMIC DRIVER.....OF THIS CENTURY....IS GOING TO BE.....GENETICS.&lt;br /&gt;Those who remain illiterate in this language Won't understand the force making the single biggest difference in their lives.&lt;br /&gt;&lt;br /&gt;Many countries and companies&lt;br /&gt;just&lt;br /&gt;don't&lt;br /&gt;get it.&lt;br /&gt;&lt;br /&gt;WE ARE BEGINNING TO ACQUIRE THE DIRECT AND DELIBERATE CONTROL OVER THE EVOLUTION OF ALL LIFE FORMS ON THE PLANET.&lt;br /&gt;&lt;br /&gt;YOUR FUTURE,&lt;br /&gt;THAT OF YOUR CHILDREN,&lt;br /&gt;AND THAT OF YOUR COUNTRY DEPEND ON.....&lt;br /&gt;UNDERSTANDING A GLOBAL ECONOMY DRIVERN BY TECHNOLOGY&lt;br /&gt;UNDERSTANDING CODE, PARTICULARLY GENETIC CODE, IS TODAY'S MOST&lt;br /&gt;POWERFUL TECHNOLOGY&lt;br /&gt;&lt;div align="center"&gt;HELYXZION&lt;br /&gt;IS THE MATHEMATICAL, ALGORITHMIC, DIGITAL LANGUAGE OF DNA, RNA, AMINO ACIDS, AND PROTIENS AND PROGRAMMNG LANGUAGE OF LIFE AND NANOTECHNOLOGY.&lt;br /&gt;&lt;br /&gt;..ARE ABOUT TO CHANGE...&lt;br /&gt;AGAIN.&lt;br /&gt;THE TWO NUCLEOTIDE BASE PAIRS THAT CODE ALL LIFE….A-T-C-G HAVE ALREADY LED SOME OF THE WORLD’S LARGEST COMPANIES….TO DECARE THAT THEIR FUTURE LIES IN THE LIFE SCIENCE.&lt;br /&gt;NO COUNTRY OR COMPANY WILL SUCCEED….. LET ALONE EXCEL IN THIS FIELD WITHOUT UNDERSTANDING DNA AS A “LANGUAGE’.&lt;br /&gt;HELYXZION IS THE ONLY COMPANY TO DEVELOP A TECHNOLOGY TO DO THIS… IF YOU DON’T BUY IT, LEARN IT AND USE IT, YOU WILL NOT ONLY LOSE IT……YOU'LL NEVER MAKE IT..&lt;/div&gt;&lt;div align="center"&gt;Societies and peoples who understand the genetic alphabet...&lt;br /&gt;ARE LIKELY TO LIVE LONGER....&lt;br /&gt;AND GET RICHER.&lt;br /&gt;But most societies do not understand genetic discovery...&lt;br /&gt;Or the challenges that arise from these discoveries...&lt;br /&gt;And that makes them, for all practical purposes.....&lt;br /&gt;Functionally illiterate......&lt;br /&gt;In the language that codes all life on this planet.&lt;br /&gt;DON'T BE ILLITERATE..&lt;br /&gt;MORE IMPORTANT; DON’T LET YOUR CHILDERN BE ILLITERATE…..&lt;br /&gt;GIVE THEM "HELYXZION THE LANGUAGE OF DNA"&lt;br /&gt;GIVE THEM POWER.... WEALTH.... HEALTH.....&lt;br /&gt;THEIR FUTURE DEPENES ON UNDERSTANDING IT! &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114342114294381553?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114342114294381553/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114342114294381553' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114342114294381553'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114342114294381553'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/03/as-future-catches-you_26.html' title='as the future catches you'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-114243901460361598</id><published>2006-03-15T08:08:00.000-08:00</published><updated>2006-03-15T08:10:14.646-08:00</updated><title type='text'>MRI_DNA</title><content type='html'>MRI/DNA Helyxzion recognizes the market demand and meets the challenge of MRI/DNA imaging diagnosis. Novel imaging protocols are paired with contrast enhancing probes and MAXfel Laser optical excitation to report gene expression and hopefully sequence whole genomes.&lt;br /&gt; By Charles Stevens&lt;br /&gt;Current MRI signals originate from the water molecules in the body. The observed signals rely on the difference of the water environments in the objectives. When such a difference is not obvious, MRI will fail. In this highly demanding areas, such as tumor detections and brain images, MRI can not give any definite information because the differences of water environments are not related to any physiological and pathological conditions. New MRI signaling molecules are urgently needed.&lt;br /&gt;&lt;br /&gt;Helyxzion is developing a new set of magnetic signaling molecules for magnetic resonance image (MRI) sequencing diagnostic studies of DNA and large molecule delivery system for genetic therapy.&lt;br /&gt;Helyxzion will develop new sets of signaling molecules for MRI and MRI/DNA Magnetic Resonance Sequencing, Diagnostic and Detection the potential applications of the new signaling molecules include early tumor detections, metabolism images, and body images. &lt;br /&gt;Helyxzion technology is based on the understanding that disease is a genetic process and that a new model for radiology, called molecular imaging, is required for genetic therapies. The methods employed are so sensitive that diseases measured on the genetic level can be detected and corrected before the patient is aware of their symptoms. While conventional medicine of the 20th century treated the effects of disease, genetic medicine in the 21st century, with its complementary DNA sequence imaging techniques, will treat its causes.&lt;br /&gt;DNA imaging (sequencing) answers key clinical questions associated with gene therapies. Through this in vitro medium, clinicians can determine gene defects and in vivo if gene-altering therapies have reached their cellular targets. It reveals the anatomic region where the introduced genes are expressed as well as the onset, magnitude, and duration of expression. Many new molecular therapies are cytostatic, rather than cytotoxic. This means the therapies inhibit cell growth, but they don't kill cells outright, so radiology's mainstay measures of tumor location, diameter, and volume may no longer provide an accurate reading of a patient's condition. The traditional standards for drug dosing are rendered obsolete because genetically targeted therapies may be largely free of side effects. Instead of monitoring physical symptoms, clinicians will use DNA imaging to determine the correct genetic defect and monitor its successful insertion.&lt;br /&gt;MRI/DNA sequencing and 3D imaging would be no more than a fanciful concept without the Helyxzion ANVIL technology to express DNA data graphically and with 3D imaging of the express genome as a digital image of the individual sequenced. Combined with the human genome project completed in 2002, catalogued the sequence of 30,000 genes that make up the human genome. In addition to supplying researchers with the basic ingredients for genetic medicine and imaging, the project created methods and instruments that have accelerated the pace of discovery.&lt;br /&gt;Although DNA sequence imaging may seem exotic, the principles guiding it are familiar to anyone who has practiced nuclear medicine. Many molecular imaging protocols use a radioisotope as a tracer, but even techniques that employ optical or MR imaging modalities rely on some pharmacological means to track the pharmacokinetic properties of the molecular therapies with which they are paired.&lt;br /&gt;DNA imaging differs from conventional techniques, tts designers must overcome cell membrane barriers to deliver the imaging probes to their DNA molecular target minimizing the size and concentration of probe molecules is essential. Typical target concentrations are on pico- or millimolar levels. Because so few molecules are involved, novel strategies must be created to amplify (LASER) the probe's signal to the point that it can be detected.&lt;br /&gt;Whole genome imaging’s best approach is to amplify DNA intracellular, combined it with green fluorescence protein and ex vivo techniques to validate uptake. Researchers have various options when they examine the tissue culture in a petri dish. Fluorescence indicating gene expression ex vivo can be performed before attempting the more costly and technically demanding in vivo imaging techniques.&lt;br /&gt;A radio labeled probe that is selectively phosphorylated in much the same way that fluorodeoxyglucose is phosphorylated by hexokinase. From that starting point, devise strategies to examine DNA in the most straightforward way.&lt;br /&gt;Monocrystalline iron oxide nanoparticles, an MR contrast agent that is a leading candidate for this role. Each particle comprises about 2000 atoms of iron in a crystalline form wrapped with dextrin. When the dextrin is cross-linked, the particles are called CLIONS (cross-linked iron oxide nanoparticles).&lt;br /&gt;Although this contrast-enhancing behavior can be used to improve the efficacy of MRI for cancer diagnosis, the MGH team plans to report gene expression with the aid of the agent. It has developed vectors that position a therapeutic gene and a transferrin gene side by side in a vector. The expression of the transferrin receptor gene product therefore is a surrogate measure for the expression of the therapeutic gene product.&lt;br /&gt;While the MION protocol is an example of a surface receptor encoding, researchers at the California Institute of Technology are developing MRI contrast agents that become activated at an intracellular level. EgadMe is the most advanced agent thus far developed in this class of selectively activated agents, said Dr. Thomas J. Meade, senior research associate in biology. It consists of a chelator that occupies eight of the nine coordination sites on a gadolinium contrast ion. A galactopyranose residue caps off the remaining coordination site on a gadolinium ion. In this water-inaccessible configuration, the contrast agent is "inactive," meaning it does not affect the T1 times of MRI images.&lt;br /&gt;MRI/DNA Imaging Approaches with Laser Optical Enhancement&lt;br /&gt;Laser Optical imaging will also become an important DNA imaging modality, in part because of new activatable fluorescent contrast agents. Laser Optical imaging uses light waves in a manner similar to x-rays at much higher frequencies. Extremely high spatial resolution is possible with Laser optical imaging, but the clinical value of enhanced coherent Laser has been limited to transparent objects or opaque tissue thinner than 150 mm. With a new generation of fluorescent contrast agents,  extending its reach into DNA imaging, in certain chemical configurations, these agents have no effect on the viewed fluorescent image The fluorochromes are unbound after specific enzymatic interaction, however, and these encounters cause them to glow. In some cases, the emission of photons boosts the contrast 1000-fold. This enables target detection with near-infrared fluorescence imaging down to the 10-8 molar concentration level.&lt;br /&gt;The Future&lt;br /&gt;What does this all mean for radiology in 2010? New diagnostic procedures and agents will help identify either the genotype or phenotype of abnormalities in vivo, making cancer combined with in vitro DNA sequencing will make comprehensive diagnosis possible. Breast adenocarcinoma, for example, is thought to be at least two different diseases. It is a safe bet that by 2010, radiologists will be determining optimal therapy by using in vivo LMI and in vitro DNA imaging to identify these unique genetic profiles.&lt;br /&gt;Activatable MR agents will play a major role, researchers at Telomolecular Corp. have a FDA approved large molecules delivery system to penetrate cell membranes and to deal with the high mass levels of probes required to produce sufficient signal for DNA imaging.&lt;br /&gt;Laser enhanced MRI imaging will become prominent because it provides benefits beyond the capabilities of other imaging technologies. "This will be the classic case of early diagnosis with DNA imaging, perhaps before morphologic or clinical phenotypic signs of disease can be seen.&lt;br /&gt;Helyxzion predicts that some of today's mainstream applications will appear quaint 10 years from now. At that time, no one will recommend serial scans separated by three-month intervals to monitor the efficacy of chemotherapy based on the size of a tumor. Instead, Helyxzion foresees that a genetic profile of the patiant will be generated and an Individualized therapeutic plans will be formulated based on genetic profiles.&lt;br /&gt;Serendipity makes accurate prediction difficult, and random events interfere with well-intentioned forecasting. "It is clear; however, that the first DNA imaging to obtain FDA approval for clinical use will be with Laser enhanced MRI.&lt;br /&gt;According to Helyxzion many enabling technologies are contributing to DNA imaging research, , Nano-device engineering, improved data processing are making there mark but the ability to 3D image the sequencing data will prove to be by far the most important.&lt;br /&gt;Helyxzion is seeking business partners to strengthen the company, its manufacture ability and to further develop the MRI technology into clinic applications. The expected investment is between two millions and five millions US dollars. Helyxzion will expect intellectual properties and patents in near future.&lt;br /&gt;&lt;br /&gt;Other work being done: Apr 18, 2005&lt;br /&gt;Development of Sequencing Technology&lt;br /&gt;Two grafs on the development of sequencing technology from a recent article in &lt;a href="http://www.biosciencetechnology.com/"&gt;Bioscience Technology&lt;/a&gt;.&lt;br /&gt;1) Sequencing technology is "frozen in time", still searching for a breakthrough:&lt;br /&gt;Progress in gene sequencing has arisen more from improved methods than ground-breaking instrumentation. Glenn Schulman, PharmD, marketing manager at 454 Life Sciences (New Haven, CT) points out that gene sequencing technology has become frozen in time circa 2000. “Things pretty much stopped with capillary electrophoresis-based instrumentation,” he says. “There have been incremental improvements, but nothing truly enabling.”&lt;br /&gt;2) Logarithmic Scaling of &lt;a href="http://www.454.com/"&gt;454&lt;/a&gt;'s sequencing-by-synthesis technology (see &lt;a href="http://www.454.com/pages/454%20Technology%20Updated%20Feb_2_2005/Website_update_2_01_05_files/slide0123.htm"&gt;image&lt;/a&gt;)&lt;br /&gt;454’s progress has been phenomenal since it reported its first results, on about 25 base pairs, in late 2001. Since then scale-up has been logarithmic: 33 kbp in 2002, 2.8 Mbp in 2003, and about 20 million bp today (about the size of a bacterial genome) in a 4.5 hour run. Dr. Schulman sees no end in sight to Moore’s Law-type scaling, which could result in sequencing a whole human genome — 30 Gbp — in a matter of days or hours.&lt;br /&gt;AGOWA GmbH is expanding its range of technological facilities in the area of high-throughput DNA analysis with the implementation of the ABI PRISM® 3730 xl. This latest innovation in DNA analyzers is distinguished by shorter runtimes, longer read lengths, increased throughput and also provides excellent data quality. AGOWA has many years experience and a proven excellent reputation for large-scale sequencing as demonstrated by their participation in large-scale national and international sequencing projects. The sequencing service provided by AGOWA combines their experience and state-of- the-art technology with the aim to rapidly deliver to clients best quality at favourable prices. AGOWA is a competent outsourcing partner for clients in industry and research and offers a broad range of services ranging from DNA libraries, automatic picking and spotting of clones, custom sequencing, bioinformatics down to complete genome analysis.&lt;br /&gt;&lt;br /&gt;Liquid-State NMR and Scalar Couplings in Microtesla Magnetic Fields&lt;br /&gt;We obtained nuclear magnetic resonance (NMR) spectra of liquids in fields of a few microtesla, using prepolarization in fields of a few millitesla and detection with a dc superconducting quantum interference device (SQUID). Because the sensitivity of the SQUID is frequency independent, we enhanced both signal-to-noise ratio and spectral resolution by detecting the NMR signal in extremely low magnetic fields, where the NMR lines become very narrow even for grossly inhomogeneous measurement fields. In the absence of chemical shifts, proton-phosphorous scalar (J) couplings have been detected, indicating the presence of specific covalent bonds. This observation opens the possibility for "pure J spectroscopy" as a diagnostic tool for the detection of molecules in low magnetic fields.&lt;br /&gt;&lt;a href="http://www.physorg.com/newman/gfx/news/IBM-MRI.jpg"&gt;Full size image&lt;/a&gt;&lt;br /&gt;IBM scientists have achieved a breakthrough in nanoscale magnetic resonance imaging (MRI) by directly detecting the faint magnetic signal from a single electron buried inside a solid sample. This achievement is a major milestone toward creating a microscope that can make three-dimensional images of molecules with atomic resolution. Success in this quest should have major impact on the study of materials -- ranging from proteins and pharmaceuticals to integrated circuits and industrial catalysts -- for which a detailed understanding of the atomic structure is essential. Knowing the exact location of specific atoms within tiny nanoelectronic structures, for example, would enhance designers' insight into their manufacture and performance. The ability to directly image the detailed atomic structure of proteins would aid the development of new drugs.&lt;br /&gt;"Throughout history, the ability to see matter more clearly has always enabled important new discoveries and insights," says Daniel Rugar, manager of nanoscale studies at IBM's Almaden Research Center in San Jose, California. "This new capability should ultimately lead to fundamental advancements in nanotechnology and biology." Rugar leads the team of scientists who for more than a decade have been making pioneering advancements in the nanoscale MRI method called magnetic resonance force microscopy (MRFM). His team has improved MRI sensitivity by some 10 million times compared to the medical MRI devices used to visualize organs in the human body. The improved sensitivity extends MRI into the nanometer realm. (A nanometer is a billionth of a meter, the length spanned by about 5-10 atoms.) IBM Research has a distinguished history in developing microscopes for nanoscale imaging and science. Gerd Binnig and Heinrich Rohrer of IBM's Zurich Research Laboratory received the 1986 Nobel Prize in Physics for their invention of the scanning tunneling microscope, which can image individual atoms on electrically conducting surfaces. Binnig later invented the atomic force microscope (AFM), which used the attraction between a cantilever and surface features on non-conducting surfaces. Scientists at IBM and elsewhere modified and extended the AFM design to image surface forces such as magnetism, friction and electrostatic attraction with nanometer resolution. MRFM combines concepts from both AFM and MRI to allow nanometer resolution of features up to 100 nanometers deep inside a sample. The IBM team of Rugar, John Mamin, Raffi Budakian and Benjamain Chui published its single-electron results in the July 15 issue of the scientific journal Nature. This research is funded in part by the Defense Advanced Research Projects Agency. Technical details The central feature of an MRFM is a microscopic silicon "microcantilever" that looks like a miniature diving board 1,000 times thinner than a human hair. It vibrates at a frequency of about 5,000 times a second, and attached to the cantilever tip is a tiny but powerful magnetic particle. Isolated ("unpaired") electrons and many atomic nuclei behave like tiny bar magnets. These fundamental units of magnetism are often called "spins." Just as two bar magnets can attract or repel each another, the MRFM’s magnetic tip is attracted or repelled by the spins in the sample. By tuning an oscillating high-frequency magnetic field to the natural precession frequency of the spin being imaged, its magnetic orientation flips back and forth as the cantilever vibrates. Although the magnetic force between the magnetic tip and the spin is exceedingly small (less than a millionth of a trillionth of a pound), the cantilever is so sensitive that the flipping of the spin causes a detectable change in the cantilever’s vibration frequency. While medical MRI looks at groups of at least 1 trillion proton spins, the IBM researchers have just detected the much fainter signal of a single electron spin. The researchers also demonstrated rudimentary (one-dimensional) imaging with 25-nanometer resolution, about 40 times better than the best conventional MRI-based microscopes. Rugar's future research is aimed at further improving the sensitivity, resolution and speed of the MRFM technique so it can detect single protons and other nuclei, such as carbon-13, that can be used to reveal molecular structures. (The magnetic signal of a single electron is about 600 times stronger than that of a single proton.) Applying MRFM to protein structures would be particularly far-reaching. The biological activity of a large protein molecule is determined by its intricately folded atomic configuration. But since such a structure is currently impossible to determine directly, scientists must use indirect methods such as the scattering of x-rays by crystallized proteins, or computer simulations. Advanced MRFMs may also be able to serve as detectors of quantum information in future spin-based quantum computers.&lt;br /&gt;&lt;br /&gt;Polymerases for Sequencing by Synthesis&lt;br /&gt;&lt;br /&gt;Significant enhancements in gene sequencing may be achieved through implementation of analysis instruments at the same dimensional scale as DNA, i.e., nanometers. Nanotechnology has recently provided the necessary tools to create such nanoinstruments and this proposal seeks to utilize these tools to fabricate a high-speed, low-cost gene sequencer. The gene sequencer is based on the nanopore approach and incorporates tunneling current electrodes to sense the individual nucleotides as they transverse the pore.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;DNA Sequencing Using Nanopores&lt;br /&gt;&lt;br /&gt;This project, as its R21 milestone, will deliver a combination of conical nanopores having read length dimensions slightly less than 1 nm, and nucleobase-modified DNA oligonucleotides, where the passage of the DNA through the nanopore proceeds with a time constant of 10-100 microseconds per nucleotide, and where the ion current through the nanopore, during the time when the DNA is in transit, varies detectably depending on the nucleotide that is in the pore at the time that the current is measured. This nanopore-modified DNA combination will form the core of an extremely inexpensive technology to generate long reads of DNA sequence at the single molecule level. The research will exploit a decade of experience in the Martin laboratory preparing nanopores and engineering their chemical context, and an equal experience in the Benner laboratory working with nucleic acid analogs, polymerases that accept them, and practical applications of the combination. As specific aims, we shall: (a) prepare the nanotubes; (b) attach chemical functionality to the nanotubes; (c) prepare nucleoside triphosphates carrying different sized polyether dendrimers attached at the 5-position (for pyrimidines) and the 7-position (for 7-deazapurines); (d) use these triphosphates to synthesize modified DNA molecules. The nanopores will then be physically characterized to determine their ion transport dynamics, and in conjunction with the modified oligonucleotides, to find a combination that meets the R21 milestone specifications. If this milestone is passed, the next period will be used to develop sequence specific and randomly targeted primers that incorporate DNA, PNA, and tags that exploit an artificial genetic alphabet, and to develop improved processes for generating conical nanopores in a form suitable for large scale application. These will then be targeted against specific sequences extracted from mammalian genomes.&lt;br /&gt;&lt;br /&gt;Polymerases for Sequencing by Synthesis&lt;br /&gt;This project, as its R21 milestone, will deliver Taq DNA polymerases that catalyze the template-directed addition of nucleoside triphosphates carrying large fluorescent groups attached to their 3'-ends. The fluorescent groups therefore both terminate transiently the growth of the oligonucleotide chain, and signal the nature of the nucleotide that was last added. These polymerase variants will form the core of a "cheap reagent" approach to the Sequencing by Synthesis (SbS) strategy. Gaining control over polymerase behavior is key for this approach to generate inexpensive genome-quality sequence data. The research will exploit a decade of experience in the Benner laboratory with nucleic acid analogs, polymerases that accept them, and practical application of the combination. The tactics assume that site-directed mutagenesis is generally site-directed damage, and therefore must be followed by directed evolution to obtain polymerase-substrate combinations that meet specifications. Here, directed evolution will be used to restore catalytic power and fidelity in polymerases that have been engineered to accept fluorescent tags. We shall: (a) synthesize nucleoside triphosphates that have fluorescent blocking groups; (b) use a directed evolution system in water-in-oil emulsions to select polymerases that accept the triphosphates efficiently and faithfully; (c) obtain polymerases to incorporate these to within 10% the catalytic activity of native polymerases, and with specificity to better than one part in 10,000. The next phase of the project will be to develop a working prototype for a multiplexed sequencing-by-synthesis device using these polymerases. The Aims of that phase will be to: (d) optimize the fluorescent compound-cleavage chemistry-polymerase combination; (e) use an artificially expanded genetic information system (AEGIS), the artificial alphabet invented in the Benner group, to bin primer-template combinations for parallel sequencing; and (f) exploit 2D gels to develop an architecture for a prototype parallel sequencing instrument based on the technologies developed in Aims a-c.&lt;br /&gt;Bead-based Polony Sequencing&lt;br /&gt;The goals of this project are to develop a robust sequencing by synthesis methodology for de novo and resequencing applications using the bead-based polony technology. Our overall R &amp; D focus is to address key aspects of the technology that need to be refined to enable robust, high quality polony sequencing. Our experience in large-scale genome sequencing will serve well to ensure that the key issues involved in optimizing the technology against current industry standards, data processing, management, and analysis are effectively addressed in a time- and cost-efficient manner. The specific aims are to:&lt;br /&gt;Develop effective procedures for production of paired-end PCR libraries with virtual insert sizes (distance between read pairs) in the range of 2 to 50 kilobases.&lt;br /&gt;Develop methods for effective solid-phase template amplification on derivatized microspheres and for enrichment of beads containing amplified templates.&lt;br /&gt;Develop methods for robust array preparation.&lt;br /&gt;Develop procedures for fluorescent in situ sequencing by synthesis.&lt;br /&gt;Develop an integrated data acquisition system including fluorescence microscope, automated stage, flow cell, fluidics system and control software.&lt;br /&gt;Develop data management and assembly software.&lt;br /&gt;Develop functional reversible chain terminators.&lt;br /&gt;Develop modified enzymes capable of efficiently incorporating reversible terminators.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://72.14.203.104/search?q=cache:SLNqxGPANwIJ:www.wiley-vch.de/books/biotech/pdf/v05b_midd.pdf+sequencing+technology&amp;hl=en&amp;amp;gl=us&amp;ct=clnk&amp;amp;cd=32"&gt;http://72.14.203.104/search?q=cache:SLNqxGPANwIJ:www.wiley-vch.de/books/biotech/pdf/v05b_midd.pdf+sequencing+technology&amp;hl=en&amp;amp;gl=us&amp;ct=clnk&amp;amp;cd=32&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.spaceref.com/news/viewpr.html?pid=14018"&gt;http://www.spaceref.com/news/viewpr.html?pid=14018&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.genome.gov/12513162"&gt;http://www.genome.gov/12513162&lt;/a&gt;  more info…&lt;br /&gt;&lt;br /&gt;New Technology Will Speed Genome Sequencing&lt;br /&gt;CAMBRIDGE, Mass. -- Almost 150 different genomes have been sequenced to date, including the human genome. But sequencing needs are growing faster than ever: In March 2003, the Bush administration announced it will spend $1 billion over five years to increase forensic analysis of DNA, including a backlog of up to 300,000 samples. And the success of the growing field of genomic medicine, which promises to deliver better therapies and diagnostics, depends on faster sequencing technology.&lt;br /&gt;This fall, researchers at Whitehead Institute will test new technology that could aid these and other endeavors. The BioMEMS 768 Sequencer can sequence the entire human genome in only one year, processing up to 7 million DNA letters a day, about seven times faster than its nearest rival. Scientists began working on the project in 1999 with a $7 million National Human Genome Research Institute grant. The technology eventually will help scientists quickly determine the exact genetic sequence of the DNA of many different organisms, and could lead to faster forensic analysis of DNA gathered in criminal cases.&lt;br /&gt;The heart of the new BioMEMs machine is a large glass chip etched with tiny microchannels called "lanes." It tests 384 lanes of DNA at a time, four times more than existing capillary sequencers. Each lane can accommodate longer strands of DNA: about 850 bases (the nucleic acids found in DNA, abbreviated by the letters A, C, T or G), compared to the current 550 bases per lane.&lt;br /&gt;It takes about 45 minutes to read the DNA from one of the BioMEMS' 768 lanes. The machine has two chips; one is prepared as the other is sequenced, so that the machine is sequencing at all times. The new sequencer saves not just capital costs, the developers say, but day-to-day expenses as well.&lt;br /&gt;"It's not only the cost of the machine, but the cost of the materials it uses," says Brian McKenna, a senior software engineer at Whitehead Institute. The goal, he says, is to use the same amount of consumables -- liquid, chemicals, and other materials used to prepare the DNA -- as existing sequencing machines. BioMEMS also uses a DNA loading process that eventually will need only 1 percent of a typical DNA sample.&lt;br /&gt;While developed at Whitehead, the machine is being commercialized by network biosystems, a company in Woburn, Mass., started in 2001 by Whitehead Member Paul Matsudaira, BioMEMS Labs Director Dan Ehrlich and research scientist Lance Koutny. Shimadzu Biotech in Japan will manufacture the sequencer.&lt;br /&gt;DNA sequencing&lt;br /&gt;How to determine the sequence of bases in a DNA molecule.&lt;br /&gt;DNA sequencing is the process of determining the exact order of the bases A, T, C and G in a piece of DNA. In essence, the DNA is used as a template to generate a set of fragments that differ in length from each other by a single base. The fragments are then separated by size, and the bases at the end are identified, recreating the original sequence of the DNA.&lt;br /&gt;The most commonly used method of sequencing DNA - the dideoxy or chain termination method - was developed by Fred Sanger in 1977 (for which he won his second Nobel prize). The key to the method is the use of modified bases called dideoxy bases; when a piece of DNA is being replicated and a dideoxy base is incorporated into the new chain, it stops the replication reaction.&lt;br /&gt;&lt;br /&gt;Key principles:&lt;br /&gt;A DNA molecule carries information in the form of four chemical groups or bases, represented by the letters A, C, G and T. The order of bases on a DNA strand is the DNA sequence.&lt;br /&gt;Most DNA sequencing is carried out using the chain termination method. This involves the synthesis of new DNA strands on a single stranded template and the random incorporation of chain-terminating nucleotide analogues.&lt;br /&gt;The chain termination method produces a set of DNA molecules differing in length by one nucleotide. The last base in each molecule can be identified by way of a unique label. Separation of these DNA molecules according to size places them in the correct order to read off the sequence.&lt;br /&gt;&lt;br /&gt;How does it work? The DNA to be sequenced is provided in single-stranded form. This acts as a template upon which a new DNA strand is synthesized. DNA synthesis requires a supply of the four nucleotides (the building blocks of DNA), the enzyme DNA polymerase and a primer (a short sequence annealed to the template which initiates the new DNA strand). The nucleotides added to the growing DNA strand are complementary to those in the template strand.&lt;br /&gt;Sequencing is achieved by including in each reaction a nucleotide analogue that cannot be extended and thus acts as a chain terminator. Four reactions are set up, each containing the same template and primer but a chain terminator specific for A, C, G or T. Because only a small amount of the chain terminator is included, incorporation into the new DNA strand is a random event. Each reaction therefore generates a collection of fragments, but every DNA strand will end at the same type of base (A, C, G or T).&lt;br /&gt;The primers or nucleotides included in each of the four reactions contain different fluorescent labels allowing DNA strands terminating at each of the four bases to be identified. The reaction products are then mixed and separated by gel electrophoresis, which separates DNA molecules according to size even if they differ in length by only a single nucleotide. As the DNA strands pass a specific point, the fluorescent signal is detected and the base identified. The whole process can be extensively automated.&lt;br /&gt;How is it used?The most obvious application of DNA sequencing technology is the accurate sequencing of genes and genomes. Only about 5-800 bases can be sequenced in one experiment so larger DNA molecules, including whole genomes, must be broken into smaller fragments before sequencing and then reassembled by searching for overlaps. Accuracy is achieved by sequencing each template several times.&lt;br /&gt;Lower-fidelity single-pass sequencing is useful for the rapid accumulation of sequence data at the expense of some accuracy. Another application of DNA sequencing technology is resequencing the same DNA molecule over and over. This is necessary, for example, in the typing of single nucleotide polymorphisms.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114243901460361598?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114243901460361598/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114243901460361598' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114243901460361598'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114243901460361598'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/03/mridna.html' title='MRI_DNA'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-114088610355965799</id><published>2006-02-25T08:47:00.001-08:00</published><updated>2006-02-25T08:48:23.560-08:00</updated><title type='text'>the new helyxzion language of dna</title><content type='html'>HELYXZION AND POST-GENOME DISCOVERYThe DNA of a cell contains all the instructions necessary to recreate life. As such, the sequence of a genome's DNA provides a form of information transfer with its own alphabet (i.e., nucleotides), words (i.e., codons), and sentences (i.e., genes). Thus, the efforts to decode the meaning of DNA sequence is an exercise analogous to that of cryptography seeking to derive meaning from a collection of seemingly randomly recurring symbols.Previous deciphering efforts have been basic and focused on the immediate meaning of a focal sequence. This is akin to the translation of a text on a word-by-word basis. As we advance in this understanding, we start to see higher order meaning through the nuances of gene expression and splice changes. Moreover, the structure and organization of the DNA sequences within and across species provides a clue as to the fundamental rules that governed the creation of life and the understanding of DNA as a true language.Linguistics is a branch of science that has long sought to define the architecture and laws of language structure. There is ample evidence to indicate that both the dimensions and units of linguistic structure appear genetically embedded in the human species. Therefore, the analysis of the structure of language has provided a window into the make-up of the Homo Sapien mind, and perhaps a set of useful strategies to unearth similar structures.Experimentally, therefore, both the disciplines of genomics and linguistics seek to uncover order and information from a sea of noise. Genomics, by virtue of its origins in physical and biological sciences, has had the benefit of rigorous computational tools and laboratory validation in its investigations. Unlike genomics, however, the intuitive understanding of language in all of us permitted linguists to convincingly reconstruct rules governing the transmission of higher order meaning, while unlike cryptography, genomics can use experimental strategies to uncover the relation between form and meaning.The "Helyxzion" Language of DNA will explore the investigative strategies used by these diverse fields of genomics and linguistics in identifying meaning from recurrent strings of information in a multidisciplinary manner touching on linguistics, genomics, computation and molecular biology. The goal will be to synthesize novel conceptual approaches to uncovering higher order meaning from DNA sequence information, to develop a deeper understanding of DNA as a language and explore the possibility of forging novel investigative strategies in genomic research.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114088610355965799?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114088610355965799/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114088610355965799' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114088610355965799'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114088610355965799'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/02/new-helyxzion-language-of-dna.html' title='the new helyxzion language of dna'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-114088604818972374</id><published>2006-02-25T08:47:00.000-08:00</published><updated>2006-02-25T08:47:28.196-08:00</updated><title type='text'>the new code of life</title><content type='html'>The code, the text and the language of DNACommunication between scientists about their work is filled with images. This is inevitable when it comes to explaining complex ideas and concepts that are not directly observable, such as the subatomic particles that comprise a proton or an electron, or the processes inside a cell that lead to the correct formation of a protein. When new discoveries are made, the words to describe them are usually lacking and must be borrowed from the physical world or common speech: lipid rafts, chaperones, molecular markers. When scientists try to explain their findings to the public, or when the media try to make science more palatable to their readers or viewers, these metaphors become even more colorful: cells are factories, proteins carry zip codes, mitochondria are the power-house of the cell, and cells of the immune system go to school.Explaining complex concepts is a creative process and it reveals how scientists think and how ideas about a world too small to grasp are represented in their mind. They accentuate certain aspects of the subject or process they are depicting, while neglecting others. Some-times they even awake associations not intended. When molecules suddenly acquire a personality of their own or are endowed with human goal-directed behavior—take, for example, a molecule that 'finds' a partner or a cell that makes a developmental 'decision', such as committing 'cell suicide'. Using molecular genetics as an example, I will try to follow some of biology's metaphors from their origin in scientific communication into the real world and analyze their impact on the public perception of science.Common language talks about DNA as 'information' or 'a code'. For a very long time, scientists suspected that something—some kind of plan, resided within the sperm and/or egg, such that a snake developed from a snake egg and humans created human offspring. But it was only in the late 1940s and 1950s, when physicists and mathematicians entered the field of molecular biology, that scientists came to interpret this 'something' as information. The complete pattern of the future development of an organism and its function when mature, is contained in the chromosomes in the form of a 'code'. The later discovery of the structure of DNA by Francis Crick and James Watson was a mile stone to the understanding of DNA as a code of some kind that allowed molecules in cells to carry information," in a paper on the implications of their DNA structure, they wrote that "it therefore seems likely that the precise sequence of the bases is the code that carries the genetic information." From today's perspective it seems rather inevitable that, when people started to think about the molecular basis of inheritance. Today, it is hard for a geneticist to picture DNA as anything other than a code that transmits information.Understanding the genome as a coded message, interpreting it as a text, book or language is not so far-fetched. These metaphors convey an important scientific principle: a sequence of a limited assortment of building blocks, like letters in a text, can carry a message. In his book The Language of Life, George Beadle wrote: "... the deciphering of the DNA code has revealed a language... as old as life itself, a language that is the most living language of all". More recently, when scientists celebrated the completion of the first draft of the human genome in 2000, the 'book' and 'language' metaphors were revived—not just reinvented by the press in the service of the public understanding of science, but used by high-ranking scientists involved in the genome project to describe their achievement. On 26 June 2000, when Francis Collins, Director of the National Human Genome Research Institute, announced the completion of the first draft in a major media event at the White House, he said "Today, we celebrate the revelation of the first draft of the human book of life" and declared that this breakthrough lets humans for the first time read "our own instruction book."When H. Gobind Khorana, Marshall W. Nirenberg and other scientists revealed the trinucleotide (now called a codon) correlation between nucleic acids and proteins, this was referred to as 'decoding' or 'deciphering' the code. In fact it only gave science the alphabet and in no way deciphered DNA. this fact becomes very evident when you look at the "state of the art in genetics today. The scientists where able to "HUNT AND PICK AWAY AND FIND THE PROTEINS" (and the term gene today still generally refers to DNA that codes for a Protein) which comprise only about 3% of the genetic information in DNA! WHAT HAPPENED TO THE OTHER 97%? what does it do? why is it there? They just don't know because they don't have the KEY to deciphering DNA. For the scientists involved, these references are clear by context—whether the issue is the DNA sequence itself or the relationship between DNA and protein. But news headlines such as "Decoding the book of life", "Cracking the code of life" or "Breaking the code of life", when referring to the sequencing of the human genome, imply that the decoded text can be read like a novel. In fact HELYXZIONS "ANVIL" (ADVANCED NUCLEOTIDE VISUAL INTERPRETIVE LANGUAGE) DOES JUST THAT!IN SHORT:WATSON &amp;amp; CHICK GAVE US THE STRUCTURE OF DNA.H. Gobind Khorana, Marshall W. Nirenberg revealed the trinucleotide correlation between nucleic acids and proteins.THE GENOME SEQUENCING PROJECT GAVE US THE "TEXT BOOKS".HELYXZIONS ANVIL TECHNOLOGY GIVES US THE ABILITY TO READ "DECIPHER" THE TEXT BOOK.No scientist would dispute that this is NOW the current state of the art. Understanding the message hidden in the 3 billion base pairs of the human genome would require a detailed translation of its sequence into physiological function. DNA itself is a "text with context", genes by themselves barely do anything. Genes just describe how to make proteins, or cease to make them, or regulate their production as directed by other proteins. Not even the basics of protein function at the level of protein folding can be deduced from the genes. It is in the introns that the real information of how intricate protein networks work, that constantly survey the environment outside the cell, monitor metabolic processes and integrate this information into physical function. Deciphering the text as laid down in the genome therefore predict how life works at the cellular and organism level.Today, Helyxzion is learning and reading the language DNA. We are also profoundly humbled by the privilege of turning the pages that describe the miracle of human life, written in the mysterious language of all the ages, the language of DNA.The real implications of "reading" the human genome sequence is just now being realized. This could very well be the turning point in human evolution from a scientific point of view, it could change the role of science, because it introduces human will and intentions into the scientific exercise, after centuries of attempting to free science and research from the limits imposed by religious leaders. In the best case, it provokes sarcasm: In the worst case, it provokes public fear—the idea of the scientists 'playing God' is not too unbelievable. And the public does listen to what the scientists are saying—indeed, public attention to the genome project was unrivalled. In 2000, The New York Times alone published 108 articles related to the Human Genome Project. Was it this outburst in media attention that turned scientists into PR spokesmen and encouraged them to blow their speech out of proportion?Thinking of genes as 'controlling' or 'programming' development dictates a certain view of these processes. "The Helyxzion "ANVIL" describe for us the exact content and structure, not only of each and every gene associated with a species, but also the intron information, that controls a particular gene.With the Helyxzion technology any one with a background knowledge of molecular biology will be able to grasp the sense of sequences easily.Thinking of DNA as a language, information (encompassing both content and structure), a code a text and a chemical structure, all at the same time. The lay reader is overwhelmed with an impression of impact, meaning, prominence, significance and seriousness, but deprived of any means to understand what exactly has been said. Helyxzion allows practitioners to explain their work in simple and easily understandable terms.Moreover, Helyxzion eliminates confusion and misconceptions in Genetics. The potential of genetics is achieved by emphasizing the power of the language and also that of the scientist analyzing it. "Reading, from cover to cover, the first draft of this 'Book of Life'", is exactly what scientists are now capable of doing. Rather, then trying to infer some meaning from small individual chunks of text.The powerful idea that the essence of life is a DNA sequence that scientists are about to read "from cover to cover", means that DNA can be analyzed and manipulated by the scientists, who are therefore taking part in human evolution. Use of this technology in various prenatal genetic diagnosis, gene patenting, the use of genomic markers to predict predisposition to disease, and the use of DNA to identify individuals. Scientists should not indiscriminately use this technology in an exaggerated way. As our parents used to tell us when we were children: "Watch your language!"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114088604818972374?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114088604818972374/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114088604818972374' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114088604818972374'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114088604818972374'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/02/new-code-of-life.html' title='the new code of life'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-114088600148452761</id><published>2006-02-25T08:46:00.000-08:00</published><updated>2006-02-25T08:46:41.616-08:00</updated><title type='text'>the helyxzion report on the language of dna</title><content type='html'>&lt;a href="http://www.helyxzion.com/"&gt;Helyxzion: The Language of DNA&lt;/a&gt;Helyxzion is the newest tool for personal identification. This technology will take all the mystery out of identification. Helyxzion DNA analysis, will convert the human genome into a digitally accrete picture of the person right down to their finger prints, making identification 100% correct. Biochemistry and molecular biology, has, from its origins, found itself in an unlikely arena, a court of law. There is no question that the fundamental issues are complicated, but it is possible to present the bottom line conclusion in such a way that a Ph.D. is not necessary to understand its implications. The two most misunderstood buzzwords, which are apparently discussed at dinner tables and cocktail parties 'round the world, are statistics and the C-word, contamination. (I have had people come up to me on mountain tops and ask me to tell them about contamination!). By the end of this piece, you should at least be able to make better cocktail-party conversation.A Word about Terminology: Fingerprints Come From FingersUnreasonable expectations, as well as undeserved criticisms, have been visited upon the entire DNA identification technology of because of the unfortunate terminology, DNA fingerprinting, applied to the original typing method. In its current state, DNA typing is not directly comparable to fingerprints from fingers (dermatoglyphic fingerprints). In dermatoglyphic fingerprints, it is possible to obtain all of the ridge detail information from all 10 finger pads; thus there are no missing pieces of information. Because only a small portion, perhaps 1 millionth, of the 3 billion units of human DNA are even available for examination by current methods, the result is better compared to a partial fingerprint. Similar to a partial print, however, it may not be necessary to have complete DNA information to be convinced of the individuality of a DNA profile. Just as a certain number of points of comparison have been deemed necessary in order to declare that two fingerprints originated from the same finger, it has been suggested that a defined number of highly polymorphic (variable) DNA loci (chromosomal locations) may be sufficient in order to be convinced that two samples have originated from the same source. One more piece of not-so-trivial information: although identical twins have different fingerprints, in the absence of genetic mutation, the DNA profiles of identical twins are, in fact, identical. More about the DNA of related individuals later.Another Word about Terminology - Burn the "Match"Another word that should be banned from the language of DNA typing is the word match. Along with DNA fingerprinting, it misleads the hapless uninitiated into believing that any test called DNA will unequivocally associate a questioned sample with an exemplar. Until all 3 billion of those genetic units can be easily and reliably analyzed, more appropriate expressions might be the same pattern as, concordant with or indistinguishable from, depending on the strength of the association. The fact that the English language does not provide an easy descriptor of statistical relationships should not detract from the potential power of DNA typing. When many highly variable DNA regions are analyzed, and even the most conservative statistical estimates indicate that not one other person with the same profile exists in the population of the Earth, indistinguishable from becomes one strong statement.The C word: Contamination Much of the opposition to the reliability of DNA evidence always seems to return to the now infamous catch-all term contamination. Other than its negative connotation, what does it really mean? Does it only refer to inadvertently introduced material or might it also be applied to a legitimately mixed sample (e.g. blood from two victims). In fact there are a plethora of different types of contamination, and the final, if any, effect on evidence varies. Among the considerations in determining whether a second DNA type would even be detected is the type of testing involved. For instance, PCR-type testing, where the DNA in the sample is copied millions of times over, is inherently a more sensitive technique than RFLP, which also makes a PCR test more likely to detect traces of a second type, whatever the source. In addition, point of view comes into play - one person's contamination is another's mixed sample; it all depends on what you were expecting and for whom you are advocating.Assuming that the criminalist collecting evidence at the scene isn't bleeding from an open wound, the greatest concern at the crime scene itself is from bacterial, not human, contamination. Crime scene samples, by definition, are in a fertile environment, and fluids like blood and semen provide a very acceptable growth medium for microorganisms. The DNA of the microorganisms themselves is really not a problem - it won't show up in tests that are specific for human DNA. The major concern is degradation of the human DNA in the sample that the bugs are literally using as food. Even so, the DNA type will simply go away, as opposed to being magically converted into someone else's type. Partially degraded DNA must be interpreted carefully by a qualified analyst; if the sample is known to be of poor quality and there is a possibility that part of a pattern has been obscured, a conclusion of "inconclusive" may be the safest bet.Although great care should be taken as a matter of routine, it is really not that easy to interject extraneous human material into a sample. Contrary to what some might have us believe, DNA does not float around randomly in the air, and cells that may be sloughed off or ejected out of a person are relatively few in number and may not contain any consequential DNA. This is not to suggest that precautions not be taken, but to put the matter in some perspective.Once the sample is dried, refrigerated and in the laboratory, the potential for contamination is mostly from other samples undergoing processing at the same time. This is where the training, qualifications of the analyst and quality control of the laboratory come into play. Safeguards are set up not only to guard against contamination from other lab samples, but just as importantly, to detect contaminated samples, should they occur. By the way, the criminalist should remember to wear gloves and not spit in his samples.The biggest real concern that would actually result in an incorrect DNA type, as opposed to NO type, is a sample switch by the analyst. Until computers can process crime scenes, fully analyze samples and take the witness stand, education, and training and good laboratory practice are the best weapons against sample mix-ups.My Brother did it In some DNA typing techniques (not all) a statistical probability is used to estimate the rareness of any particular type - in other words, the possibility that two samples originating from different sources might show the same pattern by chance alone. This type of calculation is valid only with respect to random individuals in a population; it is not applicable to closely related individuals. No two people share the same DNA type except for identical twins. However siblings potentially share more genetic material with each other than anyone else. This is because they inherit their genes from the same two people, Mom and Dad. This idea can be extended to more distant relationships such as children, grandchildren and cousins. In these relationships, some genetic material is shared, but the more distant the relationship, the fewer genes in common. For the highly variable DNA loci that are used in forensic testing, this means that even siblings are unlikely to test the same, especially when many highly variable markers are analyzed. However, until alibis are established all around, your best DNA defense is still "my brother did it..."DNA in the Judicial SystemThe statistical interpretation of DNA typing results, specifically in the context of population genetics, has been the least understood (therefore by definition the most hotly debated) issue of recent admissibility hearings. The perceived incomprehensibility of the subject, fueled by the views of, what some feel, have been only a few outspoken individuals, has led to a recalcitrance of the judicial system to accept DNA typing. California, in particular, has become both a hotbed and testing ground for DNA admissibility issues. With some half-dozen conflicting appellate opinions, the California Supreme Court has recently moved to review three recent decisions, and come to a consensus as to whether DNA testing is generally accepted in the relevant community, and may be routinely admitted in criminal trials.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-114088600148452761?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/114088600148452761/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=114088600148452761' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114088600148452761'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/114088600148452761'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2006/02/helyxzion-report-on-language-of-dna.html' title='the helyxzion report on the language of dna'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-113113475448081596</id><published>2005-11-04T12:04:00.000-08:00</published><updated>2006-02-25T08:42:05.836-08:00</updated><title type='text'>the language of dna</title><content type='html'>The code, the text and the language of DNA&lt;br /&gt;Communication between scientists about their work is filled with images. This is inevitable when it comes to explaining complex ideas and concepts that are not directly observable, such as the subatomic particles that comprise a proton or an electron, or the processes inside a cell that lead to the correct formation of a protein. When new discoveries are made, the words to describe them are usually lacking and must be borrowed from the physical world or common speech: lipid rafts, chaperones, molecular markers. When scientists try to explain their findings to the public, or when the media try to make science more palatable to their readers or viewers, these metaphors become even more colorful: cells are factories, proteins carry zip codes, mitochondria are the power-house of the cell, and cells of the immune system go to school.&lt;br /&gt;Explaining complex concepts is a creative process and it reveals how scientists think and how ideas about a world too small to grasp are represented in their mind. They accentuate certain aspects of the subject or process they are depicting, while neglecting others. Some-times they even awake associations not intended. When molecules suddenly acquire a personality of their own or are endowed with human goal-directed behavior—take, for example, a molecule that 'finds' a partner or a cell that makes a developmental 'decision', such as committing 'cell suicide'. Using molecular genetics as an example, I will try to follow some of biology's metaphors from their origin in scientific communication into the real world and analyze their impact on the public perception of science.&lt;br /&gt;&lt;br /&gt;Common language talks about DNA as 'information' or 'a code'. For a very long time, scientists suspected that something—some kind of plan, resided within the sperm and/or egg, such that a snake developed from a snake egg and humans created human offspring. But it was only in the late 1940s and 1950s, when physicists and mathematicians entered the field of molecular biology, that scientists came to interpret this 'something' as information. The complete pattern of the future development of an organism and its function when mature, is contained in the chromosomes in the form of a 'code'. The later discovery of the structure of DNA by Francis Crick and James Watson was a mile stone to the understanding of DNA as a code of some kind that allowed molecules in cells to carry information," in a paper on the implications of their DNA structure, they wrote that "it therefore seems likely that the precise sequence of the bases is the code that carries the genetic information." From today's perspective it seems rather inevitable that, when people started to think about the molecular basis of inheritance. Today, it is hard for a geneticist to picture DNA as anything other than a code that transmits information.&lt;br /&gt;Understanding the genome as a coded message, interpreting it as a text, book or language is not so far-fetched. These metaphors convey an important scientific principle: a sequence of a limited assortment of building blocks, like letters in a text, can carry a message. In his book The Language of Life, George Beadle wrote: "... the deciphering of the DNA code has revealed a language... as old as life itself, a language that is the most living language of all". More recently, when scientists celebrated the completion of the first draft of the human genome in 2000, the 'book' and 'language' metaphors were revived—not just reinvented by the press in the service of the public understanding of science, but used by high-ranking scientists involved in the genome project to describe their achievement. On 26 June 2000, when Francis Collins, Director of the National Human Genome Research Institute, announced the completion of the first draft in a major media event at the White House, he said "Today, we celebrate the revelation of the first draft of the human book of life" and declared that this breakthrough lets humans for the first time read "our own instruction book."&lt;br /&gt;&lt;br /&gt;When H. Gobind Khorana, Marshall W. Nirenberg and other scientists revealed the trinucleotide (now called a codon) correlation between nucleic acids and proteins, this was referred to as 'decoding' or 'deciphering' the code. In fact it only gave science the alphabet and in no way deciphered DNA. this fact becomes very evident when you look at the "state of the art in genetics today. The scientists where able to "HUNT AND PICK AWAY AND FIND THE PROTEINS" (and the term gene today still generally refers to DNA that codes for a Protein) which comprise only about 3% of the genetic information in DNA! WHAT HAPPENED TO THE OTHER 97%? what does it do? why is it there? They just don't know because they don't have the KEY to deciphering DNA. For the scientists involved, these references are clear by context—whether the issue is the DNA sequence itself or the relationship between DNA and protein. But news headlines such as "Decoding the book of life", "Cracking the code of life" or "Breaking the code of life", when referring to the sequencing of the human genome, imply that the decoded text can be read like a novel. In fact HELYXZIONS "ANVIL" (ADVANCED NUCLEOTIDE VISUAL INTERPRETIVE LANGUAGE) DOES JUST THAT!&lt;br /&gt;IN SHORT:&lt;br /&gt;WATSON &amp;amp; CHICK GAVE US THE STRUCTURE OF DNA.&lt;br /&gt;H. Gobind Khorana, Marshall W. Nirenberg revealed the trinucleotide correlation between nucleic acids and proteins.&lt;br /&gt;THE GENOME SEQUENCING PROJECT GAVE US THE "TEXT BOOKS".&lt;br /&gt;HELYXZIONS ANVIL TECHNOLOGY GIVES US THE ABILITY TO READ "DECIPHER" THE TEXT BOOK.&lt;br /&gt;&lt;br /&gt;No scientist would dispute that this is NOW the current state of the art. Understanding the message hidden in the 3 billion base pairs of the human genome would require a detailed translation of its sequence into physiological function. DNA itself is a "text with context", genes by themselves barely do anything. Genes just describe how to make proteins, or cease to make them, or regulate their production as directed by other proteins. Not even the basics of protein function at the level of protein folding can be deduced from the genes. It is in the introns that the real information of how intricate protein networks work, that constantly survey the environment outside the cell, monitor metabolic processes and integrate this information into physical function. Deciphering the text as laid down in the genome therefore predict how life works at the cellular and organism level.&lt;br /&gt;&lt;br /&gt;Today, Helyxzion is learning and reading the language DNA. We are also profoundly humbled by the privilege of turning the pages that describe the miracle of human life, written in the mysterious language of all the ages, the language of DNA.&lt;br /&gt;&lt;br /&gt;The real implications of "reading" the human genome sequence is just now being realized. This could very well be the turning point in human evolution from a scientific point of view, it could change the role of science, because it introduces human will and intentions into the scientific exercise, after centuries of attempting to free science and research from the limits imposed by religious leaders. In the best case, it provokes sarcasm: In the worst case, it provokes public fear—the idea of the scientists 'playing God' is not too unbelievable. And the public does listen to what the scientists are saying—indeed, public attention to the genome project was unrivalled. In 2000, The New York Times alone published 108 articles related to the Human Genome Project. Was it this outburst in media attention that turned scientists into PR spokesmen and encouraged them to blow their speech out of proportion?&lt;br /&gt;Thinking of genes as 'controlling' or 'programming' development dictates a certain view of these processes. "The Helyxzion "ANVIL" describe for us the exact content and structure, not only of each and every gene associated with a species, but also the intron information, that controls a particular gene.&lt;br /&gt;With the Helyxzion technology any one with a background knowledge of molecular biology will be able to grasp the sense of sequences easily.&lt;br /&gt;Thinking of DNA as a language, information (encompassing both content and structure), a code a text and a chemical structure, all at the same time. The lay reader is overwhelmed with an impression of impact, meaning, prominence, significance and seriousness, but deprived of any means to understand what exactly has been said. Helyxzion allows practitioners to explain their work in simple and easily understandable terms.&lt;br /&gt;&lt;br /&gt;Moreover, Helyxzion eliminates confusion and misconceptions in Genetics. The potential of genetics is achieved by emphasizing the power of the language and also that of the scientist analyzing it. "Reading, from cover to cover, the first draft of this 'Book of Life'", is exactly what scientists are now capable of doing. Rather, then trying to infer some meaning from small individual chunks of text.&lt;br /&gt;The powerful idea that the essence of life is a DNA sequence that scientists are about to read "from cover to cover", means that DNA can be analyzed and manipulated by the scientists, who are therefore taking part in human evolution. Use of this technology in various prenatal genetic diagnosis, gene patenting, the use of genomic markers to predict predisposition to disease, and the use of DNA to identify individuals. Scientists should not indiscriminately use this technology in an exaggerated way. As our parents used to tell us when we were children: "Watch your language!"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-113113475448081596?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/113113475448081596/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=113113475448081596' title='25 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113113475448081596'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113113475448081596'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/11/language-of-dna.html' title='the language of dna'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>25</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-113105626712809511</id><published>2005-11-03T14:17:00.000-08:00</published><updated>2006-02-25T08:42:30.506-08:00</updated><title type='text'>hidden language of DNA</title><content type='html'>Hidden Language in DNA&lt;br /&gt;&lt;br /&gt;Comparison between the statistical properties of coding and non-coding DNA sequences has been interpreted as indicating a yet-undiscovered language in non-coding DNA.&lt;br /&gt;&lt;br /&gt;This statement is decades old, there is a language in DNA and not just in non-coding regions of DNA. THERE IS NO SUCH THING AS “JUNK OR NON-CODING DNA”.&lt;br /&gt;We argue that greater variance among nucleotide frequencies in ALL regions explains this assertion. DNA sequences are long strings composed of codons (four nucleotides A, C, G, and T). For a statistical analysis, these strings make “words” of fixed length n.&lt;br /&gt;Then the word frequencies, of non-Protein coding DNA was shown to be non-zero (as in natural languages) and significantly larger than that of protein coding DNA. however, this simply reflects that nucleotide frequencies are more unequal in non-protein coding than in protein coding DNA; R 1 increases as the variance of the p distribution increases. The increase in R n as n increases is the same for all DNA and thus does not distinguish between them. Further more, it can be shown that correlations of finite range simply an increasing R n even for n. Which in short argue that a language simply must arise or DNA could not unfailingly impart any use information either from cell to cell or across generations of organisms?&lt;br /&gt;According to their frequencies, p, from most to least frequent, visible by a linear region in a double-logarithmic plot. The slope for non-protein coding DNA was found to be larger than that for protein coding DNA, and close to that of English text, as analyzed the Helyxzion “ANVIL” method and fixed word length was taken as further evidence that “all” regions are similar to natural languages.&lt;br /&gt;Helyxzion ANVIL analysis shows that intron coding regions are not random strings of nucleotides, independently drawn according to the observed nucleotide frequencies. For equal frequencies, all n-codons have equal probability 4 2n. However, as the nucleotide frequencies become more uneven, increasingly distinct DNA appears for finite sequence length a random sequence of identical length and nucleotide frequencies. Considering the crudeness of the approximation, these curves are strikingly similar. Secondly, the most probable “DNA words” are not very different from those of natural languages. Like English, where the most common words are “the,” “of,” “and,” etc., in the present DNA example they are combinations of only the most probable letters—TTTTTT, AAAAAA,&lt;br /&gt;TTTTTA, etc. That these words occur more often than expected for uncorrelated random sequences, Can be readily explained by unequal crossing over, which preferentially occurs in regions of short repeats.&lt;br /&gt;Thirdly, the linguistic approach has not been doubted for a long time: Even randomly generated “Text” (with words of different length) exhibits language behavior with an exponent close to that of natural Languages. We have thus shown all of the observations are simple consequences of nucleotide frequencies. Our explanation of the existence of a language in DNA would not be complete with out knowing that it is not base on guess work but well founded in the underlying “MATHMATICS DICOVERED IN DNA”.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-113105626712809511?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/113105626712809511/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=113105626712809511' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113105626712809511'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113105626712809511'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/11/hidden-language-of-dna.html' title='hidden language of DNA'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-113103935686307718</id><published>2005-11-03T09:35:00.000-08:00</published><updated>2006-02-25T08:42:47.810-08:00</updated><title type='text'>the code</title><content type='html'>The code, the text and the language of DNA&lt;br /&gt;Communication between scientists about their work is filled with images. This is inevitable when it comes to explaining complex ideas and concepts that are not directly observable, such as the subatomic particles that comprise a proton or an electron, or the processes inside a cell that lead to the correct formation of a protein. When new discoveries are made, the words to describe them are usually lacking and must be borrowed from the physical world or common speech: lipid rafts, chaperones, molecular markers. When scientists try to explain their findings to the public, or when the media try to make science more palatable to their readers or viewers, these metaphors become even more colorful: cells are factories, proteins carry zip codes, mitochondria are the power-house of the cell, and cells of the immune system go to school.&lt;br /&gt;Explaining complex concepts is a creative process and it reveals how scientists think and how ideas about a world too small to grasp are represented in their mind. They accentuate certain aspects of the subject or process they are depicting, while neglecting others. Some-times they even awake associations not intended. When molecules suddenly acquire a personality of their own or are endowed with human goal-directed behavior—take, for example, a molecule that 'finds' a partner or a cell that takes a developmental 'decision', such as committing 'cell suicide'. Using molecular genetics as an example, I will try to follow some of biology's metaphors from their origin in scientific communication into the real world and analyze their impact on the public perception of science.Common language talks about DNA as 'information' or 'a code'. For a very long time, scientists suspected that something—some kind of plan, specificity or driving force—resided within the sperm and/or egg, such that a snake developed from a snake egg and humans created human offspring. But it was only in the late 1940s and 1950s, when cyberneticists, physicists and mathematicians entered the field of molecular biology, that scientists came to interpret this 'something' as information. The complete pattern of the future development of an organism and its function when mature, is contained in the chromosomes in the form of a 'code'. The later discovery of the structure of DNA by Francis Crick and James Watson was a mile stone to the understanding of DNA as a code of some kind that allowed molecules in cells to carry information," in a paper on the implications of their DNA structure, they wrote that "it therefore seems likely that the precise sequence of the bases is the code that carries the genetic information." From today's perspective it seems rather inevitable that, when people started to think about the molecular basis of inheritance. Today, it is hard for a geneticist to picture DNA as anything other than a code that transmits information.When H. Gobind Khorana, Marshall W. Nirenberg and other scientists revealed the trinucleotide code and the correlation between nucleic acids and proteins, this was referred to as 'decoding' or 'deciphering' the code. These metaphors have gained momentum and are now routinely used to describe the sequencing of the human genome. For the scientists involved, these references are clear by context—whether the issue is the DNA sequence itself or the relationship between DNA and protein. But news headlines such as "Decoding the book of life", "Cracking the code of life" or "Breaking the code of life", when referring to the sequencing of the human genome, imply that the decoded text can be read like a novel. In fact HELYXZIONS "ANVIL" (ADVANCED NUCLEOTIDE VISUAL INTERPRETIVE LANGUAGE) DOES JUST THAT!&lt;br /&gt;No scientist would dispute that this is NOW the current state of the art. Understanding the message hidden in the 3 billion base pairs of the human genome would require a detailed translation of its sequence into physiological function. DNA itself is a "text with context", genes by themselves barely do anything. Genes just describe how to make proteins, or cease to make them, or regulate their production as directed by other proteins. Not even the basics of protein function at the level of protein folding can be deduced from the genes. It is in the introns that the real information of how intricate protein networks work, that constantly survey the environment outside the cell, monitor metabolic processes and integrate this information into physical function. Deciphering the text as laid down in the genome therefore predict how life works at the cellular and organism level.&lt;br /&gt;Understanding the genome as a coded message, interpreting it as a text, book or language is not so far-fetched. These metaphors convey an important scientific principle: a sequence of a limited assortment of building blocks, like letters in a text, can carry a message. In his book The Language of Life, George Beadle wrote: "... the deciphering of the DNA code has revealed a language... as old as life itself, a language that is the most living language of all". More recently, when scientists celebrated the completion of the first draft of the human genome in 2000, the 'book' and 'language' metaphors were revived—not just reinvented by the press in the service of the public understanding of science, but used by high-ranking scientists involved in the genome project to describe their achievement. On 26 June 2000, when Francis Collins, Director of the National Human Genome Research Institute, announced the completion of the first draft in a major media event at the White House, he said "Today, we celebrate the revelation of the first draft of the human book of life" and declared that this breakthrough lets humans for the first time read "our own instruction book." Today, Helyxzion is learning and reading the language DNA. We are also profoundly humbled by the privilege of turning the pages that describe the miracle of human life, written in the mysterious language of all the ages, the language of DNA.The real implications of "reading" the human genome sequence is just now being realized. This could very well be the turning point in human evolution from a scientific point of view, it could change the role of science, because it introduces human will and intentions into the scientific exercise, after centuries of attempting to free science and research from the limits imposed by religious leaders. In the best case, it provokes sarcasm: In the worst case, it provokes public fear—the idea of the scientists 'playing God' is not too unbelievable. And the public does listen to what the scientists are saying—indeed, public attention to the genome project was unrivalled. In 2000, The New York Times alone published 108 articles related to the Human Genome Project. Was it this outburst in media attention that turned scientists into PR spokesmen and encouraged them to blow their speech out of proportion?&lt;br /&gt;Thinking of genes as 'controlling' or 'programming' development dictates a certain view of these processes. "The Helyxzion "ANVIL" describe for us the exact content and structure, not only of each and every gene associated with a species, but also the intron information, that controls a particular gene.&lt;br /&gt;With the Helyxzion technology any one with a background knowledge of molecular biology will be able to grasp the sense of sequences easily.&lt;br /&gt;Thinking of DNA as a language, information (encompassing both content and structure), a code a text and a chemical structure, all at the same time. The lay reader is overwhelmed with an impression of impact, meaning, prominence, significance and seriousness, but deprived of any means to understand what exactly has been said. Helyxzion allows practitioners to explain their work in simple and easily understandable terms.Moreover, Helyxzion eliminates confusion and misconceptions in Genetics. The potential of genetics is achieved by emphasizing the power of the language and also that of the scientist analyzing it. "Reading, from cover to cover, the first draft of this 'Book of Life'", is exactly what scientists are now capable of doing. Rather, then trying to infer some meaning from small individual chunks of text.&lt;br /&gt;The powerful idea that the essence of life is a DNA sequence that scientists are about to read "from cover to cover", means that DNA can be analyzed and manipulated by the scientists, who are therefore taking part in human evolution. Use of this technology in various prenatal genetic diagnosis, gene patenting, the use of genomic markers to predict predisposition to disease, and the use of DNA to identify individuals. Scientists should not indiscriminately use this technology in an exaggerated way. As our parents used to tell us when we were children: "Watch your language!"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-113103935686307718?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/113103935686307718/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=113103935686307718' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113103935686307718'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113103935686307718'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/11/code.html' title='the code'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-113103345192431248</id><published>2005-11-03T07:57:00.000-08:00</published><updated>2006-02-25T08:43:09.003-08:00</updated><title type='text'>post-genome discovery</title><content type='html'>HELYXZION AND POST-GENOME DISCOVERY&lt;br /&gt;The DNA of a cell contains all the instructions necessary to recreate life. As such, the sequence of a genome's DNA provides a form of information transfer with its own alphabet (i.e., nucleotides), words (i.e., codons), and sentences (i.e., genes). Thus, the efforts to decode the meaning of DNA sequence is an exercise analogous to that of cryptography seeking to derive meaning from a collection of seemingly randomly recurring symbols.&lt;br /&gt;Previous deciphering efforts have been basic and focused on the immediate meaning of a focal sequence. This is akin to the translation of a text on a word-by-word basis. As we advance in this understanding, we start to see higher order meaning through the nuances of gene expression and splice changes. Moreover, the structure and organization of the DNA sequences within and across species provides a clue as to the fundamental rules that governed the creation of life and the understanding of DNA as a true language.&lt;br /&gt;Linguistics is a branch of science that has long sought to define the architecture and laws of language structure. There is ample evidence to indicate that both the dimensions and units of linguistic structure appear genetically embedded in the human species. Therefore, the analysis of the structure of language has provided a window into the make-up of the Homo Sapien mind, and perhaps a set of useful strategies to unearth similar structures.&lt;br /&gt;Experimentally, therefore, both the disciplines of genomics and linguistics seek to uncover order and information from a sea of noise. Genomics, by virtue of its origins in physical and biological sciences, has had the benefit of rigorous computational tools and laboratory validation in its investigations. Unlike genomics, however, the intuitive understanding of language in all of us permitted linguists to convincingly reconstruct rules governing the transmission of higher order meaning, while unlike cryptography, genomics can use experimental strategies to uncover the relation between form and meaning.&lt;br /&gt;The "Helyxzion" Language of DNA will explore the investigative strategies used by these diverse fields of genomics and linguistics in identifying meaning from recurrent strings of information in a multidisciplinary manner touching on linguistics, genomics, computation and molecular biology. The goal will be to synthesize novel conceptual approaches to uncovering higher order meaning from DNA sequence information, to develop a deeper understanding of DNA as a language and explore the possibility of forging novel investigative strategies in genomic research.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-113103345192431248?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/113103345192431248/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=113103345192431248' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113103345192431248'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113103345192431248'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/11/post-genome-discovery.html' title='post-genome discovery'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-113103312950003439</id><published>2005-11-03T07:51:00.000-08:00</published><updated>2005-11-03T07:52:09.520-08:00</updated><title type='text'>potential impact</title><content type='html'>Helyxzions Potential impact&lt;br /&gt;Strategic impact&lt;br /&gt;By the early 1970s, a puzzling set of observations emerged, that mRNAs in the Cytoplasm were found to be much shorter than their nuclear RNA counterparts. Researcher found the answer in 1977 with the discovery of RNA splicing. Most higher eukaryotes mRNAs (incl. of human) are encoded in the genome in a hyphenated form. Blocks of protein-coding sequences, called exons, are separated by intervening sequences, called introns. The primary transcripts a copy of the entire gene, containing introns as well as exons. The introns are then removed in a process called pre-mRNA splicing. Our genome exists  more than 96 % OF INTRONS, which means that more than 96 % of  human genome has non-coding information, but now researchers in US and Europe are sure, that  there is a lot of information, it is just a puzzle, based on understanding splicing and the so called alternative splicing process. Splicing also offers cells an opportunity to “customize”gene products to meet different specialized needs. Alternative splicing is a mechanism in which the use of different splice donor and /or acceptor sites allows for the production, from a single gene, of multiple mRNAs with partly overlapping genetic information, each encoding a distinct polypeptide. It is the basic mechanism that different tissues produce specialized forms of particular proteins, which makes it so important for understanding diseases like Breast Cancer, Muscular Dystrophy, Friedreich Ataxia and so on.&lt;br /&gt;Additional to these important cell processes, this proposed project offers a strategic scheme to described two very basic systems (alternative splicing and NMD) in a point of intersection between cell signalling pathways, called apoptosis. This field is called the scientific eye in cellular biology, because it is carry at least some parts of the secret of cell structure, cell cycle, and, of course, human diseases.&lt;br /&gt;Coordinate key researchers of different disciplines in this project network  Research focus on fundamental biological process of cells The project carries many parts of the secret of cell structure, cell cycle and Of human diseases  Using Breast Cancer and Muscular Dystrophy as tissue model for studying Fundamental biological process Receive answers on the functions and molecular mechanisms leading to the Disease  and the Development of databases for scientists, physicians and patients&lt;br /&gt;An impact contribution to establish a critical mass of potential excellence, by networking the capacities present in different Member States. And by attracting to Europe the best researchers from the rest of the world&lt;br /&gt;The European Research Area (ERA) was proposed by the Commission in January 2000.&lt;br /&gt;It should be focused on specific themes that are strategically important to Europe’s future. The themes have been devised in the light of political debate, expert advice, and public consultation. They are not structured from the starting point of traditional research disciplines, but as strategic themes that will be achieved through combinations of scientific disciplines.&lt;br /&gt;Our project is qualified to create a European centre of excellence through the collaboration of leading laboratories and companies in the respective fields, located in Vienna, Cambridge, Leiden, Berlin, Budapest, Jerusalem,&lt;br /&gt;We coordinate key researchers of different disciplines from different geographical areas and countries in this project network. Such high quality networks stimulate creativity and make Europe more attractive for top researchers. This will be an important impact to strengthen the European competitiveness.&lt;br /&gt;Technology Impact&lt;br /&gt;The project’s consortium integrates unique technologies and systems to investigate three important  functional genetic mechanism,, in general cell biological process and their influence especially on Breast Cancer and Muscular Dystrohy. The dissemination of these technologies and its further development to young researchers will be achieved by an exchange of tools, know-how and researches organized by the partners.  The partner 4,and 8  are reference centres for the genetic research, which have set up quality controls and standardized techniques according to French healthcare legislation. The database represent a unique tool collecting all pathway interactions which we will identify within the consortium and reported in the literature. Further developments of this database are planned in this project.  &lt;br /&gt;The development of crystallization and X-ray structure analyses (UA) of proteins and proteins-interaction represents a new technology currently not pursued in any other lab worldwide. Furthermore, the proposed development of an posttranslational adpoptosis array in using lipid-coated surfaces is a highly innovative approach that will promote future studies on different diseases. uniquely available in our consortium,&lt;br /&gt;will promote the development of further disease models for testing therapeutic strategies.  Our industrial partner Biochemicon is uniquely positioned to move rapidly and cost effectively from a described cell process to the identification of drug targets and of drugs effective in reversing a pathological phenotype, through its NEW Bioinformatic tools throughput pharmacology screening platform. The idea to develop a specific intracellular carrier For study NMD and alternative splicing in diseases like BC and MD will catapult the knowledge about these diseases.&lt;br /&gt; Impact on the Regional and European Economy&lt;br /&gt;Specific project outputs (theranostic tools) address large market areas of the world economy, in vitro diagnostics and health care. The total world market for molecular diagnostics (MDx) was approximately 1.5 Beuro in 2000 and is projected to grow to 4.7 Beuro by 2005 and 16,800 Beuro by 2010.  Diagnostic assays we intend to develop will not only feed into Molecular Diagnostics (MDx), but also into other major market segments, e.g. Point-Health-of-Care (PHOC) testing, clinical chemistry, and theranostics. &lt;br /&gt;Splicin, and alternative splicing also impact on key areas of market opportunities – cardiovascular,  cancer, neurological and metabolic disease, thus moving into mainstream health care areas. In order to be able to exploit the commercial potential from this development for the EU, it is important for SMEs to be fully integrated in those areas of expertise needed to develop pre-commercial findings into therapeutics. The size of the final market will ultimately be dictated by the extent to which pathways and new pharmaceutical targets will be found to have a role in more common multifactorial diseases and the extent to which insights from the rare monogenic diseases are informative therapeutically for the more common polygenic conditions. For example, while familial partial lipodystrophy in itself is rare and not a commercially attractive disease to study, the application of pharmacological tools used in normal adipose biology, leads into a multibillion Euro market. By integrating the SMEs at ground level as in our project, the EU will be at the forefront of exploiting this area on the world stage in disease areas with current massive unmet needs.&lt;br /&gt; Impact on European Policy&lt;br /&gt;The ‘Lisbon’ economic objectiveThe Lisbon European Council Summit held on in March 2000 set an objective to ‘make the European Union the most competitive and dynamic knowledge-based economy in the world by 2010’. This has since been interpreted as requiring focused community-wide investment in research, and the improvement of innovation and entrepreneurship.&lt;br /&gt;The European Research AreaThe European Research Area (ERA) was proposed by the Commission in January 2000. It has since been endorsed by the Heads of State and Government, and is now the major research policy debate in Europe.&lt;br /&gt;Commissioner Philippe Busquin, the architect of the ERA, described it as “The development, at European level, of an area for the coherent and co-ordinated pursuit of research activities and policies, and an area in which researchers and knowledge move freely will encourage the expression of European excellence in several ways: First, by making it possible to establish a 'critical mass' of potential excellence, by networking the capacities present in different Member States, particularly through intensive use of information and communication technologies. Second, by releasing people and teams from the protection of national barriers, thus introducing competition and increasing the general level of excellence. Third, by attracting to Europe the best researchers from the rest of the world, in the same way that American campuses are currently attracting researchers.”&lt;br /&gt;Our proposal fits well in accordance with point 1, 2 and 3, because of our project partners, which present a multi member community of Europe, there excellent experience in their specific field, and the statement of the problem of our project is a fundamental statement in many different areas. Therefore to find more information about it will bring a huge impact in different research areas and will attracting the best researchers from the rest of the world. The European Commission decided on 25 February 2004 on the work plan for the Implementation of the programme of Community action in the field of public health (2003 to 2008) (2004/192/EC). The general objectives of this programme are public healthcare Programmes: i) to improve information and knowledge for the development of public health;&lt;br /&gt;ii) To enhance the capability of responding rapidly and in a coordinated fashion the health Threats; and iii) to promote health and prevent disease. Our project is well in accordance with Point i) and iii) by aiming at identifying molecular disease mechanisms, identifying drug Targets and developing therapeutic applications for genetic diseases, as well as to offer An information platform for physicians to diagnose such diseases at early disease stages.&lt;br /&gt;Impact on Regional and European Environment&lt;br /&gt;It is particularly important in a new and rapidly developing health care-relevant field to both Reassure the general public that the technology and therapies to be developed have no negative Effects on their safety and well-being and as little negative effects as possible on the Environment. In this present project toxicological and biocompatibility considerations relating To project outputs are of primary concern. The developed diagnostic targets will support studies on novel therapeutics drugs permitting the evaluation of the efficacy/toxicity of a candidate.&lt;br /&gt;Commissioner Philippe Busquin, the architect of the ERA, described one of the most important impacts as “Attracting to Europe the best researchers from the rest of the world, in the same way that American campuses are currently attracting researchers.”&lt;br /&gt;Social Impact (jobs, education, quality of life)&lt;br /&gt;Education: It is our intention to inform and educate Society with respect to project outputs (knowledge, therapies, drugs) through communication with all interest groups at all educational levels. his will be accomplished via a variety of routes: hard copy publications (news media, scientific meetings, scientific and popular publications), and the construction and maintenance of a project-related websites. The obtained results will also support an internationally competitive high level academic education in the respective fields.&lt;br /&gt;Quality of Life:&lt;br /&gt;The project’s major and ultimate goals, the identification of drug targets and drugs for therapeutic applications in patients is clearly heading towards improvement the quality of life of patients, by reducing some of most of the pathological phenotypes often posing major risk factors for life. While the potential treatment strategies for these diseases are not clearly foreseeable at the moment and will depend on the results of our project, pathogenic secondary defects can already be treated with existing drugs that have already been tested in patients. However, our attempts to follow the efficiency of therapies by developing and using tools is expected to allow further adjustments of the therapy protocol for specific patient groups, thus increasing the efficiency of therapies and reducing the risks. This will of course have a major impact of the quality of patients’ life. Furthermore, finding new approaches for the treatment BC and MD will broaden and improve strategies for treating in other diseases. This project will also contribute indirectly to the healthcare of EU citizens through development of improved and more cost-effective methodologies for identification of inherited genetic diseases and cellular analysis for treatment.&lt;br /&gt;Finally, the benefit of our project not only lies in the development of improved diagnosis and treatment of diseases leading to improved well-being of patients, but also subsequent suffering at the individual or societal level can be lowered (e.g. loss of income due to sick leave, ability to participate in recreational activities, and general mental health). Employment market: Through the integration of SMEs and the hiring of research personal, the project will create new jobs. &lt;br /&gt; Exploitation and dissemination of results&lt;br /&gt;The project’s consortium integrates all relevant competencies to address innovation related aspects, such as technology transfer, intellectual property rights (IPR), clinical trials, etc., ensuring optimal use of the generated knowledge. We have reserved more than 40%of the budget to a group of cutting edge research SMEs. They will develop new technologies for studying Cell-function and drug testing and drug target identification with minimal amounts of chemicals, being thus highly cost effective, which is particularly relevant for studying rare diseases. New findings will be commercially exploited through patenting and licensing. All tools and reports published by the teams in the consortium will be pre-screened by an IPR advisory team collaborating with the management group of our consortium (Multi Lingual) &lt;br /&gt;Contributions to standards&lt;br /&gt;Standardisation and quality in this project is an important issue. Genetic screening of patient material performed at Inserm has to follow standardized techniques according to French healthcare legislation. &lt;br /&gt;Of course resaerch and strategies in natural science, therapeutic treatment and evaluation of therapy responses have to be done in a standardized manner.  Therefore, the consortium shall form a Standards Group from individuals from the partner institutions with experience in „complex organisation standardisation’ of working practices, manufacturing processes and product quality. Its responsibility shall be two fold.&lt;br /&gt;• By direct monitoring of workpackages and production of written specifications of working practices and evaluation procedures it will ensure standardisation of processes and outputs from the project and therefore transferability. This will enhance Partner confidence in project outputs in terms of their quality and ultimately the success of the project as a whole.&lt;br /&gt;• This will promote EU standardisation issues by supporting existing standardisation legislation and promoting the introduction of new codes where none currently exist or are developed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-113103312950003439?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/113103312950003439/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=113103312950003439' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113103312950003439'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113103312950003439'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/11/potential-impact.html' title='potential impact'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-113027299028462193</id><published>2005-10-25T13:42:00.000-07:00</published><updated>2005-10-25T13:43:10.293-07:00</updated><title type='text'>Politics of Medicine</title><content type='html'>After the many years of research in genetics and health care, I have come to realize the barriers set forth by vested capitalist interests in the health field. After all, the name of the game in business is to make money, and in order to make money in the health field it is necessary for people to be sick. If people lead healthy, vital lives with minimum sickness, the shareholders in the various health-care industries are not going to be happy — and they are going to take their money elsewhere.&lt;br /&gt;I like to use the example of the plant which grows anywhere, and when eaten, cures all diseases known to man. What happens when the discovery of this plant is made public? The medical establishment flexes its considerable public relations muscle, goes on television, gets into national news magazines and local newspapers and assures the public that this "cockamamie" new plant is a hoax. The public, unaccustomed to examining evidence and thinking for itself, automatically believes the medical establishment, and laws are passed to outlaw and eradicate this plant.&lt;br /&gt;Meanwhile, a few researchers and other people who think for themselves grow this new plant, try it out for themselves, and find it to be as effective as advertised. Some of these researchers risk everything to offer this new treatment to the public Many of them lose their jobs, because they will not back down from the medical establishment.&lt;br /&gt;If you are going to live your life to the maximum health and vitality, it is essential that you understand this process and resolve to think for yourself regarding your health. Here is the essential truth: the desire for profits by the medical and pharmaceutical companies distorts the truth. Few people, including physicians, are able to see through these distortions.&lt;br /&gt;In the exquisite process of life the normal state, is totally healthy, but at some point, do to bad diet, expose to microbes or toxins you are likely to get sick. You go to the  Doctor and get a prescription for an appetite suppressant and some incorrect, parent-like advice about diet and exercise. Meanwhile, you may notice that the doctor also is a little round and not looking like the picture of vitality.&lt;br /&gt;In my opinion, allopathic researchers are doing far more harm than good, in the long run. Why do I say that? Let us look at the allopathic model of medicine. The allopathic paradigm is that the body is perfectly normal until symptoms of a distinct disease appear. The cause of that disease is thought to be one thing, and all you have to do is identify that one thing and give the one best antidote to return the patient to perfect health.&lt;br /&gt;So, if there is an infection, the task is to identify the bacteria or other organism causing the infection and kill it with a chemical, usually an antibiotic. Or, if the problem is with the immune system, the cause must be one thing, perhaps a virus. So kill the virus or, if that is not possible, develop a vaccine to the virus and protect people who do not yet have the virus. If the offending agent is one of the patient's own organs — the stomach, colon, or gallbladder, for example — pour a chemical on it. If that doesn't work, cut it out, throw it away and pronounce the patient cured! This is not really curing anything it's just treating the septum!&lt;br /&gt;It may be that biology is a bit more complex than the allopathic, one cause, one cure paradigm would have it. It may be that the causes of today's illnesses, especially the ones which are epidemic (vascular disease, cancer and immune dysfunction), are multi-factorial: from many different insults to your genes.&lt;br /&gt;The average person in America is exposed to 500 foreign chemicals each day. Each day! This is an assault on the human body without precedent in history. I am not amazed that so many people are sick, I am amazed that so many people are well.&lt;br /&gt;Nevertheless, the dominant medical model today is the allopathic model, so even though the supposed single causes of most diseases have not been discovered, there still is the assumption that there is one cause, and there should therefore be one treatment. But, what if the cause is genetic?&lt;br /&gt;Let us say that you have been exposed to 500 chemicals each day for years, your immune system is somewhat depressed by having been insulted by molecules foreign to the human body, and you develop bronchitis. Then your doc gives you a prescription for an antibiotic — another foreign chemical. The antibiotic kills the bacteria causing your bronchitis. Success, right? Maybe not. Maybe the antibiotic to which you have just been exposed — in an amount larger than the amount of all the other foreign chemicals to which you have been exposed in the last two months combined — while having killed the bacteria, also has insulted your immune system even more, making it more likely for you to develop another infection, perhaps in a different part of your body. So you solve this problem with another antibiotic — with the same result.&lt;br /&gt;A similar situation exists with the cancer patient. Perhaps the cancer is the result of multiple insults to the body, but then the doctor says "Here take this stuff; it will kill your cancer, if it doesn't kill you first."&lt;br /&gt;Picture the heart patient with cardiovascular disease from who-knows-what-they-put-in- it, which he or she has been eating for years, and the doctor says, "Take some of this stuff. If you survive the side effects, your heart may work better." What kind of solution is that?&lt;br /&gt;Or, picture the AIDS patient. Diagnosis made. Patient assured of the single cause of AIDS: the HIV-1 virus. Here, take this AZT. Patient looks at the package insert, which comes with his AZT and sees that the possible side effects of AZT look like a description of autoimmune deficiency. How much sense does it make to take a medicine which can produce the disease it is supposed to treat?&lt;br /&gt;Contrary to popular opinion, cancer and AIDS survival rates are no better with chemical treatments than they are without. They are different, however: the doctor has something to do and the pharmaceutical industry is getting rich.&lt;br /&gt;The best hounds barking up the wrong tree will never catch the cat. Can it be that we are barking up the wrong tree? What if all diseases have multiple causes and predisposing genetic factors. How wise are we to ignore those factors and take another chemical? Is it possible that we are producing more, rather than less, disease — in the long run?&lt;br /&gt;On the other hand, why wait? Why not do these things now? When you learn you have cancer, or any other degenerative disease, that is the day you will want the address of this web site. It does not have to happen to you either. Your health is in your hands.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-113027299028462193?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/113027299028462193/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=113027299028462193' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113027299028462193'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/113027299028462193'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/10/politics-of-medicine.html' title='Politics of Medicine'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-112566953530880922</id><published>2005-09-02T06:58:00.000-07:00</published><updated>2005-09-02T06:58:55.316-07:00</updated><title type='text'>press release june 26</title><content type='html'>July 26, 2005&lt;br /&gt;  &lt;a href="http://www.physorg.com/newsletter"&gt;&lt;/a&gt;&lt;a href="http://www.physorg.com/printnews.php?newsid=5442"&gt;&lt;/a&gt;&lt;a href="http://www.physorg.com/email.php?newsid=5442"&gt;&lt;/a&gt;&lt;br /&gt;Revolutionary new bioinformatics-based viewer, developed by Helyxzion, allows scientists much greater control over scope of DNA analysis. With scale no longer an issue, scientists have tool to dramatically advance genetic analysis.&lt;br /&gt;&lt;br /&gt;Without clear sight, problems remain unsolvable. Until now, looking at DNA has been tantalizing, but ultimately, a tease. Helyxzion´s New v.3.0 Pro Anvil Viewer™, revolutionary &lt;a href="http://www.physorg.com/news5442.html##" target="_blank"&gt;software&lt;/a&gt; produced by Helyxzion and sold by Helyxzion/Biochemicon (Biochemicon is the European Representative of Helyxzion LLC and the NBI nano, bio, info–tech part of this holding company), presents scientists with a tool designed to bring a new level of clarity and precision to DNA studies. The new &lt;a href="http://www.physorg.com/news5442.html##" target="_blank"&gt;viewer&lt;/a&gt; allows scientists to understand DNA sequences which have heretofore been considered just nonsense. The software and the theory behind it are the work of Charles Stevens, a respected biologist. Walter Battistutti, (CEO of Biochemicon), chief of the Nanotech advisory board and vice president of Helyxzion said, "By making sense of nonsense, scientists hope to see, for the first time, the protein sequences behind maladies such as cancer and innumerable common genetic disorders.“ Bioinformatics is a branch of biology dedicated to mathematically decrypt the genetic code. The field did not bear fruit until the human genome was successfully and sequentially mapped. After that, real results began to emerge. Building on this monumental breakthrough, scientists developed a Helyxzonic model and were well on the way to produce software capable of letting researchers translate the language contained in all of human DNA. A breathtaking set of clinical trials showed the software could unerringly depict protein structures of genes, reveal dominant and recessive genetic characteristics, compare multiple code strings quickly and describe DNA, RNA and amino acid relationships. The viewer is a &lt;a href="http://www.physorg.com/news5442.html##" target="_blank"&gt;web-based&lt;/a&gt; viewer that allows a biologist to &lt;a href="http://www.physorg.com/news5442.html##" target="_blank"&gt;upload&lt;/a&gt; a string of DNA code and analyze it at differing scales. While conventional theories suggest only three percent of DNA contributes to the protein basis of human life, using the viewer, scientists can begin to see new combinations and how they contribute to human life and to disease. According to Battistutti, researchers can catalogue new combinations within individual genes that may provide insight into the protein basis of many common disorders, such as cancer. With greater insight may come more effective treatments. The first version of the viewer is now available. Based on input from users worldwide, a second version is being developed.&lt;br /&gt;Helyxzion Software Poised to Unlock the Code Obscuring the Elusive Mysteries of Human Life&lt;br /&gt;&lt;br /&gt;Revolutionary new bioinformatics-based viewer, developed by Helyxzion, allows scientists much greater control over scope of DNA analysis. With scale no longer an issue, scientists have tool to dramatically advance genetic analysis.&lt;br /&gt;July 26, 2005 (PRWEB via &lt;a href="http://www.prwebdirect.com/"&gt;PR Web Direct&lt;/a&gt;) -- Without clear sight, problems remain unsolvable. Until now, looking at DNA has been tantalizing, but ultimately, a tease.Helyxzion's New v.3.0 Pro Anvil Viewer™, revolutionary software produced by Helyxzion and sold by Helyxzion/Biochemicon (Biochemicon is the European Representative of Helyxzion LLC and the NBI nano, bio, info–tech part of this holding company), presents scientists with a tool designed to bring a new level of clarity and precision to DNA studies. The new viewer allows scientists to understand DNA sequences which have heretofore been considered just nonsense.The software and the theory behind it are the work of Charles Stevens, a respected biologist. Walter Battistutti, (CEO of Biochemicon), chief of the Nanotech advisory board and vice president of Helyxzion said, "By making sense of nonsense, scientists hope to see, for the first time, the protein sequences behind maladies such as cancer and innumerable common genetic disorders.“Bioinformatics is a branch of biology dedicated to mathematically decrypt the genetic code. The field did not bear fruit until the human genome was successfully and sequentially mapped. After that, real results began to emerge. Building on this monumental breakthrough, scientists developed a Helyxzonic model and were well on the way to produce software capable of letting researchers translate the language contained in all of human DNA. A breathtaking set of clinical trials showed the software could unerringly depict protein structures of genes, reveal dominant and recessive genetic characteristics, compare multiple code strings quickly and describe DNA, RNA and amino acid relationships.The viewer is a web-based viewer that allows a biologist to upload a string of DNA code and analyze it at differing scales. While conventional theories suggest only three percent of DNA contributes to the protein basis of human life, using the viewer, scientists can begin to see new combinations and how they contribute to human life and to disease. According to Battistutti, researchers can catalogue new combinations within individual genes that may provide insight into the protein basis of many common disorders, such as cancer. With greater insight may come more effective treatments.The first version of the viewer is now available. Based on input from users worldwide, a second version is being developed.About Biochemicon:BCC is the European Representative of HELYXZION LLC and based on the identification of the “Language of DNA”, BCC is one of the European leader in the area DNA-Protein-based Nanotechnology and Bioinformatics. Biochemicon founded in October 2000 by a high motivated team of English, American and Austrian scientists for the purpose of using DNA not only in the conventional strands, but also using their enormous variability ample scope for designing molecules. In the last year, BCC developed a Bio-machine for Nano-detection path processing in human body. It useable for body conditions monitoring in space shuttle projects.This year, we were successful in developing the first Chron disease diagnostic kit on the world market.In 2005, Helyxzion LLC and Biochemicon found many synergies and grew together to form a holding company.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-112566953530880922?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/112566953530880922/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=112566953530880922' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/112566953530880922'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/112566953530880922'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/09/press-release-june-26_02.html' title='press release june 26'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111902107128223603</id><published>2005-06-17T08:10:00.000-07:00</published><updated>2006-02-25T08:43:21.453-08:00</updated><title type='text'>latest helyxzion discovery</title><content type='html'>Helyxzion the technology of genetic discovery:&lt;br /&gt;Identification of activated Intron-sequences of Chromosome 13 with Helyxzion the Language of DNATM&lt;br /&gt;&lt;br /&gt;Dr. Charles Stevens, Helyxzion, Wisc. US, W.B.Battistutti, Ph.D and M. Missuraca, Ph.D, Biochemicon &amp; Cambridge Univ., UK Sonja Vogel, MD, General Hospital Vienna, Austria; However, since all cancers are based on genetic mutations in body cells, whether they are inherited or triggered by aging or environmental factors, studies on cancer genetics can lead to improved diagnosis and treatment.&lt;br /&gt;&lt;br /&gt;While scientists reporting in PNAS have not yet identified a third BRCA gene, they have succeeded in pinpointing its probable location to chromosome 13, in an interval of about five million base pairs. This is the same chromosome that also contains the previously identified BRCA2 gene, discovered in 1995. (BRCA1, discovered in 1994, lies on chromosome 17.)&lt;br /&gt;Mutations of BRCA1 and BRCA2 impair the body cells’ production of tumour suppressor proteins.&lt;br /&gt;&lt;br /&gt;We were able to analyse the whole sequence of Chromosome 13 included all Intron regions with the „Helyxzion Viewer“, and detect a sequence of the “non coding part”, which are involved in the mutation regulation of BRCA2 and could explain, why scientists are looking for BRCA3 and have not found it until now.&lt;br /&gt;&lt;br /&gt;With the knowledge of the complete (Introns) translation of Chromosome 13, we developed an Array-chip system for chromosome 13 conditions monitoring and are offering a new system for detection of gene alterations.&lt;br /&gt;&lt;br /&gt;Correspondence:&lt;br /&gt;&lt;a href="http://www.biochemicon.org/"&gt;http://www.biochemicon.org/&lt;/a&gt; &lt;a href="mailto:w.b.battistutti@biochemicon.org"&gt;w.b.battistutti@biochemicon.org&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.helyxzion.com/"&gt;http://www.helyxzion.com/&lt;/a&gt; &lt;a href="mailto:chs@helyxzion.com"&gt;chs@helyxzion.com&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111902107128223603?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111902107128223603/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111902107128223603' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111902107128223603'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111902107128223603'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/06/latest-helyxzion-discovery.html' title='latest helyxzion discovery'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111549130483028994</id><published>2005-05-07T11:40:00.000-07:00</published><updated>2005-05-07T11:41:44.836-07:00</updated><title type='text'>NANOTECHNOLOGY GOING TO MARS</title><content type='html'>2005 Helyxzion Abstract Nanotechnology to the Moon &amp; Mars:&lt;br /&gt;Helyxzions’ progress towards Nanotechnology and potential aerospace applications has taken great strides towards understanding, visualizing, and controlling matter at the atomic scale. In particular, substantial progress has been made towards the construction of molecular computers. Some progress has been made towards understanding biological molecular machines and manipulating these machines for technological purposes. Also, several polymeric molecules, notably proteins, DNA, and RNA, can be automatically synthesized from precise specifications. Helyxzions technology is particularly suited to this end. This example of "programmable matter" has been used to produce at least one molecular mechanical device. However, integration of molecular components into larger atomically precise systems has made little progress. Scaling up molecular Nanotechnology to produce macroscopic products of aerospace interest, for example, launch vehicles will require large research and development investments. In particular, self-replication, proposed as a route to macroscopic molecular Nanotechnology products, is one step closer to fruition with the use of “ANVIL” as a DNA blue printing tool. This paper is a high-level discussion of molecular Nanotechnology and some aerospace applications. Applications of importance to aerospace and NASAs’ Mars Mission including NanoGuardians, (which will remove heave metal and other toxins form the body) computers, materials, and sensors. This review is not exhaustive and much important and relevant work is yet to be done.  &lt;br /&gt;Introduction&lt;br /&gt;Molecular Nanotechnology is the three-dimensional structural control of materials, processes and devices at the atomic scale. The problems of chemistry and biology can be greatly helped if our ability to see what we are doing, and to do things on an atomic level, is ultimately developed---a development which I think cannot be avoided." Atomically precise control of matter is progressing rapidly. A particularly dramatic example was the use of a scanning tunneling microscope to write the characters "IBM" by manipulating xenon atoms on a copper surface. While a meaningful achievement it will not prove to be the way in which Nanotechnology will be able to Control the fantastic complexity of atomic scale matter, it will almost certainly require "programmable matter," DNA at the atomic scale, products that are created and/or controlled by computer programs (Helyxzion). Current examples include protein, RNA, and DNA synthesis from an exact specification of the sequence. Beyond today's state-of-the-art lie molecular machines, although a few biological molecular machines have been studied, synthesized, and used in laboratory settings. These technologies should suffice for the production of microscopic products. To produce macroscopic objects of aerospace interest will require some mechanism to scale products up in size. Biological systems use reproduction to produce large objects, such as whales and redwood trees, starting with single cells or small seeds. The construction of self-replicating programmable machines, while extraordinarily difficult and dangerous, should enable dramatic improvements in aerospace systems. Helyxizons technology is the only that offers this capabilities.&lt;br /&gt;Any molecular Nanotechnology must be based on chemistry, and the field has taken a number of directions. Organic chemists have produced a wide variety of small structures, including testable two junction computer devices. Biotechnology has been used to create a wide variety of systems, including 2D crystal patterns of DNA, modified copies of biological molecular motors, and covalently bonded molecular tubes with precise radius. Fullerene Nanotechnology development has produced transistors and diodes and wide variety of theoretical studies have examined the properties of many other potential devices, including Fullerene gears, bearings, and three junction electrical devices.&lt;br /&gt;Progress in Nanotechnology can be reasonably expected to enable radical improvement in a wide variety of aerospace systems and applications. Computer technology will probably be the first to feel the Nanotechnology revolution, with substantial advantages to the aerospace industry. Theoretical and numerical studies suggest that 1018 MIPS computers and 1015 bytes/cm2 write once memory is possible. It may also be possible to build safe, affordable vertical take-off and landing aircraft to replace personal automobiles and eliminate the need for most roads.&lt;br /&gt;The development of Nanotechnology is important for the exploration and future settlement of space. Current manufacturing technologies limit the reliability, performance, and affordability of aerospace materials, systems, and avionics. Nanotechnology has enormous potential to improve the reliability and performance of aerospace hardware while lowering manufacturing cost. For example, Nan structured materials that are perhaps 100 times lighter than conventional materials of equivalent strength are possible. Embedding nanoscale electromechanical system components into earth-orbiting satellites, planetary probes, and piloted vehicles potentially could reduce the cost of future space programs. The miniaturized sensing and robotic systems would enhance exploration capabilities at significantly reduced cost. Thousands to millions of such miniaturized devices could help map a planet in a single launch.&lt;br /&gt;Launch costs might be reduced significantly using nanotechnology, the extreme case, estimating that a four passenger single-stage-to-orbit launch vehicle weighing only three tons could be built using a mature diamondoid nanotechnology. More conservatively, estimated $153-412 per kilogram launched to low-Earth-orbit assuming existing single-stage-to-orbit vehicle designs but using diamondoid rather than conventional materials. Nanotechnology itself, the atomic scale control and imaging, programmable matter, molecular machines, and bio-nanotechnology replication, are some of the major challenges and opportunities ahead for Helyxzion&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111549130483028994?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111549130483028994/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111549130483028994' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111549130483028994'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111549130483028994'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/nanotechnology-going-to-mars.html' title='NANOTECHNOLOGY GOING TO MARS'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111533121419008793</id><published>2005-05-05T15:12:00.000-07:00</published><updated>2005-05-05T15:13:34.196-07:00</updated><title type='text'>HELYXZION AND SECURITY</title><content type='html'>HELYXZION AND USER SECURITY:&lt;br /&gt;Most comparisons between the Patriot Act and McCarthyism are NOT overblown rhetoric.  American scientists, know a lot about the way scientific research was affected by security concerns in the Cold War. If you don’t wish to see the specter of McCarthy again, listen up.New restrictions, and those on the horizon, may pose difficulties for contemporary biology that are far more chilling than those that beset early Cold War physics. The current constraints on foreign students and visitors in the name of national security have already worked serious unintended consequences for American science, engineering, and medicine.&lt;br /&gt;Censorship of sensitive unclassified research threatens worse effects, by menacing open communication in numerous biomedical areas including Helyxzions study of diseases and the immune system. It could thus, threaten Helyxzions abilities to engineer therapies and cures and that has placed the very competitiveness of the nation’s biotechnology industry in peril.&lt;br /&gt;Helyxzions biotech customers face more difficult obstacles than physicists did in the Cold War. Then, a scientist was made suspect by his or her political affiliations. In contrast, what makes a scientist suspect today is their nationality, which is difficult to modify, or ethnicity, which is unchangeable. There is no appeal against the denial of access to selected biological agents on the basis of nationality; it applies absolutely without exception. Visa delays and denials have already interfered with or caused the cancellation of important international conferences, disrupted careers, and slowed research projects, such as anti-HIV drugs, a vaccine against the West Nile virus, and sensors to detect biowarfare agents.&lt;br /&gt;HELYXZION was incorporated just 90 days after 9/11 and from day one security was a number one concern in the development of the technology and is today one of the only bio-nanotech software on the market that is secure, and in the next release of the PRO ANVIL v4.0 product will incorporate many new levels of security for our users, Keeping HOMELAND SECURITY home! And out of your research lab….&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111533121419008793?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111533121419008793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111533121419008793' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111533121419008793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111533121419008793'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-and-security.html' title='HELYXZION AND SECURITY'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111531429460404633</id><published>2005-05-05T10:29:00.000-07:00</published><updated>2005-05-05T10:31:34.610-07:00</updated><title type='text'>helyxzion discovery and the post-genome knowledge</title><content type='html'>HELYXZION DISCOVERY AND THE POST-GENOME KNOWLEDGE OF DNA(January - 2005)&lt;br /&gt; Jointly organized by Helyxzion and its Consortiums Joint Venture Partners&lt;br /&gt;The DNA of a cell contains all the instructions necessary to recreate life. As such, the sequence of a genome's DNA provides a form of information transfer with its own alphabet (i.e., nucleotides), words (i.e., codons), and sentences (i.e., genes). Thus, the efforts to decode the meaning of DNA sequence is an exercise analogous to that of cryptography seeking to derive meaning from a collection of seemingly randomly recurring symbols.&lt;br /&gt;Previous deciphering efforts have been basic and focused on the immediate meaning of a focal sequence. This is akin to the translation of a text on a word-by-word basis. As we advance in this understanding, we start to see higher order meaning through the nuances of gene expression and splice changes. Moreover, the structure and organization of the DNA sequences within and across species provides a clue as to the fundamental rules that governed the creation of life. Mathematical algorithms provided an insight into DNA that no other method could. It that search Helyxzions discovery of the basic algorithm of DNA that lead to the full and complete deciphering of the language of DNA is the greatest achievement since Watson and Chick discovery DNAs structure.&lt;br /&gt;Linguistics is a branch of science that has long sought to define the architecture and laws of language structure. There is ample evidence to indicate that both the dimensions and units of linguistic structure appear genetically embedded in the human species. Therefore, the analysis of the structure of language has provided a window into the make-up of the Homo Sapien mind, and perhaps a set of useful strategies to unearth similar structures.&lt;br /&gt;Helyxzion therefore, uses both the disciplines of math and linguistics seeking to uncover order and information from a sea of genomics noise. Genomics, by virtue of its origins in physical and biological sciences, has had the benefit of rigorous computational tools and laboratory validation in its investigations. Unlike genomics, however, the intuitive understanding of language in all of us permitted linguists to convincingly reconstruct rules governing the transmission of higher order meaning, while unlike cryptography, genomics can use mathematical strategies to uncover the relation between form and meaning.&lt;br /&gt;Helyxzions Language of DNA will explore the investigative strategies used by these diverse fields of math, biotechnology, genomics and linguistics in identifying meaning from recurrent strings of information. A series of short talks will be presented in a multidisciplinary manner touching on linguistics, genomics, computation and molecular biology. The goal will be to show new mathematical approaches to uncovering higher order meaning from DNA sequence information. These presentations will be followed by extensive discussions with the audience, aimed at deepening mutual understanding and exploring the possibility of forging novel investigative strategies in genomic research.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111531429460404633?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111531429460404633/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111531429460404633' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111531429460404633'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111531429460404633'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-discovery-and-post-genome.html' title='helyxzion discovery and the post-genome knowledge'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111530706620177359</id><published>2005-05-05T08:31:00.000-07:00</published><updated>2005-05-05T08:31:06.253-07:00</updated><title type='text'>HELYXZION abstract 3</title><content type='html'>&lt;a href="http://www.helyxzion.com/"&gt;Helyxzion: The Language of DNA&lt;/a&gt;HELYXZION&lt;br /&gt;Abstract: 3&lt;br /&gt;HELYXZION Actuarial Modeling of the Cost Implications of Genome Research" &lt;br /&gt;With the HELYXZIONMApping of the human genome, individuals may soon have access to specific information regarding their susceptibility to genetically influenced disorders. Insurance companies without access to this information risk insuring a disproportionate number of affected policyholders, leading to significant financial losses and a possible destabilization of the insurance industry. In cases where a single gene promotes susceptibility to a disorder, HELYXZION modeling can USE modeling techniques TO CURE &lt;br /&gt;GENETIC DISORDERS.&lt;br /&gt;&lt;br /&gt;A HELYXZION Statistical Analysis." &lt;br /&gt;In this talk we will explore trends and characteristics of advertising logos for political campaigns current in April, 2005. Examples were randomly found on the Internet using common search engines. A sample of 100+ HELYXZION graphics was selected and analyzed for various graphic and design elements, as well as content and color selection. A statistical analysis of this sample was performed. We will present the results and conclusions of this analysis. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;HELYXZION and the DNA Set&lt;br /&gt;Everyone has seen a HELYXZION. IT’S THE GENETIC SQUENCE VERY LIVING THING HAS AND PEOPLE TOO. Many people have heard the term, but few know what a HELYXZION is. &lt;br /&gt;Although a strict definition is hard to come by, HELYXZION IS not hard to describe. Anything that contains "self-similar" images is a HELYXZION. For example, the human circulatory system is a HELYXZION. If you look at the blood vessels in your hand, they resemble the overall shape that the complete system takes on. And since most occur naturally, we find we are surrounded by HELYXZION. This talk will give a brief description of HELYXZION, define the HELYXZION DNA Set, prove that it is a HELYXZION with zero length, and that it is closed, bounded, totally disconnected, and perfect. We will also discuss its HELYXZION dimension. &lt;br /&gt;&lt;br /&gt;"HELYXIZON AND THE Biomedical Data INTERFACE Processing: HELYXZIONICGene Expression Profile Clustering" &lt;br /&gt;A study of mathematical approaches for biomedical data processing is presented. The presentation is focused on the clustering methods for the analysis of gene expression profiling data obtained from the micro-array technology, and on showing the importance of mathematical foundations in the clustering techniques. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;"Classification by Nonlinear HELYXZION" &lt;br /&gt;A new method based on nonlinear integral projections for classification is presented. The contribution rate of each combination of the feature attributes, including each singleton, toward the classification is represented by a HELYXZION measure. The nonadditivity of the HELYXZION measure reflects the interactions HELYXZION HELYXZION among the feature HELYXZION attributes. The weighted integral with respect to the HELYXZION measure serves as an aggregation tool to project the feature space onto a real axis optimally according to some error criterion, and the classifying attribute is properly HEXOnumericalized on the HELXaxis simultaneously that makes the classification simple. To implement the classification, we need to determine the unknown parameters, that is, the values of HELYXZION measure and the weight function. This can be done by running a special adaptive genetic algorithm on the given training data. The new classifier is tested by an artificial training data set as well as several biological and medical data sets. It compares favorably with other existing classifiers on some well-known real-world benchmarks. &lt;br /&gt;SETTING THE STANDARD TO WHICH ALL BIOINFORMATION GENEOMIC OR NANOTECHNOLOGY IS READ AND UNIVERILY UNDERSTOOD AND USED&lt;br /&gt;&lt;br /&gt;This talk will report briefly on the classical HELYXZION Theory in arbitrary BIO-spaces, introduce the notion of HELYXZION AS A LANUGAGE, then discuss and illustrate the so-called HELYXZIONSets i.e. subsets of the real line whose normalized characteristic functions are HELYXZIONS Transforms of IONS &lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111530706620177359?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111530706620177359/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111530706620177359' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111530706620177359'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111530706620177359'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-abstract-3.html' title='HELYXZION abstract 3'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111517636654700199</id><published>2005-05-03T20:08:00.000-07:00</published><updated>2005-05-03T20:12:46.706-07:00</updated><title type='text'>PROOF FOR  HELYXZION ANVIL</title><content type='html'>HELYXZION A COLLECTION OF SICENTIFIC PROOF FOR “ANVIL”&lt;br /&gt;BY DR. MALCOLM SIMONS AND OTHERS&lt;br /&gt;&lt;br /&gt;Arch Virol. 2004 Jan;149(1):113-35. Epub 2003 Sep 22.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14689279"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14689279);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage bias amongst plant viruses.Adams MJ, Antoniw JF.     Plant Pathogen Interactions Division, Rothamsted Research, Harpenden, Herts, UK. mike.adams@bbsrc.ac.ukAn internet database (DPVweb) was established containing details of all sequences of viruses, viroids and satellites of plants that are complete or that contain at least one complete gene (n&gt;4600). The start and end positions of each feature (genes, non-translated regions etc) were recorded and checked for accuracy. Client software was written to enable easy selection of sequences and features of a chosen virus and to analyse codon usage bias. Codon usage was analysed for each gene of one example of each fully-sequenced plant virus. There were large differences in codon preferences, related to the nucleotide composition of the genome, particularly the GC content of the third codon position. There was no effect of gene size on codon bias. Genes from the same genome usually had similar coding strategies except where constrained by the overlap of reading frames. Although some synonymous codons were consistently used with low frequency by both plants and viruses, viruses were not generally adapted to use (or avoid) those codons most frequently used by their host plants and there was no obvious association with the type of transmission. Mutational bias, rather than translational selection appears to account for the majority of the variation detected. The software is available at &lt;a href="http://www.dpvweb.net/analysis/codons.php"&gt;http://www.dpvweb.net/analysis/codons.php&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;FEBS Lett. 1996 Dec 9;399(1-2):78-82.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8980124"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8980124);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Non-random usage of 'degenerate' codons is related to protein three-dimensional structure.Adzhubei AA, Adzhubei IA, Krasheninnikov IA, Neidle S.   CRC Biomolecular Structure Unit, The Institute of Cancer Research, Sutton, Surrey, UK.We report an analysis of a novel sequence-structure database of mammalian proteins incorporating nucleotide sequences of the exon regions of their genes together with protein sequence and structural information. We find that synonymous codon families (i.e. coding the same residue) have non-random codon distribution frequencies between protein secondary structure types. Their structural preferences are related to the third, 'silent' nucleotide position in a codon. We also find that some synonymous codons show very different or even opposite structural preferences at the N- or C-termini of structure fragments, relative to those observed for their amino acid residues.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1987 Jan 7;124(1):89-95.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3657190"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3657190);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage in Homo sapiens: evidence for a coding pattern on the non-coding strand and evolutionary implications of dinucleotide discrimination.Alff-Steinberger C.   Department of Molecular Biology, University of Geneva, Switzerland.This study reports the analysis of codon usage in 35 complete Homo sapiens genes. Both codon frequency and inter-codon interference exhibit patterns of evolutionary interest. There is a significant positive correlation between the frequency with which a given codon is used and the frequency with which its complement is used. Since the frequency of appearance of the complementary codon on the coding strand is equal to the frequency of appearance of the original codon on the non-coding strand, in the same phase, the non-coding strand is found to resemble the coding strand in triplet composition. The same effect has been observed in Escherichia coli. This preference for the use of certain complementary triplets as codons suggests that the evolution of the use of the genetic code depended to some extent upon the double-stranded nature of the coding material. In addition, the effect of discrimination against the use of two dinucleotides, CpG and UpA, is observed in codon usage and also in adjacent codon interference. Codons beginning with G, or A, are unlikely to be preceded by codons ending in C, or U, respectively. Consideration of codon assignment in the genetic code together with the observed CpG infrequency suggests that the evolution of the code may have been influenced by conditions in which the use of CpG dinucleotides was unfavorable. The infrequent use of UpA dinucleotides can be explained as the result of frameshift mutation during gene evolution.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1999 Jul;49(1):36-43.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10368432"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10368432);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The nonrandom location of synonymous codons suggests that reading frame-independent forces have patterned codon preferences.Antezana MA, Kreitman M.   Department of Ecology and Evolution, University of Chicago, 1101 East 57th Street, Chicago, IL 60637-1573, USA. marcos@uchicago.eduBiased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified in the codon-anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces not related to the individual codon-anticodon interaction may be involved in determining which synonymous codons are preferred or avoided. We show that the "off-frame" trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's "in-frame" codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of the location of confamilial synonymous codons along coding regions-a pattern often described as a context dependence of nucleotide choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences and the similarities in the codon usages of distantly related organisms.&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1987 Sep 25;15(18):7581-92.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3658704&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3658704);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=3658704"&gt;&lt;/a&gt;&lt;br /&gt;Periodicities in introns.Arques DG, Michel CJ.    Friedrich Miescher Institut, Bioinformatic group, Basel, Switzerland.The sequence information for the splicing process of introns is found in the consensus sequences at the two splice sites. For long introns, of 300 or more nucleotides, the middle regions may provide additional specificity for splicing which can be investigated by defining an adequate quantitative parameter. This methodology permits to retrieve the coding periodicity in the viral and mitochondrial introns and to identify with a statistical significance, a surprising alternating purine-pyrimidine base sequence -i.e. a modulo 2 periodicity- in the eukaryotic introns, and particularly in the vertebrate introns. This alternating structure suggests that the vertebrate introns do not have the genetic information to code for proteins, they carry structural and regulatory functions.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1987 Oct 21;128(4):457-61.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3446957&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3446957);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A purine-pyrimidine motif verifying an identical presence in almost all gene taxonomic groups.Arques DG, Michel CJ.   I.U.T. de Belfort, Universite de Franche-Comte, Belfort, France.A statistical parameter identifies, with a high degree of significance, a motif which is present in protein-coding sequences of eukaryotes, prokaryotes, chloroplasts, mitochondria, viral introns, ribosomal RNA genes, and transfer RNA genes. The random probability of occurrence of such a situation is 10(-12). This motif has the following properties:&lt;br /&gt;(i)                  its significant presence in almost all present-day genes explains why it can be considered as primitive oligonucleotide,&lt;br /&gt;(ii)                its nucleotide order is: YRY (N)6YRY, R being a purine base, Y a pyrimidine one and N any base,&lt;br /&gt;(iii)               its length and its terminal trinucleotides YRY suggest a primordial function related to the spatial structure of the DNA sequences.&lt;br /&gt;This motif is found in some viral protein-coding genes, but not in eukaryotic introns.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Bull Math Biol. 1990;52(6):741-72.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2279193&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2279193);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A model of DNA sequence evolution.Arques DG, Michel CJ.     Universite de Franche-Comte, Laboratoire d'Informatique de Besancon, Unite Associee CNRS No 822, France.Statistical studies of gene populations on the purine/pyrimidine alphabet have shown that the mean occurrence probability of the i-motif YRY(N)iYRY (R = purine, Y = pyrimidine, N = R or Y) is not uniform by varying i in the range, but presents a maximum at i = 6 in the following populations: protein coding genes of eukaryotes, prokaryotes, chloroplasts and mitochondria, and also viral introns, ribosomal RNA genes and transfer RNA genes (Arques and Michel, 1987b, J. theor. Biol. 128, 457-461). From the "universality" of this observation, we suggested that the oligonucleotide YRY(N)6 is a primitive one and that it has a central function in DNA sequence evolution (Arques and Michel, 1987b, J. theor. Biol. 128, 457-461). Following this idea, we introduce a concept of a model of DNA sequence evolution which will be validated according to a schema presented in three parts.&lt;br /&gt;In the first part, using the last version of the gene database, the YRY(N)6YRY preferential occurrence (maximum at i = 6) is confirmed for the populations mentioned above and is extended to some newly analysed populations: chloroplast introns, chloroplast 5' regions, mitochondrial 5' regions and small nuclear RNA genes. On the other hand, the YRY(N)6YRY preferential occurrence and periodicities are used in order to classify 18 gene populations.&lt;br /&gt;In the second part, we will demonstrate that several statistical features characterizing different gene populations (in particular the YRY(N)6YRY preferential occurrence and the periodicities) can be retrieved from a simple Markov model based on the mixing of the two oligonucleotides YRY(N)6 and YRY(N)3 and based on the percentages of RYR and YRY in the unspecified trinucleotides (N)3 of YRY(N)6 and YRY(N)3. Several properties are identified and prove in particular that the oligonucleotide mixing is an independent process and that several different features are functions of a unique parameter.&lt;br /&gt;In the third part, the return of the model to the reality shows a strong correlation between reality and simulation concerning the presence of a large alternating purine/pyrimidine stretches and of periodicities. It also contributes to a greater understanding of biological reality, e.g. the presence or the absence of large alternating purine/pyrimidine stretches can be explained as being a simple consequence of the mixing of two particular oligonucleotides. Finally, we believe that such an approach is the first step toward a unified model of DNA sequence evolution allowing the molecular understanding of both the origin of life and the actual biological reality.&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1990 Apr 5;143(3):307-18.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2385108&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2385108);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Periodicities in coding and noncoding regions of the genes.Arques DG, Michel CJ.    Universite de Franche-Comte, Unite Associee CNRS No. 822, Besancon, France.Gene population statistical studies of protein coding genes and introns have identified two types of periodicities on the purine/pyrimidine alphabet:&lt;br /&gt;(i)                  the modulo 3 periodicity or coding periodicity (periodicity P3) in protein coding genes of eukaryotes, prokaryotes, viruses, chloroplasts, mitochondria, plasmids and in introns of viruses and mitochondria, and&lt;br /&gt;(ii)                the modulo 2 periodicity (periodicity P2) in the eukaryotic introns.&lt;br /&gt;The periodicity study is herein extended to the 5' and 3' regions of eukaryotes, prokaryotes and viruses and shows: (i) the periodicity P3 in the 5' and 3' regions of eukaryotes. Therefore, these observations suggest a unitary and dynamic concept for the genes as for a given genome, the 5' and 3' regions have the genetic information for protein coding genes and for introns: (1) In the eukaryotic genome, the 5' (P2 and P3) and 3' (P2 and P3) regions have the information for protein coding genes (P3) and for introns (P2). The intensity of P3 is high in 5' regions and weak in 3' regions, while the intensity of P2 is weak in 5' regions and high in 3' regions. (2) In the prokaryotic genome, the 5' (P3) and 3' (P3) regions have the information for protein coding genes (P3). (3) In the viral genome, the 5' (P3) and 3' (P3) regions have the information for protein coding genes (P3) and for introns (P3). The absence of P2 in viral introns (in opposition to eukaryotic introns) may be related to the absence of P2 in 5' and 3' regions of viruses.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biochimie. 1993;75(5):399-407.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8347726&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8347726);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Identification and simulation of new non-random statistical properties common to different eukaryotic gene subpopulations.Arques DG, Michel CJ.    Equipe de Biologie Theorique, Universite de Franche-Comte, Laboratoire d'Informatique de Besancon, France.The nucleotide distribution in protein coding genes, introns and transfer RNA genes of eukaryotic subpopulations (primates, rodent and mammals) is studied by autocorrelation functions. The autocorrelation function analysing the occurrence probability of the i-motif YRY(N)iYRY (YRY-function) in protein coding genes and transfer RNA genes of these three eukaryotic subpopulations retrieves the preferential occurrence of YRY(N)6YRY (R = purine = adenine or guanine, Y = pyrimidine = cytosine or thymine, N = R or Y). The autocorrelation functions analysing the occurrence probability of the i-motifs RRR(N)iRRR (RRR-function) and YYY(N)iYYY (YYY-function) identify new non-random genetic statistical properties in these three eukaryotic subpopulations, mainly: i) in their protein coding genes: local maxima for i identical to 6 [12] (peaks for i = 6, 18, 30, 42) with the RRR-function and local maxima for i identical to 8 [10] (peaks for i = 8, 18, 28) with the YYY-function; and ii) in their introns: local maxima for i identical to 3 [6] (peaks for i = 3, 9, 15) and a short linear decrease followed by a large exponential decrease both with the RRR- and YYY-functions. The non-random properties identified in eukaryotic intron subpopulations are modelised with a process of random insertions and deletions of nucleotides simulating the RNA editing.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1993 Apr 7;161(3):329-42.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8331957&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8331957);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Identification and simulation of new non-random statistical properties common to different populations of eukaryotic non-coding genes.Arques DG, Michel CJ, Orieux K.    Equipe de Biologie Theorique, Universite de Franche-Comte, Laboratoire d'Informatique de Besancon, France.The autocorrelation function analysing the occurrence probability of the i-motif YRY(N)iYRY in genes allows the identification of mainly two periodicities modulo 2, 3 and the preferential occurrence of the motif YRY(N)6YRY (R = purine = adenine or guanine, Y = pyrimidine = cytosine or thymine, N = R or Y). These non-random genetic statistical properties can be simulated by an independent mixing of the three oligonucleotides YRYRYR, YRYYRY and YRY(N)6 (Arques &amp; Michel, 1990b). The problem investigated in this study is whether new properties can be identified in genes with other autocorrelation functions and also simulated with an oligonucleotide mixing model. The two autocorrelation functions analysing the occurrence probability of the i-motifs RRR(N)iRRR and YYY(N)iYYY simultaneously identify three new non-random genetic statistical properties: a short linear decrease, local maxima for i identical to 3[6] (i = 3, 9, etc) and a large exponential decrease. Furthermore, these properties are common to three different populations of eukaryotic non-coding genes: 5' regions, introns and 3' regions (see section 2). These three non-random properties can also be simulated by an independent mixing of the four oligonucleotides R8, Y8, RRRYRYRRR, YYYRYRYYY and large alternating R/Y series. The short linear decrease is a result of R8 and Y8, the local maxima for i identical to 3[6], of RRRYRYRRR and YYYRYRYYY, and the large exponential decrease, of large alternating R/Y series (section 3). The biological meaning of these results and their relation to the previous oligonucleotide mixing model are presented in the Discussion.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Int J Biol Macromol. 1996 Aug;19(2):131-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8842776&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8842776);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Identification of several types of periodicities in the collagens and their simulation.Arques DG, Fallot JP, Michel CJ.   Universite de Marne-la-Vallee, Institut Gaspard Monge, Noisy le Grand, France.The collagens constitute an important population of proteins providing the structural support in vertebrate tissues A collagen is mainly based on a series of tripeptides of the type GX1X2 (G = Glycine, X1 and X2 being any residues). The nine amino acids occurring with significant frequencies in the X1 and X2 residue sites and G form the reduced protein alphabet Q = [A,D,E,G,K,L,P,Q,R,S] (A = Alanine, D = Aspartic acid, E = Glutamic acid, K = Lysine, L = Leucine, P = Proline, Q = Glutamine, R = Arginine, S = Serine). Surprisingly, the method based on the autocorrelation function w(X)iw' analysing the probability that an amino acid w' in Q occurs any i residues X after an amino acid w in Q (called i-motif w(X)iw'), identifies six types of modulo 3 periodicities in collagens: three basic types 0, 1 and 2 modulo 3 and three combined types 0,1, 0,2 and 1,2 modulo 3. Furthermore, the classification of these 100 i-motifs according to the types of periodicities shows several strong relations between four sub-sets of Q [G], [A,D,P,S], [E,L] and [K,Q,R]. Then, these relations allow the construction of a simple automaton for the generation of model collagen sequences. Indeed, this automaton can simulate the six types of periodicities and it retrieves the types of periodicities for almost all i-motifs. Finally, the autocorrelation function based on the sub-set [K,Q,R] identifies segments of 18 amino acids in collagens which may correspond to the exons (segments of genes of 54 nucleotides) coding for those collagens.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1996 Sep 7;182(1):45-58.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8917736"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8917736);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A complementary circular code in the protein coding genes.Arques DG, Michel CJ. Universite de Marne-la-Vallee, Institut Gaspard Monge, France.Recently, shifted periodicities 1 modulo 3 and 2 modulo 3 have been identified in protein (coding) genes of both prokaryotes and eukaryotes with autocorrelation functions analysing eight of 64 trinucleotides (Arques et al., 1995). This observation suggests that the trinucleotides are associated with frames in protein genes. In order to verify this hypothesis, a distribution of the 64 trinucleotides AAA,..., TTT is studied in both gene populations by using a simple method based on the trinucleotide frequencies per frame. In protein genes, the trinucleotides can be read in three frames: the reading frame 0 established by the ATG start trinucleotide and frame 1 (resp. 2) which is the frame 0 shifted by 1 (resp. 2) nucleotide in the 5'-3' direction. Then, the occurrence frequencies of the 64 trinucleotides are computed in the three frames. By classifying each of the 64 trinucleotides in its preferential occurrence frame, i.e. the frame associated with its highest frequency, three subsets of trinucleotides can be identified in the three frames. This approach is applied in the two gene populations. Unexpectedly, the same three subsets of trinucleotides are identified in these two gene populations: Tzero = Xzero [symbol: see text] inverted question markAAA,TTT inverted question mark with Xzero = inverted question markAAC,AAT,ACC,ATC,ATT, CAG,CTC,CTG,GAA,GAC, GAG, GAT,GCC,GGC,GGT,GTA,GTC,GTT,TAC,TTC inverted question mark in frame 0, T1 = X1 [symbol: see text] inverted question markCCC inverted question mark in frame 1 and T2 = X2 [symbol: see text] inverted question markGGG inverted question mark in frame 2, each subset Xzero, X1 and X2 having 20 trinucleotides. Surprisingly, these three subsets have five important properties: (i) the property of maximal circular code for Xzero (resp. X1, X2) allowing the automatical retrieval of frame 0 (resp. 1, 2) in any region of a protein gene model (formed by a series of trinucleotides of Xzero) without using a start codon; (ii) the DNA complementarity property C (e.g. C(AAC) = GTT): C(T0) = T0, C(T1) = T2 and C(T2) = T1 allowing the two paired reading frames of a DNA double helix simultaneously to code for amino acids; (iii) the circular permutation property P (e.g. P(AAC) = ACA): P(Xzero) = X1 and P(X1) = X2 implying that the two subsets X1 and X2 can be deduced from Xzero; (iv) the rarity property with an occurrence probability of Xzero equal to 6 x 10(-8); and (v) the concatenation property with: a high frequency (27.5%) of misplaced trinucleotides in the shifted frames, a maximum (13 nucleotides) length of the minimal window to automatically retrieve the frame and an occurrence of the four types of nucleotides in the three trinucleotides sites, in favour of an evolutionary code. In the Discussion, the identified subsets Tzero, T1 and T2 replaced in the three two-letter genetic alphabets purine/pyrimidine, amino/ceto and strong/weak interaction, allow us to deduce that the RNY model (R = purine = A or G, Y = pyrimidine = C or T, N = R or Y) (Eigen &amp; Schuster, 1978) is the closest two-letter codon model to the trinucleotides of Tzero. Then, these three subsets are related to the genetic code. The trinucleotides of Tzero code for 13 amino acids: Ala, Asn, Asp, Gln, Glu, Gly, Ile, Leu, Lys, Phe, Thr, Tyr and Val. Finally, a strong correlation between the usage of the trinucleotides of Tzero in protein genes and the amino acid frequencies in proteins is observed as six among seven amino acids not coded by Tzero, have as expected the lowest frequencies in proteins of both prokaryotes and eukaryotes.&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1997 Mar 21;185(2):241-53.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9135803&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9135803);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;An evolutionary model of a complementary circular code.Arques DG, Fallot JP, Michel CJ.   Universite de Marne-la-Vallee, Institut Gaspard Monge, France.The subset X0 = [sequence: see text] of 20 trinucleotides has a preferential occurrence in frame 0 (a reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. This subset X0++ has the rarity property (6 x 10(-8)) to be a complementary maximal circular code with two permutated maximal circular codes X1 and X2 in frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5'-3' direction). X0 is called a C3 code. A quantitative study of these three subsets X0, X1 and X2 in the three frames 0, 1 and 2 of eukaryotic protein genes shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of X0, X1 and X2 in frame 0 of the eukaryotic protein genes are 48.5%, 29% and 22.5% respectively. These properties are not observed in the 5' and 3' regions of eukaryotes where X0, X1 and X2 occur with variable frequencies around the random value (1/3). Several frequency asymmetries unexpectedly observed, e.g. the frequency difference between X1 and X2 in the frame 0, are related to a new property of the C3 code X0 involving substitutions. An evolutionary model at three parameters (p, q, k) based on an independent mixing of the 20 codons (trinucleotides in frame 0) of X0 with equiprobability (1/20) followed by k approximately 5 substitutions per codon in the three codon sites in proportions p approximately 0.1, q approximately 0.1 and r = 1-p-q approximately 0.8 respectively, retrieves the frequencies of X0, X1 and X2 observed in the three frames of protein genes and explains these asymmetries.&lt;br /&gt;&lt;br /&gt;Biosystems. 1997;44(2):107-34.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9429747&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9429747);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A code in the protein coding genes.Arques DG, Michel CJ.  Universite de Marne la Vallee, Institut Gaspard Monge, Noisy Le Grand, France. arques@univ-mlv.frA statistical analysis with 12,288 autocorrelation functions applied in protein (coding) genes of prokaryotes and eukaryotes identifies three subsets of trinucleotides in their three frames: T0 = X0 [symbol: see text] inverted question markAAA, TTT inverted question mark with X0 = inverted question markAAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC inverted question mark in frame 0 (the reading frame established by the ATG start trinucleotide), T1 = X1 [symbol: see text] inverted question markCCC inverted question mark in frame 1 and T2 = X2 [symbol: see text] inverted question markGGG inverted question mark in frame 2 (the frames 1 and 2 being the frame 0 shifted by one and two nucleotides, respectively, to the right). These three subsets are identical in these two gene populations and have five important properties: (i) the property of maximal (20 trinucleotides) circular code for X0 (resp. X1, X2) allowing to retrieve automatically the frame 0 (resp. 1, 2) in any region of the gene without start codon; (ii) the DNA complementarity property C (e.g. C(AAC) = GTT): C(T0) = T0, C(T1) = T2 and C(T2) = T1 allowing the two paired reading frames of a DNA double helix simultaneously to code for amino acids; (iii) the circular permutation property P (e.g. P(AAC) = ACA): P(X0) = X1 and P(X1) = X2 implying that the two subsets X1 and X2 can be deduced from X0; (iv) the rarity property with an occurrence probability of X0 = 6 x 10(-8); and (v) the concatenation properties in favour of an evolutionary code: a high frequency (27.5%) of misplaced trinucleotides in the shifted frames, a maximum (13 nucleotides) length of the minimal window to retrieve automatically the frame and an occurrence of the four types of nucleotides in the three trinucleotide sites. In Discussion, a simulation based on an independent mixing of the trinucleotides of T0 allows to retrieve the two subsets T1 and T2. Then, the identified subsets T0, T1 and T2 replaced in the 2-letter genetic alphabet inverted question markR, Y inverted question mark (R = purine = A or G, Y = pyrimidine = C or T) allow to retrieve the RNY model (N = R or Y) and to explain previous works in the alphabet inverted question markR, Y inverted question mark. Then, these three subsets are related to the genetic code. The trinucleotides of T0 code for 13 amino acids: Ala, Asn, Asp, Gln, Glu, Gly, Ile, Leu, Lys, Phe, Thr, Tyr and Val. Finally, a strong correlation between the usage of the trinucleotides of T0 in protein genes and the amino acid frequencies in proteins is observed as six among seven amino acids not coded by T0, have as expected the lowest frequencies in proteins of both prokaryotes and eukaryotes.&lt;br /&gt;&lt;br /&gt;Bull Math Biol. 1998 Jan;60(1):163-94.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9530018&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9530018);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5' and 3' regions.Arques DG, Fallot JP, Michel CJ.    Equipe de Biologie Theorique, Universite de Marne la Vallee, Institut Gaspard Monge, Noisy Le Grand, France. arques@univ-mlv.frThe self-complementary subset T0 = X0 [symbol: see text] inverted question markAAA, TTT inverted question mark with X0 = inverted question markAAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC inverted question mark of 22 trinucleotides has a preferential occurrence in the frame 0 (reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. The subsets T1 = X1 [symbol: see text] inverted question markCCC inverted question mark and T2 = X2 [symbol: see text] inverted question markGGG inverted question mark of 21 trinucleotides have a preferential occurrence in the shifted frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5'-3' direction). T1 and T2 are complementary to each other. The subset T0 contains the subset X0 which has the rarity property (6 x 10(-8) to be a complementary maximal circular code with two permutated maximal circular codes X1 and X2 in the frames 1 and 2 respectively. X0 is called a C3 code. A quantitative study of these three subsets T0, T1, T2 in the three frames 0, 1, 2 of protein genes, and the 5' and 3' regions of eukaryotes, shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of T0, T1, T2 in the frame 0 of protein genes are 49, 28.5 and 22.5% respectively. In contrast, the frequencies of T0, T1, T2 in the 5' and 3' regions of eukaryotes, are independent of the frame. Indeed, the frequency of T0 in the three frames of 5' (respectively 3') regions is equal to 35.5% (respectively 38%) and is greater than the frequencies T1 and T2, both equal to 32.25% (respectively 31%) in the three frames. Several frequency asymmetries unexpectedly observed (e.g. the frequency difference between T1 and T2 in the frame 0), are related to a new property of the subset T0 involving substitutions. An evolutionary analytical model at three parameters (p, q, t) based on an independent mixing of the 22 codons (trinucleotides in frame 0) of T0 with equiprobability (1/22) followed by t approximately 4 substitutions per codon according to the proportions p approximately 0.1, q approximately 0.1 and r = 1 - p - q approximately 0.8 in the three codon sites respectively, retrieves the frequencies of T0, T1, T2 observed in the three frames of protein genes and explains these asymmetries. Furthermore, the same model (0.1, 0.1, t) after t approximately 22 substitutions per codon, retrieves the statistical properties observed in the three frames of the 5' and 3' regions. The complex behaviour of these analytical curves is totally unexpected and a priori difficult to imagine&lt;br /&gt;&lt;br /&gt;Biosystems. 1999 Feb;49(2):83-103.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10203190&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10203190);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;An evolutionary analytical model of a complementary circular code.Arques DG, Fallot JP, Marsan L, Michel CJ.   Equipe de Biologie Theorique, Universite de Marne la Vallee, Institut Gaspard Monge, Noisy le Grand, France. arques@univ-mlv.frThe subset X0=[AAC,AAT,ACC,ATC,ATT,CAG,CTC,CTG, GAA,GAC,GAG,GAT,GCC,GGC,GGT,GTA,GTC,GTT,TAC,TTC] of 20 trinucleotides has a preferential occurrence in the frame 0 (reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. This subset X0 is a complementary maximal circular code with two permutated maximal circular codes X1 and X2 in the frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5'-3' direction). X0 is called a C3 code (Arques and Michel, 1997, J. Biosyst 44, 107-134). A quantitative study of these three subsets X0, X1 and X2 in the three frames 0, 1 and 2 of eukaryotic protein genes shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of X0, X1 and X2 in the frame 0 of eukaryotic protein genes are 48.5%, 29% and 22.5% respectively. These properties are not observed in the 5' and 3' regions of eukaryotes where X0, X1 and X2 occur with variable frequencies around the random value (1/3). Several frequency asymmetries unexpectedly observed, e.g. the frequency difference between X1 and X2 in the frame 0, are related to a new property of the C3 code X0 involving substitutions. An evolutionary analytical model at three parameters (p, q, t) based on an independent mixing of the 20 codons (trinucleotides in the frame 0) of X0 with equiprobability (1/20) followed by t approximately 4 substitutions per codon according to the proportions p approximately 0.1, q approximately 0.1 and r = 1 - p - q approximately 0.8 in the three codon sites respectively, retrieves the frequencies of X0, X1 and X2 observed in the three frames of protein genes and explains these asymmetries. The complex behaviour of these analytical curves is totally unexpected and a priori difficult to imagine. Finally, the evolutionary analytical method developed could be applied to the phylogenetic tree reconstruction and the DNA sequence alignment.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biosystems. 2002 Jun-Jul;66(1-2):73-92.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12204444&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12204444);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Identification of protein coding genes in genomes with statistical functions based on the circular code.Arques DG, Lacan J, Michel CJ.    Equipe de Biologie Theorique, Institut Gaspard Monge, Universite de Marne la Vallee, 2 rue de la Butte Verte, 93160 Noisy le Grand, France. arques@univ-mlv.frA new statistical approach using functions based on the circular code classifies correctly more than 93% of bases in protein (coding) genes and non-coding genes of human sequences. Based on this statistical study, a research software called 'Analysis of Coding Genes' (ACG) has been developed for identifying protein genes in the genomes and for determining their frame. Furthermore, the software ACG also allows an evaluation of the length of protein genes, their position in the genome, their relative position between themselves, and the prediction of internal frames in protein genes.&lt;br /&gt;&lt;br /&gt;J Mol Biol. 1987 Oct 5;197(3):379-88.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3441003"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3441003);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon distribution in vertebrate genes may be used to predict gene length.Bains W. Department of Biochemistry, University of Bath, Claverton Down, England.I have analysed the coding regions of 96 eukaryotic genes for their use of iso-coding codons. Specific codons occur more frequently in specific positions in all members of some gene families than would be expected if codon choice was determined solely by the frequency of codon usage. In the absence of evidence a priori for selection for particular codons at particular positions, I term such co-occurring codons "coincident codons". Coincident codons are not confined to particular regions of genes, and their occurrence is not detectably linked with the location of introns in the genomic sequence. Their presence is partly but not completely explained by the exchange of sequence between similar functional genes within a species: homologous genes from different organisms also possess the same codons at some sites with greater than expected frequencies. The relative excess of coincident codons correlates well with the overall length of the genes analysed, but not with the length of mRNA or coding regions, or with qualitative features of gene structure or expression. This, and the unusual sequence environment of coincident codons, suggests that they are a feature of the overall secondary structure of the heterogeneous nuclear RNA. Such considerations suggest approaches for optimizing the expression of exogenous genes in eukaryotic systems, and for predicting the structure of genes for which only partial sequence data is available.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;DNA Seq. 1993;3(5):277-82.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8400357"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8400357);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage in mammalian genes is biased by sequence slippage mechanisms.Bains W.  PA Consulting Group, Melbourn, Royston, Herts, UK.The codons for some conserved amino acids are found to be the same between homologous genes from different species when the statistics of codon usage would suggest that they should be different. I examine whether this 'coincidence' of codon usage could be due to genetic mechanisms homogenising the DNA around specific sites. This paper describes the further analysis of the coincident codons in 19 genes (a total of 96 homologues) for slippage. Coincident codons arise in contexts of increased sequence simplicity, and have a high chance of occurring within sequences similar to the recombination-prone minisatellite 'core' sequence. This suggests a role of genetic homogenisation in their generation.&lt;br /&gt;&lt;br /&gt;Bioinformatics. 2001 Mar;17(3):237-48.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11294789"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11294789);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Flexibility of the genetic code with respect to DNA structure.Baisnee PF, Baldi P, Brunak S, Pedersen AG.  Department of Information and Computer Science, University of California, Irvine, CA 92697-3425, USA. pfbald@ics.uci.eduMOTIVATION: The primary function of DNA is to carry genetic information through the genetic code. DNA, however, contains a variety of other signals related, for instance, to&lt;br /&gt;reading frame,&lt;br /&gt;codon bias,&lt;br /&gt;pairwise codon bias,&lt;br /&gt;splice sites&lt;br /&gt;transcription regulation,&lt;br /&gt;nucleosome positioning and&lt;br /&gt;DNA structure.&lt;br /&gt;Here we study the relationship between the genetic code and DNA structure and address two questions. First, to which degree does the degeneracy of the genetic code and the acceptable amino acid substitution patterns allow for the superimposition of DNA structural signals to protein coding sequences? Second, is the origin or evolution of the genetic code likely to have been constrained by DNA structure? RESULTS: We develop an index for code flexibility with respect to DNA structure. Using five different di- or tri-nucleotide models of sequence-dependent DNA structure, we show that the standard genetic code provides a fair level of flexibility at the level of broad amino acid categories. Thus the code generally allows for the superimposition of any structural signal on any protein-coding sequence, through amino acid substitution. The flexibility observed at the level of single amino acids allows only for the superimposition of punctual and loosely positioned signals to conserved amino acid sequences. The degree of flexibility of the genetic code is low or average with respect to several classes of alternative codes. This result is consistent with the view that DNA structure is not likely to have played a significant role in the origin and evolution of the genetic code.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Biol. 1996 Nov 8;263(4):503-10.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8918932"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8918932);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Naturally occurring nucleosome positioning signals in human exons and introns.Baldi P, Brunak S, Chauvin Y, Krogh A.  Division of Biology, California Institute of Technology, Pasadena 91125, USA.We describe the structural implications of a periodic pattern found in human exons and introns by hidden Markov models. We show that exons (besides the reading frame) have a specific sequential structure in the form of a pattern with triplet consensus non-T(A/T)G, and a minimal periodicity of roughly ten nucleotides. The periodic pattern is also present in intron sequences, although the strength per nucleotide is weaker. Using two independent profile methods based on triplet bendability parameters from DNase I experiments and nucleosome positioning data, we show that the pattern in multiple alignments of internal exon and intron sequences corresponds to a periodic "in phase" bending potential towards the major groove of the DNA. The nucleosome positioning data show that the consensus triplets (and their complements) have a preference for locations on a bent double helix where the major groove faces inward and is compressed. The in-phase triplets are located adjacent to GCC/GGC triplets known to have the strongest bias in their positioning on the nuclesome. Analysis of mRNA sequences encoding proteins with known tertiary structure exclude the possibility that the pattern is a consequence of the previously well-known periodicity caused by the encoding of alpha-helices in proteins. Finally, we discuss the relation between the bending potential of coding and non-coding regions and its impact on the translational positioning of nucleosomes and the recognition of genes by the transcriptional machinery.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1991 Oct 7;152(3):405-26.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1749256"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1749256);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Frequencies of codons in histones, tubulins and fibrinogen: bias due to interference between transcription signals and protein function.Barrai I, Scapoli C, Gambari R, Brungnoli F.  Department of Evolutionary Biology, University of Ferrara, Italy.The distribution of codons was studied in 65 proteins: 48 histones, 14 tubulins, and three fibrinogens, With the methodology used,&lt;br /&gt;(1) we confirmed that the preterminator state of a codon has no detectable effect on codon bias.&lt;br /&gt;(2) The well-known effect of CG suppression was visible. We also found that&lt;br /&gt;some codons which are very rare, are equal to parts of known transcription signals.&lt;br /&gt; Thus, we advanced that to avoid signal interference, the use of these codons is suppressed when a synonymous codon is available. In addition we found that in the whole series of codons, transcription signals are less frequent than in a random sequence of equal composition. Finally we observed&lt;br /&gt; that tryptophan is absent in histones. This absence was related not to the TGG codon itself, but to characteristics of the amino acid. We conclude that the functional constraints of a protein can influence, at least for synonymous codon usage, the evolution of its own coding sequence.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1994 Feb 7;166(3):331-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8159018"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8159018);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage and evolutionary rates of proteins.Barrai I, Scapoli C, Nesti C, Poli G, Gambari R, Beretta M.  Department of Evolutionary Biology, University of Ferrara, Italy.The 61 codons and the three terminators were counted in the coding sequences of 31 families of proteins of higher vertebrates. The protein families were ordered according to their evolutionary rate. In each family, the ratio between the Observed and Expected frequency of each codon was obtained (O/E ratio). A strong and significant positive correlation was observed between the O/E ratio of the eight codons AAC, TAT, ATA, GAA, ACA, AAT, ATG and CGA and the evolutionary rate of the protein. A negative and significant correlation was observed for codons AAG and GAG. It was advanced that the functional constraints of proteins can influence the usage of codons, particularly for those trimers which are components of signal sequences. It was also observed that the O/E ratios of the terminators are negatively correlated with the evolutionary rate of the protein they terminate, and the correlation is significant for TAA and TGA, which in vertebrates might be older than TAG.&lt;br /&gt;&lt;br /&gt;BMC Bioinformatics. 2004 Apr 02;5(1):35.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15059245"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15059245);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;PASS2: an automated database of protein alignments organised as structural superfamilies.Bhaduri A, Pugalenthi G, Sowdhamini R.  National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS-GKVK campus, Bellary Road, Bangalore, Karnataka 560 065, India. anirban@ncbs.res.inBACKGROUND: The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. DESCRIPTION: An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. CONCLUSIONS: The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at &lt;a href="http://www.ncbs.res.in/~faculty/mini/campass/pass2.html"&gt;http://www.ncbs.res.in/~faculty/mini/campass/pass2.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 1993 Jan;10(1):205-20.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8450757"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8450757);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Forbidden synonymous substitutions in coding regions.Britten RJ.   Division of Biology, California Institute of Technology.In the evolution of highly conserved genes, a few "synonymous" substitutions at third bases that would not alter the protein sequence are forbidden or very rare, presumably as a result of functional requirements of the gene or the messenger RNA. Another 10% or 20% of codons are significantly less variable by synonymous substitution than are the majority of codons. The changes that occur at the majority of third bases are subject to codon usage restrictions. These usage restrictions control sequence similarities between very distant genes.  For example, 70% of third bases are identical in calmodulin genes of man and trypanosome. Third-base similarities of distant genes for conserved proteins are mathematically predicted, on the basis of the G+C composition of third bases. These observations indicate the need for reexamination of methods used to calculate synonymous substitutions.&lt;br /&gt;&lt;br /&gt;Nature. 1987 Feb 19-25;325(6106):728-30.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2434856"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2434856);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Coevolution of codon usage and transfer RNA abundance.Bulmer M.The use of synonymous codons is strongly biased in the bacterium Escherichia coli and yeast, comprising both bias between codons recognized by the same transfer RNA and bias between groups of codons recognized by different synonymous tRNAs. A major determinant of the second sort of bias is tRNA content, codons recognized by abundant tRNAs being used more often than those recognised by rare tRNAs, particularly in highly expressed genes, probably owing to selection at the level of translation against codons recognized by rare tRNAs. Conversely, codon usage is likely to exert selection pressure on tRNA abundance. Here I develop a model for the coevolution of codon usage and tRNA abundance which explains why there are unequal abundances of synonymous tRNAs leading to biased usage between groups of codons recognized by them in unicellular organisms.&lt;br /&gt;&lt;br /&gt;Genetics. 1991 Nov;129(3):897-907.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1752426"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1752426);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=1752426&amp;amp;db=pubmed&amp;url=http://www.genetics.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=1752426"&gt;&lt;/a&gt;&lt;br /&gt;The selection-mutation-drift theory of synonymous codon usage.Bulmer M.  Department of Statistics, Oxford University, England.It is argued that the bias in synonymous codon usage observed in unicellular organisms is due to a balance between the forces of selection and mutation in a finite population, with greater bias in highly expressed genes reflecting stronger selection for efficiency of translation. A population genetic model is developed taking into account population size and selective differences between synonymous codons. A biochemical model is then developed to predict the magnitude of selective differences between synonymous codons in unicellular organisms in which growth rate (or possibly growth yield) can be equated with fitness. Selection can arise from differences in either the speed or the accuracy of translation. A model for the effect of speed of translation on fitness is considered in detail, a similar model for accuracy more briefly. The model is successful in predicting a difference in the degree of bias at the beginning than in the rest of the gene under some circumstances, as observed in Escherichia coli, but grossly overestimates the amount of bias expected. Possible reasons for this discrepancy are discussed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1995 Mar;40(3):280-92.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7723055"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7723055);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Nonrandom frequency patterns of synonymous substitutions in homologous mammalian genes.Caccio S, Zoubak S, D'Onofrio G, Bernardi G.  Laboratoire de Genetique Moleculaire, Institut Jacques Monod, Paris, France.All 69 homologous coding sequences that are currently available in four mammalian orders were aligned and the synonymous positions of quartet and duet (fourfold and twofold degenerate) codons were divided into three classes (that will be called conserved, intermediate, and variable) according to whether they show no change, one change, or more than one change, respectively. We observed (1) that the frequencies of conserved, intermediate, and variable positions of quartet and duet codons are different in different genes; (2) that the frequencies of the three classes are significantly different from expectations based on a random substitution process in the majority of genes (especially for GC-rich genes) for quartet codons and in a minority of genes for doublet codons; and (3) that the frequencies of the three classes of positions of quartet codons are correlated with those of duet codons, the conserved positions of quartet and duet codons being, in addition, correlated with the degree of amino acid conservation. Our main conclusions are that synonymous substitution frequencies: (1) are gene-specific; (2) are not simply the result of a stochastic process in which nucleotide substitutions accumulate at random, over time; and (3) are correlated in quartet and duet codons.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2004 Jun;21(6):1014-23. Epub 2004 Mar 10.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15014158"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15014158);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: evidence for selectively driven codon usage.Chamary JV, Hurst LD.  Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom.In mammals divergence at fourfold degenerate sites in codons (K(4)) and intronic sequence (K(i)) are both used to estimate the mutation rate, under the supposition that both evolve neutrally. Does it matter which of these we use? Using either class of sequence can be defended because (1) K(4) is the same as K(i) (at least in rodents) and (2) there is no selectively driven codon usage (hence no systematic selection on third sites). Here we re-examine these findings using 560 introns (for 136 genes) in the mouse-&lt;br /&gt;rat comparison, aligned by eye and using a new maximum likelihood protocol. We find that the rate of evolution at fourfold sites and at intronic sites is similar in magnitude, but only after eliminating putatively constrained sites from introns (first introns and sites flanking intron-exon junctions). Any approximate congruence between the two rates is not, however, owing to an underlying similarity in the mode of sequence evolution. Some dinucleotides are hypermutable and differently abundant in exons and introns (e.g., CpGs). More importantly, after controlling for relative abundance, all dinucleotides starting with A or T are more prevalent in mismatches in exons than in introns, whereas C-starting dinucleotides (except CG) are more common in introns. Although C content at intronic sites is lower than at flanking fourfold sites, G content is similar, demonstrating that there exists a strong strand-specific preference for C nucleotides that is unique to exons. Transcription-coupled mutational processes and biased gene conversion cannot explain this, as they should affect introns and flanking exons equally. Therefore, by elimination, we propose this to be strong evidence for selectively driven codon usage in mammals.&lt;br /&gt;&lt;br /&gt;Genes Genet Syst. 1999 Dec;74(6):271-86.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10791023"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10791023);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;RNA secondary structure and compensatory evolution.Chen Y, Carlini DB, Baines JF, Parsch J, Braverman JM, Tanda S, Stephan W. Department of Biology, University of Rochester, NY 14627, USA.The classic concept of epistatic fitness interactions between genes has been extended to study interactions within gene regions, especially between nucleotides that are important in maintaining pre-mRNA/mRNA secondary structures. It is shown that the majority of linkage disequilibria found within the Drosophila Adh gene are likely to be caused by epistatic selection operating on RNA secondary structures. A recently proposed method of RNA secondary structure prediction based on DNA sequence comparisons is reviewed and applied to several types of RNAs, including tRNA, rRNA, and mRNA. The patterns of covariation in these RNAs are analyzed based on Kimura's compensatory evolution model. The results suggest that this model describes the substitution process in the pairing regions (helices) of RNA secondary structures well when the helices are evolutionarily conserved and thermodynamically stable, but fails in some other cases. Epistatic selection maintaining pre-mRNA/mRNA secondary structures is compared to weak selective forces that determine features such as base composition and synonymous codon usage. The relationships among these forces and their relative strengths are addressed. Finally, our mutagenesis experiments using the Drosophila Adh locus are reviewed. These experiments analyze long-range compensatory interactions between the 5' and 3' ends of Adh mRNA, the different constraints on secondary structures in introns and exons, and the possible role of secondary structures in RNA splicing.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Phys Rev E Stat Nonlin Soft Matter Phys. 2002 Jun;65(6 Pt 1):061907. Epub 2002 Jun 20.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12188759&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12188759);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Classification of amino acids based on statistical results of known structures and cooperativity of protein folding.Chen H, Zhou X, Ou-Yang ZC.   Center for Advanced Study, Tsinghua University, Beijing 100084, People's Republic of China.It has been found that the 20 kinds of amino acids have different frequencies of occurrence in alpha,beta, and coil structures [P. Y. Chou and G. D. Fasman, Biochemistry 13, 211 (1974)]. Based on more known structures of proteins, frequencies for each amino acid in alpha and beta secondary structures are recalculated. Next step, under the approximation ignoring the chain connectivity of proteins, energy parameters to form alpha and beta secondary structures for each amino acid are obtained. According to the hydrophobicity and energies in alpha and beta secondary structures, 20 kinds of amino acids are classified. The results suggest that dividing amino acids to five or nine groups is desirable. At last, a protein model considering both two-body hydrophobic interaction and one-body energy to form secondary structures, hydrophobic-polar alphabeta model, is introduced. It is shown that the consistency among various energy terms makes the cooperativity of protein folding closer to the experiments.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proc Natl Acad Sci U S A. 2004 Mar 9;101(10):3480-5. Epub 2004 Feb 27.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14990797&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14990797);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=14990797&amp;amp;db=pubmed&amp;url=http://www.pnas.org/cgi/pmidlookup?view=long&amp;amp;pmid=14990797"&gt;&lt;/a&gt; &lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=14990797"&gt; &lt;/a&gt;&lt;br /&gt;Codon usage between genomes is constrained by genome-wide mutational processes.Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH. Department of Developmental Biology, Stanford University School of Medicine, Beckman Center, B300, Stanford, CA 94304, USA. slchen@stanford.eduAnalysis of genome-wide codon bias shows that only two parameters effectively differentiate the genome-wide codon bias of 100 eubacterial and archaeal organisms. The first parameter correlates with genome GC content, and the second parameter correlates with context-dependent nucleotide bias. Both of these parameters may be calculated from intergenic sequences. Therefore, genome-wide codon bias in eubacteria and archaea may be predicted from intergenic sequences that are not translated. When these two parameters are calculated for genes from nonmammalian eukaryotic organisms, genes from the same organism again have similar values, and genome-wide codon bias may also be predicted from intergenic sequences. In mammals, genes from the same organism are similar only in the second parameter, because GC content varies widely among isochores. Our results suggest that, in general, genome-wide codon bias is determined primarily by mutational processes that act throughout the genome, and only secondarily by selective forces acting on translated sequences.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 1999 Sep 30;238(1):23-31.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10570980"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10570980);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Correlations of nucleotide substitution rates and base composition of mammalian coding sequences with protein structure.Chiusano ML, D'Onofrio G, Alvarez-Valin F, Jabbari K, Colonna G, Bernardi G.Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Naples, Italy.We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (alpha-helix, beta-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substitution were found to be significantly different in regions predicted to be alpha-helix, beta-sheet, or coil. Likewise, the nonsynonymous rates also differ, although expectedly at a lower extent, in the three types of secondary structure, suggesting that different selective constraints associated with the different structures are affecting in a similar way the synonymous and nonsynonymous rates. Moreover, the base composition of the third codon positions is different in coding sequence regions corresponding to different secondary structures of proteins.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2000 Dec 30;261(1):63-9.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11164038&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11164038);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Second codon positions of genes and the secondary structures of proteins. Relationships and implications for the origin of the genetic code.Chiusano ML, Alvarez-Valin F, Di Giulio M, D'Onofrio G, Ammirato G, Colonna G, Bernardi G.   Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Villa Comunale, I-80121, Naples, Italy.The nucleotide frequencies in the second codon positions of genes are remarkably different for the coding regions that correspond to different secondary structures in the encoded proteins, namely, helix, beta-strand and aperiodic structures. Indeed, hydrophobic and hydrophilic amino acids are encoded by codons having U or A, respectively, in their second position. Moreover, the beta-strand structure is strongly hydrophobic, while aperiodic structures contain more hydrophilic amino acids. The relationship between nucleotide frequencies and protein secondary structures is associated not only with the physico-chemical properties of these structures but also with the organisation of the genetic code. In fact, this organisation seems to have evolved so as to preserve the secondary structures of proteins by preventing deleterious amino acid substitutions that could modify the physico-chemical properties required for an optimal structure.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1993 Mar 21;161(2):251-62.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8331952&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8331952);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A joint prediction of the folding types of 1490 human proteins from their genetic codons.Chou JJ, Zhang CT.   Department of Physics, University of Michigan, Ann Arbor 48109.The codon usages for 1490 human proteins have been published by Wada et al. (1990). Based on these data, the frequencies of occurrence of 20 amino acids for each of the 1490 proteins have been calculated according to the genetic codes. Proteins are generally classified into five folding types, i.e. the alpha, beta, alpha + beta, alpha/beta and zeta (irregular) types. The folding type of a protein is correlated to its amino acid composition. By means of three methods established by different investigators, the folding type for each of the 1490 human proteins has been predicted. It has been demonstrated that the accuracy of prediction for the 1490 human proteins is at least 80% by examining the predicted results of some structurally known proteins with these methods. There are only six proteins for which there is uncertainty about their folding types as completely inconsistent results were obtained when predicted with the three different methods. For the remaining 1484 human proteins the numbers of alpha, beta, alpha + beta, alpha/beta, and zeta folding type proteins were found to be 128, 235, 169, 933 and 19, respectively, suggesting that the alpha/beta type proteins would predominate in this set of human proteins. The occurrence frequencies of bases in the first, second and third codon position for each folding type of protein have been calculated. It is shown that the folding type of a protein is strongly dependent on the ratio of frequency of base G in the first codon position with that in the second codon position. The biological implication of the results has been discussed.&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1998 Sep;47(3):268-74.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9732453"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9732453);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;An evaluation of measures of synonymous codon usage bias.Comeron JM, Aguade M.   Departament de Genetica, Facultat de Biologia, Universitat de Barcelona, Av. Diagonal 645, 08071 Barcelona, Spain. jcomeron@midwaySynonymous codons are not generally used at equal frequencies, and this trend is observed for most genes and organisms. Several methods have been proposed and used to estimate the degree of the nonrandom use of the different synonymous codons. The estimates obtained by these methods, however, show different levels of both precision and dispersion when coding regions of a finite number of codons are under analysis. Here, we present a study, based on computer simulation, of how the different methods proposed to evaluate the nonrandom use of synonymous codons are affected by the length of the coding region analyzed. The results show that some of these methods are heavily influenced by the number of codons and that the comparison of codon usage bias between coding regions of different lengths shows a methodological bias under different conditions of nonrandom use of synonymous codons. The study of the dispersion of the estimates obtained by the different methods gives, on the other hand, an indication of the methods to be applied to compare values of codon usage bias among coding regions of equivalent length.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Int J Pept Protein Res. 1989 Sep;34(3):184-95.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2599756"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2599756);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Regularities in the primary structure of proteins.Cserzo M, Simon I.  Institute of Enzymology, Hungarian Academy of Sciences, Budapest.In this paper the latest protein database consisting of more than a million amino acids is analyzed to characterize the short range regularities in the primary structure. The amino acid distributions along the polypeptide chain and among the proteins have been studied first. Their influence on the amino acid pair statistics was taken into account. We are primarily interested in the distances of the covalent structure, where the amino acid pair frequencies show non-random characters. The amino acid pairs separated by at least 20 residues in the covalent structure exhibit an exact Gaussian distribution. We found that there is a range of non-random pairing in the covalent structure. We conclude that the pair preference characters are different for each of the 20 x 20 amino acid pairs. The range of the non-random pairing varies from pair to pair, and in most cases it does not extend beyond the 9th neighbour. The preferences of a certain pair in a certain position can not be derived from the character of that pair in another position. The preference values of 400 amino acid pairs are listed for up to the pairs in 9th neighbour position. Some fields of potential application of these data have also been discussed.&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2005 Mar;22(3):496-500. Epub 2004 Nov 3.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15525696"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15525696);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The Comparative Method Rules! Codon Volatility Cannot Detect Positive Darwinian Selection Using a Single Genome Sequence.Dagan T, Graur D.    Department of Zoology, George S. Wise Faculty of Life Science, Tel Aviv University, Ramat Aviv, Israel; and daggerDepartment of Biology and Biochemistry, University of Houston, Houston, Texas.All established methods for detecting positive selection at the molecular level rely on comparisons between nucleotide sequences. An exceptional method that purports to detect selection on the basis of a single genomic sequence has recently been proposed. This method uses a measure called "codon volatility," defined for each codon as the ratio between the number of nonsynonymous codons that differ from the codon under study at a single nucleotide position and the number of sense codons that differ from the codon under study at a single nucleotide position. Here, we examine various properties of codon volatility and its derivatives and use simulation of evolutionary processes to determine whether they can be used to detect selective pressures. Codons for only four amino acids (glycine, leucine, arginine, and serine) show any variation in codon volatility. Thus, codon volatility is mainly a proxy for amino acid usage, rather than for codon usage, with 65% of all synonymous changes and 27% of all nonsynonymous changes being undetectable by this measure. Genes identified by the volatility method as being subject to positive selection tend to have idiosyncratic amino acid compositions (e.g., they are glycine rich or arginine poor). An additional property of codon volatility is the near zero variance of its mean expectation, which translates into overestimated statistical significance estimates, especially in the absence of corrections for multiple comparisons. A comparison with measures of selection inferred through comparative methodology reveals no relationship between the results of the two methods. Finally, we show that codon volatility can increase in the absence of positive Darwinian selection; that is, increased codon volatility is not indicative of positive selection.&lt;br /&gt;&lt;br /&gt;Curr Genet. 1991 Nov;20(5):353-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1807825"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1807825);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage is imposed by the gene location in the transcription unit.Delorme MO, Henaut A.   Centre de Genetique Moleculaire, Laboratoire propre du CNRS associe a l'Universite Pierre et Marie Curie, Paris VI, Gif-sur-Yvette, France.A characteristic profile of the fluctuations of codon usage is observed in bacteriophages and mitochondria. By following the DNA in the direction of transcription, one moves slowly from a region where selective pressure favours codons ending with C to a region where the bias is in favour of codons ending with T; then, abruptly, one again enters a region of codons ending in C. The transcription end point takes place in the area of abrupt change in codon usage. By comparing Drosophila yakuba and mouse mitochondrial genomes, it is possible to show that the strategy of codon usage for a given gene depends on its location along the transcription unit and not on the encoded protein. The choice of codons ending in T or C allows large scale variations of DNA stability which could regulate the speed of propagation of the RNA polymerase.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1991 Jun;32(6):504-10.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1908021&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1908021);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins.D'Onofrio G, Mouchiroud D, Aissani B, Gautier C, Bernardi G.   Laboratoire de Genetique Moleculaire, Institut Jacques Monod, Paris, France.We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unit slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 1992 Jan 2;110(1):81-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1544580"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1544580);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A universal compositional correlation among codon positions.D'Onofrio G, Bernardi G.  Laboratoire de Genetique Moleculaire, Institut Jacques Monod, Paris, France.We have investigated the compositional distributions of third codon positions of genes from the 16 prokaryotes and seven eukaryotes for which the largest numbers of coding sequences are available in data banks. In prokaryotes, both narrow and broad distributions were found. In eukaryotes, distributions were very broad (except for Saccharomyces cerevisiae) and remarkably different for different genomes. In low-GC genomes, third codon positions were lower in GC than first + second codon positions and trailed towards high GC; the opposite situation was found for high-GC genomes. In all genomes, first codon positions were higher in GC than second codon positions. We then investigated the compositional correlations between third and first + second codon positions in prokaryotic genomes (the 16 mentioned above plus 87 additional ones) and in genome compartments of eukaryotes. A general, common relationship was found, which also holds within the same (heterogeneous) genomes. This universal correlation is due to the fact that the relative effects of compositional constraints on different codon positions are the same, on the average, whatever the genome under consideration.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2002 Oct 30;300(1-2):179-87.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12468099&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12468099);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The base composition of the genes is correlated with the secondary structures of the encoded proteins.D'Onofrio G, Ghosh TC, Bernardi G.    Laboratorio di Evoluzione Molecolare, Stazione Zoologica A. Dohrn, Naples, Italy. donofrio@sunev.szn.itThe analysis of a non-redundant set of human proteins, for which both the crystallographic structures and the corresponding gene sequences are available, show that bases at third codon position are non-uniformly distributed along the coding sequences. Significant compositional differences are found by comparing the gene regions corresponding to the different secondary structures of the proteins. Inter-and intra-structure differences were most pronounced in the GC-richest genes. These results are not compatible with any proposed hypotheses based on a neutral process of formation/maintenance of the high GC(3) levels of the genes localized in the GC-richest isochores of the human genome.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Immunol. 1997 Mar 15;158(6):2779-89.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9058813&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9058813);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Analysis of the frequency and pattern of somatic mutations within nonproductively rearranged human variable heavy chain genes.Dorner T, Brezinschek HP, Brezinschek RI, Foster SJ, Domiati-Saad R, Lipsky PE.Department of Internal Medicine, Harold C. Simmons Arthritis Research Center, University of Texas, Dallas 75235, USA.Somatic hypermutation plays an essential role in avidity maturation of Ab. To characterize the effects of hypermutation without the imposed bias of Ag-mediated selection, the mutational pattern of 37 nonproductively rearranged VH genes amplified from individual human B cells was analyzed. A high frequency of mutations as well as frequent replacement mutations were observed in the complementarity-determining regions (CDR) and in the framework regions of nonproductive VHDJH rearrangements. Comparison with 57 productive VH rearrangements indicated that replacement mutations, especially those occurring in the framework regions, were less frequent in productively rearranged VH genes, suggesting that they were deleted from the expressed repertoire. A number of factors contributed to the nonrandom localization of mutations, including: the targeting of specific motifs, such as AGY, GCY, GTA, TAY, and RGYW; an increased frequency of some commonly mutated motifs in the CDRs; and an apparent increased likelihood of mutations of CDR nucleotides. Each of these appeared to bias the mutational machinery, resulting in an increased frequency of replacement mutations in the CDRs of nonproductive VH rearrangements.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2004 Sep 24;32(17):5036-44. Print 2004.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15448185"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15448185);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Solving the riddle of codon usage preferences: a test for translational selection.dos Reis M, Savva R, Wernisch L.  School of Crystallography, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK. m.reis@mail.cryst.bbk.ac.ukTranslational selection is responsible for the unequal usage of synonymous codons in protein coding genes in a wide variety of organisms. It is one of the most subtle and pervasive forces of molecular evolution, yet, establishing the underlying causes for its idiosyncratic behaviour across living kingdoms has proven elusive to researchers over the past 20 years. In this study, a statistical model for measuring translational selection in any given genome is developed, and the test is applied to 126 fully sequenced genomes, ranging from archaea to eukaryotes. It is shown that tRNA gene redundancy and genome size are interacting forces that ultimately determine the action of translational selection, and that an optimal genome size exists for which this kind of selection is maximal. Accordingly, genome size also presents upper and lower boundaries beyond which selection on codon usage is not possible. We propose a model where the coevolution of genome size and tRNA genes explains the observed patterns in translational selection in all living organisms. This model finally unifies our understanding of codon usage across prokaryotes and eukaryotes. Helicobacter pylori, Saccharomyces cerevisiae and Homo sapiens are codon usage paradigms that can be better understood under the proposed model.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 2003 Dec;57(6):694-701.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14745538"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14745538);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Mammalian mutation pressure, synonymous codon choice, and mRNA degradation.Duan J, Antezana MA.   Department of Psychiatry, The University of Chicago, 924 East 57th Street, R-004, Chicago, IL 60637, USA.The usage of synonymous codons (SCs) in mammalian genes is highly correlated with local base composition and is therefore thought to be determined by mutation pressure. The usage is nonetheless structured. For instance, mammals share with Saccharomyces and Drosophila most preferences for the C-ending over the G-ending codon (or vice versa) within each fourfold-degenerate SC family and the fact that their SCs are placed along coding regions in ways that minimize the number of TA and CG dinucleotides ("" being the codon boundary). TA and CG underrepresentations are observed everywhere in the mammalian genome affecting the SC usage, the amino acid composition of proteins, and the primary structure of introns and noncoding DNA. While the rarity of CG is ascribed to the high mutability of this dinucleotide, the rarity of TA in coding regions is considered adaptive because UA dinucleotides are cleaved by endoribonucleases. Here we present in vivo experimental evidence indicating that the number of TA and/or CG dinucleotides of a human gene can affect strongly the expression level and degradation of its mRNA. Our results are consistent with indirect evidence produced by other workers and with the detailed work that has been devoted to characterize UA cleavage in vitro and in vivo. We conclude that SC choice can influence strongly mRNA function and gene expression through effects not directly related to the codon-anticodon interaction. These effects should constrain heavily the nucleotide motif composition of the most abundant mRNAs in the transcriptome, in particular, their SC usage, a usage that must be reflected by cellular tRNA concentrations and thus defines for all other genes which SCs are translated fastest and most accurately. Furthermore, the need to avoid such effects genome-wide appears serious enough to have favored the evolution of biases in context-dependent mutation that reduce the occurrence of intrinsically unfavorable motifs, and/or, when possible, to have induced the molecular machinery mediating such effects to rely opportunistically on already existing motif rarities and abundances. This may explain why nucleotide motif preferences are very similar in transcribed and nontranscribed mammalian DNA even though the preferences appear to be adaptive only in transcribed DNA.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Theor Biol. 1985 Oct 7;116(3):343-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=4058024&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu4058024);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Genetic code redundancy and the evolutionary stability of protein secondary structure.Dufton MJ.The genetic code has an inherent bias towards some amino acids because of the variable number of synonymous codons per amino acid. The extent to which these biases are expressed in protein secondary structure is described through the analysis of the overall amino acid compositions of the alpha-helix, beta-sheet, beta-turn and random coil segments elucidated by X-ray crystallography. Given the concept of neutral mutation in proteins, the allocation of synonyms in the genetic code appears to protect secondary structures from amino acid changes and discourages the appearance of chemically complex residues. The level of protection is similar for each structural form, despite their clear preferences for certain amino acids. The organization of the code is therefore relevant to the preservation of conformation seen in the evolution of many protein families.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2001 May;18(5):757-62.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11319260&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11319260);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The elevated GC content at exonic third sites is not evidence against neutralist models of isochore evolution.Duret L, Hurst LD.    Pole BioInformatique Lyonnais, Laboratoire BBE-UMR Centre National de la Recherche Scientifique 5558, Universite Claude Bernard-Lyon 1, Villeurbanne, France.The human genome is divided into isochores, large stretches (&gt;&gt;300 kb) of genomic DNA with more or less consistent GC content. Mutational/neutralist and selectionist models have been put forward to explain their existence. A major criticism of the mutational models is that they cannot account for the higher GC content at fourfold-redundant silent sites within exons (GC4) than in flanking introns (GCi). Indeed, it has been asserted that it is hard to envisage a mutational bias explanation, as it is difficult to see how repair enzymes might act differently in exons and their flanking introns. However, this rejection, we note, ignores the effects of transposable elements (TEs), which are a major component of introns and tend to cause them to have a GC content different from (usually lower than) that dictated by point mutational processes alone. As TEs tend not to insert at the extremities of introns, this model predicts that GC content at the extremities of introns should be more like that at GC4 than are the intronic interiors. This we show to be true. The model also correctly predicts that small introns should have a composition more like that at GC4 than large introns. We conclude that the logic of the previous rejection of neutralist models is unsafe.&lt;br /&gt;&lt;br /&gt;Bioinformatics. 2002 Oct;18 Suppl 2:S91.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12385989"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12385989);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Detecting genomic features under weak selective pressure: the example of codon usage in animals and plants.Duret L.  Laboratoire de Biometrie et Biologie Evolutive, Lyon, France.Large scale experiments of gene inactivation in yeast have shown that 50% of genes have no detectable impact on the phenotype, and similar observations have been made in other model organisms. This apparent paradox is probably due to the fact that many genes only have a marginal contribution to the fitness of organisms. Because of the size of populations and the number of generations that can be studied in laboratories, experimental approaches only permit to detect functional elements that have a strong phenotypic impact. Comparative sequence analysis can help to solve this problem: the analysis of sequences evolution permits to detect the action of selection, and hence to reveal functional features of genomes. This approach will be illustrated by the study of synonymous codon usage in animals and plants.&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2002 Jun 1;30(11):2515-23.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12034841&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12034841);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=12034841&amp;amp;db=pubmed&amp;url=http://nar.oupjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=12034841"&gt;&lt;/a&gt; &lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=12034841"&gt; &lt;/a&gt;&lt;br /&gt;Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.Echols N, Harrison P, Balasubramanian S, Luscombe NM, Bertone P, Zhang Z, Gerstein M.   Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, Box 208114, New Haven, CT 06520-8114, USA.Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes-the worm, yeast, fly and human (chromosomes 21 and 22 only). Each of our nearly 2500 pseudogenes is characterized by one or more disablements mid-domain, such as premature stops and frameshifts. Here, we perform a comprehensive survey of the amino acid and nucleotide composition of these pseudogenes in comparison to that of functional genes and intergenic DNA. We show that pseudogenes invariably have an amino acid composition intermediate between genes and translated intergenic DNA. Although the degree of intermediacy varies among the four organisms, in all cases, it is most evident for amino acid types that differ most in occurrence between genes and intergenic regions. The same intermediacy also applies to codon frequencies, especially in the worm and human. Moreover, the intermediate composition of pseudogenes applies even though the composition of the genes in the four organisms is markedly different, showing a strong correlation with the overall A/T content of the genomic sequence. Pseudogenes can be divided into 'ancient' and 'modern' subsets, based on the level of sequence identity with their closest matching homolog (within the same genome). Modern pseudogenes usually have a much closer sequence composition to genes than ancient pseudogenes. Collectively, our results indicate that the composition of pseudogenes that are under no selective constraints progressively drifts from that of coding DNA towards non-coding DNA. Therefore, we propose that the degree to which pseudogenes approach a random sequence composition may be useful in dating different sets of pseudogenes, as well as to assess the rate at which intergenic DNA accumulates mutations. Our compositional analyses with the interactive viewer are available over the web at http://genecensus.org/pseudogene.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proteins. 1996 Jun;25(2):169-79.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8811733&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8811733);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class.Eisenhaber F, Frommel C, Argos P.  Institut fur Biochemie der Charite, Medizinische Fakultat, Humboldt-Universitat zu Berlin, Berlin-Mitte, Germany.The success rates reported for secondary structural class prediction with different methods are contradictory. On one side, the problem of recognizing the secondary structural class of a protein knowing only its amino acid composition appears completely solved by simply applying jury decision with an elliptically scaled distance function. Chou and coworkers repeatedly (see Crit. Rev. Biochem. Mol. Biol. 30:275-349, 1995) published prediction accuracies near 100%. On the other hand, traditional secondary structure prediction techniques achieve success rates of about 70% for the secondary structural state per residue and about 75% for structural class only with extensive input information (full sequence of the query protein, its amino acid composition and length, multiple alignments with homologous sequences). In this article, we resolve the paradox and consider (1) the question of the secondary structural class definition, (2) the role of the representativity of the test set of protein tertiary structure for the current state of the Protein Data Bank (PDB); and (3) we estimate the real impact of amino acid composition on secondary structural class. We formulate three objective criteria for a reasonable definition of secondary structural classes and show that only the criterion of Nakashima et al. (J. Biochem. 99:153-162, 1986) complies with all of them. Only this definition matches the distribution of secondary structural content in representative PDB subsets, whereas other criteria leave many proteins (up to 65% of all PDB entries) simply unassigned. We review critically specialized secondary-structural class prediction methods, especially those of Chou and coworkers, which claim almost 100% accuracy using only amino acid composition, and resolve the paradox that these prediction accuracies are better than those from secondary structure predictions from multiple alignments. We show (i) that these techniques rely on a preselection of test sets which removes irregular proteins and other proteins without any class assignment (about 35% of all PDB entries); and (ii) that even for preselected representative test sets, the success rate drops to 60% and lower for a 4-type classification (alpha, beta, alpha + beta, alpha/beta). The prediction accuracies fall to about 50% if the secondary structural class definition of Nakashima et al. is applied and only few irregular proteins are preselected and removed from automatically generated, representative subsets of the PDB. We have applied two new vector decomposition methods for secondary structural content prediction from amino acid composition alone, with and without consideration of amino acid compositional coupling in the learning set of tertiary structures respectively, to the problem of class prediction and achieve about 60% correct assignment among four classes (alpha, beta, mixed, irregular) as well as single sequence-based secondary structure prediction methods like GORIII and COMBI. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein and that consideration of compositional coupling does not improve the prediction success. The prediction program SSCP offering secondary structural class assignment for query compositions and sequences has been made available as a World Wide Web and E-mail service.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 1994 Nov;11(6):875-85.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7815927"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7815927);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=7815927&amp;amp;db=pubmed&amp;url=http://mbe.oupjournals.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=7815927"&gt;&lt;/a&gt; &lt;br /&gt;Evolution of base composition in the insulin and insulin-like growth factor genes.Ellsworth DL, Hewett-Emmett D, Li WH.   Center for Demographic and Population Genetics, University of Texas Health Science Center at Houston 77225.The genomes of homeothermic (warm-blooded) vertebrates are mosaic interspersions of homogeneously GC-rich and GC-poor regions (isochores). Evolution of genome compartmentalization and GC-rich isochores is hypothesized to reflect either selective advantages of an elevated GC content or chromosome location and mutational pressure associated with the timing of DNA replication in germ cells. To address the present controversy regarding the origins and maintenance of isochores in homeothermic vertebrates, newly obtained as well as published nucleotide sequences of the insulin and insulin-like growth factor (IGF) genes, members of a well-characterized gene family believed to have evolved by repeated duplication and divergence, were utilized to examine the evolution of base composition in nonconstrained (flanking) and weakly constrained (introns and fourfold degenerate sites) regions. A phylogeny derived from amino acid sequences supports a common evolutionary history for the insulin/IGF family genes. In cold-blooded vertebrates, insulin and the IGFs were similar in base composition. In contrast, insulin and IGF-II demonstrate dramatic increases in GC richness in mammals, but no such trend occurred in IGF-I. Base composition of the coding portions of the insulin and IGF genes across vertebrates correlated (r = 0.90) with that of the introns and flanking regions. The GC content of homologous introns differed dramatically between insulin/IGF-II and IGF-I genes in mammals but was similar to the GC level of noncoding regions in neighboring genes. Our findings suggest that the base composition of introns and flanking regions is determined by chromosomal location and the mutational pressure of the isochore in which the sequences are embedded. An elevated GC content at codon third positions in the insulin and the IGF genes may reflect selective constraints on the usage of synonymous codons.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1991 Nov;33(5):442-9.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1960741&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1960741);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;An analysis of codon usage in mammals: selection or mutation bias?Eyre-Walker AC.   Institute of Cell, Animal and Population Biology, University of Edinburgh, United Kingdom.A new statistical test has been developed to detect selection on silent sites. This test compares the codon usage within a gene and thus does not require knowledge of which genes are under the greatest selection, that there exist common trends in codon usage across genes, or that genes have the same mutation pattern. It also controls for mutational biases that might be introduced by the adjacent bases. The test was applied to 62 mammalian sequences, and significant codon usage biases were detected in all three species examined (humans, rats, and mice). However, these biases appear not to be the consequence of selection, but of the first base pair in the codon influencing the mutation pattern at the third position.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2002 Mar 1;30(5):1192-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11861911"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11861911);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=11861911&amp;amp;db=pubmed&amp;url=http://nar.oupjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=11861911"&gt;&lt;/a&gt;&lt;br /&gt;Regularities of context-dependent codon bias in eukaryotic genes.Fedorov A, Saxonov S, Gilbert W.   Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA. afedorov@fas.harvard.eduNucleotides surrounding a codon influence the choice of this particular codon from among the group of possible synonymous codons. The strongest influence on codon usage arises from the nucleotide immediately following the codon and is known as the N1 context. We studied the relative abundance of codons with N1 contexts in genes from four eukaryotes for which the entire genomes have been sequenced: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. For all the studied organisms it was found that 90% of the codons have a statistically significant N1 context-dependent codon bias. The relative abundance of each codon with an N1 context was compared with the relative abundance of the same 4mer oligonucleotide in the whole genome. This comparison showed that in about half of all cases the context-dependent codon bias could not be explained by the sequence composition of the genome. Ranking statistics were applied to compare context-dependent codon biases for codons from different synonymous groups. We found regularities in N1 context-dependent codon bias with respect to the codon nucleotide composition. Codons with the same nucleotides in the second and third positions and the same N1 context have a statistically significant correlation of their relative abundances.&lt;br /&gt;&lt;br /&gt;Proteins. 1997 Feb;27(2):249-71.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9061789"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9061789);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme.Fetrow JS, Palumbo MJ, Berg G.   Department of Biological Sciences, University at Albany, SUNY 12222, USA. acque@isadora.albany.eduTo study local structures in proteins, we previously developed an autoassociative artificial neural network (autoANN) and clustering tool to discover intrinsic features of macromolecular structures. The hidden unit activations computed by the trained autoANN are a convenient low-dimensional encoding of the local protein backbone structure. Clustering these activation vectors results in a unique classification of protein local structural features called Structural Building Blocks (SBBs). Here we describe application of this method to a larger database of proteins, verification of the applicability of this method to structure classification, and subsequent analysis of amino acid frequencies and several commonly occurring patterns of SBBs. The SBB classification method has several interesting properties: 1) it identifies the regular secondary structures, alpha helix and beta strand; 2) it consistently identifies other local structure features (e.g., helix caps and strand caps); 3) strong amino acid preferences are revealed at some positions in some SBBs; and 4) distinct patterns of SBBs occur in the "random coil" regions of proteins. Analysis of these patterns identifies interesting structural motifs in the protein backbone structure, indicating that SBBs can be used as "building blocks" in the analysis of protein structure. This type of pattern analysis should increase our understanding of the relationship between protein sequence and local structure, especially in the prediction of protein structures.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;APMIS. 2003 Jun;111(6):605-18.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12969016"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12969016);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The genome of Campylobacter jejuni: codon and amino acid usage.Fuglsang A.   Royal Danish School of Pharmacy, Institute of Pharmacology, Copenhagen, Denmark. anfu@dfh.dkThe genes from the genome of the AT-rich bacterium Campylobacter jejuni were analysed and characterised with respect to usage and amino acid usage. Codon usage is generally biased for all amino acids having synonymous codons, so that AT-rich synonyms are most frequently used. Markov chain analysis showed that codon bias, and over- or underrepresentation of the corresponding tri-letter words, are not related. Predicted secondary structure, lipophilicity, codon position within the gene, strand, and position on the (+)-strand were all shown to be determinants of codon usage, and these effects were in part directly explained by compositional phenomena. Codon context and the GC-content at the wobble position of the fourfold degenerate sites exert indirect effects on codon usage. The factors that affect codon usage seem to affect all amino acids, rather than selected amino acids. The usage of amino acids correlates well with the GC-content of genes, i.e. usage of amino acids encoded by GC-rich codons increases with GC-content and vice versa.&lt;br /&gt;&lt;br /&gt;C R Acad Sci Hebd Seances Acad Sci D. 1976 Dec 8;283(15):1667-70.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=827376"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu827376);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;[Composition and variability of bases in degenerate and non-degenerate positions in codons of various types of mRNA]Grantham R.Codon base frequencies have been studied in all published mRNA sequences of more than 50 codons. Bases are not "random" in any codon position. Variability in degenerate position III is of the same order as in the other two positions (non-degenerate). The base distribution in each codon position is significantly different from that in each other position. Position I has the most distinctive make up. Correlations demonstrate that in this ensemble of 1,265 codons positions I and II have the same type of composition as in the decoded mRNA for the average protein.&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1980 Jan 11;8(1):r49-r62.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6986610"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6986610);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=6986610"&gt;&lt;/a&gt;&lt;br /&gt;Codon catalog usage and the genome hypothesis.Grantham R, Gautier C, Gouy M, Mercier R, Pave A.Frequencies for each of the 61 amino acid codons have been determined in every published mRNA sequence of 50 or more codons. The frequencies are shown for each kind of genome and for each individual gene. A surprising consistency of choices exists among genes of the same or similar genomes. Thus each genome, or kind of genome, appears to possess a "system" for choosing between codons. Frameshift genes, however, have widely different choice strategies from normal genes. Our work indicates that the main factors distinguishing between mRNA sequences relate to choices among degenerate bases. These systematic third base choices can therefore be used to establish a new kind of genetic distance, which reflects differences in coding strategy. The choice patterns we find seem compatible with the idea that the genome and not the individual gene is the unit of selection. Each gene in a genome tends to conform to its species' usage of the codon catalog; this is our genome hypothesis.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1980 May 10;8(9):1893-912.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6159596"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6159596);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=6159596"&gt;&lt;/a&gt;&lt;br /&gt;Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type.Grantham R, Gautier C, Gouy M.The poor printing of our previous Figure 2 (1) is corrected. Codon usage in mRNA sequences just published is also given. A new correspondence analysis is done, based on simultaneous comparison in all mRNA of use of the 61 codons. This analysis reinforces our claim that most genes in a genome, or genome type, have the same coding strategy; that is, they show similar choices among synonymous codons, or among degenerate bases (2). Like analysis on frequency variation in the amino acids coded reveals an entirely different pattern.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1980 Jan 11;8(1):r49-r62.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6986610"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6986610);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon catalog usage and the genome hypothesis.Grantham R, Gautier C, Gouy M, Mercier R, Pave A.Frequencies for each of the 61 amino acid codons have been determined in every published mRNA sequence of 50 or more codons. The frequencies are shown for each kind of genome and for each individual gene. A surprising consistency of choices exists among genes of the same or similar genomes. Thus each genome, or kind of genome, appears to possess a "system" for choosing between codons. Frameshift genes, however, have widely different choice strategies from normal genes. Our work indicates that the main factors distinguishing between mRNA sequences relate to choices among degenerate bases. These systematic third base choices can therefore be used to establish a new kind of genetic distance, which reflects differences in coding strategy. The choice patterns we find seem compatible with the idea that the genome and not the individual gene is the unit of selection. Each gene in a genome tends to conform to its species' usage of the codon catalog; this is our genome hypothesis.&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1981 Jan 10;9(1):r43-74.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7208352"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7208352);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon catalog usage is a genome strategy modulated for gene expressivity.Grantham R, Gautier C, Gouy M, Jacobzone M, Mercier R.The nucleic acid sequence bank now contains 161 mRNAs, 43 new genes are added. One sequence, that of B. mori fibroin, is dropped due to uncertainty on the starting point for translation. Frequencies of all codons are given for each gene added and for each genome type in the total bank. A new series of correspondence analyses on codon use is presented, substantiating the genome hypothesis. Internal regulation of mRNA expression by different third base choices between quartet and duet codons is proposed for bacterial genes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 1982 Jun;18(3):199-209.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6751939"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6751939);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Preferential codon usage in prokaryotic genes: the optimal codon-anticodon interaction energy and the selective codon usage in efficiently expressed genes.Grosjean H, Fiers W.By considering the nucleotide sequence of several highly expressed coding regions in bacteriophage MS2 and mRNAs from Escherichia coli, it is possible to deduce some rules which govern the selection of the most appropriate synonymous codons NNU or NNC read by tRNAs having GNN, QNN or INN as anticodon. The rules fit with the general hypothesis that an efficient in-phase translation is facilitated by proper choice of degenerate codewords promoting a codon-anticodon interaction with intermediate strength (optimal energy) over those with very strong or very weak interaction energy. Moreover, codons corresponding to minor tRNAs are clearly avoided in these efficiently expressed genes. These correlations are clearcut in the normal reading frame but not in the corresponding frameshift sequences +1 and +2. We hypothesize that both the optimization of codon-anticodon interaction energy and the adaptation of the population to codon frequency or vice versa in highly expressed mRNAs of E. coli are part of a strategy that optimizes the efficiency of translation. Conversely, codon usage in weakly expressed genes such as repressor genes follows exactly the opposite rules. It may be concluded that, in addition to the need for coding an amino acid sequence, the energetic consideration for codon-anticodon pairing, as well as the adaptation of codons to the tRNA population, may have been important evolutionary constraints on the selection of the optimal nucleotide sequence.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;IEEE Trans Nanobioscience. 2003 Sep;2(3):150-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15376949"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15376949);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Folding type specific secondary structure propensities of synonymous codons.Gu W, Zhou T, Ma J, Sun X, Lu Z.   Key Laboratory of Molecular and Biomolecular Electronics, Southeast University, Ministry of Education, Nanjing 210096, China.We have proposed new amino acid secondary structure propensities in proteins with different folding types based on synonymous codons. They have been derived from 200 all alpha, all beta, alpha/beta, and alpha + beta proteins of known structures and their coding genes. The secondary structure propensities of the same codon in gene coding for different folding type proteins are not the same. For instance, amino acid Ile coded by AUU is indifferent to form the alpha unit in the alpha + beta protein class, but it is a former and a breaker for the alpha unit in the all alpha protein class and the alpha/beta class, respectively. On the other hand, the secondary structure propensities of different synonymous codons in the coding genes with the same folding type are also not all the same. As an example, CGU, CGG, and AGA, which are synonymous codons of Arg, are preferential to form the alpha unit in all alpha proteins, while CGA is an alpha unit breaker and the other two synonymous codons, CGC and AGG, are indifferent to form or break the alpha unit. As a result, protein secondary structure information contained both in mRNA sequences and in amino acid sequences has been introduced in these codon-based amino acid secondary structure propensities. These codon-based amino acid secondary structure propensities are helpful to in vitro protein design and protein secondary structure prediction.&lt;br /&gt;&lt;br /&gt;Biosystems. 2004 Feb;73(2):89-97.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15013221"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15013221);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The relationship between synonymous codon usage and protein structure in Escherichia coli and Homo sapiens.Gu W, Zhou T, Ma J, Sun X, Lu Z.  Key Laboratory of Molecular and Biomolecular Electronics (Southeast University), Ministry of Education, Nanjing 210096, China.The role of silent position in the codon on the protein structure is an interesting and yet unclear problem. In this paper, 563 Homo sapiens genes and 417 Escherichia coli genes coding for proteins with four different folding types have been analyzed using variance analysis, a multivariate analysis method newly used in codon usage analysis, to find the correlation between amino acid composition, synonymous codon, and protein structure in different organisms. It has been found that in E. coli, both amino acid compositions in differently folded proteins and synonymous codon usage in different gene classes coding for differently folded proteins are significantly different. It was also found that only amino acid composition is different in different protein classes in H. sapiens. There is no universal correlation between synonymous codon usage and protein structure in these two different organisms. Further analysis has shown that GC content on the second codon position can distinguish coding genes for different folded proteins in both organisms.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biochem Biophys Res Commun. 2000 Mar 24;269(3):692-6.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10720478"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10720478);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Studies on the relationships between the synonymous codon usage and protein secondary structural units.Gupta SK, Majumdar S, Bhattacharya TK, Ghosh TC.  Distributed Information Centre, Bose Institute, P 1/12, C.I.T. Scheme, VII M, Calcutta, 700 054, India.The relationship between the synonymous codon usage and protein secondary structural elements (alpha helices and beta sheets) were reinvestigated by taking structural information of proteins from Protein Data Bank (PDB) and their corresponding mRNA sequences from GenBank for four different organisms E. coli, B. subtilis, S. cerevisiae, and Homo sapiens. It was observed that synonymous codon families have non-random codon usage, but there does not exist any species invariant universal correlation between the synonymous codon usage and protein secondary structural elements. The secondary structural units of proteins can be distinguished from the occurrences of bases at the second codon position.&lt;br /&gt;&lt;br /&gt;Pac Symp Biocomput. 1998;:106-17.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9697175"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9697175);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;DINAMO: a coupled sequence alignment editor/molecular graphics tool for interactive homology modeling of proteins.Hansen M, Bentz J, Baucom A, Gregoret L.   Department of Biology, University of California, Santa Cruz 95064, USA.Gaining functional information about a novel protein is a universal problem in biomedical research. With the explosive growth of the protein sequence and structural databases, it is becoming increasingly common for researchers to attempt to build a three-dimensional model of their protein of interest in order to gain information about its structure and interactions with other molecules. The two most reliable methods for predicting the structure of a protein are&lt;br /&gt;homology modeling, in which the novel sequence is modeled on the known three-dimensional structure of a related protein, and&lt;br /&gt;fold recognition (threading), where the sequence is scored against a library of fold models, and the highest scoring model is selected. The sequence alignment to a known structure can be ambiguous, and human intervention is often required to optimize the model.&lt;br /&gt;We describe an interactive model building and assessment tool in which a sequence alignment editor is dynamically coupled to a molecular graphics display. By means of a set of assessment tools, the user may optimize his or her alignment to satisfy the known heuristics of protein structure. Adjustments to the sequence alignment made by the user are reflected in the displayed model by color and other visual cues. For instance, residues are colored by hydrophobicity in both the three-dimensional model and in the sequence alignment. This aids the user in identifying undesirable buried polar residues. Several different evaluation metrics may be selected including residue conservation, residue properties, and visualization of predicted secondary structure. These characteristics may be mapped to the model both singly and in combination. DINAMO is a Java-based tool that may be run either over the web or installed locally. Its modular architecture also allows Java-literate users to add plug-ins of their own design.&lt;br /&gt;&lt;br /&gt;Genetics. 1994 Sep;138(1):227-34.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8001789"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8001789);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=8001789&amp;amp;db=pubmed&amp;url=http://www.genetics.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=8001789"&gt;&lt;/a&gt; &lt;br /&gt;Selection intensity for codon bias.Hartl DL, Moriyama EN, Sawyer SA.   Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138.The patterns of nonrandom usage of synonymous codons (codon bias) in enteric bacteria were analyzed. Poisson random field (PRF) theory was used to derive the expected distribution of frequencies of nucleotides differing from the ancestral state at aligned sites in a set of DNA sequences. This distribution was applied to synonymous nucleotide polymorphisms and amino acid polymorphisms in the gnd and putP genes of Escherichia coli. For the gnd gene, the average intensity of selection against disfavored synonymous codons was estimated as approximately 7.3 x 10(-9); this value is significantly smaller than the estimated selection intensity against selectively disfavored amino acids in observed polymorphisms (2.0 x 10(-8)), but it is approximately of the same order of magnitude. The selection coefficients for optimal synonymous codons estimated from PRF theory were consistent with independent estimates based on codon usage for threonine and glycine. Across 118 genes in E. coli and Salmonella typhimurium, the distribution of estimated selection coefficients, expressed as multiples of the effective population size, has a mean and standard deviation of 0.5 +/- 0.4. No significant differences were found in the degree of codon bias between conserved positions and replacement positions, suggesting that translational misincorporation is not an important selective constraint among synonymous polymorphic codons in enteric bacteria. However, across the first 100 codons of the genes, conserved amino acids with identical codons have significantly greater codon bias than that of either synonymous or nonidentical codons, suggesting that there are unique selective constraints, perhaps including mRNA secondary structures, in this part of the coding region.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2003 Jul 17;312:197-206.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12909356&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12909356);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Erratum in: Gene. 2003 Nov 27;320:191.&lt;br /&gt;Low hanging fruit: a subset of human cSNPs is both highly non-uniform and predictable.Horvath MM, Fondon JW 3rd, Garner HR.  McDermott Center for Human Growth and Development, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-8591, USA. monica.horvath@utsouthwestern.eduWe present a point mutation classification method that contrasts SNP databases and has the potential to illuminate the relative mutational load of genes caused by codon bias. We group point variation gleaned from public databases by their wild-type and mutant codons, e.g. codon mutation classes (CMCs, 576 possible such as ACG--&gt;ATG), whose frequencies in a database are assembled into a BLOSUM-style matrix describing the likelihood of observing all possible single base codon changes as tuned by the intertwined effects of mutation rate and selection. The rankings of the CMCs in any database are reshuffled according to the population stratification of the typical genotyping experiment producing that resource's data. Analysis of four independent databases reveals that a considerable fraction of mutation in functional genes can be described by a few CMCs regardless of gene identity or population stratification in the genotyping experiment. For example, the top 5% (29/576) of CMCs account for 27.4% of the observed variants in dbSNP while the bottom 5% account for only 0.02%. For non-synonymous disease-causing mutation, 40.8% are described by the top 5% of all possible non-silent CMCs (22/438). Overall, the most observed polymorphism is a G--&gt;A transition at CpG dinucleotides causing ACG, TCG, GCG, and CCG to frequently undergo silent mutation in any gene due to the putative lack of impact on the protein product. In order to assess how well CMC spectrums estimate the aggregate non-synonymous mutational trends of a single gene, a CMC matrix was applied to seven unrelated genes to compute the most likely point mutations. In excess of 87% of these mutation predictions are historically known to play an important role in a disease state according to published literature. CMC-based mutation prediction may aid design and execution of direct association genotyping studies.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Microb Comp Genomics. 1998;3(4):243-53.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10027193"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10027193);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Relationship between codon usage and sequence-dependent curvature of genomes.Jauregui R, O'Reilly F, Bolivar F, Merino E.   Laboratorio de Biologia Computacional, Centro de Investigacion sobre Fijacion de Nitrogeno, Morelos, Mexico.Static DNA curvature distributions of full-sequenced genomes and large DNA contigs from different organisms were calculated. Very distinctive differences among histogram profiles coming from archaebacteria, eubacteria, and eukaryotes were observed. Eubacterial profiles were, on average, more curved than were archaeal and eukaryotic profiles. A comparative analysis between real and randomized DNA sequences revealed that eubacterial genomes presented, overall, higher curvature values than random sequences. An opposite portrait was exhibited by archaeal and eukaryotic genomes. They displayed a lower frequency of curved regions than their corresponding randomized sequences. The contributions of coding and intergenic regions to the curvature profile were also analyzed. Intergenic regions, on average, were found to be more curved than the overall genomic sequences, especially in prokaryotic organisms. Nevertheless, because of their small size with respect to coding regions, the contribution of intergenic sequences to the overall curvature profile tended to be minor. A clear relationship between codon usage and DNA curvature was demonstrated, and a proposal of the possible coevolution of both systems is discussed. Finally, we present a procedure to quantify the deviation of a curvature profile from randomness through a formal statistical analysis.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Biol. 1996 Oct 4;262(4):459-72.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8893856"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8893856);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;What drives codon choices in human genes?Karlin S, Mrazek J.  Department of Mathematics Stanford University, CA 94305-2125, USA.Synonymous codon usage is based and the bias seems to be different in different organisms. Factors with proposed roles in causing codon bias include&lt;br /&gt;(1) degree and timing of gene expression,&lt;br /&gt;(2) codon-anticodon interactions,&lt;br /&gt;(3) transcription and translation rate and fidelity,&lt;br /&gt;(4)codon context, and&lt;br /&gt;(5) global and local G + C content.&lt;br /&gt;We offer a new perspective and new methods for elucidating codon choices applied especially to the human genome. We present data supporting the thesis that codon choices for human genes are largely a consequence of two factors:&lt;br /&gt;(1) amino acid constraints,&lt;br /&gt;(2) maintaining DNA structures dependent on base-step conformational tendencies consistent with the organism's genome signature that is determined by genome-wide processes of DNA modification, replication and repair.&lt;br /&gt;The related codon signature defined as the dinucleotide relative abundances at the distinct codon positions (1,2), (2,3), and (3,4) (4 = 1 of the next codon) accommodates both the global genome signature and amino acid constraints. In human genes, codon positions (2,3) and (3,4) containing the silent site have similar codon signatures reflecting DNA symmetry. Strong CG and TA dinucleotide underrepresentation is observed at all codon positions as well as in non-coding regions. Estimates of synonymous codon usage based on codon signatures are in excellent agreement with the actual codon usage in human and general vertebrate genes. These properties are largely independent of the isochore compartment (G + C content), gene size, and transcriptional and translational constraints. We hypothesize that major influences on codon usage in human genes result from residue preferences and diresidue associations in proteins coupled to biases on the DNA level, related to replication and repair processes and/or DNA structural requirements.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Genes Genet Syst. 2003 Oct;78(5):343-52.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14676425"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14676425);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Patterns of codon usage bias in three dicot and four monocot plant species.Kawabe A, Miyashita NT.   Laboratory of Plant Genetics, Graduate School of Agriculture, Kyoto University, Japan. akirakawabe@hotmail.comCodon usage in nuclear genes of four monocot and three dicot species was analyzed to find general patterns in codon choice of plant species. Codon bias was correlated with GC content at the third codon position. GC contents were higher in monocot species than in dicot species at all codon positions. The high GC contents of monocot species might be the result of relatively strong mutational bias that occurred in the lineage of the Poaceae species. In both dicot and monocot species, the effective number of codons (ENCs) for most genes was similar to that for the expected ENCs based on the GC content at the third codon positions. G and C ending codons were detected as the "preferred" codons in monocot species, as in Drosophila. Also, many "preferred" codons are the same in dicot species. Pyrimidine (C and T) is used more frequently than purine (G and A) in four-fold degenerate codon groups.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 1997 Jun;14(6):637-43.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9190065&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9190065);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=9190065&amp;amp;db=pubmed&amp;url=http://mbe.oupjournals.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=9190065"&gt;&lt;/a&gt; &lt;br /&gt;Codon bias and plasticity in immunoglobulins.Kepler TB.   Department of Statistics, North Carolina State University, Raleigh 27695-8203, USA. kepler@unity.ncsu.eduImmunoglobulin genes experience Darwinian evolution twice. In addition to the germline evolution all genes experience, immunoglobulins are subjected, upon exposure to antigen, to somatic hypermutation. This is accompanied by selection for high affinity to the eliciting antigen and frequently results in a significant increase in the specificity of the responding population. The hypermutation mechanism displays a strong sequence specificity. Thus arises the opportunity to manipulate codon bias in a site-specific manner so as to direct hypermutation to those parts of the gene that encode the antigen-binding portions of the molecule and away from those that encode the structurally conserved regions. This segregation of mutability would clearly be advantageous; it would enhance the generation of potentially useful variants while keeping mutational loss to acceptably low levels. But it is not clear that the advantage gained would be large enough to produce a measurable effect within the background stochasticity of the evolutionary process. I have performed a pair of statistical tests to determine whether site-specific codon bias in human immunoglobulin genes is correlated with the sequence specificity of the somatic mutation mechanism. The sequence specificity of the mutator was determined by analysis of a database of published immunoglobulin intron sequences that had experienced somatic mutation but not selection. The site-specific codon bias was determined by analysis of published sequences of human germline immunoglobulin V genes. Both tests strongly suggest that evolution has acted to enhance the plasticity of immunoglobulin genes under somatic hypermutation [I would say that the unexpected plasticity of immunoglobuling genes undergoing somatic mutation is evidence for evolutionary selection]..&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2004 Feb;21(2):286-94. Epub 2003 Dec 05.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14660698"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14660698);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Effect of strong directional selection on weakly selected mutations at linked sites: implication for synonymous codon usage.Kim Y.   Department of Biological Statistics and Computational Biology, Cornell University, New York, USA. ykim@mail.rochester.eduThe fixation of weakly selected mutations can be greatly influenced by strong directional selection at linked loci. Here, I investigate a two-locus model in which weakly selected, reversible mutations occur at one locus and recurrent strong directional selection occurs at the other locus. This model is analogous to selection on codon usage at synonymous sites linked to nonsynonymous sites under strong directional selection. Two approximations obtained here describe the expected frequency of the weakly selected preferred alleles at equilibrium. These approximations, as well as simulation results, show that the level of codon bias declines with an increasing rate of substitution at the strongly selected locus, as expected from the well-understood theory that selection at one locus reduces the efficacy of selection at linked loci. These solutions are used to examine whether the negative correlation between codon bias and nonsynonymous substitution rates recently observed in Drosophila can be explained by this hitchhiking effect. It is shown that this observation can be reasonably well accounted for if a large fraction of the nonsynonymous substitutions on genes in the data set are driven by strong directional selection.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proc Natl Acad Sci U S A. 1981 Sep;78(9):5773-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6946514"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6946514);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Possibility of extensive neutral evolution under stabilizing selection with special reference to nonrandom usage of synonymous codons.Kimura M.The rate of evolution in terms of the number of mutant substitutions in a finite population is investigated assuming a quantitative character subject to stabilizing selection, which is known to be the most prevalent type of natural selection. It is shown that, if a large number of segregating loci (or sites) are involved, the average selection coefficient per mutant under stabilizing selection may be exceedingly small. These mutants are very slightly deleterious but nearly neutral, so that mutant substitutions are mainly controlled by random drift, although the rate of evolution may be lower as compared with the situation in which all the mutations are strictly neutral. This is treated quantitatively by using the diffusion equation method in population genetics. A model of random drift under stabilizing selection is then applied to the problem of "nonrandom" or unequal usage of synonymous codons, and it is shown that such nonrandomness can readily be understood within the framework of the neutral mutation--random drift hypothesis (the neutral theory, for short) of molecular evolution.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biochim Biophys Acta. 1988 Jul 13;950(2):215-20.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3382664"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3382664);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Nucleotide sequence of a cDNA clone encoding mouse transition protein 1.Kleene KC, Borzorgzadeh A, Flynn JF, Yelick PC, Hecht NB.  Department of Biology, University of Massachusetts, Boston 02125.We have determined the nucleotide sequence of cDNA clones encoding mouse transition protein 1 (TP1), a basic nuclear protein involved in nuclear condensation during spermiogenesis. The nucleotide sequence predicts that transition protein 1 in rats and mice differs by only one amino acid. The rate of substitution of nucleotides in the coding region of mouse and rat transition protein 1 mRNA is close to the average of many proteins in rats and mice, and the usage of degenerate codons is typical of the mouse. The identification of this cDNA clone, in conjunction with previous work (Kleene et al. (1983) Dev. Biol. 98, 455-464; Hecht et al. (1986) Exp. Cell Res. 164, 183-190), demonstrates that the mRNA for mouse transition protein 1 accumulates during the haploid phase of spermatogenesis.&lt;br /&gt;&lt;br /&gt;Genome Biol. 2001;2(4):RESEARCH0010. Epub 2001 Mar 22.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11305938"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11305938);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3196&amp;uid=11305938&amp;amp;db=pubmed&amp;url=http://genomebiology.com/1465-6906/2/RESEARCH0010"&gt;&lt;/a&gt; &lt;br /&gt;A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes.Knight RD, Freeland SJ, Landweber LF.  Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA.BACKGROUND: Correlations between genome composition (in terms of GC content) and usage of particular codons and amino acids have been widely reported, but poorly explained. We show here that a simple model of processes acting at the nucleotide level explains codon usage across a large sample of species (311 bacteria, 28 archaea and 257 eukaryotes). The model quantitatively predicts responses (slope and intercept of the regression line on genome GC content) of individual codons and amino acids to genome composition. RESULTS: Codons respond to genome composition on the basis of their GC content relative to their synonyms (explaining 71-87% of the variance in response among the different codons, depending on measure). Amino-acid responses are determined by the mean GC content of their codons (explaining 71-79% of the variance). Similar trends hold for genes within a genome. Position-dependent selection for error minimization explains why individual bases respond differently to directional mutation pressure. CONCLUSIONS: Our model suggests that GC content drives codon usage (rather than the converse). It unifies a large body of empirical evidence concerning relationships between GC content and amino-acid or codon usage in disparate systems. The relationship between GC content and codon and amino-acid usage is ahistorical; it is replicated independently in the three domains of living organisms, reinforcing the idea that genes and genomes at mutation/selection equilibrium reproduce a unique relationship between nucleic acid and protein composition. Thus, the model may be useful in predicting amino-acid or nucleotide sequences in poorly characterized taxa.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Indian J Biochem Biophys. 1995 Dec;32(6):417-23.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8714212"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8714212);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Contextual constraints in the choice of synonymous codons.Kolaskar AS, Joshi B, Reddy BV.  Distributed Information Center, University of Pune. kolaskar@bioinfo.ernet.inFrom EMBL Nucleotide Sequence Database, protein coding sequences of all E. coli and its DNA phages, were extracted using our computer programme. Same programme has been used to form a database of sequence of oligonucleotides of length 18 nucleotides on both sides of each of the 61 codons. From analysis of this database and study of variations in twist parameter (Tw) values, as an indicator of sequence dependent variations in B-DNA helix, a method is developed to fix the codon among the set of synonymous codons. The accuracy of the method was checked on enlarged data set by adding data from more prokaryotes. Our method assigns the codon 85-90% times correctly if the selection has to be made between codons having different sequence in terms of R and Y. The accuracy of the method is somewhat lower when choice of the codon has to be made between codons having same codes in terms of R and Y. This study points out that the major factors which decide the choice of a codon from a set of synonymous codons are contextual constraints arising from flanking regions.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biochimie. 1985 May;67(5):469-73.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2992612"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2992612);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The missense errors in protein can be controlled by selective synonymous codon usage at the level of transcription.Konopka AK, Brendel V.In the cases of the 6-fold degenerate residues and the stop signal, selective codon usage at the level of transcription can account for a 10-20% variation in their mistranslation rate. For all other residues, the mistranslation rate is dependent upon the degree of degeneracy only, but not upon the pattern of synonymous codon usage.&lt;br /&gt;&lt;br /&gt;Biochimie. 1985 May;67(5):455-68.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=4027279"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu4027279);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Theory of degenerate coding and informational parameters of protein coding genes.Konopka AK.The theory of degenerate coding is presented in a way enabling further application to molecular biology. There are two kinds of redundancy of a degenerate code. The first is due to the excess in codon length and the second to the code degeneracy. If the code is asymmetrically degenerate, the second kind of redundancy can be profitable for control of error rate. This control can be performed just by selective synonymous codon usage. Utilisation of the genetic code is partially influenced by this theoretical possibility. In particular the degree of error protectivity is well correlated with deviation from equiprobability in synonymous codon usage. The biological significance of this fact is discussed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Am J Hum Genet. 1998 Aug;63(2):474-88.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9683596&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9683596);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes.Krawczak M, Ball EV, Cooper DN.   Institute of Medical Genetics, University of Wales College of Medicine,Cardiff CF4 4XN, United Kingdom. krawczak@cardiff.ac.ukThe spectrum of single-base-pair substitutions logged in The Human Gene Mutation Database (HGMD), comprising 7,271 different lesions in the coding regions of 547 different human genes, was analyzed for nearest-neighbor effects on relative mutation rates. Owing to its retrospective nature, HGMD allows mutation rates to be estimated only in relative terms. Therefore, a novel methodology was devised in order to obtain these estimates in iterative fashion, correcting, at the same time, for the confounding effects of differential codon usage and for the fact that different types of amino acid replacement come to clinical attention with different probabilities. Over and above the hypermutability of CpG dinucleotides, reflected in transition rates five times the base mutation rate, only a subtle and locally confined influence of the surrounding DNA sequence on relative single-base-pair substitution rates was observed, which extended no farther than 2 bp from the substitution site. A disparity between the two DNA strands was evidenced by the fact that, when substitution rates were estimated conditional on the 5' and 3' flanking nucleotides, a significant rate difference emerged for 10 of 96 possible pairs of complementary substitutional events. Mutational bias, favoring substitutions toward flanking bases, a phenomenon reminiscent of misalignment mutagenesis, was apparent and exhibited both directionality and reading-frame sensitivity. No specific preponderance of repeat-sequence motifs was observed in the vicinity of nucleotide substitutions, but a moderate correlation between the relative mutability and thermodynamic stability of DNA triplets emerged, suggesting either inefficient DNA replication in regions of high stability or the transient stabilization of misaligned intermediates.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Plant Mol Biol. 1996 May;31(2):337-54.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8756597"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8756597);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Function of 3' non-coding sequences and stop codon usage in expression of the chloroplast psaB gene in Chlamydomonas reinhardtii.Lee H, Bingham SE, Webber AN.  Department of Botany, Arizona State University, Tempe 85287-1601, USA.The rate of mRNA decay is an important step in the control of gene expression in prokaryotes, eukaryotes and cellular organelles. Factors that determine the rate of mRNA decay in chloroplasts are not well understood. Chloroplast mRNAs typically contain an inverted repeat sequence within the 3' untranslated region that can potentially fold into a stem-loop structure. These stem-loop structures have been suggested to stabilize the mRNA by preventing degradation by exonuclease activity, although such a function in vivo has not been clearly established. Secondary structures within the translation reading frame may also determine the inherent stability of an mRNA. To test the function of the inverted repeat structures in chloroplast mRNA stability mutants were constructed in the psaB gene that eliminated the 3' flanking sequences of psaB or extended the open reading frame into the 3' inverted repeat. The mutant psaB genes were introduced into the chloroplast genome of Chlamydomonas reinhardtii. Mutants lacking the 3' stem-loop exhibited a 75% reduction in the level of psaB mRNA. The accumulation of photosystem I complexes was also decreased by a corresponding amount indicating that the mRNA level is limiting to PsaB protein synthesis. Pulse-chase labeling of the mRNA showed that the decay rate of the psaB mRNA was significantly increased demonstrating that the stem-loop structure is required for psaB mRNA stability. When the translation reading frame was extended into the 3' inverted repeat the mRNA level was reduced to only 2% of wild-type indicating that ribosome interaction with stem-loop structures destabilizes chloroplast mRNAs. The non-photosynthetic phenotype of the mutant with an extended reading frame allowed us to test whether infrequently used stop codons (UAG and UGA) can terminate translation in vivo. Both UAG and UGA are able to effectively terminate PsaB synthesis although UGA is never used in any of the Chlamydomonas chloroplast genes that have been sequenced.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 2002 May;54(5):625-37.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11965435&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11965435);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage by transposable elements and their host genes in five species.Lerat E, Capy P, Biemont C.   Laboratoire Biometrie et Biologie Evolutive, UMR CNRS 5558, Universite Lyon 1, 69622 Villeurbanne Cedex, France.We compared the codon usage of sequences of transposable elements (TEs) with that of host genes from the species Drosophila melanogaster, Arabidopsis thaliana, Caeno-rhabditis elegans, Saccharomyces cerevisiae, and Homo sapiens. Factorial correspond-ence analysis showed that, regardless of the base composition of the genome, the TEs differed from the genes of their host species by their AT-richness. In all species, the percentage of A + T on the third codon position of the TEs was higher than that on the first codon position and lower than that in the noncoding DNA of the genomes. This indicates that the codon choice is not simply the outcome of mutational bias but is also subject to selection constraints. A tendency toward higher A + T on the third position than on the first position was also found in the host genes of A. thaliana, C. elegans, and S. cerevisiae but not in those of D. melanogaster and H. sapiens. This strongly suggests that the AT choice is a host-independent characteristic common to all TEs. The codon usage of TEs generally appeared to be different from the mean of the host genes. In the AT-rich genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Saccharomyces cerevisiae, the codon usage bias of TEs was similar to that of weakly expressed genes. In the GC-rich genome of D. melanogaster, however, the bias in codon usage of the TEs clearly differed from that of weakly expressed genes. These findings suggest that selection acts on TEs and that TEs may display specific behavior within the host genomes.&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 1985 Mar;2(2):150-74.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3916709&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3916709);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=3916709&amp;amp;db=pubmed&amp;url=http://mbe.oupjournals.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=3916709"&gt;&lt;/a&gt; &lt;br /&gt;A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes.Li WH, Wu CI, Luo CC.   Center for Demographic and Population Genetics, University of Texas, Houston 77225.A new method is proposed for estimating the number of synonymous and nonsynonymous nucleotide substitutions between homologous genes. In this method, a nucleotide site is classified as nondegenerate, twofold degenerate, or fourfold degenerate, depending on how often nucleotide substitutions will result in amino acid replacement; nucleotide changes are classified as either transitional or transversional, and changes between codons are assumed to occur with different probabilities, which are determined by their relative frequencies among more than 3,000 changes in mammalian genes. The method is applied to a large number of mammalian genes. The rate of nonsynonymous substitution is extremely variable among genes; it ranges from 0.004 X 10(-9) (histone H4) to 2.80 X 10(-9) (interferon gamma), with a mean of 0.88 X 10(-9) substitutions per nonsynonymous site per year. The rate of synonymous substitution is also variable among genes; the highest rate is three to four times higher than the lowest one, with a mean of 4.7 X 10(-9) substitutions per synonymous site per year. The rate of nucleotide substitution is lowest at nondegenerate sites (the average being 0.94 X 10(-9), intermediate at twofold degenerate sites (2.26 X 10(-9)). and highest at fourfold degenerate sites (4.2 X 10(-9)). The implication of our results for the mechanisms of DNA evolution and that of the relative likelihood of codon interchanges in parsimonious phylogenetic reconstruction are discussed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biopolymers. 1999 May;49(6):481-95.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10193195"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10193195);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Protein loops on structurally similar scaffolds: database and conformational analysis.Li W, Liu Z, Lai L.   Institute of Physical Chemistry, Peking University, Beijing, P. R. China.A general problem in comparative modeling and protein design is the conformational evaluation of loops with a certain sequence in specific environmental protein frameworks. Loops of different sequences and structures on similar scaffolds are common in the Protein Data Bank (PDB). In order to explore both structural and sequential diversity of them, a data base of loops connecting similar secondary structure fragments is constructed by searching the data base of families of structurally similar proteins and PDB. A total of 84 loop families having 2-13 residues are found among the well-determined structures of resolution better than 2.5 A. Eight alpha-alpha, 20 alpha-beta, 19 beta-alpha, and 37 beta-beta families are identified. Every family contains more than 5 loop motifs. In each family, no loops share same sequence and all the frameworks are well superimposed. Forty-three new loop classes are distinguished in the data base. The structural variability of loops in homologous proteins are examined and shown in 44 families. Motif families are characterized with geometric parameters and sequence patterns. The conformations of loops in each family are clustered into subfamilies using average linkage cluster analysis method. Information such as geometric properties, sequence profile, sequential and structural variability in loop, structural alignment parameters, sequence similarities, and clustering results are provided. Correlations between the conformation of loops and loop sequence, motif sequence, and global sequence of PDB chain are examined in order to find how loop structures depend on their sequences and how they are affected by the local and global environment. Strong correlations (R &gt; 0.75) are only found in 24 families. The best R value is 0.98. The data base is available through the Internet.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2001 Nov;18(11):2040-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11606700&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11606700);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=11606700&amp;amp;db=pubmed&amp;url=http://mbe.oupjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=11606700"&gt;&lt;/a&gt; &lt;br /&gt;Evaluation of methods for determination of a reconstructed history of gene sequence evolution.Liberles DA.   Department of Biochemistry and Biophysics and Stockholm Bioinformatics Center, Stockholm University, Stockholm, Sweden. liberles@sbc.su.seWith whole-genome sequences being completed at an increasing rate, it is important to develop and assess tools to analyze them. Following annotation of the protein content of a genome, one can compare sequences with previously characterized homologous genes to detect novel functions within specific proteins in the evolution of the newly sequenced genome. One common statistical method to detect such changes is to compare the ratios of nonsynonymous (K(a)) to synonymous (K(s)) nucleotide substitution rates. Here, the effects of several parameters that can influence this calculation (sequence reconstruction method, phylogenetic tree branch length weighting, GC content, and codon bias) are examined. Also, two new alternative measures of adaptive evolution, the point accepted mutations (PAM)/neutral evolutionary distance (NED) ratio and the sequence space assessment (SSA) statistic are presented. All of these methods are compared using two sequence families: the recent divergence of leptin orthologs in primates, and the more ancient divergence of the deoxyribonucleoside kinase family. The examination of these and other measures to detect changes of gene function along branches of a phylogenetic tree will become increasingly important in the postgenomic era.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 2003 Nov;57(5):538-45.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14738312"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14738312);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Nonrandom intragenic variations in patterns of codon bias implicate a sequential interplay between transitional genetic drift and functional amino acid selection.Lin K, Tan SB, Kolatkar PR, Epstein RJ.   Institute of Molecular and Cell Biology, Bioinformatics Centre, National University of Singapore, 117609, Singapore.Although most codon third bases appear to be functionless, the synonymous codons so defined exhibit a strikingly nonrandom distribution (codon bias) within human and other genes. To examine this phenomenon further, we generated a database of DNA sequences encoding human transmembrane cell-surface receptor proteins. Using this database we show here that the guanine and cytosine content of codon third bases (GC3) varies intragenically with&lt;br /&gt;(1)   the nature of the specified receptor domains (transmembrane &gt; extracellular &gt; intracellular domains; p &lt; 0.001),&lt;br /&gt;(2)   the phenotype of the encoded amino acids (hydrophobic &gt; hydrophilic &gt; neutral amino acids; p &lt; 0.001), and&lt;br /&gt;(3)     the receptor affiliation of the transmembrane (G-protein-coupled receptors &gt; receptor tyrosine kinases; p &lt; 0.001).&lt;br /&gt;Within gene regions specifying transmembrane domains, GC3 declines as domain functionality becomes redundant with increasing hydrophobicity (p &lt;&gt; XTG mechanism of codon bias, the G3:A3 ratio of codons specifying the transmembrane amino acid glycine (GGZ) is intermediate between that of its functional homolog alanine (GCZ) and that of hydrophobic valine (GTZ), even though the C3:T3 ratios are similar. Conversely, nearest-neighbor analysis of third bases 5' to codons specifying valine and leucine (CTZ) confirms a significant difference in C3:T3 but not G3:A3 ratios (i.e., C3/G1 --&gt; T3/G1 &gt; C3/A1; p &lt; 0.001), consistent with the functionally advantageous retention of hydrophobic residues. These data raise the possibility that patterns of intragenic codon bias reflect a balance between negative and positive selection, suggesting in turn that analysis of codon third-base usage may help to predict the functional significance of encoded products.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Biol. 1983 Jan 25;163(3):363-76.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6834430"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6834430);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Contextual constraints on synonymous codon choice.Lipman DJ, Wilbur WJ.We have studied the statistical constraints on synonymous codon choice to evaluate various proposals regarding the origin of the bias in synonymous codon usage observed by Fiers et al. (1975), Air et al. (1976), Grantham et al. (1980) and others. We have determined the statistical dependence of the degenerate third base on either of its nearest neighbors in mitochondrial, prokaryotic, and eukaryotic coding sequences. We noted an increasing dependence of the third base on its nearest neighbors in moving from mitochondria to prokaryotes to eukaryotes. A statistical model assuming random equiprobable selection of synonymous codons was found grossly adequate for the mitochondria, but totally inadequate for prokaryotes and eukaryotes. A model assuming selection of synonymous codons reflecting a genomic strategy, i.e. the genome hypothesis of Grantham et al. (1980), gave a good approximation of the mitochondrial sequences. A statistical model which exactly maintains codon frequency, but allows the position of corresponding synonymous codons to vary was only grossly adequate for prokaryotes and totally inadequate for eukaryotes. The results of these simulations are consistent with the measures on experimental sequences and suggest that a "frequency constraint" model such as that of Grantham et al. (1980) may be an adequate explanation of the codon usage in mitochondria. However, in addition to this frequency constraint, there may be constraints on synonymous codon choice in prokaryotes due to codon context. Furthermore, any proposal to explain codon usage in eukaryotes must involve a constraint on the context of a codon in the sequence.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1984-85;21(2):161-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6442990"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6442990);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Interaction of silent and replacement changes in eukaryotic coding sequences.Lipman DJ, Wilbur WJ.We examined the codon usages in well-conserved and less-well-conserved regions of vertebrate protein genes and found them to be similar. Despite this similarity, there is a statistically significant decrease in codon bias in the less-well-conserved regions. Our analysis suggests that although those codon changes initially fixed under amino acid replacements tend to follow the overall codon usage pattern, they also reduce the bias in codon usage. This decrease in codon bias leads one to predict that the rate of change of synonymous codons should be greater in those regions that are less well conserved at the amino acid level than in the better-conserved regions. Our analysis supports this prediction. Furthermore, we demonstrate a significantly elevated rate of change of synonymous codons among the adjacent codons 5' to amino acid replacement positions. This provides further support for the idea that there are contextual constraints on the choice of synonymous codons in eukaryotes.&lt;br /&gt;&lt;br /&gt;Bioinformatics. 1999 Jul-Aug;15(7-8):612-21.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10487869"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10487869);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;GeneBuilder: interactive in silico prediction of gene structure.Milanesi L, D'Angelo D, Rogozin IB.    Istituto di Tecnologie Biomediche Avanzate CNR, via Fratelli Cervi 93, 20090 Segrate, Milano, Italy. milanesi@itba.mi.cnr.itMOTIVATION: Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. RESULTS: We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. AVAILABILITY: The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proc Natl Acad Sci U S A. 1981 Sep;78(9):5739-43.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6795634&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6795634);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=6795634"&gt;&lt;/a&gt;&lt;br /&gt;Extraordinarily high evolutionary rate of pseudogenes: evidence for the presence of selective pressure against changes between synonymous codons.Miyata T, Hayashida H.Comparisons of nucleotide sequences of several pseudogenes described to date, including alpha- and beta-globin and immunoglobulin kappa-type variable domain pseudogenes, with those of functional counterparts revealed that pseudogenes accumulate mutations at an extremely high rate uniformly over their entirety. It is remarkable that the evolutionary rate exceeds the rate of changes between synonymous codons, the highest known rate, in functional genes. Because no pseudogenes appear to function, this result strongly supports the neutral theory. In addition this result apparently indicates the presence of selective pressure against changes between synonymous codons in functional genes. Close examinations of codon utilization patterns in pseudogenes and functional genes revealed a significant correlation between the rate of changes at synonymous codon sites and the strength of bias in code word usage. This implies that even synonymous codon changes are not completely free from selective pressure but are constrained in part, although presumably weakly, depending on the degree of bias in code word usage. We also reexamined alignment between mouse beta h3 (pseudogene) and beta maj sequences and found a unique structure of the beta h3 that is homologous in sequence to the beta maj gene overall but contains a long deletion (about 150 base pairs) in the middle of the gene.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1979 Dec 20;7(8):2431-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=523321"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu523321);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=523321"&gt;&lt;/a&gt;&lt;br /&gt;The preferential codon usages in variable and constant regions of immunoglobulin genes are quite distinct from each other.Miyata T, Hayashida H, Yasunaga T, Hasegawa M.The pattern of codon utilization in the variable and constant regions of immunoglobulin genes are compared. It is shown that, in these regions, codon utilizations are quite distinct from one another: For most degenerate codons, there is a selective bias that prefers C and/or G ending codons to U and/or A ending codons in the constant region compared with the bias in the variable region. This would strongly suggest that, in immunoglobulin genes, the bias in code word usage is determined by other factors than those concerning with the translational mechanism such as tRNA availability and codon-anticodon interaction. A possibility is also suggested that this differance of code word usage between them is due to the existence of secondary structure in the constant region but not in the variable region.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proc Natl Acad Sci U S A. 1981 Sep;78(9):5739-43.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6795634&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6795634);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=6795634"&gt;&lt;/a&gt;&lt;br /&gt;Extraordinarily high evolutionary rate of pseudogenes: evidence for the presence of selective pressure against changes between synonymous codons.Miyata T, Hayashida H.Comparisons of nucleotide sequences of several pseudogenes described to date, including alpha- and beta-globin and immunoglobulin kappa-type variable domain pseudogenes, with those of functional counterparts revealed that pseudogenes accumulate mutations at an extremely high rate uniformly over their entirety. It is remarkable that the evolutionary rate exceeds the rate of changes between synonymous codons, the highest known rate, in functional genes. Because no pseudogenes appear to function, this result strongly supports the neutral theory. In addition this result apparently indicates the presence of selective pressure against changes between synonymous codons in functional genes. Close examinations of codon utilization patterns in pseudogenes and functional genes revealed a significant correlation between the rate of changes at synonymous codon sites and the strength of bias in code word usage. This implies that even synonymous codon changes are not completely free from selective pressure but are constrained in part, although presumably weakly, depending on the degree of bias in code word usage. We also reexamined alignment between mouse beta h3 (pseudogene) and beta maj sequences and found a unique structure of the beta h3 that is homologous in sequence to the beta maj gene overall but contains a long deletion (about 150 base pairs) in the middle of the gene.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 2000 Feb;50(2):184-93.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10684352"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10684352);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage in plastid genes is correlated with context, position within the gene, and amino acid content.Morton BR, So BG.   Department of Biological Sciences, Barnard College, Columbia University, 3009 Broadway, New York, NY 10027, USA. bmorton@barnard.columbia.eduHighly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5' end of highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3) a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10-20 codons. Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but this is not observed when the 3' neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational, not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced by selection for increased translation efficiency.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 1989 Jan 25;17(2):477-98.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2644621"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2644621);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=2644621"&gt;&lt;/a&gt;&lt;br /&gt;Codon usage in plant genes.Murray EE, Lotzer J, Eberle M.  Agrigenetics Advanced Sciences Company, Madison, WI 53713.We have examined codon bias in 207 plant gene sequences collected from Genbank and the literature. When this sample was further divided into 53 monocot and 154 dicot genes, the pattern of relative use of synonymous codons was shown to differ between these taxonomic groups, primarily in the use of G + C in the degenerate third base. Maize and soybean codon bias were examined separately and followed the monocot and dicot codon usage patterns respectively. Codon preference in ribulose 1,5 bisphosphate and chlorophyll a/b binding protein, two of the most abundant proteins in leaves was investigated. These highly expressed are more restricted in their codon usage than plant genes in general.&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2001 Sep;18(9):1703-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11504850"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11504850);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Translational selection on codon usage in Xenopus laevis.Musto H, Cruveiller S, D'Onofrio G, Romero H, Bernardi G.   Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Naples, Italy.A correspondence analysis of codon usage in Xenopus laevis revealed that the first axis is strongly correlated with the base composition at third codon positions. The second axis discriminates between putatively highly expressed genes and the other coding sequences, with expression levels being confirmed by the analysis of Expressed sequence tag frequencies. The comparison of codon usage of the sequences displaying the extreme values on the second axis indicates that several codons are statistically more frequent among the highly expressed (mainly housekeeping) genes. Translational selection appears, therefore, to influence synonymous codon usage in Xenopus.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Immunol. 1995 Nov 1;155(9):4322-9.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7594591&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7594591);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Influence of coding-end sequence on coding-end processing in V(D)J recombination.Nadel B, Feeney AJ.   Department of Immunology, Scripps Research Institute, La Jolla, CA 92037, USA.The large diversity of the Ig and TCR repertoires is accounted for by combinatorial assembly of the germ-line-encoded V, D, and J gene segments, as well as extensive modification at the junctions during the recombination process. Those modifications, termed coding-end processing, consist of removal and addition of an apparently random number of nucleotides. To obtain further insights into the mechanism of the coding-end processing, we constructed a large data base of several Ig and TCR coding ends obtained in vivo, using conditions that avoid potential bias by cellular selection events. We show that the processing patterns are not random, but rather specific for each coding end, suggesting that specific motifs in the coding-end sequence influence the processing. We found a good correlation between the presence of internal stretches of at least three A.T nucleotides, absence of stretches of G.C nucleotides, and high average nucleotide deletion. Based on a detailed analysis of the processing patterns, we propose that nicks of the hairpin intermediate take place preferentially in potential open structures formed by weaker pairings of A.T stretches. Together, these findings indicate that the sequence of the coding end plays an important role in nonrandom aspects of the recombination mechanism. This suggests that coding-end sequences might have been selected throughout evolution to participate in an early control of the development of the primary repertoire.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2003 Sep 1;31(17):5195-201.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12930971&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12930971);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=12930971&amp;amp;db=pubmed&amp;url=http://nar.oupjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=12930971"&gt;&lt;/a&gt; &lt;a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=12930971"&gt; &lt;/a&gt;&lt;br /&gt;Comparative analysis of the base biases at the gene terminal portions in seven eukaryote genomes.Niimura Y, Terabe M, Gojobori T, Miura K.  Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, 1111, Yata, Mishima, Shizuoka 411-8540, Japan. nxy10@psu.eduAdenine nucleotides have been found to appear preferentially in the regions after the initiation codons or before the termination codons of bacterial genes. Our previous experiments showed that AAA and AAT, the two most frequent second codons in Escherichia coli, significantly enhance translation efficiency. To determine whether such a characteristic feature of base frequencies exists in eukaryote genes, we performed a comparative analysis of the base biases at the gene terminal portions using the proteomes of seven eukaryotes. Here we show that the base appearance at the codon third positions of gene terminal regions is highly biased in eukaryote genomes, although the codon third positions are almost free from amino acid preference. The bias changes depending on its position in a gene, and is characteristic of each species. We also found that bias is most outstanding at the second codon, the codon after the initiation codon. NCN is preferred in every genome; in particular, GCG is strongly favored in human and plant genes. The presence of the bias implies that the base sequences at the second codon affect translation efficiency in eukaryotes as well as bacteria&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2000 Oct 1;28(19):3801-10.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11000273"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11000273);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions.Nishizawa M, Nishizawa K.  Department of Biochemistry, Teikyo University School of Medicine, Kaga, Itabashi, Tokyo 173, Japan.The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of &gt;50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 2003 Sep;57(3):325-34.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14629042"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14629042);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Strand compositional asymmetries of nuclear DNA in eukaryotes.Niu DK, Lin K, Zhang DY.   MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China.Both DNA replication and transcription are structurally asymmetric processes. An asymmetric nucleotide substitution pattern has been observed between the leading and the lagging strand, and between the coding and the noncoding strand, in eubacterial, viral, and organelle genomes. Similar studies in eukaryotes have been rare, because the origins of replication in nuclear genomes are mostly unknown and the replicons are much shorter than those of prokaryotes. To circumvent these predicaments, all possible pairs of neighboring genes that are located on different strands of nuclear DNA were selected from the complete genomes of Saccharomyces cerevisiae, Schizosaccharomyces pombe, Plasmodium falciparum, Encephalitozoon cuniculi, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Anopheles gambiae, Mus musculus, and Homo sapiens. For such a pair of genes, one is likely coded from the leading strand and the other from the lagging strand. By examining the introns and the fourfold degenerate sites of codons in the genes of each pair, we found that the relative frequencies of T vs. A and of G vs. C are significantly skewed in most eukaryotes studied. In a gene pair, the potential effects of replication- and transcription-associated mutation bias on strand asymmetry are in the same direction for one gene where leading strand synthesis shares the same template with transcription, while they tend to be canceled out in the other gene. Our study demonstrates that DNA replication-associated and transcription-associated mutation bias and/or selective codon usage bias may affect the strand nucleotide composition asymmetrically in eukaryotic genomes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biochim Biophys Acta. 1982 Aug 30;698(2):111-5.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7126583"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7126583);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;RNA folding is unaffected by the nonrandom degenerate codon choice.Nussinov R.The frequent suggestion that the nonrandom codon usage is explained by its forming more stable mRNAs is tested in 22 genes. Only the histones, globins, and the rat preproinsulin gene show a correlation between the preferred degenerate codons and the stability of the secondary structure of their mRNAs. However, the examined members from the histone and globin gene families, both among the oldest, in evolutionary sense, eukaryotic genes, have a high GC content (approx. 56% compared to an average of 42% in all eukaryotes) which is reflected in their degenerate codon choice and thus in their more stable folding.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Adv Exp Med Biol. 1985;190:627-36.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=4083166&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu4083166);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The notion of primordial building blocks in construction of genes and transcriptional and processing errors due to random occurrence of oligonucleotide signal sequences.Ohno S.Contrary to the currently popular belief, genes (flanking and internal noncoding sequences included) that specify beta-sheet and alpha-helical proteins are not unique sequences, rather they are degenerate repeats of short primordial building block sequences that are 45 to 48 bases long in the case of genes belonging to the beta-2-microglobulin superfamily. Accordingly, a large number of base decamers, nonomers, octamers, heptamers and hexamers recur within every gene. One consequence of the above is the random and inadvertent occurrence within genes of various oligonucleotide signal sequences for initiation and termination of transcription as well as for processing of transcripts by removal of intervening sequences. Inadvertent transcription of nonsense sequences and missplicing of transcripts may increase with age and contribute to the aging process. There is little doubt that the life span, being one of the species' characteristics, is genetically programmed. The question remains, however, as to whether or not such a program is embodied in each and every somatic cell type. If the cessation of cell proliferation is regarded synonymous with senescence, one is placed in the awkward position of having to state that most neurons of the central nervous system enter the state of senescence at the neonatal stage. An alternative to the above is the assumption of central control; e.g., the programmed secretion of an aging peptide hormone by the pituitary. To be sure somatic cells accumulate randomly sustained mutations as do germ cells and whatever other genetic mishaps (e.g., deletions, duplications) that may affect somatic cells also occur in germ cells. Yet, the monophyletic germ line on this earth has persisted for three billion years and has the potential of being immortal. Furthermore there can be no direct cause-and-effect relationship between the process of differentiation and the loss of immortality, for spermatozoa are one of the most, if not the most, differentiated cell types that can be found in the body. Nevertheless, if one's scope is confined to the types of genetic mishaps that may afflict somatic cells in their given life span, the one particular type that has hitherto escaped notice should be considered.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Biol. 1998 Aug 7;281(1):31-48.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9680473"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9680473);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Specific correlations between relative synonymous codon usage and protein secondary structure.Oresic M, Shalloway D.  Section of Biochemistry Molecular and Cell Biology, Cornell University, Ithaca, NY 14853, USA.We found significant species-specific correlations between the use of two synonymous codons and protein secondary structure units by comparing the three-dimensional structures of human and Escherichia coli proteins with their mRNA sequences. The correlations are not explained by codon-context, expression level, GC/AU content, or positional effects. The E. coli correlation is between Asn AAC and the C-terminal regions of beta-sheet segments; it may result from selection for translational accuracy, suggesting the hypothesis that downstream Asn residues are important for beta-sheet formation. The correlation in human proteins is between Asp GAU and the N termini of alpha-helices; it may be important for eukaryote-specific sequential, cotranslational folding. The kingdom-specific correlations may reflect kingdom-specific differences in translational mechanisms. The correlations may help identify residues that are important for secondary structure formation, be useful in secondary structure prediction algorithms, and have implications for recombinant gene expression.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 2003 Apr;56(4):473-84.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12664167"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12664167);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Tracing specific synonymous codon-secondary structure correlations through evolution.Oresic M, Dehn M, Korenblum D, Shalloway D.  Department of Molecular Biology and Genetics, Cornell University, 265 Biotechnology Building, Ithaca, NY 14853, USA.We previously showed that GAU codons are preferred (relative to synonymous GAC codons) for encoding aspartates specifically at the N-termini of alpha-helices in human, but not in E. coli, proteins. To test if this difference reflected a general difference between eucaryotes and procaryotes, we now extended the analysis to include the proteins and coding sequences of mammals, vertebrates, S. cerevisiae, and plants. We found that the GAU-alpha-helix correlation is also strong in non-human mammalian and vertebrate proteins but is much weaker or insignificant in S. cerevisiae and plants. The vertebrate correlations are of sufficient strength to enhance alpha-helix N-terminus prediction. Additional results, including the observation that the correlation is significantly enhanced when proteins that are known to be correctly expressed in recombinant prokaryotic systems are excluded, suggest that the correlation is induced at the level of protein translation and folding and not at the nucleic acid level. To the best of our knowledge, it is not explicable by the canonical picture of protein expression and folding, suggesting the existence of a novel evolutionary selection mechanism. One possible explanation is that some alpha-helix N-terminal GAU codons may facilitate correct co-translational folding in vertebrates.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2002 Jan 1;30(1):289-93.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11752317"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11752317);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=11752317&amp;amp;db=pubmed&amp;url=http://nar.oupjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=11752317"&gt;&lt;/a&gt; &lt;br /&gt;SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes.Pandit SB, Gosar D, Abhiman S, Sujatha S, Dixit SS, Mhatre NS, Sowdhamini R, Srinivasan N.  Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India.Members of a superfamily of proteins could result from divergent evolution of homologues with insignificant similarity in the amino acid sequences. A superfamily relationship is detected commonly after the three-dimensional structures of the proteins are determined using X-ray analysis or NMR. The SUPFAM database described here relates two homologous protein families in a multiple sequence alignment database of either known or unknown structure. The present release (1.1), which is the first version of the SUPFAM database, has been derived by analysing Pfam, which is one of the commonly used databases of multiple sequence alignments of homologous proteins. The first step in establishing SUPFAM is to relate Pfam families with the families in PALI, which is an alignment database of homologous proteins of known structure that is derived largely from SCOP. The second step involves relating Pfam families which could not be associated reliably with a protein superfamily of known structure. The profile matching procedure, IMPALA, has been used in these steps. The first step resulted in identification of 1280 Pfam families (out of 2697, i.e. 47%) which are related, either by close homologous connection to a SCOP family or by distant relationship to a SCOP family, potentially forming new superfamily connections. Using the profiles of 1417 Pfam families with apparently no structural information, an all-against-all comparison involving a sequence-profile match using IMPALA resulted in clustering of 67 homologous protein families of Pfam into 28 potential new superfamilies. Expansion of groups of related proteins of yet unknown structural information, as proposed in SUPFAM, should help in identifying 'priority proteins' for structure determination in structural genomics initiatives to expand the coverage of structural information in the protein sequence space. For example, we could assign 858 distinct Pfam domains in 2203 of the gene products in the genome of Mycobacterium tubercolosis. Fifty-one of these Pfam families of unknown structure could be clustered into 17 potentially new superfamilies forming good targets for structural genomics. SUPFAM database can be accessed at &lt;a href="http://pauling.mbu.iisc.ernet.in/~supfam"&gt;http://pauling.mbu.iisc.ernet.in/~supfam&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2000 Dec 30;261(1):85-91.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11164040&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11164040);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Analysis of oligonucleotide AUG start codon context in eukariotic mRNAs.Pesole G, Gissi C, Grillo G, Licciulli F, Liuni S, Saccone C.   Dipartimento di Fisiologia e Biochimica Generali, Universita di Milano, Via Celoria 26, 20133, Milan, Italy. graziano.pesole@unimi.itThe AUG start codon context features have been investigated by analyzing eukaryotic mRNAs belonging to various taxonomic groups. The functional relevance of each specific position surrounding the AUG start codon has been established as a function of the measured shift between base composition observed at that particular position, and base composition averaged over all the 5'untranslated regions. A more detailed analysis carried out on human genes belonging to different isochores showed significant isochore-specific fea-tures that cannot be explained only by a mutational bias effect. The most represented heptamers spanning from position -3 to +4 with respect to the initiator AUG have been determined for mRNAs belonging to different taxonomic groups and a web page utility has been set up (http://bigarea.area.ba.cnr.it:8000/BioWWW/ATG.html) to determine the relative abundance of a user submitted oligonucleotide context in a given species or taxon.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Physiologie. 1986 Jul-Sep;23(3):209-12.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3095864"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3095864);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The distribution of codons by classes of triplets in the sequence of genes.Portelli C, Ursu C, Portelli AP.According to a criterion of symmetry-asymmetry, the triplets of the genetic code can be divided into four classes. In the genes of viruses and human mitochondria, the frequency by which a codon is followed by a codon of the same class is higher than that theoretically estimated. This is the consequence of the fact that in an initial stage of evolution many codons were duplicated.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biochim Biophys Acta. 1995 Apr 26;1261(3):394-400.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7742368"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7742368);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Correlation between codon usage, regional genomic nucleotide composition, and amino acid composition in the cytochrome P-450 gene superfamily.Porter TD.   Division of Pharmacology and Experimental Therapeutics, College of Pharmacy, University of Kentucky, Lexington 40536-0082, USA.The codon usage bias of 110 mammalian cytochrome P-450 genes has been determined and analyzed in relation to a variety of genetic, biochemical, and physiological parameters. In those P-450 genes exhibiting biased usage the preferred codons generally do not differ among the four species examined (rat, rabbit, man, and mouse) or from the predominantly used codons identified for all sequenced genes in a recent data base analysis (Wada et al. (1992) Nucleic Acids Res. 20 (Suppl.), 2111-2118). Codon usage bias does not correlate with evolutionary relationships, evolutionary age, or with the extent of evolutionary conservation of orthologous proteins; there is no obvious correlation with the level of expression of a given P-450, with its inducibility, nor with its physiologic role; and neither the preferred codons nor the degree of bias differ for P-450s expressed in different tissues. Codon usage bias does correlate with the C+G content at the codon third position, and thus preferred codons usually end in C or G; for those P-450s for which gene sequences are available this bias also correlates with the C + G content of the intronic and flanking regions of these genes. Moreover, a lesser increase in the C + G content at the codon first and second positions is also evident in genes located in regions of high C + G content; this leads to predictable differences in the amino acid compositions of P-450 enzymes that correlate with genomic nucleotide composition and the degree of bias in codon usage.&lt;br /&gt;&lt;br /&gt;Biol Chem. 1997 Dec;378(12):1521-30.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9461351"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9461351);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Comment in: &lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=pubmed&amp;amp;dopt=Abstract&amp;list_uids=9461337"&gt;Biol Chem. 1997 Dec;378(12):1393-5.&lt;/a&gt;&lt;br /&gt;Long non-stop reading frames on the antisense strand of heat shock protein 70 genes and prion protein (PrP) genes are conserved between species.Rother KI, Clay OK, Bourquin JP, Silke J, Schaffner W.   Institut fur Molekularbiologie II, Universitat Zurich, Switzerland.Several mammalian genes, including heat shock protein (Hsp70) and prion protein (PrP) genes, have been reported to have long open reading frames (ORFs) or non-stop reading frames (NRFs) in the antisense direction. A simple explanation would be that these long antisense reading frames, which are usually in the same triplet frame as the coding strand, are the fortuitous byproduct of a high overall [G+C] content with concomitant preference for G/C over A/T in the third codon position, a preference for RNY type codons (purine/any nucleotide/pyrimidine), and/or a bias against serine and leucine, the only amino acids with codons that can be read as stop codons in the antisense direction. The PrP genes and most heat shock genes with long antisense NRFs (aNRFs) are indeed relatively [G+C] rich but do not show a bias against serine and leucine. In several vertebrates investigated, at least one of the Hsp70 genes has a long antisense reading frame, and we found that some, though not all, putative stop codons in long Hsp70 antisense reading frames were due to sequencing errors. The PrP gene contains an extended antisense open reading frame in all 45 eutherian mammals tested, but not in a marsupial and in a bird. In the PrP gene, the long, protein-coding exon also harbors the antisense nonstop reading frame. In both Hsp70 and PrP genes, the putative antisense protein sequence is well conserved. Even though there is no clear evidence in Hsp70 or PrP genes for the existence of the respective antisense proteins, we speculate that such antisense proteins serve to regulate the genuine Hsp and PrP proteins under special circumstances. Alternatively, regulation might occur at the RNA level, and the antisense RNA would merely lack stop codons to prevent its rapid degradation by an mRNA quality control mechanism that is triggered by premature stop codons. We note that both Hsp and PrP are involved in physiological or pathological protein aggregation phenomena, that scrapie prions have been reported to modify the expression or localization of heat shock proteins, and that in yeast, propagation of a prion-like state (PSI+) depends on a heat shock (Hsp104) protein.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2001 Oct 3;276(1-2):101-5.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11591476"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11591476);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Correlation between sequence conservation of the 5' untranslated region and codon usage bias in Mus musculus genes.Sakai H, Washio T, Saito R, Shinagawa A, Itoh M, Shibata K, Carninci P, Konno H, Kawai J, Hayashizaki Y, Tomita M.   Laboratory for Bioinformatics, Keio University, 5322 Endo, Fujisawa, 252-8520, Kanagawa, Japan.The codon adaptation index (CAI) values of all protein-coding sequences of the full-length cDNA libraries of Mus musculus were computed based on the RIKEN mouse full-length cDNA library. We have also computed the extent of consensus in flanking sequences of the initiator ATG codon based on the 'relative entropy' values of respective nucleotide positions (from -20 to +12 bp relative to the initiator ATG codon) for each group of genes classified by CAI values. With regard to the two nucleotides positions (-3 and +4) known to be highly conserved in Kozak's consensus sequence, a clear correlation between CAI values and relative entropy values was observed at position -3 but this was not significant at position +4, although a significant correlation was found at position -1 of the consensus sequence. Further, although no correlation was observed at any additional positions, relative entropy values were very high at positions -4, -6, and -8 in genes with high CAI values. These findings suggest that the extent of conservation in the flanking sequence of the initiator ATG codon including Kozak's consensus sequence was an important factor in modulation of the translation efficiency as well as synonymous codon usage bias particularly in highly expressed genes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2002 Oct 30;300(1-2):89-95.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12468090&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12468090);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;On biased distribution of introns in various eukaryotes.Sakurai A, Fujimori S, Kochiwa H, Kitamura-Abe S, Washio T, Saito R, Carninci P, Hayashizaki Y, Tomita M.   Institute for Advanced Biosciences, Keio University, 5322 Endo, Fujisawa-city, Kanagawa, Japan.We conducted comprehensive analyses on intron positions in the Mus musculus genome by comparing genomic sequences in the GenBank database and cDNA sequences in the mouse cDNA library recently developed by Riken Genomic Sciences Center. Our results confirm that introns have a tendency to be located toward the 5' end of the gene. The same type of analysis was conducted in the coding region of seven eukaryotes (Saccharomyces cerevisiae, Plasmodium falciparum, Caenorhabditis elegans, Drosophila melanogaster, M. musculus, Homo sapiens, Arabidopsis thaliana). Introns in genes with a single intron have a locational bias toward the 5' end in all species except A. thaliana. We also measured the distance from the start codon to the position of the intron, and found that single introns prefer the location immediately after the start codon in S. cerevisiae and P. falciparum. We discuss three possible explanations for these findings: (1) they are the consequence of intron loss by reverse-transcriptase; (2) they are necessary to accommodate the function; and (3) they are concerned with the mechanism of pre-mRNA splicing.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2003 Jun 5;311:35-42.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12853136"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12853136);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Quantifying the species-specificity in genomic signatures, synonymous codon choice, amino acid usage and G+C content.Sandberg R, Branden CI, Ernberg I, Coster J.  Microbiology and Tumor Biology Center, Karolinska Institute, S-171 77 Stockholm, Sweden. Rickard.Sandburg@mtc.ki.seEach prokaryote has a unique genomic signature as evidenced by a set of species-specific frequencies of short oligonucleotides. With respect to genomic signatures a bacterial genome is homogenous and the variation within a genome is smaller than the variations between genomes of different species. This study quantifies the species-specificity of genomic signatures in the complete genomes of 57 prokaryotes. The species-specificity in the genomic signature was related to the quantification of other sequence biases, such as G+C content, synonymous codon choice and amino acid usage. The results confirm that the genomic signature is genome-wide with high species-specificity in both coding and non-coding regions. In coding regions the species-specific bias in synonymous codon choice was comparable to the genomic signature, while the bias in amino acid usage only captured about 50% of the species-specific bias in the genomic signature. A correlation between the species-specificity in synonymous codon choice and amino acid usage was identified, in which proteins with species-specific amino acid usage were also coded with species-specific synonymous codon choice. However, we demonstrated that the G+C content captures only approximately 40% of the species-specificity in the genomic signature, and is insufficient to explain the species specificity in the non-coding regions. Thus, the species-specific bias in non-coding regions remains largely unknown. Further, we compared the genomic signature in relation to phylogenetic distance. This was performed in order to illustrate the feasibility of a hierarchical classification scheme in future applications of the described classification methodology in screening for horizontal gene transfer and biodiversity studies.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proc Natl Acad Sci U S A. 1986 Apr;83(7):2133-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=3457379"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu3457379);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Origin of eukaryotic introns: a hypothesis, based on codon distribution statistics in genes, and its implications.Senapathy P.A hypothesis for the origin of introns in eukaryotic genes is developed. By computer simulation it was found that the reading-frame lengths in a random nucleotide sequence are distributed in a negative exponential manner and that there exists an upper limit of about 200 codons in the length of the reading frames (RFs). These characteristics suggest that, if primordial DNA contained a random nucleotide sequence, the most primitive cells would have been under selective pressure to eliminate interfering stop codons in order to increase the length of RFs. Further, they indicate that the only possible way that a coding sequence that is considerably longer than 600 nucleotides could be derived from the short coding sequences occurring in a random sequence would be to splice the short coding sequences and to eliminate the stretches of sequences containing clusters of inframe stop codons. Thus, introns are suggested to be those stretches of sequences containing interfering stop codons that were originally earmarked in the first primitive cells to be eliminated in order to enable the coding for long polypeptides. Because the statistical characteristics of codon distributions in today's eukaryotic DNA sequences resemble closely those of a random sequence and because the upper limit in the length of RFs (200 codons) in a random sequence corresponds precisely to the observed maximum length of exons in today's eukaryotic genes (600 nucleotides), it is suggested that introns originated in the most primitive unicellular eukaryotes when they evolved from primordial sequences. The data from the prokaryotic gene sequences indicate that prokaryotic genes may have been derived originally from primitive unicellular eukaryotic genes by losing introns from them.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Immunol. 1999 Jul 1;163(1):259-68.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10384124&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10384124);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=10384124&amp;amp;db=pubmed&amp;url=http://www.jimmunol.org/cgi/pmidlookup?view=long&amp;amp;pmid=10384124"&gt;&lt;/a&gt; &lt;br /&gt;Predicting regional mutability in antibody V genes based solely on di- and trinucleotide sequence composition.Shapiro GS, Aviszus K, Ikle D, Wysocki LJ.   Department of Pediatrics, Division of Basic Sciences, National Jewish Medical and Research Center, Denver, CO 80206, USA.Somatic mutations are not distributed randomly throughout Ab V region genes. A sequence-specific target bias is revealed by a defined hierarchy of mutability among di- and trinucleotide sequences located within Ig intronic DNA. Here we report that the di- and trinucleotide mutability preference pattern is shared by mouse intronic JH and Jkappa clusters and by human VH genes, suggesting that a common mutation mechanism exists for all Ig V genes of both species. Using di- and trinucleotide target preferences, we performed a comprehensive analysis of human and murine germline V genes to predict regional mutabilities. Heavy chain genes of both species exhibit indistinguishable patterns in which complementarity-determining region 1 (CDR1), CDR2, and framework region 3 (FR3) are predicted to be more mutable than FR1 and FR2. This prediction is borne out by empirical mutation data from nonproductively rearranged human VH genes. Analysis of light chain genes in both species also revealed a common, but unexpected, pattern in which FR2 is predicted to be highly mutable. While our analyses of nonfunctional Ig genes accurately predicts regional mutation preferences in VH genes, observed relative mutability differences between regions are more extreme than expected. This cannot be readily accounted for by nascent mRNA secondary structure or by a supplemental gene conversion mechanism that might favor nucleotide replacements in CDR. Collectively, our data support the concept of a common mutation mechanism for heavy and light chain genes of mice and humans with regional bias that is qualitatively, but not quantitatively, accounted for by short nucleotide sequence composition.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Curr Opin Genet Dev. 1994 Dec;4(6):851-60.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7888755"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7888755);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage and genome evolution.Sharp PM, Matassi G.  Department of Genetics, University of Nottingham, Queens Medical Centre, UK.The rates and patterns of evolution at silent sites in codons reveal much about the basic features of molecular evolution. Recent increases in the amount of sequence data available for various species and more precise knowledge of the chromosomal locations of those sequences, coming in particular from genome projects, reveal that some features of molecular evolution vary around the genome.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Philos Trans R Soc Lond B Biol Sci. 1995 Sep 29;349(1329):241-7.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8577834"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8577834);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;DNA sequence evolution: the sounds of silence.Sharp PM, Averof M, Lloyd AT, Matassi G, Peden JF.   Department of Genetics, University of Nottingham, Queens Medical Centre, U.K.Silent sites (positions that can undergo synonymous substitutions) in protein-coding genes can illuminate two evolutionary processes. First, despite being silent, they may be subject to natural selection. Among eukaryotes this is exemplified by yeast, where synonymous codon usage patterns are shaped by selection for particular codons that are more efficiently and/or accurately translated by the most abundant tRNAs; codon usage across the genome, and the abundance of different tRNA species, are highly co-adapted. Second, in the absence of selection, silent sites reveal underlying mutational patterns. Codon usage varies enormously among human genes, and yet silent sites do not appear to be influenced by natural selection, suggesting that mutation patterns vary among regions of the genome. At first, the yeast and human genomes were thought to reflect a dichotomy between unicellular and multicellular organisms. However, it now appears that natural selection shapes codon usage in some multicellular species (e.g. Drosophila and Caenorhabditis), and that regional variations in mutation biases occur in yeast. Silent sites (in serine codons) also provide evidence for mutational events changing adjacent nucleotides simultaneously.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Tsitologiia. 2003;45(7):702-6.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14989164&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14989164);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Amino acid code of protein secondary structure.Shestopalov BV.   Institute of Cytology RAS, St. Petersburg. shest@mail.cytspb.rssi.ruThe calculation of protein three-dimensional structure from the amino acid sequence is a fundamental problem to be solved. This paper presents principles of the code theory of protein secondary structure, and their consequence--the amino acid code of protein secondary structure. The doublet code model of protein secondary structure, developed earlier by the author (Shestopalov, 1990), is part of this theory. The theory basis are:&lt;br /&gt;1)      the name secondary structure is assigned to the conformation, stabilized only by the nearest (intraresidual) and middle-range (at a distance no more than that between residues i and i + 5) interactions;&lt;br /&gt;2)      the secondary structure consists of regular (alpha-helical and beta-structural) and irregular (coil) segments;&lt;br /&gt;3)      the alpha-helices, beta-strands and coil segments are encoded, respectively, by residue pairs (i, i + 4), (i, i + 2), (i, i = 1), according to the numbers of residues per period, 3.6, 2, 1;&lt;br /&gt;4)      all such pairs in the amino acid sequence are codons for elementary structural elements, or structurons;&lt;br /&gt;5)      the codons are divided into 21 types depending on their strength, i.e. their encoding capability;&lt;br /&gt;6)      overlappings of structurons of one and the same structure generate the longer segments of this structure;&lt;br /&gt;7)      overlapping of structurons of different structures is forbidden, and therefore selection of codons is required, the codon selection is hierarchic;&lt;br /&gt;8)      the code theory of protein secondary structure generates six variants of the amino acid code of protein secondary structure.&lt;br /&gt;There are two possible kinds of model construction based on the theory: the physical one using physical properties of amino acid residues, and the statistical one using results of statistical analysis of a great body of structural data. Some evident consequences of the theory are:&lt;br /&gt;a)      the theory can be used for calculating the secondary structure from the amino acid sequence as a partial solution of the problem of calculation of protein three-dimensional structure from the amino acid sequence, and the calculated secondary structure and codon strength distribution can be used for simulating the next step of protein folding;&lt;br /&gt;b)      one can propose that the same secondary structures can be folded into different tertiary structures and, vice versa, different secondary structures can be folded into the same tertiary structures, provided codon distributions are considered also;&lt;br /&gt;c)      codons can be considered as first elements of protein three-dimensional structure language.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 1997 Jul 18;194(1):143-55.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9266684&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9266684);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The majority of long non-stop reading frames on the antisense strand can be explained by biased codon usage.Silke J.  Institut fur Molekularbiologie II, Universitat Zurich, Switzerland.In recent studies it has been suggested that long reading frames on the antisense strand of open reading frames (ORFs) are more frequent than expected. The vertebrate DNA database was searched for long (greater than 900 bp) antisense non-stop reading frames (aNRFs) that overlap known coding regions. The sequences obtained were predominantly positioned in DNA with a high usage of G or C in the third codon position of the sense ORF. The major class of sequences revealed by the search was that of the heat-shock protein 70 kDa (Hsp70) family. A long Hsp70 aNRF was found in many Hsp70 sequences and occurred in species as diverse as fish, flies, fungi and bacteria. The role of codon usage bias was analysed both in the specific case of the Hsp70 genes and in a general species-wide context. The data obtained showed that even the very long aNRFs present in the Hsp70 family could be explained by codon usage bias on the sense strand. Codon usage bias is determined by GC content at the third codon position of the sense ORF and, in some species, by a high expression level of the gene in question. Such an explanation for the occurrence of long aNRFs cannot exclude that some aNRFs are transcribed and translated.&lt;br /&gt;&lt;br /&gt;Mol Biol Evol. 2000 Nov;17(11):1581-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11070046&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11070046);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=11070046&amp;amp;db=pubmed&amp;url=http://mbe.oupjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=11070046"&gt;&lt;/a&gt; &lt;br /&gt;Nucleotide bias causes a genomewide bias in the amino acid composition of proteins.Singer GA, Hickey DA.  Department of Biology, University of Ottawa, Ottawa, Ontario, Canada.We analyzed the nucleotide contents of several completely sequenced genomes, and we show that nucleotide bias can have a dramatic effect on the amino acid composition of the encoded proteins. By surveying the genes in 21 completely sequenced eubacterial and archaeal genomes, along with the entire Saccharomyces cerevisiae genome and two Plasmodium falciparum chromosomes, we show that biased DNA encodes biased proteins on a genomewide scale. The predicted bias affects virtually all genes within the genome, and it could be clearly seen even when we limited the analysis to sets of homologous gene sequences. Parallel patterns of compositional bias were found within the archaea and the eubacteria. We also found a positive correlation between the degree of amino acid bias and the magnitude of protein sequence divergence. We conclude that mutational bias can have a major effect on the molecular evolution of proteins. These results could have important implications for the interpretation of protein-based molecular phylogenies and for the inference of functional protein adaptation from comparative sequence data.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Exp Clin Immunogenet. 2000;17(1):29-41.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10686481&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10686481);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Two human gene families display preferences for different nucleotides and have distinct codon usage patterns.Skerka C, Abel WO, Zipfel PF.    Bernhard Nocht Institut fur Tropenmedizin, Institut fur Allgemeine Botanik, Universitat Hamburg, Hamburg, Germany.Analysis of base composition has proven important for functional gene analysis. By comparing base composition and codon usage between two specific human gene families we were able to show a highly conserved nucleotide distribution among the members of one gene family and a significant difference between the two families. The two groups selected for analysis were the human factor H gene family, which represents six secreted human plasma proteins with functions in immune defense, and a class of four human zinc finger proteins, termed early growth response (EGR) proteins, which represent DNA-binding transcription factors. The nucleotide distribution of each gene family is distinct: members of the factor H gene family represent AT-rich genes, displaying an overall AT nucleotide content of 62.8% and a particular preference for A nucleotides (33.9%). In contrast, the EGR genes are GC-rich (55.9%) and C nucleotides are used in 31.2%. This nucleotide difference affects codon usage among synonymous codons and is considered of biological significance, as it affects DNA stability. The codon preference is particularly high at codon position 3, where each family selects for codons which have the preferred nucleotide at this silent third position. At position 3, A nucleotides are preferred by factor H genes in 36.3% of the 2, 503 codons analyzed, compared to 10% of the 1,876 codons analyzed for the EGR family. In contrast, C nucleotides are used by the EGR family in 48.1%, compared to 16% of the triplets used by the factor H gene family. This comparison of two human gene families shows that nucleotide distribution and codon usage is not uniform within the human organism and the described differences most likely represent selection constraints between the polymorphic factor H and highly conserved EGR genes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Comput Biol. 1994 Spring;1(1):39-50.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8790452&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8790452);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Combined use of sequence similarity and codon bias for coding region identification.States DJ, Gish W.   Institute for Biomedical Computing, Washington University, St. Louis, MO 63108, USA.A computer program called BLASTX was previously shown to be effective in identifying and assigning putative function to likely protein coding regions by detecting significant similarity between a conceptually translated nucleotide query sequence and members of a protein sequence database. We present and assess the sensitivity of a new option to this software tool, herein called BLASTC, which employs information obtained from biases in codon utilization, along with the information obtained from sequence similarity. A rationale for combining these diverse information sources was derived, and analyses of the information available from codon utilization in several species were performed, with wide variation seen. Codon bias information was found on average to improve the sensitivity of detection of short coding regions of human origin by about a factor of 5. The implications of combining information sources on the interpretation of positive findings are discussed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2001 Jan 24;263(1-2):273-84.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11223267"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11223267);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon bias at the 3'-side of the initiation codon is correlated with translation initiation efficiency in Escherichia coli.Stenstrom CM, Jin H, Major LL, Tate WP, Isaksson LA.   Department of Microbiology, Stockholm University, S-106 91 Stockholm, Sweden.The codon that follows the AUG initiation triplet (+2 codon) affects gene expression in Escherichia coli. We have extended this analysis using two model genes lacking any apparent Shine-Dalgarno sequence. Depending on the identity of the +2 codon a difference in gene expression up to 20-fold could be obtained. The effects did not correlate with the levels of intracellular pools of cognate tRNA for the +2 codon, with putative secondary mRNA structures, or with mRNA stability. However, most +2 iso-codons that were decoded by the same species of tRNA gave pairwise similar effects, suggesting that the effect on gene expression was associated with the decoding tRNA. High adenine content of the +2 codon was associated with high gene expression. Of the fourteen +2 codons that mediated the highest efficiency, all except two had an adenine as the first base of the codon. Analysis of the 3540 E. coli genes from the TransTerm database revealed that codons associated with high gene expression in the two expression systems are over-represented at the +2 position in natural genes. Codons that are associated with low gene expression are under-represented. The data suggest that evolution has favored codons at the +2 position that give high translation initiation.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 1999 Sep 30;238(1):53-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10570983&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10570983);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position.Sueoka N.    University of Colorado, Department of Molecular, Cellular, and Developmental Biology, Boulder 80309-0347, USA. sueoka@stripe.colorado.eduThe genome of higher eukaryotes consists of genes having a widely heterogeneous base composition at the third codon position. Ubiquitous variability of the DNA base composition has the following two aspects:&lt;br /&gt;intragenomic heterogeneity of the G+C content and&lt;br /&gt;the amino-acid-specific translation-coupled biases from the Parity Rule 2 (PR2).&lt;br /&gt;PR2 is an intrastrand rule where A = T and G = C are expected if there is no bias in mutation and selection between the two complementary strands of DNA. To examine whether or not the biases from PR2 are responsible for the wide heterogeneity of the DNA G+C content in human, the third codon position of 846 human genes was analyzed. Genes were separated into six groups according to their G+C content of the third codon position, and each group was examined for the translation-coupled PR2 biases in the nucleotide composition of the third codon position for two- and four-codon amino acids. The results show that genes in the different G+C content groups have similar PR2 biases, indicating that the intragenomic heterogeneity of the G+C content is not correlated with translation-coupled biases from the PR2. Therefore, the heterogeneity of the G+C content is likely to be determined by some other mechanism (e.g. locally variable directional mutation pressures) than amino-acid-specific selections for the codon preference.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 2000 Dec 30;261(1):53-62.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11164037"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11164037);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;DNA G+C content of the third codon position and codon usage biases of human genes.Sueoka N, Kawanishi Y.  University of Colorado, Department of Molecular, Cellular, and Developmental Biology, Boulder, CO 80309-0347, USA. sueoka@stripe.colorado.eduThe human genome, as in other eukaryotes, has a wide heterogeneity in the DNA base composition. The evolutionary basis for this heterogeneity has been unknown. A previous study of the human genome (846 genes analyzed) has shown that, in the major range of the G+C content in the third codon position (0.25-0.75), biases from the Parity Rule 2 (PR2) among the synonymous codons of the four-codon amino acids are similar except in the highest G+C range (Sueoka, N., 1999. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53-58.). PR2 is an intra-strand rule where A=T and G=C are expected when there are no biases between the two complementary strands of DNA in mutation and selection rates (substitution rates). In this study, 14,026 human genes were analyzed. In addition, the third codon positions of two-codon amino acids were analyzed. New results show the following: (a) The G+C contents of the third codon position of human genes are scattered in the G+C range of 0.22-0.96 in the third codon position. (b) The PR2 biases are similar in the range of 0.25-0.75, whereas, in the high G+C range (0.75-0.96; 13% of the genes), the PR2-bias fingerprints are different from those of the major range. (c) Unlike the PR2 biases, the G+C contents of the third codon position for both four-codon and two-codon amino acids are all correlated almost perfectly with the G+C content of the third codon position over the total G+C ranges. These results support the notion that the directional mutation pressure, rather than the directional selection pressure, is mainly responsible for the heterogeneity of the G+C content of the third codon position.&lt;br /&gt;&lt;br /&gt;Gene. 2004 Jun 23;335:19-23.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15194186"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15194186);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;The 'weighted sum of relative entropy': a new index for synonymous codon usage bias.Suzuki H, Saito R, Tomita M.   Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan.Shannon entropy from information theory has been applied to estimate the degree of deviation from equal usage of synonymous codons; however, previous attempts have failed to take into account all three aspects of amino acid usage, i.e. (i) the number of distinct amino acids,&lt;br /&gt;(ii) their relative frequencies, and&lt;br /&gt;(iii) their degree of codon degeneracy.&lt;br /&gt;A new index taking into account all of these aspects is proposed. The index, designated as the 'weighted sum of relative entropy' (E(w)), is defined as the sum of the relative entropy of each amino acid weighted by its relative frequency in the sequence. In this paper, we demonstrate that E(w) allows us to avoid some amino acid usage biases and can yield results contradictory to those obtained by previous methods.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Biotechniques. 1991 Jun;10(6):782-4.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=1878215"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu1878215);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;A Macintosh computer program for designing DNA sequences that code for specific peptides and proteins.Tamura T, Holbrook SR, Kim SH.  Melvin Calvin Laboratory, University of California, Berkeley 94720.A computer program (PINCERS) is described for use in the design of synthetic genes and mixed-probe DNA sequences. A protein sequence is reverse translated with generation of synonymous codons at each position producing a degenerate sequence. In order to locate potential restriction enzyme sites, the degenerate sequence is searched with a library of restriction enzymes for sites that utilize any combination of synonymous codons. These sites are indicated in a map so that they may be incorporated into the synthetic gene sequence. The program allows the user to select the appropriate codon usage table for the organism of interest and then to set a threshold usage frequency below which codons are not generated. PINCERS may also be used to assist in planning the synthesis of mixed-probe DNA sequences for cross-hybridization experiments. It can identify regions of specified length with the protein sequence that have the least overall degeneracy, thereby minimizing the number of probes to be synthesized and, therefore, maximizing the concentration of a given probe sequence.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1989 Apr;28(4):286-98.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=2499685&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu2499685);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Nucleic acid composition, codon usage, and the rate of synonymous substitution in protein-coding genes.Ticher A, Graur D.    Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel.Based on the rates of synonymous substitution in 42 protein-coding gene pairs from rat and human, a correlation is shown to exist between the frequency of the nucleotides in all positions of the codon and the synonymous substitution rate. The correlation coefficients were positive for A and T and negative for C and G. This means that AT-rich genes accumulate more synonymous substitutions than GC-rich genes. Biased patterns of mutation could not account for this phenomenon. Thus, the variation in synonymous substitution rates and the resulting unequal codon usage must be the consequence of selection against A and T in synonymous positions. Most of the variation in rates of synonymous substitution can be explained by the nucleotide composition in synonymous positions. Codon-anticodon interactions, dinucleotide frequencies, and contextual factors influence neither the rates of synonymous substitution nor codon usage. Interestingly, the nucleotide in the second position of codons (always a nonsynonymous position) was found to affect the rate of synonymous substitution. This finding links the rate of nonsynonymous substitution with the synonymous rate. Consequently, highly conservative proteins are expected to be encoded by genes that evolve slowly in terms of synonymous substitutions, and are consequently highly biased in their codon usage.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Genetics. 2001 Nov;159(3):1191-9.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11729162"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11729162);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon usage bias covaries with expression breadth and the rate of synonymous evolution in humans, but this is not evidence for selection.Urrutia AO, Hurst LD.   Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom.In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, [which, being translated (no pun intended!), means that there is a second order of non-randomness (1) codon synonomous bias (2) differential evolution (=selection) rate]  the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Bioinformatics. 2003 Jan;19(1):159-60.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12499310"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12499310);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;XdomView: protein domain and exon position visualization.Vivek G, Tan TW, Ranganathan S.   Department of Biochemistry, National University of Singapore, Singapore 119260.SUMMARY: The relationship between intron distribution in the eukaryotic gene and protein structural elements is essential for understanding the origin and evolution of genes. XdomView is a web-based viewer mapping protein structural domains and intron positions in eukaryotic homologues to its tertiary structure. The association of sequence signals to 3D structure in XdomView provides a valuable visualization environment for eukaryotic gene organization, gene evolution, protein folding and protein structure classification. AVAILABILITY: Freely available from &lt;a href="http://surya.bic.nus.edu.sg/xdom"&gt;http://surya.bic.nus.edu.sg/xdom&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Gene. 1981 May;13(4):355-64.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=7262559"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu7262559);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Preferential codon usage in genes.Wain-Hobson S, Nussinov R, Brown RJ, Sussman JL.We present a method which permits comparison of the preferential use of degenerate codons within any gene. The method makes use of the triplet frequencies in the noncoding frames to assess whether a preference is specific to the reading frame. Preference is given a statistical meaning by use of the analysis of variance coupled to Duncan's multiple range test. Preferential use of degenerate codons is gene-specific and independent of gene size. The data suggest that any correlation between codon frequency distribution and tRNA levels is unreliable. In those animal genes examined, codons ending in C or G are preferred; in animal viruses tested, codons ending in U or A are preferred. Similarly, the bacterial genes and the genes of single-stranded DNA phages that we analyzed differed from each other as well as from eukaryotic genes in the third base of the codon.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Biol. 1994 May 20;238(5):693-708.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=8182744&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu8182744);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. II. Secondary structures.Wako H, Blundell TL.   Department of Crystallography, Birkbeck College, University of London, U.K.A three-step method is presented to predict secondary structures of proteins, by utilizing aligned sequences of homologous proteins. Mean propensities and amino acid substitution patterns at a given site in the aligned sequences are first evaluated for four conformational states (i.e. alpha-helix, beta-strand, buried coil and exposed coil). Capping rules are applied in order to define boundaries of the secondary-structure segments more precisely. In the second step beta-strand is predicted by searching regions predicted as coil for the two patterns characteristic of alternating and fully buried beta-strands. The complete sequences of the solvent-accessibility classes predicted by substitution tables and propensities are also searched using Fourier transform methods for alpha-helical periodicity. After applying capping rules, the alpha-helices and beta-strands predicted in the second step replace, where appropriate, the conformational states predicted in the first step. Finally, in the third step, if one of the four conformational states is assigned to the residues at an equivalent site of aligned sequences in more than a given fraction of the proteins, such a state is reassigned to all the residues at that site. The method is applied to 13 protein families, which contain four folding types, alpha, beta, alpha/beta and alpha + beta. The accuracy of the prediction ranges from 60 to 79% (mean percentage over the 13 families is 69%). For comparison the Garnier-Osguthorpe-Robson (GOR) method is also applied to them. Although the mean prediction accuracy for the GOR method, 58%, can be improved to 63% by applying the second and third steps in this method, there remain four families with less than 55% accuracy. The mean accuracy is relatively higher and poor predictions are reduced in this method.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Bioinformatics. 2000 Nov;16(11):988-1002.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=11159310"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu11159310);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=11159310&amp;amp;db=pubmed&amp;url=http://bioinformatics.oupjournals.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=11159310"&gt;&lt;/a&gt; &lt;br /&gt;Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.Wallqvist A, Fukunishi Y, Murphy LR, Fadel A, Levy RM.   Department of Chemistry, Rutgers University, Wright-Rieman Laboratories, 610 Taylor Rd, Piscataway, NJ 08854-8087, USA. anders@rutchem.rutgers.eduMOTIVATION: Sequence alignment techniques have been developed into extremely powerful tools for identifying the folding families and function of proteins in newly sequenced genomes. For a sufficiently low sequence identity it is necessary to incorporate additional structural information to positively detect homologous proteins. We have carried out an extensive analysis of the effectiveness of incorporating secondary structure information directly into the alignments for fold recognition and identification of distant protein homologs. A secondary structure similarity matrix based on a database of three-dimensionally aligned proteins was first constructed. An iterative application of dynamic programming was used which incorporates linear combinations of amino acid and secondary structure sequence similarity scores. Initially, only primary sequence information is used. Subsequently contributions from secondary structure are phased in and new homologous proteins are positively identified if their scores are consistent with the predetermined error rate. RESULTS: We used the SCOP40 database, where only PDB sequences that have 40% homology or less are included, to calibrate homology detection by the combined amino acid and secondary structure sequence alignments. Combining predicted secondary structure with sequence information results in a 8-15% increase in homology detection within SCOP40 relative to the pairwise alignments using only amino acid sequence data at an error rate of 0.01 errors per query; a 35% increase is observed when the actual secondary structure sequences are used. Incorporating predicted secondary structure information in the analysis of six small genomes yields an improvement in the homology detection of approximately 20% over SSEARCH pairwise alignments, but no improvement in the total number of homologs detected over PSI-BLAST, at an error rate of 0.01 errors per query. However, because the pairwise alignments based on combinations of amino acid and secondary structure similarity are different from those produced by PSI-BLAST and the error rates can be calibrated, it is possible to combine the results of both searches. An additional 25% relative improvement in the number of genes identified at an error rate of 0.01 is observed when the data is pooled in this way. Similarly for the SCOP40 dataset, PSI-BLAST detected 15% of all possible homologs, whereas the pooled results increased the total number of homologs detected to 19%. These results are compared with recent reports of homology detection using sequence profiling methods. AVAILABILITY: Secondary structure alignment homepage at http://lutece.rutgers.edu/ssas CONTACT: anders@rutchem.rutgers.edu; ronlevy@lutece.rutgers.edu Supplementary Information: Genome sequence/structure alignment results at &lt;a href="http://lutece.rutgers.edu/ss_fold_predictions"&gt;http://lutece.rutgers.edu/ss_fold_predictions&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Nucleic Acids Res. 2002 Nov 1;30(21):e120.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12409479"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12409479);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Designing gene libraries from protein profiles for combinatorial protein experiments.Wang W, Saven JG.  Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA.Protein combinatorial libraries provide new ways to probe the determinants of folding and to discover novel proteins. Such libraries are often constructed by expressing an ensemble of partially random gene sequences. Given the intractably large number of possible sequences, some limitation on diversity must be imposed. A non-uniform distribution of nucleotides can be used to reduce the number of possible sequences and encode peptide sequences having a predetermined set of amino acid probabilities at each residue position, i.e., the amino acid sequence profile. Such profiles can be determined by inspection, multiple sequence alignment or physically-based computational methods. Here we present a computational method that takes as input a desired sequence profile and calculates the individual nucleotide probabilities among partially random genes. The calculated gene library can be readily used in the context of standard DNA synthesis to generate a protein library with essentially the desired profile. The fidelity between the desired profile and the calculated one coded by these partially random genes is quantitatively evaluated using the linear correlation coefficient and a relative entropy, each of which provides a measure of profile agreement at each position of the sequence. On average, this method of identifying such codon frequencies performs as well or better than other methods with regard to fidelity to the original profile. Importantly, the method presented here provides much better yields of complete sequences that do not contain stop codons, a feature that is particularly important when all or large fractions of a gene are subject to combinatorial mutation.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1984-85;21(2):169-81.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6442991"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6442991);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Codon equilibrium I: Testing for homogeneous equilibrium.Wilbur WJ.We present theoretical considerations that suggest that synonymous-codon usage might be expected to be close to an equilibrium distribution given a very homogeneous process of silent substitution. By homogeneous we mean that substitution depends only on the two bases involved, so that 12 base-substitution rates completely describe the silent substitution process. We have developed a method of statistically testing for such homogeneous equilibrium and applied it to reported data on the codon usages of different classes of organisms. Weakly expressed bacterial sequences and both mammalian and nonmammalian eukaryotic sequences deviate significantly from a random pattern of codon usage, in the direction of homogeneous equilibrium. On the other hand, highly expressed bacterial sequences do not exhibit homogeneous equilibrium, which may be correlated with recent experimental results showing that they are optimized to accept the most abundant tRNAs. To examine the effect of amino acid replacements on the homogeneous model of silent substitution, we divided the amino acids with degenerate codes into two classes, those with high mutabilities and those with low, and performed the same analysis on bacterial and eukaryotic data sets. The codon sets of the highly mutable class of amino acids are not further from homogeneous equilibrium than are the codon sets of the class with low mutabilities. We also found for the eukaryotic data that these independent classes of codon sets show very similar equilibrium patterns. The various results suggest a high level of uniformity in the process of silent fixation in the different synonymous-codon sets, especially in eukaryotes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Trends Genet. 2004 Nov;20(11):534-8.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=15475111&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu15475111);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Evidence for codon bias selection at the pre-mRNA level in eukaryotes.Willie E, Majewski J.    Laboratory of Statistical Genetics, The Rockefeller University, PO Box 192, 1230 York Avenue, New York, NY 10021, USA.We investigated codon usage patterns across eukaryotic exons. We have shown that in humans codon preference varies with distance from the splice sites. This is consistent with the distribution of RNA elements involved in splicing regulation. Our results provide the first evidence that selection at the pre-mRNA level influences codon usage in humans. We also show that systematic trends in codon usage are found in other eukaryotes, suggesting that pre-mRNA level selection for codon usage could be a widespread phenomenon in organisms that undergo RNA splicing.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Biol Chem. 1980 Apr 10;255(7):2807-15.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=6244294&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu6244294);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=6244294&amp;amp;db=pubmed&amp;url=http://www.jbc.org/cgi/pmidlookup?view=reprint&amp;amp;pmid=6244294"&gt;&lt;/a&gt; &lt;br /&gt;Nucleotide sequence of the coding portion of human alpha globin messenger RNA.Wilson JT, Wilson LB, Reddy VB, Cavallesco C, Ghosh PK, deRiel JK, Forget BG, Weissman SM.The nucleotide sequence of the coding portion of human alpha globin mRNA has been determined by sequence analysis using human alpha globin cDNA cloned in bacterial plasmids. The sequence was obtained by a combination of direct sequence analysis of the cloned cDNA and analysis of cDNA obtained by primer extension, using short restriction endonuclease fragments of cloned alpha cDNA that were hybridized to human globin mRNA and elongated on the mRNA template by viral reverse transcriptase. The human alpha globin mRNA has an unexpectedly high G + C base composition (64.7%), similar to that observed for rabbit globin alpha mRNA, and displays a striking bias in the use of synonym codons for various amino acids. The bias in codon usage of human alpha globin mRNA is similar, with some exceptions, to that previously observed for rabbit alpha globin mRNA as well as for human and rabbit beta globin mRNAs. A detailed restriction endonuclease map of the human alpha globin cDNA is presented.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Proteins. 2004 Jan 1;54(1):71-87.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=14705025"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu14705025);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Gaps in structurally similar proteins: towards improvement of multiple sequence alignment.Wrabl JO, Grishin NV.   Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050, USA.An algorithm was developed to locally optimize gaps from the FSSP database. Over 2 million gaps were identified from all versus all FSSP structure comparisons, and datasets of non-identical gaps and flanking regions comprising between 90,000 and 135,000 sequence fragments were extracted for statistical analysis. Relative to background frequencies, gaps were enriched in residue types with small side chains and high turn propensity (D, G, N, P, S), and were depleted in residue types with hydrophobic side chains (C, F, I, L, V, W, Y). In contrast, regions flanking a gap exhibited opposite trends in amino acid frequencies, i.e., enrichment in hydrophobic residues and a high degree of secondary structure. Log-odds scores of residue type as a function of position in or around a gap were derived from the statistics. Three simple experiments demonstrated that these scores contained significant predictive information. First, regions where gaps were observed in single sequences taken from HOMSTRAD structure-based multiple sequence alignments generally scored higher than regions where gaps were not observed. Second, given the correct pairwise-aligned cores, the actual positions of gaps could be reproduced from sequence more accurately using the structurally-derived statistics than by using random pairwise alignments. Finally, revision of the Clustal-W residue-specific gap opening parameters with this new information improved the agreement of Clustal-W alignments with the structure-based alignments. At least three applications for these results are envisioned: improvement of gap penalties in pairwise (or multiple) sequence alignment, prediction of regions of single sequences likely (or unlikely) to contain indels, and more accurate placement of gaps in automated pairwise structure alignment.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;FEBS Lett. 1998 Aug 28;434(1-2):93-6.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9738458&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9738458);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Erratum in: FEBS Lett 1998 Oct 16;437(1-2):164. Tao, X [corrected to Xie, T]; Dafu, D [corrected to Ding, D].&lt;br /&gt;The relationship between synonymous codon usage and protein structure.Xie T, Ding D.  Shanghai Institute of Biochemistry, Academia Sinica, People's Republic of China.The hypothesis that synonymous codon usage is related to protein three-dimensional structure is examined by investigating the correlation between synonymous codon usage and protein secondary structure. All except two codons in E. coli show the same secondary structural preference for alpha-helix, beta-strand or coil as that of amino acids to be encoded by the respective codons, while 17 codons show secondary structural bias in mammalian proteins. The results indicate that there is no significant correlation between synonymous codon usage and protein secondary structure in E. coli, but there is a correlation in mammals. It could be deduced that synonymous codons carry much less structural information in prokaryotes than in eukaryotes due to their divergent evolutionary mechanism.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;J Mol Evol. 1998 Apr;46(4):409-18.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=9541535&amp;amp;tool=ExternalSearch"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu9541535);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;Synonymous and nonsynonymous rate variation in nuclear genes of mammals.Yang Z, Nielsen R.  Department of Integrative Biology, University of California, Berkeley, 94720-3140, USA. z.yang@ucl.ac.ukA maximum likelihood approach was used to estimate the synonymous and nonsynonymous substitution rates in 48 nuclear genes from primates, artiodactyls, and rodents. A codon-substitution model was assumed, which accounts for the genetic code structure, transition/transversion bias, and base frequency biases at codon positions. Likelihood ratio tests were applied to test the constancy of nonsynonymous to synonymous rate ratios among branches (evolutionary lineages). It is found that at 22 of the 48 nuclear loci examined, the nonsynonymous/synonymous rate ratio varies significantly across branches of the tree. The result provides strong evidence against a strictly neutral model of molecular evolution. Our likelihood estimates of synonymous and nonsynonymous rates differ considerably from previous results obtained from approximate pairwise sequence comparisons. The differences between the methods are explored by detailed analyses of data from several genes. Transition/transversion rate bias and codon frequency biases are found to have significant effects on the estimation of synonymous and nonsynonymous rates, and approximate methods do not adequately account for those factors. The likelihood approach is preferable, even for pairwise sequence comparison, because more realistic models about the mutation and substitution processes can be incorporated in the analysis.&lt;br /&gt;&lt;br /&gt;J Mol Biol. 2000 Aug 18;301(3):691-711.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=10966778"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu10966778);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.Yang AS, Honig B.    Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA. ay1@columbia.eduThe information required to generate a protein structure is contained in its amino acid sequence, but how three-dimensional information is mapped onto a linear sequence is still incompletely understood. Multiple structure alignments of similar protein structures have been used to investigate conserved sequence features but contradictory results have been obtained, due, in large part, to the absence of subjective criteria to be used in the construction of sequence profiles and in the quantitative comparison of alignment results. Here, we report a new procedure for multiple structure alignment and use it to construct structure-based sequence profiles for similar proteins. The definition of "similar" is based on the structural alignment procedure and on the protein structural distance (PSD) described in paper I of this series, which offers an objective measure for protein structure relationships. Our approach is tested in two well-studied groups of proteins; serine proteases and Ig-like proteins. It is demonstrated that the quality of a sequence profile generated by a multiple structure alignment is quite sensitive to the PSD used as a threshold for the inclusion of proteins in the alignment. Specifically, if the proteins included in the aligned set are too distant in structure from one another, there will be a dilution of information and patterns that are relevant to a subset of the proteins are likely to be lost.In order to understand better how the same three-dimensional information can be encoded in seemingly unrelated sequences, structure-based sequence profiles are constructed for subsets of proteins belonging to nine superfolds. We identify patterns of relatively conserved residues in each subset of proteins. It is demonstrated that the most conserved residues are generally located in the regions where tertiary interactions occur and that are relatively conserved in structure. Nevertheless, the conservation patterns are relatively weak in all cases studied, indicating that structure-determining factors that do not require a particular sequential arrangement of amino acids, such as secondary structure propensities and hydrophobic interactions, are important in encoding protein fold information. In general, we find that similar structures can fold without having a set of highly conserved residue clusters or a well-conserved sequence profile; indeed, in some cases there is no apparent conservation pattern common to structures with the same fold. Thus, when a group of proteins exhibits a common and well-defined sequence pattern, it is more likely that these sequences have a close evolutionary relationship rather than the similarities having arisen from the structural requirements of a given fold.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Genome Res. 2002 Jun;12(6):944-55.&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;from_uid=12045147"&gt;Related Articles,&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="javascript:PopUpMenu2_Set(Menu12045147);" target="_self"&gt;Links&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/entrez/utils/lofref.fcgi?PrId=3051&amp;uid=12045147&amp;amp;db=pubmed&amp;url=http://www.genome.org/cgi/pmidlookup?view=long&amp;amp;pmid=12045147"&gt;&lt;/a&gt;  Shannon information theoretic computation of synonymous codon usage biases in coding regions of human and mouse genomes.Zeeberg B.   Laboratory of Molecular Pharmacology, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA. barry@discover.nci.nih.govExonic GC of human mRNA reference sequences (RefSeqs), as well as A, C, G, and T in codon position 3 are linearly correlated with genomic GC. These observations utilize information from the completed human genome sequence and a large, high-quality set of human and mouse coding sequences, and are in accord with similar determinations published by others. A Shannon Information Theoretic measure of bias in synonymous codon usage was developed. When applied to either human or mouse RefSeqs, this measure is nonlinearly correlated with genomic, exonic, and third codon position A, C, G, and T. Information values between orthologous mouse and human RefSeqs are linearly correlated: mouse = 0.092 + 0.55 human. Mouse genes were consistently placed in genomic regions whose GC content was closer to 50% than was the GC content of the human ortholog. Since the (nonlinear) information versus percent GC curve has a minimum at 50% GC and monotonically increases with increasing distance from 50% GC, this phenomenon directly results in the low slope of 0.55. This appears to be a manifestation of an evolutionary strategy for placement of genes in regions of the genome with a GC content that relates synonymous codon bias and protein folding.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111517636654700199?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111517636654700199/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111517636654700199' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111517636654700199'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111517636654700199'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/proof-for-helyxzion-anvil.html' title='PROOF FOR  HELYXZION ANVIL'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111514158298445737</id><published>2005-05-03T10:33:00.000-07:00</published><updated>2005-05-03T10:33:02.983-07:00</updated><title type='text'>helyxzion</title><content type='html'>the door is open walk in&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111514158298445737?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111514158298445737/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111514158298445737' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111514158298445737'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111514158298445737'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion.html' title='helyxzion'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111513002779569539</id><published>2005-05-03T07:20:00.000-07:00</published><updated>2005-05-03T07:45:23.493-07:00</updated><title type='text'>HELYXZION REPORT ON</title><content type='html'>“BENT” DNA &amp; PROTEIN BINDING&lt;br /&gt;Slight structural variations within a stretch of DNA cause it to bend in ways that determine whether a gene sits dormant or cranks out thousands of copies. Scientists are uncovering how molecular dynamics govern this bending.&lt;br /&gt;Blue eyes or brown, tall or short, susceptible to cancer or diabetes--our characteristics are written in the code of a twisting double helix. Unraveling that code has occupied molecular biologists since 1953, when James Watson and Francis Crick first proposed the double helical structure of DNA. This long molecule resembles a twisting rope ladder, the legs of which are identical groups of sugars and phosphates. The rungs, however, are composed of four kinds of bases. In genes, regions that code for proteins, the bases exist in any of 64 different groupings of three. These groupings, called codons, form the words of the genetic language--a language that carries the instructions for building and maintaining a living organism.&lt;br /&gt;Over the years scientists have learned that a source of nuance in the meaning of this genetic language lies in structural variations that arise from differences in the sequence not of codons but of the bases themselves. Differences in base sequence encourage certain stretches of the double helix to fold back upon itself as if making a U-turn, while other stretches are straight as rods. Some stretches twist tightly, like an overwound rubber band, whereas others curve gently. These twists and bends often influence whether a gene sits dormant or turns on to crank out thousands of copies of a protein. Helyxzion “ANVIL” brings to the researcher a clear User friendly GUI easy to understand interface, for the complete understanding of DNA.&lt;br /&gt;Helyxzion will quicken advancements in all areas of biotech and genetic research Today genetic engineers can design DNA with any base sequence, but they still cannot predict how it will twist or bend.&lt;br /&gt;Until now with the Helyxizon ANVIL (Advanced Nucleiotide Interpretive Language). The HELYXZION “ANVIL”s algorithm consistently accurate prediction of protein structures is just one of the many key features of ”ANVIL” . Like any language there is more than just Nouns &amp;amp; Pronouns. The so call “JUNK DNA” is not Junk at all and HELYZION is renaming none protein coding DNA to accurately describe its true function.&lt;br /&gt;Additional work is being done to improve predictability, molecular biophysicist David Beveridge and his colleagues at Wesleyan University in Middletown, CN, are modeling DNA on the SGI Cray Origin2000 computer at NCSA. Graduate student Matthew Young generated the first glimpse of the bending action of a well-studied stretch of DNA called phased A-tract. The results, reported in the Journal of Molecular Biology, may lead to a fundamental understanding of the molecular forces that bend DNA.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;A lot can happen to DNA in 5 nanoseconds. The phased A-tract DNA--straight as a rod at the start of the simulation--bent, straightened, and bent again. To test the accuracy of the HELYXZION model, Young and Beveridge also simulated a piece of DNA that was not expected to bend. The results revealed a crucial difference in how the two pieces of DNA behave: While the control DNA moved around in solution and bent temporarily, on average it was straight. The phased A-tract DNA, however, bent about 30 degrees in a single direction. Because the angle of bending was consistent with the results of lab experiments, they could scrutinize the details of the simulation to identify which of the three models most accurately represent phased A-tract bending. "The results essentially support the junction theory; that is, an essentially straight A-tract that is bent at junctions," says Beveridge. But bending also occurs in the stretches of DNA that flank the A-tract.&lt;br /&gt;Now that they have a basic understanding of how this piece of DNA bends, Beveridge and his colleagues will run DNA simulations to see how varying conditions inside the cell, such as salt concentration, affect bending. A preliminary run with the A-tract DNA has found that a saltier solution--one with a higher concentration of sodium--increased A-tract bending. Other DNA sequences also may bend and twist differently at different salt concentrations. Cells may, in fact, influence gene expression by altering the salt concentration inside their nucleus, Beveridge says.&lt;br /&gt;Discerning the specific atomic forces that underlie DNA bending is not enough to understand the biological role of bent DNA, he adds. Genes are turned on when specific gene-activating proteins bind to specific stretches of DNA, and those segments of DNA are often bent. The next step, which Beveridge and his colleagues have already begun, is to run molecular dynamics simulations gauging the atomic forces that help proteins bind to bent DNA. The results could help scientists comprehend the molecular dynamics of how genes are turned on and off. Then, the nuances of HELYXZION the language of DNA will be that much clearer.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111513002779569539?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111513002779569539/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111513002779569539' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111513002779569539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111513002779569539'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-report-on.html' title='HELYXZION REPORT ON'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111512965216425579</id><published>2005-05-03T07:14:00.000-07:00</published><updated>2006-02-25T08:44:21.666-08:00</updated><title type='text'>Helyxzion: The Language of DNA</title><content type='html'>&lt;a href="http://www.helyxzion.com/"&gt;Helyxzion: The Language of DNA&lt;/a&gt;press release: helyxzion has discovered that DNA is not degernative&lt;br /&gt;A universal compositional correlation among codon positions.&lt;br /&gt;example by:&lt;br /&gt;D'Onofrio G, Bernardi G.&lt;br /&gt;&lt;br /&gt;Laboratoire de Genetique Moleculaire, Institut Jacques Monod, Paris, France.&lt;br /&gt;&lt;br /&gt;We have investigated the compositional distributions of third codon positions of genes from the 16 prokaryotes and seven eukaryotes for which the largest numbers of coding sequences are available in data banks. In prokaryotes, both narrow and broad distributions were found. In eukaryotes, distributions were very broad (except for Saccharomyces cerevisiae) and remarkably different for different genomes. In low-GC genomes, third codon positions were lower in GC than first + second codon positions and trailed towards high GC; the opposite situation was found for high-GC genomes. In all genomes, first codon positions were higher in GC than second codon positions. We then investigated the compositional correlations between third and first + second codon positions in prokaryotic genomes (the 16 mentioned above plus 87 additional ones) and in genome compartments of eukaryotes. A general, common relationship was found, which also holds within the same (heterogeneous) genomes. This universal correlation is due to the fact that the relative effects of compositional constraints on different codon positions are the same, on the average, whatever the genome under consideration.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111512965216425579?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111512965216425579/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111512965216425579' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111512965216425579'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111512965216425579'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-language-of-dna.html' title='Helyxzion: The Language of DNA'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111512892653659080</id><published>2005-05-03T07:02:00.000-07:00</published><updated>2005-05-03T07:02:06.536-07:00</updated><title type='text'>helyxzion We're Steadily Marching</title><content type='html'>&lt;a href="http://helyxzion.modblog.com/core.mod?show=poll&amp;amp;poll_id=22426"&gt;ModBlog - helyxzion's ModBlog&lt;/a&gt;Genomics: "We're Steadily Marching"&lt;br /&gt;Dr. Charles Stevens, a pioneer in genetic research, explains why progress has been slower than expected but treatments are on the way &lt;br /&gt; &lt;br /&gt;When it was announced with huge fanfare in the summer of 2000 that researchers had successfully charted the entire genetic code of human beings, two scientists -- Craig Venter and Francis Collins -- received most of the attention. Yet the historic deciphering of the human genome was the work of many pioneers. Helyxzion’s technology is the only one devoted to the full understanding of the genome as a language.&lt;br /&gt;&lt;br /&gt;Helyxzion is at the center of this rapidly developing field playing a major role in, analyzing sequences and understanding the complex human-genome. Since then, HELYXZION has announced the formation of a international consortium for collaboration in reading the human genetic code and is also deciphering the genes of the rat, honeybee, monkey, and others. At the same time, HELYXZION is developing a new sequencing technology (using CT scan and pulsed laser) as well as trying to use the new genetic information to make a difference in medicine. &lt;br /&gt;&lt;br /&gt;"HARD PROBLEM."  The idea that knowledge about genetic defects could be used to transform treatments is what first drove the US-born Stevens to began research on the “language of DNA In 1986, just when scientists were beginning to use a new technique, called polymerase chain reaction (PCR), which allowed them to copy and study DNA. "Our big interest was finding changes in genes that caused disease," he says. &lt;br /&gt;&lt;br /&gt;WHAT'S AHEAD &lt;br /&gt;Dr. Stevens realized how powerful it would be if scientists could do analyses of this sort not just on one gene but on humanity's full set of tens of thousands. That led him to establish HELYXZION in 2002, and to become a major figure in the human-genome project. &lt;br /&gt;&lt;br /&gt;Now he has a vision of a future in which scientists are able to pinpoint key genetic variations in an individual, understand the biological -- and disease -- consequences of those variations, and devise personalized treatments. "I think we will see this going from genome to bedside in three years," he says. &lt;br /&gt;&lt;br /&gt; &lt;br /&gt;&lt;br /&gt;Q: The mapping of the human genome was completed three years ago, but as yet there haven't been any real breakthroughs in new drugs and treatments -- and some of the first drugs based on new gene findings have flopped. Should we be pessimistic that results will take much longer than hoped for, or are there encouraging developments?&lt;br /&gt;A: I do feel a bit better than a year ago. There's a phenomenon we have to have faith in -- that when we find an allele [a gene variant that differs among individuals and could be associated with a disease], it's in a gene that's contributing to a disorder. &lt;br /&gt;&lt;br /&gt;Q: Is there a good example of this?&lt;br /&gt;A: Look at the genes linked to schizophrenia. There are three or four genes, and a similar number has been found in diabetes and hypertension. No one has yet found a gene that causes a 10% to 20% increase in risk. But even genes that contribute less prominently to the increase in risk can tell you something about the disorder. Progress is happening slower than we would like, but it's coming along. &lt;br /&gt;&lt;br /&gt;Q: Your own team is looking at genes for the molecules that act as channels across cell membranes. Have you been able to find genetic variations linked to diseases?&lt;br /&gt;A: Yes, we've been studying ion-channel defects in epilepsy. There's a spectrum of genetic differences, some strongly inherited and some not. We have faith that ion channels play a role in the disease, and we have found variants of the genes. We're now busy cloning the genes and making the aberrant proteins, and putting them back in to see what the ion channels do. &lt;br /&gt;&lt;br /&gt;One interesting phenomenon is that we have found more variation in these ion-channel genes that we might have expected. In a perfect world, all the genes would be normal [and the same], except for those that are broken in a sick person. Instead, we've found more variations in apparently healthy people, as well as hints of changes in genes at several sites. Several patients have several changes. &lt;br /&gt;&lt;br /&gt;Q: So you've found some gene variants linked to disease. What next?&lt;br /&gt;A: Then we have to ask, which really are the important variants -- and what's their functional significance? That's a big step. Of course, it's easy to say that we've just created a new question [i.e. now that a gene has been found, what does it do?], but we're steadily marching to a resolution. &lt;br /&gt;&lt;br /&gt;Q: There are some interesting things about the rat genome. For instance, the rat has a lot more genes than mice or men for dealing with toxic substances -- as you might expect for an animal which has evolved to thrive in nasty habitats.&lt;br /&gt;A: The expansion of detoxification genes was of great interest. Also the expansion of genes for olfactory receptors. Then there was the issue of evolutionary speed -- the rate of genome change. It seems to be very fast in the rodents, compared to primates. &lt;br /&gt;&lt;br /&gt;Q: And soon, we'll be able to compare those genomes with other animals?&lt;br /&gt;A: We talking about getting a lot of mammals. That will help fill out the tree. But it's really about finding the mechanism of genome changes -- which we can use to think about human disease. &lt;br /&gt;&lt;br /&gt;Q: How so?&lt;br /&gt;A: Genetic disease arises from changes in the genome, and that's reflective of the dynamic nature of the genome. When you look at a mouse or rat and see an insertion [of DNA] or a deletion, the process that gave rise to that can also happen in other species. &lt;br /&gt;&lt;br /&gt;That can enlighten us about all the potential things a genome can do. For instance, a disease caused by deletions [of DNA] might be in some part of the genome that's more susceptible to deletions. So if we look across species, we can ask why a genetic transformation occurs at particular spots. &lt;br /&gt;&lt;br /&gt;Q: So overall, progress may be slower than initially hoped for, but there's progress nonetheless?&lt;br /&gt;A: Biology is tough, and we have to take what we can get. But I'm pretty upbeat now.&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111512892653659080?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111512892653659080/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111512892653659080' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111512892653659080'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111512892653659080'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-were-steadily-marching.html' title='helyxzion We&apos;re Steadily Marching'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111506494439132956</id><published>2005-05-02T13:14:00.000-07:00</published><updated>2005-05-02T13:15:44.393-07:00</updated><title type='text'>Helyxzion: doing a spin with DNA</title><content type='html'>Doing A Spin With DNA&lt;br /&gt;The enzyme topoisomerase IB releases the torsion built up in DNA strands. During their investigations, the researchers could follow a single topoisomerase-enzyme molecule over time as it acted on a single DNA molecule. The topoisomerase clamps onto the DNA, cuts through one of the two DNA strands, and then lets the DNA unwind before sticking the broken ends back together again. With the help of sensitive measuring devices, the researchers could measure various parameters such as the friction of the rotating DNA in a cavity of the enzyme. The research has provided new insights into the interactions between DNA and the enzyme, which are of fundamental importance for understanding cell division.&lt;br /&gt;DNA consists of two long strands joined together by pairs of bases. Both strands wind around each other in the form of a double helix with the base pairs acting as the 'stairs' in a staircase. The sequence of these base pairs stores genetic information. During cell division genetic material is copied and the enzymes responsible for this must be able to transcribe the base sequences. This is only possible if the portion of DNA to be transcribed is unwound. This winding and unwinding of the DNA gives rise to torsional forces in the DNA, the magnitude of which increases as cell division progresses. These forces can delay the process of cell division and under certain conditions even stop it. Topoisomerase IB can reduce these torsional forces.&lt;br /&gt;The enzyme releases the torsion from the DNA as follows: The enzyme surrounds the double-stranded DNA like a clamp and then temporarily cuts through one of the two DNA strands. The accumulated torsional forces in the DNA are then spun out around the intact strand. After a number of turns the topoisomerase ones again firmly grabs the spinning DNA and 'glues' (ligates) the broken stands neatly back together again. The researchers were able to determine the exact number of turns removed by the topisomerase between 'cutting' and 'gluing'.&lt;br /&gt;The precise mechanism of topoisomerase IB is also important for cancer research. Drugs which inhibit the functioning of topoisomerase IB are already in clinical use, but can possibly be improved using the knowledge from this study.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111506494439132956?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111506494439132956/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111506494439132956' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111506494439132956'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111506494439132956'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-doing-spin-with-dna.html' title='Helyxzion: doing a spin with DNA'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111505564328017383</id><published>2005-05-02T10:39:00.000-07:00</published><updated>2005-05-02T10:40:43.280-07:00</updated><title type='text'>abstract helyxzion modled</title><content type='html'>Abstract:&lt;br /&gt;&lt;br /&gt;HELYXZION MODLED  proteins are a new evolvable method of HELYXIZONICLLY mapping genotype to phenotype through a developmental process, where HELYXIZONgenes are expressed into HELYXZIONICproteins comprised of subsets of  HELYXZION. The resulting network of HELYXZIONgene and proteinICHELYXZION interactions can be designed by evolution to produce specific patterns, that in turn can be used to solve problems. Here the use of regulatory networks for learning a robot path through a series of obstacles is described. The results indicate the ability of this system to learn regularities in solutions and automatically create and use modules AS HELYXZION genes CAN BE REGULATED AND CONTROLED………&lt;br /&gt;HELYXZION MODLE:&lt;br /&gt;  Proteins= HELXYIONS , XION/YION, ZION AND XZYION HELYXZION. HELYXZION IS DIVCOVERING NEW methodS of HELXOGENEICmapping, HELXOgenotype to HELYXO(IONICIONS)phenotype through a developmental process, where HELYXIZONgenes ( THE WORK GENE GENES IN HELYXIZON ARE: BYTRANSLATED DNA into HELYXZION TO.&lt;br /&gt;&lt;br /&gt;HELXZYIONICproteins comprised of subsets of  HELYXZION. The resulting network of HELYXZIONgene and proteinICHELYXZION interactions can be designed by evolution to produce specific patterns, that in turn can be used to solve problems. Here the use of regulatory networks for learning a robot path through a series of obstacles is described. The results indicate the ability of this system to learn regularities in solutions and automatically create and use modules AS HELYXZION genes CAN BE REGULATED AND CONTROLED……HELYYXXZZION COVERS ALL DNAs, RNA,s AMINO ACIDS, AND THE HELYXZIOPROTEINS THEY FORM OR CONTROL ……………………………………………………………….ALL ELES IS “JUNK (THERE IS NEARER TO 0% JUNK IN NATURE ) AND NON_CODEING DNA………MY BEST WISHES TO YOU IF YOU DO FIND AND JUNK DNA&lt;br /&gt;IS YOURS…… KEEP IT IF IT TRULY IS SUCH A THING AS JUNK DNA “ PLEASE PROVE IT…..HELYXZION MAKES NO CLAIM TO “JUNK DNA” OR “NON_CODEING DNA”  AND UNDER J.V. AGREEMENT TO FRACTICALS OF PRIIME SEQUENCES. PATENT PENDING.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;ONLY AS TO H0W GENETIC INFORMATION GENOMIC DATA&lt;br /&gt;Gene A DNA SEGMENT CONTAINING BIOLOGICAL INFORMARTION AND HENCE CODING FOR AN RNA AND /OR A POLYPEPTIDE MOLECULE&lt;br /&gt;CONTAINS INFORMATION .HELYXZION 101………………………..&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111505564328017383?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111505564328017383/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111505564328017383' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111505564328017383'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111505564328017383'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/abstract-helyxzion-modled.html' title='abstract helyxzion modled'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111497884803655274</id><published>2005-05-01T13:20:00.000-07:00</published><updated>2005-07-29T09:31:20.293-07:00</updated><title type='text'>helyxzions discovery: synonymous codons unique</title><content type='html'>&lt;a href="http://www.helyxzion.com/"&gt;Helyxzion: The Language of DNA&lt;/a&gt;Helyxzions discovery (which dates back to 1984) reveals that relationships between the synonymous codon usage of amino acids and their protein secondary, tertiary and quaternary structural units are the not same. Yet each codon is unique in the same way its used, being assembled into the growing polypeptide chain in different orientations which result in different structural properties, and in most cases different functional proteins.&lt;br /&gt;We have proposed new amino acid secondary structure propensities in proteins with different folding types based on synonymous codons. They have been derived from 200 all alpha, all beta, alpha/beta, and alpha + beta proteins of known structures and their coding genes. The secondary structure propensities of the same codon in gene coding for different folding type proteins are not the same. For instance, amino acid Ile coded by AUU is indifferent to form the alpha unit in the alpha + beta protein class, but it is a former and a breaker for the alpha unit in the all alpha protein class and the alpha/beta class, respectively. On the other hand, the secondary structure propensities of different synonymous codons in the coding genes with the same folding type are also not all the same. As an example, CGU, CGG, and AGA, which are synonymous codons of Arg, are preferential to form the alpha unit in all alpha proteins, while CGA is an alpha unit breaker and the other two synonymous codons, CGC and AGG, are indifferent to form or break the alpha unit. As a result, protein secondary structure information contained both in mRNA sequences and in amino acid sequences has been introduced in these codon-based amino acid secondary structure propensities. These codon-based amino acid secondary structure propensities are helpful to in vitro protein design and protein secondary, tertiary and quaternary structure prediction.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111497884803655274?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111497884803655274/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111497884803655274' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111497884803655274'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111497884803655274'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzions-discovery-synonymous-codons.html' title='helyxzions discovery: synonymous codons unique'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111496765958440963</id><published>2005-05-01T10:14:00.000-07:00</published><updated>2006-02-25T08:45:18.570-08:00</updated><title type='text'>Helyxzion is changing the way bioteches do business</title><content type='html'>&lt;a href="http://www.google.com/search?q=MicroCaps+&amp;amp;hl=en&amp;lr=&amp;amp;rls=RNWE,RNWE:2005-12,RNWE:en&amp;as_qdr=all&amp;amp;start=90&amp;sa=N"&gt;Google Search: &lt;/a&gt;Helyxzion is changing the way bioteches do business although biotech businesses have been slower than some industries to embrace e-commerce. "It appears to me that although researchers have been using the Internet for a while, that the business opportunity in the Internet and biotech is only getting started," said Helyxzions CEO Dr. Stevens. There's an explosion of biotech information, tools and services on-line." Unfortunately most are not of much real use as they are old technology in new boxes.&lt;br /&gt;Helyxzions focus is on the importance of offering researchers both comprehensive and reliable information. Without that, "you are part of the problem, not part of the solution.&lt;br /&gt;The most rapidly evolving market is for scientific analysis of “Intron dna” which is the biggest untouched gold mine of biotechnology and for its success: It must have a critical mass of high-quality information; it must be simple to navigate and access; and it must have high functionality. Helyxzion is right on the mark.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111496765958440963?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111496765958440963/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111496765958440963' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111496765958440963'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111496765958440963'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/05/helyxzion-is-changing-way-bioteches-do.html' title='Helyxzion is changing the way bioteches do business'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111489257489922914</id><published>2005-04-30T13:22:00.000-07:00</published><updated>2005-04-30T13:22:54.900-07:00</updated><title type='text'>Biochemicons joint venture with Helyxzion</title><content type='html'>BCC is the European Representative of HELYXZION LLC and based on the identification of the “Language of DNA”, we are one of the European leader in the area of DNA-Protein-based Nanotechnology and Bioinformatic.&lt;br /&gt;&lt;br /&gt;Biochemicon founded in October 2000 by a high motivated team of English, American and Austrian scientists for the purpose of using DNA not only in the conventional strands, but also using their enormous variability ample scope for Designing Molecules.&lt;br /&gt;&lt;br /&gt;A central focus of our mission in the post genomic area is the understanding of how cellular phenomena arise from the connectivity of genes and proteins. This connectivity generates molecular networks, and a systematic understanding requires the development of a mathematical framework for describing the circuitry of life. &lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111489257489922914?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111489257489922914/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111489257489922914' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489257489922914'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489257489922914'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/04/biochemicons-joint-venture-with.html' title='Biochemicons joint venture with Helyxzion'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111489247540669993</id><published>2005-04-30T13:21:00.000-07:00</published><updated>2005-04-30T13:21:15.406-07:00</updated><title type='text'>Helyxzion joint venture with BIOCHEMICON � The Post Gene NBI � Tech Company</title><content type='html'>&lt;a href="http://www.biochemicon.com/gross.html"&gt;WELLCOME TO BIOCHEMICON � The Post Gene NBI � Tech Company&lt;/a&gt;BCC is the European Representative of HELYXZION LLC and based on the identification of the “Language of DNA”, we are one of the European leader in the area of DNA-Protein-based Nanotechnology and Bioinformatic.&lt;br /&gt;&lt;br /&gt;Biochemicon founded in October 2000 by a high motivated team of English, American and Austrian scientists for the purpose of using DNA not only in the conventional strands, but also using their enormous variability ample scope for Designing Molecules.&lt;br /&gt;&lt;br /&gt;A central focus of our mission in the post genomic area is the understanding of how cellular phenomena arise from the connectivity of genes and proteins. This connectivity generates molecular networks, and a systematic understanding requires the development of a mathematical framework for describing the circuitry of life. &lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111489247540669993?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111489247540669993/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111489247540669993' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489247540669993'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489247540669993'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/04/helyxzion-joint-venture-with.html' title='Helyxzion joint venture with BIOCHEMICON � The Post Gene NBI � Tech Company'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111489226586976340</id><published>2005-04-30T13:17:00.000-07:00</published><updated>2005-05-02T09:02:57.303-07:00</updated><title type='text'>Helyxzion: The Language of DNA</title><content type='html'>Helyxzion: Angel with broken wing: "The development of the Helyxzion TRD Viewer is a critical first step toward the complete understanding of the nature and function of DNA. Charles Stevens' research in this area continues to be inspired by the life of his aunt, Mary Lorraine Becker. Artist: Title: Angel With a Broken Wing&lt;br /&gt;Stricken with polio at the age of 6, doctors told Mary's family that she would probably never walk, or worse, she would die prematurely. Mary defied this prognosis; she taught herself to walk with the aid of braces and crutches and lived to the age of 86.&lt;br /&gt;Mary was an accomplished artist, and Helyxzion LLC chose her self-portrait, 'Angel With A Broken Wing,' as a constant reminder that our goal is to ensure that no one will ever have to 'live a lifetime with a broken wing.'&lt;br /&gt;The work undertaken by Charles Stevens and Helyxzion LLC is dedicated to the memory of this brave and extraordinary woman."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111489226586976340?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111489226586976340/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111489226586976340' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489226586976340'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489226586976340'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/04/helyxzion-language-of-dna_30.html' title='Helyxzion: The Language of DNA'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111489156537385768</id><published>2005-04-30T13:06:00.000-07:00</published><updated>2006-02-25T08:44:41.236-08:00</updated><title type='text'>Helyxzion: THE 21ST AND 22ND AMINO ACID:</title><content type='html'>&lt;a href="http://www.helyxzion.com/"&gt;Helyxzion: The Language of DNA&lt;/a&gt;HELYXZION:&lt;br /&gt;THE NEXT GENERATION OF THE ‘ANVIL VIEWER WILL INCORPORATE&lt;br /&gt;THE 21ST AND 22ND AMINO ACID:&lt;br /&gt;&lt;br /&gt;Selenocysteine has been recognized for the past fifteen years as the 21st "natural" amino acid. It is now known to occur in several dozen proteins, its mRNA codon being UGA, which usually serves as a stop codon but, with a specific downstream sequence forming a loop and a specific translational elongation factor, is recognized as the site of selenocysteine incorporation into proteins&lt;br /&gt;Selenocysteine has been called the 21st amino acid. It is an essential constituent of 3 proteins in E. coli : formic dehydrogenases, FDHO, FDHN, and FDHH; each of them contains one selenocysteine residue. The biosynthesis of selenocysteine is unique among amino acid biosyntheses because its defining step occurs while attached to a tRNA molecule. The tRNA molecule, tRNA,sec, serves only to insert selenocysteine into these proteins. tRNAsec has an anticodon that recognizes the stop codon UGA; it also has properties that allow it to be charged with serine by serS-encoded seryl-tRNA synthetase. But it cannot insert serine at an UGA codon. Only after it has been converted to selenocysteyl-tRNAsec by the action of selenocysteine synthetase can it recognize certain UGAs as sense codons and insert selenocysteine there. Recognition of UGA as a sense codon and insertion of selenocysteine depends on adjacent sequences in mRNA termed SECIS (selenocysteine insertion sequence) and a special elongation factor termed SELB, the product of selB, which acts in place of EF-Tu in this special case..&lt;br /&gt;Selenophosphate donates selenium to seryl-tRNAsec thereby converting it to selenocysteyl-tRNAsec. The physiological source of its selenium, shown here as selenide, is not known for certain. Selenide is synthesized, via selinite, from selenate by the same enzymes that reduce sulfate,via sulfite, to sulfide.&lt;br /&gt;Selenocysteine is recognized as the 21st amino acid in ribosome-mediated protein synthesis and its specific incorporation is directed by the UGA codon. Unique tRNAs that have complementary UCA anticodons are aminoacylated with serine, the seryl-tRNA is converted to selenocysteyl-tRNA and the latter binds specifically to a special elongation factor and is delivered to the ribosome. Recognition elements within the mRNAs are essential for translation of UGA as selenocysteine. A reactive oxygen-labile compound, selenophosphate, is the selenium donor required for synthesis of selenocysteyl-tRNA. Selenophosphate synthetase, which forms selenophosphate from selenide and ATP, is found in various prokaryotes, eukaryotes, and archaebacteria. The distribution and properties of selenocysteine-containing enzymes and proteins that have been discovered to date are discussed. Artificial selenoenzymes such as selenosubtilisin have been produced by chemical modification. Genetic engineering techniques also have been used to replace cysteine residues in proteins with selenocysteine. The mechanistic roles of selenocysteine residues in the glutathione peroxidase family of enzymes, the 5' deiodinases, formate dehydrogenases, glycine reductase, and a few hydrogenases are discussed. In some cases a marked decrease in catalytic activity of an enzyme is observed when a selenocysteine residue is replaced with cysteine. This substitution caused complete loss of glycine reductase selenoprotein A activity.&lt;br /&gt;Pyrrolysine is a lysine derivative encoded by the UAG codon in methylamine methyltransferase genes of Methanosarcina barkeri. Near a methyltransferase gene cluster is the pylT gene, which encodes an unusual transfer RNA (tRNA) with a CUA anticodon. The adjacent pylS gene encodes a class II aminoacyl-tRNA synthetase that charges the pylT-derived tRNA with lysine but is not closely related to known lysyl-tRNA synthetases. Homologs of pylS and pylT are found in a Gram-positive bacterium. Charging a tRNA(CUA) with lysine is a likely first step in translating UAG amber codons as pyrrolysine in certain methanogens. Our results indicate that pyrrolysine is the 22nd genetically encoded natural amino acid.&lt;br /&gt;Some methanogenic archaea can begin methane formation with methyltransferases that demethylate trimethylamine, dimethylamine, or monomethylamine. The genes encoding these methyltransferases each contain an in-frame UAG (amber) codon that does not cease translation during synthesis of the full-length methyltransferase. Recently, the structure of the monomethylamine methyltransferase MtmB was determined. A novel amino acid was observed at the amber codon position whose structure is lysine with the epsilon nitrogen in amide linkage with a pyrroline ring. The name pyrrolysine has been proposed for this residue. In addition, a tRNA is found cotranscribed with a gene for an unusual lysyl-tRNA synthetase in methanogens possessing methylamine methyltransferase genes. This tRNA has a CUA anticodon, and is a naturally occurring amber decoding tRNA. Taken together, these data indicate that pyrrolysine is a novel genetically encoded amino acid, the first discovered since selenocysteine was found to be encoded by UGA in 1986. TWO YEARS AFTER Dr. Stevens prediction that DNA coded for 23 amino acids!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111489156537385768?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111489156537385768/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111489156537385768' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489156537385768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489156537385768'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/04/helyxzion-21st-and-22nd-amino-acid.html' title='Helyxzion: THE 21ST AND 22ND AMINO ACID:'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111489086401330776</id><published>2005-04-30T12:54:00.000-07:00</published><updated>2005-04-30T12:54:24.013-07:00</updated><title type='text'>HELYXION “ANVIL” advanced nucleotide visual interpretive language</title><content type='html'>&lt;a href="http://helyxzion.blogspot.com/"&gt;genedoctor&lt;/a&gt;HELYXION “ANVIL” advanced nucleotide visual interpretive language&lt;br /&gt;For the last few decades, scientists have been intently sequencing the genes of dozens of organisms, from bacteria to humans. The effort, which culminated in 2000 with the sequencing of the human genome's roughly thirty thousand genes, reflects researchers' increasing adeptness at "reading" the language of DNA. It's a biological literacy that has meant dramatic disadvances in understanding the genetic basis of health and disease, bringing with them the promise of safer and more effective drugs.&lt;br /&gt;But now Helyxzions small group of researchers are looking to a far more ambitious goal than simply sequencing DNA but “reading” the sequence of genetic material: they are attempting to write entirely new genomes from scratch. In essence, they hope to create new synthetic forms of life, the likes of which have never before existed, by painstakingly spelling out exact sequences of DNA that hold all the instructions for the new organisms. &lt;br /&gt;It is biotech's most brazen attempt, so far, to play God. Dr. Stevens a visionary of genomics is leading the charge. After all, it was Dr. Stevens, the president and Founder of Helyxzion, who headed the controversial private effort to “crack the code and Start Reading DNA and to do so ahead of the public Human Genome Project. Working with joint venture partners Dr. Water Battistutti of BioChemicon, and Dr. Malcolm J. Simons, the founder of Simons Haplomics, the inventor of far reaching patents for non-coding DNA diagnostic and Gene discovery. It's a project that would not only help meet the Helyxzions goal of creating high-utility microorganisms specifically designed to mop up carbon dioxide, say, or produce hydrogen fuel with the utmost efficiency; it's a project that could also upend genetic engineering itself.&lt;br /&gt;Helyxions objective is not merely to tweak existing life forms by inserting genes that confer specific traits-the main tactic in conventional genetic engineering. Instead Helyxzion will assemble an entire genome, DNA letter by DNA letter, putting together only the genes he wants: those necessary for an organism's survival and those that will allow it to carry out a desired task. "The long-term advantage of creating an organism from a chemically synthesized genome is that it allows complete flexibility of design,". No longer limited to nature's repertoire, researchers could create a wide variety of synthetic organisms, each made to perform a specific chore, such as sopping up oil slicks or producing a plastic. And because such a bacterium would devote most of its energy to its assigned job, it could, in theory, be much more efficient than a counterpart made via conventional genetic engineering.&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111489086401330776?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111489086401330776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111489086401330776' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489086401330776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111489086401330776'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/04/helyxion-anvil-advanced-nucleotide.html' title='HELYXION “ANVIL” advanced nucleotide visual interpretive language'/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111488533108413545</id><published>2005-04-30T11:22:00.000-07:00</published><updated>2005-04-30T11:22:11.083-07:00</updated><title type='text'>Helyxzion is driven by the creation </title><content type='html'>&lt;a href="http://www.helyxzion.com/"&gt;Helyxzion: The Language of DNA&lt;/a&gt;Helyxzion is driven by the creation of IT tools that efficiently manipulate the vast amount of data generated by life science research. Helyxzions research tools can be obtained through licensing agreements, joint ventures and limited partnerships. Of these choices, all include licensing agreements and provide immediate access to Helyxzions current computational tools and provide the flexibility to switch to new products as they mature. In addition, license agreements contribute to the bottom line by allowing for cost savings and less commitment of capital. &lt;br /&gt;The challenge of crafting an effective licensing agreement involves identifying areas where buyers and seller have conflicting goals. The agreement must reconcile the strategic concerns of Helyxzion the vendor or licensor and the Research Company or licensee while addressing cost issues, competition risks, and the exclusivity of the licensed the Bioinformatics tools. More importantly, the Bioinformatics field is affected by governmental involvement, intellectual property considerations and royalty arrangements that are not relevant to traditional software licenses.&lt;br /&gt;Basic Terms&lt;br /&gt;The basic form of a licensing agreement involving Helyxzions Bioinformatics tools does not vary much from a standard software or technology license. For example, the grant of license, representations and warranties, acceptance and delivery provisions, confidentiality limitations, termination rights, indemnification and limitation of liability clauses are standard fare in both types of agreements. &lt;br /&gt;It has often been said, however, that the devil is in the details. Accordingly, it is important to realize that small refinements to the details of legal clauses in Helyxzions agreements should reflect the nature of the biotech industry. For example, in other industries, it is typically a foreign notion that physical injury can result from the use of software and IT tools. However, IT vendors to Bioinformatics agreements may require that research companies indemnify them from future legal claims based on death, illness, or bodily injury arising out of the use of the Bioinformatics tool by the licensee.&lt;br /&gt;Managing Risk&lt;br /&gt;One major concern for labs and research companies is the competition risk posed by other players in the industry working on similar projects. One way for research companies to secure a competitive advantage is to have exclusive access to functional research tools. However, exclusivity has costs. To lock out other competitors, research companies need to provide Helyxzion with enough incentive to cover the profits otherwise available through multiple license agreements.&lt;br /&gt;One way to reduce exclusivity costs is to take a niche-market approach and fashion licenses that exclude specific companies from using the same tools and information to develop similar products for similar markets. HGS and SmithKline Beecham entered into a license agreement of this kind worth over $125 million dollars in 1993. The agreement was subsequently updated to allow several companies into the niche market that had previously been excluded1. Ultimately, the value of exclusive agreements depends on usefulness and uniqueness of the Helyxzion technology and on how small or large the niche market is defined. &lt;br /&gt;Because Helyxzion views exclusivity provisions as obstacles to future revenue, they would prefer to narrowly define the niche market by shortening the list of excluded companies to those of the licensee’s direct competitors. However, licensees with greater bargaining power may be in a position to broaden the boundaries of exclusivity by preventing Helyxzion from licensing the same Bioinformatics tools to entities that are, in the licensee’s opinion, similarly situated to the licensee (i.e. those that engage in biomanufacturing or in drug delivery research). Whatever the case may be, it is important to know that exclusivity arrangements may raise anticompetitive concerns. &lt;br /&gt;In some cases, the competition risk is between the licensor and licensee. As research companies develop tools for internal use, the opportunity exists to assume the role of Helyxzion. Selling computational tools allows a research company to hedge against the cost and unpredictability of its own research efforts. The risk borne in exchange is the possibility of another company using the proprietary tools to make advancements in the research company’s core product field. Once again a carefully drafted agreement can help alleviate this concern through use of niche agreements that protect a company’s prime market.&lt;br /&gt;It is important to note that public entities (i.e. government-funded labs and universities) that license Helyxzions technology may be limited in their ability to agree to exclusivity provisions. Typically a public entity’s main goal is to advance further scientific discovery for the public good by ensuring that the Bioinformatics tools are widely available to both academics and commercial scientists. Because federal grants are typically tied to the advancement of such goals, public entities are hampered in their ability to agree to exclusivity provisions.&lt;br /&gt;Open data sources&lt;br /&gt;It may often be the case that the data sources underlying some Bioinformatics research are independent from the computational applications. Primary protein and nucleic acids sequence databases are pervasive, exemplified by two well-known genome databases such as the public GenBank database and the Celera private subscription database2. Other genomes data bases are available as well. Currently the patent rights to the burgeoning life science data are still murky. As patent issues are resolved, the accessibility of these data systems may be affected. &lt;br /&gt;Research companies that rely on open data sources should protect themselves by transferring liability and risk to Helyxzion by obligating Helyxzion to ensure that the research company has on-going access to the data source. Additional requirements on Helyxzion may incur a higher price due to the additional arrangements, but Helyxzions transaction cost associated with arranging data access could be spread out among multiple licensees.&lt;br /&gt;To mitigate costs, Helyxzion may engage in co-exclusive licenses with data source providers. A co-exclusive license would require each company to use the other as the sole provider of services. The data provider can market the system as a whole, but bear only the costs of maintaining the data. Helyxzion realizes similar efficiencies while reducing transaction costs for licensees.&lt;br /&gt;Depending upon the relative bargaining power of the parties to the license agreement, such obligations to provide access to data or co-exclusive licensing arrangements may not be viable alternatives. Another option for the research company would be to negotiate a termination right within the license agreement and by granting the research company an option to terminate for cause should access to open data be affected. This course of action, however, is merely a quick fix that allows the research company to stop making licensing or royalty payments to Helyxzion once access to open data has been affected. The burden to access open data sources still resides with the research company.&lt;br /&gt;Intellectual Property&lt;br /&gt;While innovation drives success, use of intellectual property laws can alleviate competitive harms. The nature of the protection will be a key element in determining the bargaining value of the Helyxzion products. Software patent and copyright protections can protect the source code and hardware involved in IT tools. For Bioinformatics license agreements, however, the patenting of the underlying algorithms within the software itself presents an interesting new issue.&lt;br /&gt;Courts have recognized that certain non-abstract mathematical algorithms used in computational calculations are patentable3. This may allow software vendors the ability to patent the underlying methodology of their systems. With a patent in hand, the licensor gains bargaining power in determining licensing terms. Currently, the field is in its infancy. The USPTO reported 110 pending applications in their Bioinformatics division as of October 1, 2000, with the number of applications expected to increase4. Therefore, the possibility of patent infringement remains a constant issue in any licensing agreement. &lt;br /&gt;In typical license agreements, patent infringement risk is allocated to the licensor, or in this case Helyxzion. However, both sides to a transaction should engage in due diligence with regard to ownership rights. Licensor’s need to prepare for the economic impact of infringement claims, while licensees should be aware of potential disruption to their biotech production or research efforts&lt;br /&gt;Trade Secrets&lt;br /&gt;Helyxzions technology for the most part is a trade secret. Generally, a trade secret is information that is both "secret" (not generally known in the trade or readily discernible) and commercially valuable to its owner. Typical trade secrets include secret recipes, compositions of matter, formulas, mathematical algorithms, computer software and applications, processes, patterns, devices, or compilations of information such as customer lists and business plans. Trade secrets can be lost when the owner does not protect the confidentiality of the information, or when a third party independently develops the information and makes it public. Trade secrets are primarily protected under state law. However, the Federal Economic Espionage Act of 1996 (EEA) makes criminal the interstate theft of certain trade secrets..&lt;br /&gt;&lt;br /&gt;Public Entity Interactions&lt;br /&gt;There are few industries that bear such a close relationship with public institutions as life science firms. Life science companies are constantly exposed to competitive effects not only from each other, but also from academics and public researchers. And as stated previously, public entities follow a different paradigm than private companies regarding research access policies.&lt;br /&gt;Public institutions can play the role of licensor and licensee for many of the new emerging technologies in Bioinformatics. Many public institutions operate on the principles of open information sharing5. Shared genomic databases are being created by consortiums of public entities. For example, The National Institute of Health (“NIH”) maintains the Genbank database that relies on public submissions. They also provide a variety of research tools to the public to help with analyzing data. In conjunction with the open sharing environment, public institutions are licensing their computational tools to private institutions. The licenses pose significant competitive harm to private firms. In addition, the open sharing environment between public institutions destabilizes pricing efforts as the market is forced to contend with free alternatives.&lt;br /&gt;For private firms, licensing technology from public institutions is a convenient way to establish relationships with experts in the field. However, obtaining a license from public institutions contains additional risks. Under the Bayh-Dole University and Small Business Patent Act of 1980 the federal government allows certain businesses and non-profit institutions to retain rights to technology developed with federal funds, either through grants or contracts. The Bayh-Dole Act and subsequent supporting statutes allow the government to retain rights in the technology, such as, a non-exclusive license to practice the invention on behalf of the United States, right to prohibit future assignments, and a portion of the royalties, just to name a few6. A license arrangement should consider whether government involvement is possible and work to minimize the impact in the event that the government chooses to exercise its rights.&lt;br /&gt;Licenses with public institutions typically contain the same terms and agreements as any other licensing agreement. The Working Group on Research Tools at the NIH reported that in order to overcome the dissatisfaction of public entities that are a party to these transactions, improved access to research tools by public researchers should be considered7. For example, NIH focuses on free access to tools without legal agreements and the use of standard licenses to speed up the transfer of tools and lessen the time needed to negotiate licensing terms8. Use of a standard form or a modified version of standard Materials Transfer Agreements typically used for biotechnology licenses may allow an IT vendor to reach agreements with public institutions more effectively. The benefits gained are lower transaction costs, along with the added revenue provided from these sources.&lt;br /&gt;&lt;br /&gt;Royalty Arrangements&lt;br /&gt;A somewhat controversial compensation method available in the biotechnology arena is the idea of reach-through licensing9. Rather than paying for the Helyxzion technology at the outset, research companies that negotiate for reach-through royalty provisions defer the payments by granting the Helyxzion a stake in the final product developed by the licensed technology. Depending on the negotiated terms, Helyxzion may claim a percentage of the sale of the developed product as a royalty payment or demand some sort of ownership or licensing rights to the developed product. For Helyxzions licensing to public entities, this is one method of making tools available for minimal costs but in exchange for a potentially valuable future stake. &lt;br /&gt;Given the uncertainty involved in any type of science research, research companies that invest heavily in tools early in the research process may not ensure early economic gains. Reach-through royalty provisions provide an effective method for gaining access to resources when the future of the product is still in question. However, the research company may run into problems if reach-through royalties become the norm rather than an exception. If reach-through terms are negotiated with multiple IT vendors, the research company faces a stacking problem where multiple agreements dilute the value of the end product.&lt;br /&gt;For Helyxzion reach-through rights provide an alternative to standard pricing agreements. In exchange for valuable computational methods, vendors can enjoy rights to the vast array of potential end products. However, this strategy bears two substantial risks. First, Helyxzion must assume a greater cost burden in the short term. While Bioinformatics may speed up research efforts, it will still be a matter of years before viable products are produced and generate sales revenue. In the interim the Helyxzion faces reduced cash flows while waiting for royalties to materialize. Licensors need to carefully evaluate the financial implications of pursuing such deferred payment plans.&lt;br /&gt;The second risk is the task of monitoring. In addition to the finance hurdles, Helyxzion desiring a piece of any royalty jackpot will be forced to audit the research company’s research efforts. To successfully claim royalty rights, Helyxzion will need to show integral use of their product in the development cycle. The cost of monitoring is ongoing from early stage research and development to production. Add the costs of enforcing the agreement when valuable end products are at stake and the value of reach through provisions may be reduced.&lt;br /&gt;Conclusion&lt;br /&gt;As the Bioinformatics field develops there is opportunity to structure deals in new and unique ways. Parties to agreements need flexibility on pricing and terms as the field matures. Efficient licensing agreements in Bioinformatics need to balance risks without burdening the agreement with needless complexity. Effective licenses will involve careful choices regarding the nature of the license, proper management of intellectual property, and the allocation of risks and rewards.&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12538363-111488533108413545?l=helyxzion.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://helyxzion.blogspot.com/feeds/111488533108413545/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=12538363&amp;postID=111488533108413545' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111488533108413545'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12538363/posts/default/111488533108413545'/><link rel='alternate' type='text/html' href='http://helyxzion.blogspot.com/2005/04/helyxzion-is-driven-by-creation.html' title='Helyxzion is driven by the creation '/><author><name>Dr.Charles Stevens</name><uri>http://www.blogger.com/profile/09002150879563547497</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12538363.post-111488510067743622</id><published>2005-04-30T11:18:00.000-07:00</published><updated>2005-04-30T11:18:20.676-07:00</updated><title type='text'>HELXYZION AND THE HISTORY OF DNA</title><content type='html'>&lt;a href="http://www.google.com/webhp?sourceid=navclient&amp;amp;ie=UTF-8&amp;amp;rls=RNWE,RNWE:2005-12,RNWE:en"&gt;Google&lt;/a&gt;•	Simplicity – Understanding DNA with the Helyxzion viewer is much simpler and easier to learn than perl. Perl has a complex syntax where most of the charcters on the keyboard have special meanings you have to memorise. As a result, you'll need to be an expert programmer to learn perl. DNA is just the opposite. To get started with DNA, all you need to learn is four basic nucleotides, plus a few simple rules governing how they translate into useful biology. &lt;br /&gt;•	Maintainability - people often refer to perl code as 'write-only'. If you come back to a perl script after a couple of months you probably won't be able to understand it anymore. DNA can be left for millions of years and remains perfectly readable. A perl script is correct if it's halfway readble and gets the job done before your boss fires you. DNA is correct if it's halfway readable and gets the job done before you become extinct. &lt;br /&gt;•	Graphical user interface - biology provides DNA with GUI facilities that are amazingly simple and powerful. There's nothing quite like it in any other language. &lt;br /&gt;•	Development tools - no more debugging or having to remember to make backups. DNA is self-repairing and self-replicating. &lt;br /&gt;•	Efficiency - DNA has parallel processing and built-in genetic algorithms, thus fitting enormous powerall into the size of a very small thing. &lt;br /&gt;•	Scope - DNA can be used in many more situations, for many more purposes, than perl. Perl is really a text manipulation language, whereas DNA is a more general application development language. &lt;br /&gt;Arguing whether perl is better than DNA is like arguing whether daddy is better than chips. DNA has brought the world beer, pie, camels and Buffy. What has perl ever done for us?&lt;br /&gt;&lt;br /&gt;Leonard Adleman, the man who invented the "DNA computer," claims that the riches lie along the side of the road rather than at the end of the rainbow. Engineers working toward replacing computers with molecular-sized DNA machinery, he said, will discover ways to use DNA for all manner of applications, and those discoveries will overshadow their progress toward developing a true DNA computer.&lt;br /&gt;"I don't expect that we will ever have a PC that's a DNA computer, for instance, but we will be able to do things with DNA that we can't with any other type of technology," said Adleman, a professor at the University of Southern California.&lt;br /&gt;The amazing density of DNA as an information storage medium-a single cubic centimeter of DNA holds more information than a trillion CDs-prompted Adleman's first demonstration, in 1994, that DNA molecules could solve computational problems. He used strips of DNA like the tape in a Turing machine, and performed read and write operations with the tools of genetic engineering. In 2000, Adleman upped the ante by using more modern genetic engineering techniques to solve a six-variable problem, albeit one that some humans could solve by hand. His next milestone, slated for 2002, is a 20-variable problem that any human being would find daunting.&lt;br /&gt;"'These demonstrations are to show people how DNA can be directed to process information in very specific ways, but I think others will think of much cooler applications," Adleman said.&lt;br /&gt;DNA computers work by encoding the problem to be solved in the language of DNA: the base-four values A, T, C and G. Using this base-four number system, the solution to any conceivable problem can be encoded along a DNA strand a la a Turing machine tape. Every possible sequence can be chemically created in a test tube on trillions of different DNA strands, and the correct sequences can be filtered out using genetic engineering tools.&lt;br /&gt;This massive process-of-elimination method of finding solutions to problems, a kind of Darwinian survival of the fittest at the molecular level, has evolved to become the universal method of storing and processing information in living things. Plants, animals, humans, bacteria, viruses-literally all living things use DNA to store and process the biological information that directs the processes of life. The density of this information is enormous, scaling all the way up to the history of an entire species and scaling down to individual molecules. DNA is essentially godlike in that it "remembers" the history of a species, from single cells to higher animals.&lt;br /&gt;"We have in our hands the legacy from about 3 billion years of evolution, which we have never been able to tap into before now. DNA's legacy is the machinery inside the living cell," Adleman exThe history of genetics undoubtedly goes back to "Let there be light," or, at the least, "Go forth and multiply." In his rose garden speech on June 26, 2000, commemorating the completion of the Human Genome Project, former President Clinton called the gene map "the language of God." For those of a less religious inclination, genetics is certainly the language of nature and all living things.&lt;br /&gt;It is probably impossible to give a complete and accurate history of genetics. Too many people were involved in too many experiments and discoveries. But here, in a time line format, we’ll attempt to look at some of the more important discoveries and the people who made them.&lt;br /&gt;&lt;br /&gt;THE LATE 1800’s: FUNDAMENTALS&lt;br /&gt;&lt;br /&gt;In 1859 – just a year before the American Civil War began – Charles Darwin published his treatis on evolution. This work, "On the Origin of Species," was so exciting and controversial that it is still the subject of emotional debate 142 years later. (Is evolution taught in your school system?) It cast such a long and dark shadow that it was sometimes difficult for what were some of the most basic and fundamental scientific discoveries in history to be noticed. Case in point: Gregor Mendel.&lt;br /&gt;1865 – An Augustinian monk named Gregor Mendel presented his experiments on peas to the Natural Science Society in Brunn, Austria. His research dealt with invisible "factors" (later to be called genes) that provided the basis for visible traits. His work was widely ignored until 1900 when it was rediscovered by Hugo de Vries, Erich von Tschermak and Carl Correns. Today, Mendel is considered the Father of Modern Genetics.&lt;br /&gt;1866 – Mendel publishes "Experiments on Plant Hybridization."&lt;br /&gt;1871 – Fredrich Miescher successfully isolates nuclein from pus cells obtained from discarded bandages. Nuclein contains nucleic acid. It would be several generations before a connection was made between nucleic acid and heredity.&lt;br /&gt;- DNA was isolated from the sperm of trout.&lt;br /&gt;- Darwin published his follow-up to "On the Origin of Species" titled "The Descent of Man and Selection Relation to Sex."&lt;br /&gt;1875 – Darwin proposed the 
