Arquivo da tag: Matemática

Bits of Mystery DNA, Far From ‘Junk,’ Play Crucial Role (N.Y.Times)

By GINA KOLATA

Published: September 5, 2012

Among the many mysteries of human biology is why complex diseases like diabeteshigh blood pressure and psychiatric disorders are so difficult to predict and, often, to treat. An equally perplexing puzzle is why one individual gets a disease like cancer or depression, while an identical twin remains perfectly healthy.

Béatrice de Géa for The New York Times. “It is like opening a wiring closet and seeing a hairball of wires,” Mark Gerstein of Yale University said of the DNA intricacies.

Now scientists have discovered a vital clue to unraveling these riddles. The human genome is packed with at least four million gene switches that reside in bits of DNA that once were dismissed as “junk” but that turn out to play critical roles in controlling how cells, organs and other tissues behave. The discovery, considered a major medical and scientific breakthrough, has enormous implications for human health because many complex diseases appear to be caused by tiny changes in hundreds of gene switches.

The findings, which are the fruit of an immense federal project involving 440 scientists from 32 laboratories around the world, will have immediate applications for understanding how alterations in the non-gene parts of DNA contribute to human diseases, which may in turn lead to new drugs. They can also help explain how the environment can affect disease risk. In the case of identical twins, small changes in environmental exposure can slightly alter gene switches, with the result that one twin gets a disease and the other does not.

As scientists delved into the “junk” — parts of the DNA that are not actual genes containing instructions for proteins — they discovered a complex system that controls genes. At least 80 percent of this DNA is active and needed. The result of the work is an annotated road map of much of this DNA, noting what it is doing and how. It includes the system of switches that, acting like dimmer switches for lights, control which genes are used in a cell and when they are used, and determine, for instance, whether a cell becomes a liver cell or a neuron.

“It’s Google Maps,” said Eric Lander, president of the Broad Institute, a joint research endeavor of Harvard and the Massachusetts Institute of Technology. In contrast, the project’s predecessor, the Human Genome Project, which determined the entire sequence of human DNA, “was like getting a picture of Earth from space,” he said. “It doesn’t tell you where the roads are, it doesn’t tell you what traffic is like at what time of the day, it doesn’t tell you where the good restaurants are, or the hospitals or the cities or the rivers.”

The new result “is a stunning resource,” said Dr. Lander, who was not involved in the research that produced it but was a leader in the Human Genome Project. “My head explodes at the amount of data.”

The discoveries were published on Wednesday in six papers in the journal Nature and in 24 papers in Genome Research and Genome Biology. In addition, The Journal of Biological Chemistry is publishing six review articles, and Science is publishing yet another article.

Human DNA is “a lot more active than we expected, and there are a lot more things happening than we expected,” said Ewan Birney of the European Molecular Biology Laboratory-European Bioinformatics Institute, a lead researcher on the project.

In one of the Nature papers, researchers link the gene switches to a range of human diseases — multiple sclerosislupusrheumatoid arthritisCrohn’s diseaseceliac disease — and even to traits like height. In large studies over the past decade, scientists found that minor changes in human DNA sequences increase the risk that a person will get those diseases. But those changes were in the junk, now often referred to as the dark matter — they were not changes in genes — and their significance was not clear. The new analysis reveals that a great many of those changes alter gene switches and are highly significant.

“Most of the changes that affect disease don’t lie in the genes themselves; they lie in the switches,” said Michael Snyder, a Stanford University researcher for the project, called Encode, for Encyclopedia of DNA Elements.

And that, said Dr. Bradley Bernstein, an Encode researcher at Massachusetts General Hospital, “is a really big deal.” He added, “I don’t think anyone predicted that would be the case.”

The discoveries also can reveal which genetic changes are important in cancer, and why. As they began determining the DNA sequences of cancer cells, researchers realized that most of the thousands of DNA changes in cancer cells were not in genes; they were in the dark matter. The challenge is to figure out which of those changes are driving the cancer’s growth.

“These papers are very significant,” said Dr. Mark A. Rubin, a prostate cancer genomics researcher at Weill Cornell Medical College. Dr. Rubin, who was not part of the Encode project, added, “They will definitely have an impact on our medical research on cancer.”

In prostate cancer, for example, his group found mutations in important genes that are not readily attacked by drugs. But Encode, by showing which regions of the dark matter control those genes, gives another way to attack them: target those controlling switches.

Dr. Rubin, who also used the Google Maps analogy, explained: “Now you can follow the roads and see the traffic circulation. That’s exactly the same way we will use these data in cancer research.” Encode provides a road map with traffic patterns for alternate ways to go after cancer genes, he said.

Dr. Bernstein said, “This is a resource, like the human genome, that will drive science forward.”

The system, though, is stunningly complex, with many redundancies. Just the idea of so many switches was almost incomprehensible, Dr. Bernstein said.

There also is a sort of DNA wiring system that is almost inconceivably intricate.

“It is like opening a wiring closet and seeing a hairball of wires,” said Mark Gerstein, an Encode researcher from Yale. “We tried to unravel this hairball and make it interpretable.”

There is another sort of hairball as well: the complex three-dimensional structure of DNA. Human DNA is such a long strand — about 10 feet of DNA stuffed into a microscopic nucleus of a cell — that it fits only because it is tightly wound and coiled around itself. When they looked at the three-dimensional structure — the hairball — Encode researchers discovered that small segments of dark-matter DNA are often quite close to genes they control. In the past, when they analyzed only the uncoiled length of DNA, those controlling regions appeared to be far from the genes they affect.

The project began in 2003, as researchers began to appreciate how little they knew about human DNA. In recent years, some began to find switches in the 99 percent of human DNA that is not genes, but they could not fully characterize or explain what a vast majority of it was doing.

The thought before the start of the project, said Thomas Gingeras, an Encode researcher from Cold Spring Harbor Laboratory, was that only 5 to 10 percent of the DNA in a human being was actually being used.

The big surprise was not only that almost all of the DNA is used but also that a large proportion of it is gene switches. Before Encode, said Dr. John Stamatoyannopoulos, a University of Washington scientist who was part of the project, “if you had said half of the genome and probably more has instructions for turning genes on and off, I don’t think people would have believed you.”

By the time the National Human Genome Research Institute, part of the National Institutes of Health, embarked on Encode, major advances in DNA sequencing and computational biology had made it conceivable to try to understand the dark matter of human DNA. Even so, the analysis was daunting — the researchers generated 15 trillion bytes of raw data. Analyzing the data required the equivalent of more than 300 years of computer time.

Just organizing the researchers and coordinating the work was a huge undertaking. Dr. Gerstein, one of the project’s leaders, has produced a diagram of the authors with their connections to one another. It looks nearly as complicated as the wiring diagram for the human DNA switches. Now that part of the work is done, and the hundreds of authors have written their papers.

“There is literally a flotilla of papers,” Dr. Gerstein said. But, he added, more work has yet to be done — there are still parts of the genome that have not been figured out.

That, though, is for the next stage of Encode.

*   *   *

Published: September 5, 2012

Rethinking ‘Junk’ DNA

A large group of scientists has found that so-called junk DNA, which makes up most of the human genome, does much more than previously thought.

GENES: Each human cell contains about 10 feet of DNA, coiled into a dense tangle. But only a very small percentage of DNA encodes genes, which control inherited traits like eye color, blood type and so on.

JUNK DNA: Stretches of DNA around and between genes seemed to do nothing, and were called junk DNA. But now researchers think that the junk DNA contains a large number of tiny genetic switches, controlling how genes function within the cell.

REGULATION: The many genetic regulators seem to be arranged in a complex and redundant hierarchy. Scientists are only beginning to map and understand this network, which regulates how cells, organs and tissues behave.

DISEASE: Errors or mutations in genetic switches can disrupt the network and lead to a range of diseases. The new findings will spur further research and may lead to new drugs and treatments.

 

Evolution could explain the placebo effect (New Scientist)

06 September 2012 by Colin Barras

Magazine issue 2881

ON THE face of it, the placebo effect makes no sense. Someone suffering from a low-level infection will recover just as nicely whether they take an active drug or a simple sugar pill. This suggests people are able to heal themselves unaided – so why wait for a sugar pill to prompt recovery?

New evidence from a computer model offers a possible evolutionary explanation, and suggests that the immune system has an on-off switch controlled by the mind.

It all starts with the observation that something similar to the placebo effect occurs in many animals, says Peter Trimmer, a biologist at the University of Bristol, UK. For instance, Siberian hamsters do little to fight an infection if the lights above their lab cage mimic the short days and long nights of winter. But changing the lighting pattern to give the impression of summer causes them to mount a full immune response.

Likewise, those people who think they are taking a drug but are really receiving a placebo can have a response which is twice that of those who receive no pills (Annals of Family Medicinedoi.org/cckm8b). In Siberian hamsters and people, intervention creates a mental cue that kick-starts the immune response.

There is a simple explanation, says Trimmer: the immune system is costly to run – so costly that a strong and sustained response could dangerously drain an animal’s energy reserves. In other words, as long as the infection is not lethal, it pays to wait for a sign that fighting it will not endanger the animal in other ways.

Nicholas Humphrey, a retired psychologist formerly at the London School of Economics, first proposed this idea a decade ago, but only now has evidence to support it emerged from a computer model designed by Trimmer and his colleagues.

According to Humphrey’s picture, the Siberian hamster subconsciously acts on a cue that it is summer because food supplies to sustain an immune response are plentiful at that time of year. We subconsciously respond to treatment – even a sham one – because it comes with assurances that it will weaken the infection, allowing our immune response to succeed rapidly without straining the body’s resources.

Trimmer’s simulation is built on this assumption – that animals need to spend vital resources on fighting low-level infections. The model revealed that, in challenging environments, animals lived longer and sired more offspring if they endured infections without mounting an immune response. In more favourable environments, it was best for animals to mount an immune response and return to health as quickly as possible (Evolution and Human Behavior, doi.org/h8p). The results show a clear evolutionary benefit to switching the immune system on and off depending on environmental conditions.

“I’m pleased to see that my theory stands up to computational modelling,” says Humphrey. If the idea is right, he adds, it means we have misunderstood the nature of placebos. Farming and other innovations in the past 10,000 years mean that many people have a stable food supply and can safely mount a full immune response at any time – but our subconscious switch has not yet adapted to this. A placebo tricks the mind into thinking it is an ideal time to switch on an immune response, says Humphrey.

Paul Enck at the University of Tübingen in Germany says it is an intriguing idea, but points out that there are many different placebo responses, depending on the disease. It is unlikely that a single mechanism explains them all, he says.

First Holistic View of How Human Genome Actually Works: ENCODE Study Produces Massive Data Set (Science Daily)

ScienceDaily (Sep. 5, 2012) — The Human Genome Project produced an almost complete order of the 3 billion pairs of chemical letters in the DNA that embodies the human genetic code — but little about the way this blueprint works. Now, after a multi-year concerted effort by more than 440 researchers in 32 labs around the world, a more dynamic picture gives the first holistic view of how the human genome actually does its job.

William Noble, professor of genome sciences and computer science, in the data center at the William H. Foege Building. Noble, an expert on machine learning, and his team designed artificial intellience programs to analyze ENCODE data. These computer programs can learn from experience, recognize patterns, and organize information into categories understandable to scientists. The center houses systems for a wide variety of genetic research. The computer center has the capacity to store and analyze a tremendous amount of data, the equivalent of a 670-page autobiography of each person on earth, uncompressed.The computing resources analyze over 4 pentabytes of genomic data a year. (Credit: Clare McLean, Courtesy of University of Washington)

During the new study, researchers linked more than 80 percent of the human genome sequence to a specific biological function and mapped more than 4 million regulatory regions where proteins specifically interact with the DNA. These findings represent a significant advance in understanding the precise and complex controls over the expression of genetic information within a cell. The findings bring into much sharper focus the continually active genome in which proteins routinely turn genes on and off using sites that are sometimes at great distances from the genes themselves. They also identify where chemical modifications of DNA influence gene expression and where various functional forms of RNA, a form of nucleic acid related to DNA, help regulate the whole system.

“During the early debates about the Human Genome Project, researchers had predicted that only a few percent of the human genome sequence encoded proteins, the workhorses of the cell, and that the rest was junk. We now know that this conclusion was wrong,” said Eric D. Green, M.D., Ph.D., director of the National Human Genome Research Institute (NHGRI), a part of the National Institutes of Health. “ENCODE has revealed that most of the human genome is involved in the complex molecular choreography required for converting genetic information into living cells and organisms.”

NHGRI organized the research project producing these results; it is called the Encyclopedia oDNA Elements or ENCODE. Launched in 2003, ENCODE’s goal of identifying all of the genome’s functional elements seemed just as daunting as sequencing that first human genome. ENCODE was launched as a pilot project to develop the methods and strategies needed to produce results and did so by focusing on only 1 percent of the human genome. By 2007, NHGRI concluded that the technology had sufficiently evolved for a full-scale project, in which the institute invested approximately $123 million over five years. In addition, NHGRI devoted about $40 million to the ENCODE pilot project, plus approximately $125 million to ENCODE-related technology development and model organism research since 2003.

The scale of the effort has been remarkable. Hundreds of researchers across the United States, United Kingdom, Spain, Singapore and Japan performed more than 1,600 sets of experiments on 147 types of tissue with technologies standardized across the consortium. The experiments relied on innovative uses of next-generation DNA sequencing technologies, which had only become available around five years ago, due in large part to advances enabled by NHGRI’s DNA sequencing technology development program. In total, ENCODE generated more than 15 trillion bytes of raw data and consumed the equivalent of more than 300 years of computer time to analyze.

“We’ve come a long way,” said Ewan Birney, Ph.D., of the European Bioinformatics Institute, in the United Kingdom, and lead analysis coordinator for the ENCODE project. “By carefully piecing together a simply staggering variety of data, we’ve shown that the human genome is simply alive with switches, turning our genes on and off and controlling when and where proteins are produced. ENCODE has taken our knowledge of the genome to the next level, and all of that knowledge is being shared openly.”

The ENCODE Consortium placed the resulting data sets as soon as they were verified for accuracy, prior to publication, in several databases that can be freely accessed by anyone on the Internet. These data sets can be accessed through the ENCODE project portal (www.encodeproject.org) as well as at the University of California, Santa Cruz genome browser,http://genome.ucsc.edu/ENCODE/, the National Center for Biotechnology Information,http://www.ncbi.nlm.nih.gov/geo/info/ENCODE.html and the European Bioinformatics Institute,http://useast.ensembl.org/Homo_sapiens/encode.html?redirect=mirror;source=www.ensembl.org.

“The ENCODE catalog is like Google Maps for the human genome,” said Elise Feingold, Ph.D., an NHGRI program director who helped start the ENCODE Project. “Simply by selecting the magnification in Google Maps, you can see countries, states, cities, streets, even individual intersections, and by selecting different features, you can get directions, see street names and photos, and get information about traffic and even weather. The ENCODE maps allow researchers to inspect the chromosomes, genes, functional elements and individual nucleotides in the human genome in much the same way.”

The coordinated publication set includes one main integrative paper and five related papers in the journal Nature; 18 papers inGenome Research; and six papers in Genome Biology. The ENCODE data are so complex that the three journals have developed a pioneering way to present the information in an integrated form that they call threads.

“Because ENCODE has generated so much data, we, together with the ENCODE Consortium, have introduced a new way to enable researchers to navigate through the data,” said Magdalena Skipper, Ph.D., senior editor at Nature, which produced the freely available publishing platform on the Internet.

Since the same topics were addressed in different ways in different papers, the new website, www.nature.com/encode, will allow anyone to follow a topic through all of the papers in the ENCODE publication set by clicking on the relevant thread at the Nature ENCODE explorer page. For example, thread number one compiles figures, tables, and text relevant to genetic variation and disease from several papers and displays them all on one page. ENCODE scientists believe this will illuminate many biological themes emerging from the analyses.

In addition to the threaded papers, six review articles are being published in the Journal of Biological Chemistry and two related papers in Science and one in Cell.

The ENCODE data are rapidly becoming a fundamental resource for researchers to help understand human biology and disease. More than 100 papers using ENCODE data have been published by investigators who were not part of the ENCODE Project, but who have used the data in disease research. For example, many regions of the human genome that do not contain protein-coding genes have been associated with disease. Instead, the disease-linked genetic changes appear to occur in vast tracts of sequence between genes where ENCODE has identified many regulatory sites. Further study will be needed to understand how specific variants in these genomic areas contribute to disease.

“We were surprised that disease-linked genetic variants are not in protein-coding regions,” said Mike Pazin, Ph.D., an NHGRI program director working on ENCODE. “We expect to find that many genetic changes causing a disorder are within regulatory regions, or switches, that affect how much protein is produced or when the protein is produced, rather than affecting the structure of the protein itself. The medical condition will occur because the gene is aberrantly turned on or turned off or abnormal amounts of the protein are made. Far from being junk DNA, this regulatory DNA clearly makes important contributions to human health and disease.”

Identifying regulatory regions will also help researchers explain why different types of cells have different properties. For example why do muscle cells generate force while liver cells break down food? Scientists know that muscle cells turn on some genes that only work in muscle, but it has not been previously possible to examine the regulatory elements that control that process. ENCODE has laid a foundation for these kinds of studies by examining more than 140 of the hundreds of cell types found in the human body and identifying many of the cell type-specific control elements.

Despite the enormity of the dataset described in this historic collection of publications, it does not comprehensively describe all of the functional genomic elements in all of the different types of cells in the human body. NHGRI plans to invest in additional ENCODE-related research for at least another four years. During the next phase, ENCODE will increase the depth of the catalog with respect to the types of functional elements and cell types studied. It will also develop new tools for more sophisticated analyses of the data.

Journal References:

  1. Magdalena Skipper, Ritu Dhand, Philip Campbell.Presenting ENCODENature, 2012; 489 (7414): 45 DOI:10.1038/489045a
  2. Joseph R. Ecker, Wendy A. Bickmore, Inês Barroso, Jonathan K. Pritchard, Yoav Gilad, Eran Segal. Genomics: ENCODE explainedNature, 2012; 489 (7414): 52 DOI:10.1038/489052a
  3. The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome.Nature, 2012; 489 (7414): 57 DOI: 10.1038/nature11247

Design Help for Drug Cocktails for HIV Patients: Mathematical Model Helps Design Efficient Multi-Drug Therapies (Science Daily)

ScienceDaily (Sep. 2, 2012) — For years, doctors treating those with HIV have recognized a relationship between how faithfully patients take the drugs they prescribe, and how likely the virus is to develop drug resistance. More recently, research has shown that the relationship between adherence to a drug regimen and resistance is different for each of the drugs that make up the “cocktail” used to control the disease.

HIV is shown attaching to and infecting a T4 cell. The virus then inserts its own genetic material into the T4 cell’s host DNA. The infected host cell then manufactures copies of the HIV. (Credit: iStockphoto/Medical Art Inc.)

New research conducted by Harvard scientists could help explain why those differences exist, and may help doctors quickly and cheaply design new combinations of drugs that are less likely to result in resistance.

As described in a September 2 paper in Nature Medicine, a team of researchers led by Martin Nowak, Professor of Mathematics and of Biology and Director of the Program for Evolutionary Dynamics, have developed a technique medical researchers can use to model the effects of various treatments, and predict whether they will cause the virus to develop resistance.

“What we demonstrate in this paper is a prototype for predicting, through modeling, whether a patient at a given adherence level is likely to develop resistance to treatment,” Alison Hill, a PhD student in Biophysics and co-first author of the paper, said. “Compared to the time and expense of a clinical trial, this method offers a relatively easy way to make these predictions. And, as we show in the paper, our results match with what doctors are seeing in clinical settings.”

The hope, said Nowak, is that the new technique will take some of the guesswork out of what is now largely a trial-and-error process.

“This is a mathematical tool that will help design clinical trials,” he said. “Right now, researchers are using trial and error to develop these combination therapies. Our approach uses the mathematical understanding of evolution to make the process more akin to engineering.”

Creating a model that can make such predictions accurately, however, requires huge amounts of data.

To get that data, Hill and Daniel Scholes Rosenbloom, a PhD student in Organismic and Evolutionary Biology and the paper’s other first author, turned to Johns Hopkins University Medical School, where Professor of Medicine and of Molecular Biology and Genetics Robert F. Siliciano was working with PhD student Alireza Rabi (also co-first author) to study how the HIV virus reacted to varying drug dosages.

Such data proved critical to the model that Hill, Rabi and Rosenbloom eventually designed, because the level of the drug in patients — even those that adhere to their treatment perfectly — naturally varies. When drug levels are low — as they are between doses, or if a dose is missed — the virus is better able to replicate and grow. Higher drug levels, by contrast, may keep the virus in check, but they also increase the risk of mutant strains of the virus emerging, leading to drug resistance.

Armed with the data from Johns Hopkins, Hill, Rabi and Rosenbloom created a computer model that could predict whether and how much the virus, or a drug-resistant strain, was growing based on how strictly patients stuck to their drug regimen.

“Our model is essentially a simulation of what goes on during treatment,” Rosenbloom said. “We created a number of simulated patients, each of whom had different characteristics, and then we said, ‘Let’s imagine these patients have 60 percent adherence to their treatment — they take 60 percent of the pills they’re supposed to.’ Our model can tell us what their drug concentration is over time, and based on that, we can say whether the virus is growing or shrinking, and whether they’re likely to develop resistance.”

The model’s predictions, Rosenbloom explained, can then serve as a guide to researchers as they work to design new drug cocktails to combat HIV.

While their model does hold out hope for simplifying the process of designing drug “cocktails,” Hill and Rosenbloom said they plan to continue to refine the model to take additional factors — such as multiple mutant-resistant strains of the virus and varying drug concentrations in other parts of the body — into effect.

“The prototype we have so far looks at concentrations of drugs in blood plasma,” Rosenbloom explained. “But a number of drugs don’t penetrate other parts of the body, like the brains or the gut, with the same efficiency, so it’s important to model these other areas where the concentrations of drugs might not be as high.”

Ultimately, though, both say their model can offer new hope to patients by helping doctors design better, cheaper and more efficient treatments.

“Over the past 10 years, the number of HIV-infected people receiving drug treatment has increased immensely,” Hill said. “Figuring out what the best ways are to treat people in terms of cost effectiveness, adherence and the chance of developing resistance is going to become even more important.”

Journal Reference:

  1. Daniel I S Rosenbloom, Alison L Hill, S Alireza Rabi, Robert F Siliciano, Martin A Nowak. Antiretroviral dynamics determines HIV evolution and predicts therapy outcomeNature Medicine, 2012; DOI: 10.1038/nm.2892

*   *   *

Anti-HIV Drug Simulation Offers ‘Realistic’ Tool to Predict Drug Resistance and Viral Mutation

ScienceDaily (Sep. 2, 2012) — Pooling data from thousands of tests of the antiviral activity of more than 20 commonly used anti-HIV drugs, AIDS experts at Johns Hopkins and Harvard universities have developed what they say is the first accurate computer simulation to explain drug effects. Already, the model clarifies how and why some treatment regimens fail in some patients who lack evidence of drug resistance. Researchers say their model is based on specific drugs, precise doses prescribed, and on “real-world variation” in how well patients follow prescribing instructions.

Johns Hopkins co-senior study investigator and infectious disease specialist Robert Siliciano, M.D., Ph.D., says the mathematical model can also be used to predict how well a patient is likely to do on a specific regimen, based on their prescription adherence. In addition, the model factors in each drug’s ability to suppress viral replication and the likelihood that such suppression will spur development of drug-resistant, mutant HIV strains.

“With the help of our simulation, we can now tell with a fair degree of certainty what level of viral suppression is being achieved — how hard it is for the virus to grow and replicate — for a particular drug combination, at a specific dosage and drug concentration in the blood, even when a dose is missed,” says Siliciano, a professor at the Johns Hopkins University School of Medicine and a Howard Hughes Medical Institute investigator. This information, he predicts, will remove “a lot of the current trial and error, or guesswork, involved in testing new drug combination therapies.”

Siliciano says the study findings, to be reported in the journalNature Medicine online Sept. 2, should help scientists streamline development and clinical trials of future combination therapies, by ruling out combinations unlikely to work.

One application of the model could be further development of drug combinations that can be contained in a single pill taken once a day. That could lower the chance of resistance, even if adherence is not perfect. Such future drug regimens, he says, will ideally strike a balance between optimizing viral suppression and minimizing risk of drug resistance.

Researchers next plan to expand their modeling beyond blood levels of virus to other parts of the body, such as the brain, where antiretroviral drug concentrations can be different from those measured in the blood. They also plan to expand their analysis to include multiple-drug-resistant strains of HIV.

Besides Siliciano, Johns Hopkins joint medical-doctoral student Alireza Rabi was a co-investigator in this study. Other study investigators included doctoral candidates Daniel Rosenbloom, M.S.; Alison Hill, M.S.; and co-senior study investigator Martin Nowak, Ph.D. — all at Harvard University.

Funding support for this study, which took two years to complete, was provided by the National Institutes of Health, with corresponding grant numbers R01-MH54907, R01-AI081600, R01-GM078986; the Bill and Melinda Gates Foundation; the Cancer Research Institute; the National Science Foundation; the Howard Hughes Medical Institute; Natural Sciences and Engineering Research Council of Canada; the John Templeton Foundation; and J. Epstein.

Currently, an estimated 8 million of the more than 34 million people in the world living with HIV are taking antiretroviral therapy to keep their disease in check. An estimated 1,178,000 in the United States are infected, including 23,000 in the state of Maryland.

Journal Reference:

  1. Daniel I S Rosenbloom, Alison L Hill, S Alireza Rabi, Robert F Siliciano, Martin A Nowak. Antiretroviral dynamics determines HIV evolution and predicts therapy outcomeNature Medicine, 2012; DOI: 10.1038/nm.2892

Mathematics or Memory? Study Charts Collision Course in Brain (Science Daily)

ScienceDaily (Sep. 3, 2012) — You already know it’s hard to balance your checkbook while simultaneously reflecting on your past. Now, investigators at the Stanford University School of Medicine — having done the equivalent of wire-tapping a hard-to-reach region of the brain — can tell us how this impasse arises.

The area in red is the posterior medial cortex, the portion of the brain that is most active when people recall details of their own pasts. (Credit: Courtesy of Josef Parvizi)

The researchers showed that groups of nerve cells in a structure called the posterior medial cortex, or PMC, are strongly activated during a recall task such as trying to remember whether you had coffee yesterday, but just as strongly suppressed when you’re engaged in solving a math problem.

The PMC, situated roughly where the brain’s two hemispheres meet, is of great interest to neuroscientists because of its central role in introspective activities.

“This brain region is famously well-connected with many other regions that are important for higher cognitive functions,” said Josef Parvizi, MD, PhD, associate professor of neurology and neurological sciences and director of Stanford’s Human Intracranial Cognitive Electrophysiology Program. “But it’s very hard to reach. It’s so deep in the brain that the most commonly used electrophysiological methods can’t access it.”

In a study published online Sept. 3 in Proceedings of the National Academy of Sciences, Parvizi and his Stanford colleagues found a way to directly and sensitively record the output from this ordinarily anatomically inaccessible site in human subjects. By doing so, the researchers learned that particular clusters of nerve cells in the PMC that are most active when you are recalling details of your own past are strongly suppressed when you are performing mathematical calculations. Parvizi is the study’s senior author. The first and second authors, respectively, are postdoctoral scholars Brett Foster, PhD, and Mohammed Dastjerdi, PhD.

Much of our understanding of what roles different parts of the brain play has been obtained by techniques such as functional magnetic resonance imaging, which measures the amount of blood flowing through various brain regions as a proxy for activity in those regions. But changes in blood flow are relatively slow, making fMRI a poor medium for listening in on the high-frequency electrical bursts (approximately 200 times per second) that best reflect nerve-cell firing. Moreover, fMRI typically requires pooling images from several subjects into one composite image. Each person’s brain physiognomy is somewhat different, so the blending blurs the observable anatomical coordinates of a region of interest.

Nonetheless, fMRI imaging has shown that the PMC is quite active in introspective processes such as autobiographical memory processing (“I ate breakfast this morning”) or daydreaming, and less so in external sensory processing (“How far away is that pedestrian?”). “Whenever you pay attention to the outside world, its activity decreases,” said Parvizi.

To learn what specific parts of this region are doing during, say, recall versus arithmetic requires more-individualized anatomical resolution than an fMRI provides. Otherwise, Parvizi said, “if some nerve-cell populations become less active and others more active, it all washes out, and you see no net change.” So you miss what’s really going on.

For this study, the Stanford scientists employed a highly sensitive technique to demonstrate that introspective and externally focused cognitive tasks directly interfere with one another, because they impose opposite requirements on the same brain circuitry.

The researchers took advantage of a procedure performed on patients who were being evaluated for brain surgery at the Stanford Epilepsy Monitoring Unit, associated with Stanford University Medical Center. These patients were unresponsive to drug therapy and, as a result, suffered continuing seizures. The procedure involves temporarily removing small sections of a patient’s skull, placing a thin plastic film containing electrodes onto the surface of the brain near the suspected point of origin of that patient’s seizure (the location is unique to each patient), and then monitoring electrical activity in that region for five to seven days — all of it spent in a hospital bed. Once the epilepsy team identifies the point of origin of any seizures that occurred during that time, surgeons can precisely excise a small piece of tissue at that position, effectively breaking the vicious cycle of brain-wave amplification that is a seizure.

Implanting these electrode packets doesn’t mean piercing the brain or individual cells within it. “Each electrode picks up activity from about a half-million nerve cells,” Parvizi said. “It’s more like dotting the ceiling of a big room, filled with a lot of people talking, with multiple microphones. We’re listening to the buzz in the room, not individual conversations. Each microphone picks up the buzz from a different bunch of partiers. Some groups are more excited and talking more loudly than others.”

The experimenters found eight patients whose seizures were believed to be originating somewhere near the brain’s midline and who, therefore, had had electrode packets placed in the crevasse dividing the hemispheres. (The brain’s two hemispheres are spaced far enough apart to slip an electrode packet between them without incurring damage.)

The researchers got permission from these eight patients to bring in laptop computers and put the volunteers through a battery of simple tasks requiring modest intellectual effort. “It can be boring to lie in bed waiting seven days for a seizure to come,” said Foster. “Our studies helped them pass the time.” The sessions lasted about an hour.

On the laptop would appear a series of true/false statements falling into one of four categories. Three categories were self-referential, albeit with varying degrees of specificity. Most specific was so-called “autobiographical episodic memory,” an example of which might be: “I drank coffee yesterday.” The next category of statements was more generic: “I eat a lot of fruit.” The most abstract category, “self-judgment,” comprised sentences along the lines of: “I am honest.”

A fourth category differed from the first three in that it consisted of arithmetical equations such as: 67 + 6 = 75. Evaluating such a statement’s truth required no introspection but, instead, an outward, more sensory orientation.

For each item, patients were instructed to press “1” if a statement was true, “2” if it was false.

Significant portions of the PMC that were “tapped” by electrodes became activated during self-episodic memory processing, confirming the PMC’s strong role in recall of one’s past experiences. Interestingly, true/false statements involving less specifically narrative recall — such as, “I eat a lot of fruit” — induced relatively little activity. “Self-judgment” statements — such as, “I am attractive” — elicited none at all. Moreover, whether a volunteer judged a statement to be true or false made no difference with respect to the intensity, location or duration of electrical activity in activated PMC circuits.

This suggests, both Parvizi and Foster said, that the PMC is not the brain’s “center of self-consciousness” as some have proposed, but is more specifically engaged in constructing autobiographical narrative scenes, as occurs in recall or imagination.

Foster, Dastjerdi and Parvizi also found that the PMC circuitry activated by a recall task took close to a half-second to fire up, ruling out the possibility that this circuitry’s true role was in reading or making sense of the sentence on the screen. (These two activities are typically completed within the first one-fifth of a second or so.) Once activated, these circuits remained active for a full second.

Yet all the electrodes that lit up during the self-episodic condition were conspicuously deactivated during arithmetic calculation. In fact, the circuits being monitored by these electrodes were not merely passively silent, but actively suppressed, said Parvizi. “The more a circuit is activated during autobiographical recall, the more it is suppressed during math. It’s essentially impossible to do both at once.”

The study was funded by the National Institutes of Health, with partial sponsorship from the Stanford Institute for NeuroInnovation and Translational Neuroscience.

Rooting out Rumors, Epidemics, and Crime — With Math (Science Daily)

ScienceDaily (Aug. 10, 2012) — A team of EPFL scientists has developed an algorithm that can identify the source of an epidemic or information circulating within a network, a method that could also be used to help with criminal investigations.

Investigators are well aware of how difficult it is to trace an unlawful act to its source. The job was arguably easier with old, Mafia-style criminal organizations, as their hierarchical structures more or less resembled predictable family trees.

In the Internet age, however, the networks used by organized criminals have changed. Innumerable nodes and connections escalate the complexity of these networks, making it ever more difficult to root out the guilty party. EPFL researcher Pedro Pinto of the Audiovisual Communications Laboratory and his colleagues have developed an algorithm that could become a valuable ally for investigators, criminal or otherwise, as long as a network is involved. The team’s research was published August 10, 2012, in the journal Physical Review Letters.

Finding the source of a Facebook rumor

“Using our method, we can find the source of all kinds of things circulating in a network just by ‘listening’ to a limited number of members of that network,” explains Pinto. Suppose you come across a rumor about yourself that has spread on Facebook and been sent to 500 people — your friends, or even friends of your friends. How do you find the person who started the rumor? “By looking at the messages received by just 15-20 of your friends, and taking into account the time factor, our algorithm can trace the path of that information back and find the source,” Pinto adds. This method can also be used to identify the origin of a spam message or a computer virus using only a limited number of sensors within the network.

Trace the propagation of an epidemic

Out in the real world, the algorithm can be employed to find the primary source of an infectious disease, such as cholera. “We tested our method with data on an epidemic in South Africa provided by EPFL professor Andrea Rinaldo’s Ecohydrology Laboratory,” says Pinto. “By modeling water networks, river networks, and human transport networks, we were able to find the spot where the first cases of infection appeared by monitoring only a small fraction of the villages.”

The method would also be useful in responding to terrorist attacks, such as the 1995 sarin gas attack in the Tokyo subway, in which poisonous gas released in the city’s subterranean tunnels killed 13 people and injured nearly 1,000 more. “Using this algorithm, it wouldn’t be necessary to equip every station with detectors. A sample would be sufficient to rapidly identify the origin of the attack, and action could be taken before it spreads too far,” says Pinto.

Identifying the brains behind a terrorist attack

Computer simulations of the telephone conversations that could have occurred during the terrorist attacks on September 11, 2001, were used to test Pinto’s system. “By reconstructing the message exchange inside the 9/11 terrorist network extracted from publicly released news, our system spit out the names of three potential suspects — one of whom was found to be the mastermind of the attacks, according to the official enquiry.”

The validity of this method thus has been proven a posteriori. But according to Pinto, it could also be used preventatively — for example, to understand an outbreak before it gets out of control. “By carefully selecting points in the network to test, we could more rapidly detect the spread of an epidemic,” he points out. It could also be a valuable tool for advertisers who use viral marketing strategies by leveraging the Internet and social networks to reach customers. For example, this algorithm would allow them to identify the specific Internet blogs that are the most influential for their target audience and to understand how in these articles spread throughout the online community.

How Computation Can Predict Group Conflict: Fighting Among Captive Pigtailed Macaques Provides Clues (Science Daily)

ScienceDaily (Aug. 13, 2012) — When conflict breaks out in social groups, individuals make strategic decisions about how to behave based on their understanding of alliances and feuds in the group.

Researchers studied fighting among captive pigtailed macaques for clues about behavior and group conflict. (Credit: iStockphoto/Natthaphong Phanthumchinda)

But it’s been challenging to quantify the underlying trends that dictate how individuals make predictions, given they may only have seen a small number of fights or have limited memory.

In a new study, scientists at the Wisconsin Institute for Discovery (WID) at UW-Madison develop a computational approach to determine whether individuals behave predictably. With data from previous fights, the team looked at how much memory individuals in the group would need to make predictions themselves. The analysis proposes a novel estimate of “cognitive burden,” or the minimal amount of information an organism needs to remember to make a prediction.

The research draws from a concept called “sparse coding,” or the brain’s tendency to use fewer visual details and a small number of neurons to stow an image or scene. Previous studies support the idea that neurons in the brain react to a few large details such as the lines, edges and orientations within images rather than many smaller details.

“So what you get is a model where you have to remember fewer things but you still get very high predictive power — that’s what we’re interested in,” says Bryan Daniels, a WID researcher who led the study. “What is the trade-off? What’s the minimum amount of ‘stuff’ an individual has to remember to make good inferences about future events?”

To find out, Daniels — along with WID co-authors Jessica Flack and David Krakauer — drew comparisons from how brains and computers encode information. The results contribute to ongoing discussions about conflict in biological systems and how cognitive organisms understand their environments.

The study, published in the Aug. 13 edition of the Proceedings of the National Academy of Sciences, examined observed bouts of natural fighting in a group of 84 captive pigtailed macaques at the Yerkes National Primate Research Center. By recording individuals’ involvement — or lack thereof — in fights, the group created models that mapped the likelihood any number of individuals would engage in conflict in hypothetical situations.

To confirm the predictive power of the models, the group plugged in other data from the monkey group that was not used to create the models. Then, researchers compared these simulations with what actually happened in the group. One model looked at conflict as combinations of pairs, while another represented fights as sparse combinations of clusters, which proved to be a better tool for predicting fights. From there, by removing information until predictions became worse, Daniels and colleagues calculated the amount of information each individual needed to remember to make the most informed decision whether to fight or flee.

“We know the monkeys are making predictions, but we don’t know how good they are,” says Daniels. “But given this data, we found that the most memory it would take to figure out the regularities is about 1,000 bits of information.”

Sparse coding appears to be a strong candidate for explaining the mechanism at play in the monkey group, but the team points out that it is only one possible way to encode conflict.

Because the statistical modeling and computation frameworks can be applied to different natural datasets, the research has the potential to influence other fields of study, including behavioral science, cognition, computation, game theory and machine learning. Such models might also be useful in studying collective behaviors in other complex systems, ranging from neurons to bird flocks.

Future research will seek to find out how individuals’ knowledge of alliances and feuds fine tunes their own decisions and changes the groups’ collective pattern of conflict.

The research was supported by the National Science Foundation, the John Templeton Foundation through the Santa Fe Institute, and UW-Madison.

A Century Of Weather Control (POP SCI)

Posted 7.19.12 at 6:20 pm – http://www.popsci.com

 

Keeping Pilots Updated, November 1930

It’s 1930 and, for obvious reasons, pilots want regular reports on the weather. What to do? Congress’s solution was to give the U.S. Weather Bureau cash to send them what they needed. It was a lot of cash, too: $1.4 million, or “more than one third the sum it spend annually for all of its work.”

About 13,000 miles of airway were monitored for activity, and reports were regularly sent via the now quaintly named “teletype”–an early fax machine, basically, that let a typed message be reproduced. Pilots were then radioed with the information.

From the article “Weather Man Makes the Air Safe.”

 

Battling Hail, July 1947

We weren’t shy about laying on the drama in this piece on hail–it was causing millions in damage across the country and we were sick of it. Our writer says, “The war against hail has been declared.” (Remember: this was only two years after World War II, which was a little more serious. Maybe our patriotism just wouldn’t wane.)

The idea was to scatter silver iodide as a form of “cloud seeding”–turning the moisture to snow before it hails. It’s a process that’s still toyed with today.

From the article “The War Against Hail.”

 

Hunting for a Tornado “Cure,” March 1958

1957 was a record-breaking year for tornadoes, and PopSci was forecasting even rougher skies for 1958. As described by an official tornado watcher: ‘”They’re coming so fast and thick … that we’ve lost count.'”

To try to stop it, researchers wanted to learn more. Meteorologists asked for $5 million more a year from Congress to be able to study tornadoes whirling through the Midwest’s Tornado Alley, then, hopefully, learn what they needed to do to stop them.

From the article “What We’re Learning About Tornadoes.”

 

Spotting Clouds With Nimbus, November 1963

Weather satellites were a boon to both forecasters and anyone affected by extreme weather. The powerful Hurricane Esther was discovered two days before anything else spotted it, leaving space engineers “justifiably proud.” The next satellite in line was the Nimbus, which Popular Science devoted multiple pages to covering, highlighting its ability to photograph cloud cover 24 hours a day and give us better insight into extreme weather.

Spoiler: the results really did turn out great, with Nimbus satellites paving the way for modern GPS devices.

From the article “The Weather Eye That Never Blinks.”

 

Saving Money Globally With Forecasts, November 1970

Optimism for weather satellites seemed to be reaching a high by the ’70s, with Popular Science recounting all the disasters predicted–how they “saved countless lives through early hurricane warnings”–and now even saying they’d save your vacation.

What they were hoping for then was an accurate five-day forecast for the world, which they predicted would save billions and make early warnings even better.

From the article “How New Weather Satellites Will Give You More Reliable Forecasts.”

 

Extreme Weather Alerts on the Radio, July 1979

Those weather alerts that come on your television during a storm–or at least one radio version of those–were documented byPopular Science in 1979. But rather than being something that anyone could tune in to, they were specialized radios you had to purchase, which seems like a less-than-great solution to the problem. But at this point the government had plans to set up weather monitoring stations near 90 percent of the country’s population, opening the door for people to find out fast what the weather situation was.

From the article “Weather-Alert Radios–They Could Save Your Life.”

 

Stopping “Bolts From the Blue,” May 1990

Here Popular Science let loose a whooper for anyone with a fear of extreme weather: lightning kills a lot more people every year than you think, and sometimes a lightning bolt will come and hit you even when there’s not a storm. So-called “bolts from the blue” were a part of the story on better predicting lightning, a phenomenon more manic than most types of weather. Improved sensors played a major part in better preparing people before a storm.

From the article “Predicting Deadly Lightning.”

 

Infrared Views of Weather, August 1983

Early access to computers let weather scientists get a 3-D, radar-based view of weather across the country. The system culled information from multiple sources and placed it in one viewable display. (The man pictured looks slightly bored for how revolutionary it is.) The system was an attempt to take global information and make it into “real-time local predictions.”

From the article “Nowcasting: New Weather Computers Pinpoint Deadly Storms.”

 

Modernizing the National Weather Service, August 1997

A year’s worth of weather detection for every American was coming at the price of “a Big Mac, fries, and a Coke,” the deputy director of the National Weather Service said in 1997. The computer age better tied together the individual parts of weather forecasting for the NWS, leaving a unified whole that could grab complicated meteorological information and interpret it in just a few seconds.

From the article “Weather’s New Outlook.”

 

Modeling Weather With Computers, September 2001

Computer simulations, we wrote, would help us predict future storms more accurately. But it took (at the time) the largest supercomputer around to give us the kinds of models we wanted. Judging by the image, we might’ve already made significant progress on the weather modeling front.

Researchers Produce First Complete Computer Model of an Organism (Science Daily)

ScienceDaily (July 21, 2012) — In a breakthrough effort for computational biology, the world’s first complete computer model of an organism has been completed, Stanford researchers reported last week in the journal Cell.

The Covert Lab incorporated more than 1,900 experimentally observed parameters into their model of the tiny parasite Mycoplasma genitalium. () (Credit: Illustration by Erik Jacobsen / Covert Lab)

A team led by Markus Covert, assistant professor of bioengineering, used data from more than 900 scientific papers to account for every molecular interaction that takes place in the life cycle of Mycoplasma genitalium, the world’s smallest free-living bacterium.

By encompassing the entirety of an organism in silico, the paper fulfills a longstanding goal for the field. Not only does the model allow researchers to address questions that aren’t practical to examine otherwise, it represents a stepping-stone toward the use of computer-aided design in bioengineering and medicine.

“This achievement demonstrates a transforming approach to answering questions about fundamental biological processes,” said James M. Anderson, director of the National Institutes of Health Division of Program Coordination, Planning and Strategic Initiatives. “Comprehensive computer models of entire cells have the potential to advance our understanding of cellular function and, ultimately, to inform new approaches for the diagnosis and treatment of disease.”

The research was partially funded by an NIH Director’s Pioneer Award from the National Institutes of Health Common Fund.

From information to understanding

Biology over the past two decades has been marked by the rise of high-throughput studies producing enormous troves of cellular information. A lack of experimental data is no longer the primary limiting factor for researchers. Instead, it’s how to make sense of what they already know.

Most biological experiments, however, still take a reductionist approach to this vast array of data: knocking out a single gene and seeing what happens.

“Many of the issues we’re interested in aren’t single-gene problems,” said Covert. “They’re the complex result of hundreds or thousands of genes interacting.”

This situation has resulted in a yawning gap between information and understanding that can only be addressed by “bringing all of that data into one place and seeing how it fits together,” according to Stanford bioengineering graduate student and co-first author Jayodita Sanghvi.

Integrative computational models clarify data sets whose sheer size would otherwise place them outside human ken.

“You don’t really understand how something works until you can reproduce it yourself,” Sanghvi said.

Small is beautiful

Mycoplasma genitalium is a humble parasitic bacterium known mainly for showing up uninvited in human urogenital and respiratory tracts. But the pathogen also has the distinction of containing the smallest genome of any free-living organism — only 525 genes, as opposed to the 4,288 of E. coli, a more traditional laboratory bacterium.

Despite the difficulty of working with this sexually transmitted parasite, the minimalism of its genome has made it the focus of several recent bioengineering efforts. Notably, these include the J. Craig Venter Institute’s 2008 synthesis of the first artificial chromosome.

“The goal hasn’t only been to understand M. genitalium better,” said co-first author and Stanford biophysics graduate student Jonathan Karr. “It’s to understand biology generally.”

Even at this small scale, the quantity of data that the Stanford researchers incorporated into the virtual cell’s code was enormous. The final model made use of more than 1,900 experimentally determined parameters.

To integrate these disparate data points into a unified machine, the researchers modeled individual biological processes as 28 separate “modules,” each governed by its own algorithm. These modules then communicated to each other after every time step, making for a unified whole that closely matched M. genitalium‘s real-world behavior.

Probing the silicon cell

The purely computational cell opens up procedures that would be difficult to perform in an actual organism, as well as opportunities to reexamine experimental data.

In the paper, the model is used to demonstrate a number of these approaches, including detailed investigations of DNA-binding protein dynamics and the identification of new gene functions.

The program also allowed the researchers to address aspects of cell behavior that emerge from vast numbers of interacting factors.

The researchers had noticed, for instance, that the length of individual stages in the cell cycle varied from cell to cell, while the length of the overall cycle was much more consistent. Consulting the model, the researchers hypothesized that the overall cell cycle’s lack of variation was the result of a built-in negative feedback mechanism.

Cells that took longer to begin DNA replication had time to amass a large pool of free nucleotides. The actual replication step, which uses these nucleotides to form new DNA strands, then passed relatively quickly. Cells that went through the initial step quicker, on the other hand, had no nucleotide surplus. Replication ended up slowing to the rate of nucleotide production.

These kinds of findings remain hypotheses until they’re confirmed by real-world experiments, but they promise to accelerate the process of scientific inquiry.

“If you use a model to guide your experiments, you’re going to discover things faster. We’ve shown that time and time again,” said Covert.

Bio-CAD

Much of the model’s future promise lies in more applied fields.

CAD — computer-aided design — has revolutionized fields from aeronautics to civil engineering by drastically reducing the trial-and-error involved in design. But our incomplete understanding of even the simplest biological systems has meant that CAD hasn’t yet found a place in bioengineering.

Computational models like that of M. genitalium could bring rational design to biology — allowing not only for computer-guided experimental regimes, but also for the wholesale creation of new microorganisms.

Once similar models have been devised for more experimentally tractable organisms, Karr envisions bacteria or yeast specifically designed to mass-produce pharmaceuticals.

Bio-CAD could also lead to enticing medical advances — especially in the field of personalized medicine. But these applications are a long way off, the researchers said.

“This is potentially the new Human Genome Project,” Karr said. “It’s going to take a really large community effort to get close to a human model.”

Stanford’s Department of Bioengineering is jointly operated by the School of Engineering and the School of Medicine.

Disorderly Conduct: Probing the Role of Disorder in Quantum Coherence (Science Daily)

ScienceDaily (July 19, 2012) — A new experiment conducted at the Joint Quantum Institute (JQI)* examines the relationship between quantum coherence, an important aspect of certain materials kept at low temperature, and the imperfections in those materials. These findings should be useful in forging a better understanding of disorder, and in turn in developing better quantum-based devices, such as superconducting magnets.

Figure 1 (top): Two thin planes of cold atoms are held in an optical lattice by an array of laser beams. Still another laser beam, passed through a diffusing material, adds an element of disorder to the atoms in the form of a speckle pattern. Figure 2 (bottom): Interference patterns resulting when the two planes of atoms are allowed to collide. In (b) the amount of disorder is just right and the pattern is crisp. In (c) too much disorder has begun to wash out the pattern. In (a) the pattern is complicated by the presence of vortices in the among the atoms, vortices which are hard to see in this image taken from the side. (Credit: Matthew Beeler)

Most things in nature are imperfect at some level. Fortunately, imperfections — a departure, say, from an orderly array of atoms in a crystalline solid — are often advantageous. For example, copper wire, which carries so much of the world’s electricity, conducts much better if at least some impurity atoms are present.

In other words, a pinch of disorder is good. But there can be too much of this good thing. The issue of disorder is so important in condensed matter physics, and so difficult to understand directly, that some scientists have been trying for some years to simulate with thin vapors of cold atoms the behavior of electrons flowing through solids trillions of times more dense. With their ability to control the local forces over these atoms, physicists hope to shed light on more complicated case of solids.

That’s where the JQI experiment comes in. Specifically, Steve Rolston and his colleagues have set up an optical lattice of rubidium atoms held at temperature close to absolute zero. In such a lattice atoms in space are held in orderly proximity not by natural inter-atomic forces but by the forces exerted by an array of laser beams. These atoms, moreover, constitute a Bose Einstein condensate (BEC), a special condition in which they all belong to a single quantum state.

This is appropriate since the atoms are meant to be a proxy for the electrons flowing through a solid superconductor. In some so called high temperature superconductors (HTSC), the electrons move in planes of copper and oxygen atoms. These HTSC materials work, however, only if a fillip of impurity atoms, such as barium or yttrium, is present. Theorists have not adequately explained why this bit of disorder in the underlying material should be necessary for attaining superconductivity.

The JQI experiment has tried to supply palpable data that can illuminate the issue of disorder. In solids, atoms are a fraction of a nanometer (billionth of a meter) apart. At JQI the atoms are about a micron (a millionth of a meter) apart. Actually, the JQI atom swarm consists of a 2-dimensional disk. “Disorder” in this disk consists not of impurity atoms but of “speckle.” When a laser beam strikes a rough surface, such as a cinderblock wall, it is scattered in a haphazard pattern. This visible speckle effect is what is used to slightly disorganize the otherwise perfect arrangement of Rb atoms in the JQI sample.

In superconductors, the slight disorder in the form of impurities ensures a very orderly “coherence” of the supercurrent. That is, the electrons moving through the solid flow as a single coordinated train of waves and retain their cohesiveness even in the midst of impurity atoms.

In the rubidium vapor, analogously, the slight disorder supplied by the speckle laser ensures that the Rb atoms retain their coordinated participation in the unified (BEC) quantum wave structure. But only up to a point. If too much disorder is added — if the speckle is too large — then the quantum coherence can go away. Probing this transition numerically was the object of the JQI experiment. The setup is illustrated in figure 1.

And how do you know when you’ve gone too far with the disorder? How do you know that quantum coherence has been lost? By making coherence visible.

The JQI scientists cleverly pry their disk-shaped gas of atoms into two parallel sheets, looking like two thin crepes, one on top of each other. Thereafter, if all the laser beams are turned off, the two planes will collide like miniature galaxies. If the atoms were in a coherent condition, their collision will result in a crisp interference pattern showing up on a video screen as a series of high-contrast dark and light stripes.

If, however, the imposed disorder had been too high, resulting in a loss of coherence among the atoms, then the interference pattern will be washed out. Figure 2 shows this effect at work. Frames b and c respectively show what happens when the degree of disorder is just right and when it is too much.

“Disorder figures in about half of all condensed matter physics,” says Steve Rolston. “What we’re doing is mimicking the movement of electrons in 3-dimensional solids using cold atoms in a 2-dimensional gas. Since there don’t seem to be any theoretical predictions to help us understand what we’re seeing we’ve moved into new experimental territory.”

Where does the JQI work go next? Well, in figure 2a you can see that the interference pattern is still visible but somewhat garbled. That arises from the fact that for this amount of disorder several vortices — miniature whirlpools of atoms — have sprouted within the gas. Exactly such vortices among electrons emerge in superconductivity, limiting their ability to maintain a coherent state.

The new results are published in the New Journal of Physics: “Disorder-driven loss of phase coherence in a quasi-2D cold atom system,” by M C Beeler, M E W Reed, T Hong, and S L Rolston.

Another of the JQI scientists, Matthew Beeler, underscores the importance of understanding the transition from the coherent state to incoherent state owing to the fluctuations introduced by disorder: “This paper is the first direct observation of disorder causing these phase fluctuations. To the extent that our system of cold atoms is like a HTSC superconductor, this is a direct connection between disorder and a mechanism which drives the system from superconductor to insulator.”

Dummies guide to the latest “Hockey Stick” controversy (Real Climate)

http://www.realclimate.org

 — gavin @ 18 February 2005

by Gavin Schmidt and Caspar Amman

Due to popular demand, we have put together a ‘dummies guide’ which tries to describe what the actual issues are in the latest controversy, in language even our parents might understand. A pdf version is also available. More technical descriptions of the issues can be seen here and here.

This guide is in two parts, the first deals with the background to the technical issues raised byMcIntyre and McKitrick (2005) (MM05), while the second part discusses the application of this to the original Mann, Bradley and Hughes (1998) (MBH98) reconstruction. The wider climate science context is discussed here, and the relationship to other recent reconstructions (the ‘Hockey Team’) can be seen here.

NB. All the data that were used in MBH98 are freely available for download atftp://holocene.evsc.virginia.edu/pub/sdr/temp/nature/MANNETAL98/ (and also as supplementary data at Nature) along with a thorough description of the algorithm.
Part I: Technical issues:

1) What is principal component analysis (PCA)?

This is a mathematical technique that is used (among other things) to summarize the data found in a large number of noisy records so that the essential aspects can more easily seen. The most common patterns in the data are captured in a number of ‘principal components’ which describe some percentage of the variation in the original records. Usually only a limited number of components (‘PC’s) have any statistical significance, and these can be used instead of the larger data set to give basically the same description.

2) What do these individual components represent?

Often the first few components represent something recognisable and physical meaningful (at least in climate data applications). If a large part of the data set has a trend, than the mean trend may show up as one of the most important PCs. Similarly, if there is a seasonal cycle in the data, that will generally be represented by a PC. However, remember that PCs are just mathematical constructs. By themselves they say nothing about the physics of the situation. Thus, in many circumstances, physically meaningful timeseries are ‘distributed’ over a number of PCs, each of which individually does not appear to mean much. Different methodologies or conventions can make a big difference in which pattern comes up tops. If the aim of the PCA analysis is to determine the most important pattern, then it is important to know how robust that pattern is to the methodology. However, if the idea is to more simply summarize the larger data set, the individual ordering of the PCs is less important, and it is more crucial to make sure that as many significant PCs are included as possible.

3) How do you know whether a PC has significant information?

PC significanceThis determination is usually based on a ‘Monte Carlo’ simulation (so-called because of the random nature of the calculations). For instance, if you take 1000 sets of random data (that have the same statistical properties as the data set in question), and you perform the PCA analysis 1000 times, there will be 1000 examples of the first PC. Each of these will explain a different amount of the variation (or variance) in the original data. When ranked in order of explained variance, the tenth one down then defines the 99% confidence level: i.e. if your real PC explains more of the variance than 99% of the random PCs, then you can say that this is significant at the 99% level. This can be done for each PC in turn. (This technique was introduced by Preisendorfer et al. (1981), and is called the Preisendorfer N-rule).

The figure to the right gives two examples of this. Here each PC is plotted against the amount of fractional variance it explains. The blue line is the result from the random data, while the blue dots are the PC results for the real data. It is clear that at least the first two are significantly separated from the random noise line. In the other case, there are 5 (maybe 6) red crosses that appear to be distinguishable from the red line random noise. Note also that the first (‘most important’) PC does not always explain the same amount of the original data.

4) What do different conventions for PC analysis represent?

Some different conventions exist regarding how the original data should be normalized. For instance, the data can be normalized to have an average of zero over the whole record, or over a selected sub-interval. The variance of the data is associated with departures from the whatever mean was selected. So the pattern of data that shows the biggest departure from the mean will dominate the calculated PCs. If there is an a priori reason to be interested in departures from a particular mean, then this is a way to make sure that those patterns move up in the PC ordering. Changing conventions means that the explained variance of each PC can be different, the ordering can be different, and the number of significant PCs can be different.

5) How can you tell whether you have included enough PCs?

This is rather easy to tell. If your answer depends on the number of PCs included, then you haven’t included enough. Put another way, if the answer you get is the same as if you had used all the data without doing any PC analysis at all, then you are probably ok. However, the reason why the PC summaries are used in the first place in paleo-reconstructions is that using the full proxy set often runs into the danger of ‘overfitting’ during the calibration period (the time period when the proxy data are trained to match the instrumental record). This can lead to a decrease in predictive skill outside of that window, which is the actual target of the reconstruction. So in summary, PC selection is a trade off: on one hand, the goal is to capture as much variability of the data as represented by the different PCs as possible (particularly if the explained variance is small), while on the other hand, you don’t want to include PCs that are not really contributing any more significant information.

Part II: Application to the MBH98 ‘Hockey Stick’

1) Where is PCA used in the MBH methodology?

When incorporating many tree ring networks into the multi-proxy framework, it is easier to use a few leading PCs rather than 70 or so individual tree ring chronologies from a particular region. The trees are often very closely located and so it makes sense to summarize the general information they all contain in relation to the large-scale patterns of variability. The relevant signal for the climate reconstruction is the signal that the trees have in common, not each individual series. In MBH98, the North American tree ring series were treated like this. There are a number of other places in the overall methodology where some form of PCA was used, but they are not relevant to this particular controversy.

2) What is the point of contention in MM05?

MM05 contend that the particular PC convention used in MBH98 in dealing with the N. American tree rings selects for the ‘hockey stick’ shape and that the final reconstruction result is simply an artifact of this convention.

3) What convention was used in MBH98?

MBH98 were particularly interested in whether the tree ring data showed significant differences from the 20th century calibration period, and therefore normalized the data so that the mean over this period was zero. As discussed above, this will emphasize records that have the biggest differences from that period (either positive of negative). Since the underlying data have a ‘hockey stick’-like shape, it is therefore not surprising that the most important PC found using this convention resembles the ‘hockey stick’. There are actual two significant PCs found using this convention, and both were incorporated into the full reconstruction.

PC1 vs PC44) Does using a different convention change the answer?

As discussed above, a different convention (MM05 suggest one that has zero mean over the whole record) will change the ordering, significance and number of important PCs. In this case, the number of significant PCs increases to 5 (maybe 6) from 2 originally. This is the difference between the blue points (MBH98 convention) and the red crosses (MM05 convention) in the first figure. Also PC1 in the MBH98 convention moves down to PC4 in the MM05 convention. This is illustrated in the figure on the right, the red curve is the original PC1 and the blue curve is MM05 PC4 (adjusted to have same variance and mean). But as we stated above, the underlying data has a hockey stick structure, and so in either case the ‘hockey stick’-like PC explains a significant part of the variance. Therefore, using the MM05 convention, more PCs need to be included to capture the significant information contained in the tree ring network.

This figure shows the difference in the final result whether you use the original convention and 2 PCs (blue) and the MM05 convention with 5 PCs (red). The MM05-based reconstruction is slightly less skillful when judged over the 19th century validation period but is otherwise very similar. In fact any calibration convention will lead to approximately the same answer as long as the PC decomposition is done properly and one determines how many PCs are needed to retain the primary information in the original data.

different conventions
5) What happens if you just use all the data and skip the whole PCA step?

This is a key point. If the PCs being used were inadequate in characterizing the underlying data, then the answer you get using all of the data will be significantly different. If, on the other hand, enough PCs were used, the answer should be essentially unchanged. This is shown in the figure below. The reconstruction using all the data is in yellow (the green line is the same thing but with the ‘St-Anne River’ tree ring chronology taken out). The blue line is the original reconstruction, and as you can see the correspondence between them is high. The validation is slightly worse, illustrating the trade-off mentioned above i.e. when using all of the data, over-fitting during the calibration period (due to the increase number of degrees of freedom) leads to a slight loss of predictability in the validation step.

No PCA comparison

6) So how do MM05 conclude that this small detail changes the answer?

MM05 claim that the reconstruction using only the first 2 PCs with their convention is significantly different to MBH98. Since PC 3,4 and 5 (at least) are also significant they are leaving out good data. It is mathematically wrong to retain the same number of PCs if the convention of standardization is changed. In this case, it causes a loss of information that is very easily demonstrated. Firstly, by showing that any such results do not resemble the results from using all data, and by checking the validation of the reconstruction for the 19th century. The MM version of the reconstruction can be matched by simply removing the N. American tree ring data along with the ‘St Anne River’ Northern treeline series from the reconstruction (shown in yellow below). Compare this curve with the ones shown above.

No N. American tree rings

As you might expect, throwing out data also worsens the validation statistics, as can be seen by eye when comparing the reconstructions over the 19th century validation interval. Compare the green line in the figure below to the instrumental data in red. To their credit, MM05 acknowledge that their alternate 15th century reconstruction has no skill.

validation period

7) Basically then the MM05 criticism is simply about whether selected N. American tree rings should have been included, not that there was a mathematical flaw?

Yes. Their argument since the beginning has essentially not been about methodological issues at all, but about ‘source data’ issues. Particular concerns with the “bristlecone pine” data were addressed in the followup paper MBH99 but the fact remains that including these data improves the statistical validation over the 19th Century period and they therefore should be included.

Hockey Team *used under GFDL license8) So does this all matter?

No. If you use the MM05 convention and include all the significant PCs, you get the same answer. If you don’t use any PCA at all, you get the same answer. If you use a completely different methodology (i.e. Rutherford et al, 2005), you get basically the same answer. Only if you remove significant portions of the data do you get a different (and worse) answer.

9) Was MBH98 the final word on the climate of last millennium?

Not at all. There has been significant progress on many aspects of climate reconstructions since MBH98. Firstly, there are more and better quality proxy data available. There are new methodologies such as described in Rutherford et al (2005) or Moberg et al (2005) that address recognised problems with incomplete data series and the challenge of incorporating lower resolution data into the mix. Progress is likely to continue on all these fronts. As of now, all of the ‘Hockey Team’ reconstructions (shown left) agree that the late 20th century is anomalous in the context of last millennium, and possibly the last two millennia.

Para evitar catástrofes ambientais (FAPERJ)

Vilma Homero

05/07/2012

 Nelson Fernandes / UFRJ
 
  Novos métodos podem prever onde e quando
ocorrerão deslizamentos na região serrana

Quando várias áreas de Nova Friburgo, Petrópolis e Teresópolis sofreram deslizamentos, em janeiro de 2011, soterrando mais de mil pessoas em toneladas de lama e destroços, a pergunta que ficou no ar foi se o desastre poderia ter sido minimizado. No que depender do Instituto de Geociências da Universidade Federal do Rio de Janeiro (UFRJ), as consequências provocadas por cataclismas ambientais como esses poderão ser cada vez menores. Para isso, os pesquisadores estão desenvolvendo uma série de projetos multidisciplinares para viabilizar sistemas de análise de riscos. Um deles é o Prever, que, com suporte de programas computacionais, une os avanços alcançados em metodologias de sensoriamento remoto, geoprocessamento, geomorfologia e geotecnia, à modelagem matemática para a previsão do tempo em áreas mais suscetíveis a deslizamentos, como a região serrana. “Embora a realidade dos vários municípios daquela região seja bastante diferente, há em comum uma falta de metodologias voltadas à previsão para esse tipo de risco. O fundamental agora é desenvolver métodos capazes de prever a localização espacial e temporal desses processos. Ou seja, saber “onde” e “quando” esses deslizamentos podem ocorrer”, explica o geólogo Nelson Ferreira Fernandes, professor do Departamento de Geografia da UFRJ e Cientista do Nosso Estado da FAPERJ.Para elaborar métodos de previsão de risco, em tempo real, que incluam movimentos de massa deflagrados em resposta a entradas pluviométricas, os pesquisadores estão traçando um mapeamento, realizado a partir de sucessivas imagens captadas por satélites, que são cruzadas com mapas geológicos e geotécnicos. O Prever combina modelos de simulação climática e de previsão de eventos pluviométricos extremos, desenvolvidos na área da meteorologia, com modelos matemáticos de previsão, mais as informações desenvolvidos pela geomorfologia e pela geotecnia, que nos indicam as áreas mais suscetíveis a deslizamentos. Assim, podemos elaborar traçar previsões de risco, em tempo real, classificando os resultados de acordo com a gravidade desse risco, que varia continuamente, no espaço e no tempo”, explica Nelson.

Para isso, os Departamentos de Geografia, Geologia e Meteorologia do Instituto de Geociências da UFRJ se unem à Faculdade de Geologia da Universidade do Estado do Rio de Janeiro (Uerj) e ao Departamento de Engenharia Civil da Pontifícia Universidade Católica (PUC-Rio). Com a sobreposição de informações, pode-se apontar, nas imagens resultantes, as áreas mais sensíveis a deslizamentos. “Somando esses conhecimentos acadêmicos aos dados de órgãos estaduais, como o Núcleo de Análise de Desastres (Nade), do Departamento de Recursos Minerais (DRM-RJ), responsável pelo apoio técnico à Defesa Civil, estaremos não apenas atualizando constantemente os mapas usados hoje pelos órgãos do governo do estado e pela Defesa Civil, como estaremos também facilitando um planejamento mais preciso para a tomada de decisões.”

 Divulgação / UFRJ
Uma simulação mostra em imagem a possibilidade de
um deslizamento de massas na
 região de Jacarepaguá

Esse novo mapeamento também significa melhor qualidade e maior precisão e mais detalhamento de imagens. “Obviamente, com melhores instrumentos em mãos, o que quer dizer mapas mais detalhados e precisos, os gestores públicos também poderão planejar e agir de forma mais acurada e em tempo real”, afirma Nelson. Segundo o pesquisador, esses mapas precisam ter atualização constante para acompanhar a dinâmica da interferência da ocupação humana sobre a topografia das várias regiões. “Isso vem acontecendo seja pelo corte de encostas, seja pela ocupação de áreas aterradas ou pelas mudanças em consequência da drenagem de rios. Tudo isso altera a topografia e, no caso de chuvas mais fortes e prolongadas, pode tornar determinados solos mais propensos a deslizamentos ou a alagamentos e enchentes”, exemplifica Nelson.Mas os sistemas de análises de desastres e riscos ambientais também compreendem outras linhas de pesquisa. No Prever, se trabalha em duas linhas de ação distintas. “Uma delas é a de clima, em que detectamos as áreas em que haverá um aumento pluviométrico a longo prazo e fornecemos informações a órgãos de decisão e planejamento. Outra é a previsão de curtíssimo prazo, o chamadonowcasting.” No caso de previsão de longo prazo, a professora Ana Maria Bueno Nunes, do Departamento de Meteorologia da mesma universidade, vem trabalhando no projeto “Implementação de um Sistema de Modelagem Regional: Estudos de Tempo e Clima”, sob sua coordenação, com a proposta de uma reconstrução do hidroclima da América do Sul, uma extensão daquele projeto.

“Unindo dados sobre precipitação fornecidos por satélite às informações das estações atmosféricas, é possível, através de modelagem computacional, traçar estimativas de precipitação. Assim, podemos não apenas saber quando haverá chuvas de intensidade mais forte, ou mais prolongadas, como também observar em mapas passados qual foi a convergência de fatores que provocou uma situação de desastre. A reconstrução é uma forma de estudar o passado para entender cenários atuais que se mostrem semelhantes. E, com isso, ajudamos a melhorar os modelos de previsão”, afirma Ana. Estas informações, que a princípio servirão para uso acadêmico e científico, permitirão que se tenha dados cada vez mais detalhados de como se formam grandes chuvas, aquelas que são capazes de provocar inundações em determinadas áreas. “Isso permitirá não apenas compreender melhor as condições em que certas situações de calamidade acontecem, como prever quando essas condições podem se repetir. Com o projeto, estamos também formando recursos humanos ainda mais especializados nessa área”, avalia a pesquisadora, cujo trabalho conta com recursos de um Auxílio à Pesquisa (APQ 1).

Também integrante do projeto, o professor Gutemberg Borges França, da UFRJ, explica que existem três tipos de previsão meteorológica: a sinótica – que traça previsões numa média de 6h até sete dias, cobrindo alguns milhares de km, como o continente sul-americano; a de mesoescala, que faz previsões sobre uma média de 6h a dois dias, cobrindo algumas centenas de km, como o estado do Rio de Janeiro; e a de curto prazo, ou nowcasting, que varia de poucos minutos até 3h a 6h, sobre uma área específica de poucos km, como a região metropolitana do Rio de Janeiro, por exemplo.

Se previsões de longo prazo são importantes, as de curto prazo, ou nowcasting, também são. Segundo Gutemberg, os atuais modelos numéricos de previsão ainda são deficientes para realizar a previsão de curto prazo, que termina sendo feita em grande parte com base na experiência do meteorologista, pela interpretação das informações de várias fontes de dados disponíveis, como imagens de satélites; de estações meteorológicas de superfície e altitude; de radar e sodar (Sonic Detection and Ranging), e modelos numéricos. “No entanto, o meteorologista carece ainda hoje de ferramentas objetivas que possam auxiliá-lo na integração dessas diversas informações para realizar uma previsão de curto prazo mais acurada”, argumenta Gutemberg.Atualmente, o Rio de Janeiro já dispõe de estações de recepção de satélites, estação de altitude – radiosondagem – que geram perfis atmosféricos, estações meteorológicas de superfície e radar. O Laboratório de Meteorologia Aplicada do Departamento de Meteorologia, da UFRJ, está desenvolvendo, desde 2005, ferramentas de previsão de curto prazo, utilizando inteligência computacional, visando o aprimoramento das previsões de eventos meteorológicos extremos para o Rio de Janeiro. “Com inteligência computacional, temos essa informação em tempo mais curto e de forma mais acurada.”, resume.

© FAPERJ – Todas as matérias poderão ser reproduzidas, desde que citada a fonte.

University of Tennessee anthropologists find American heads are getting larger (University of Tennessee)

University of Tennessee at Knoxville

White Americans’ heads are getting bigger — that’s according to research by forensic anthropologists at the University of Tennessee, Knoxville

White Americans’ heads are getting bigger. That’s according to research by forensic anthropologists at the University of Tennessee, Knoxville.

Lee Jantz, coordinator of UT’s Forensic Anthropology Center (FAC); Richard Jantz, professor emeritus and former director of the FAC; and Joanne Devlin, adjunct assistant professor, examined 1,500 skulls dating back to the mid-1800s through the mid-1980s. They noticed U.S. skulls have become larger, taller and narrower as seen from the front and faces have become significantly narrower and higher.

The researchers cannot pinpoint a reason as to why American head shapes are changing and whether it is primarily due to evolution or lifestyle changes.

“The varieties of changes that have swept American life make determining an exact cause an endlessly complicated proposition,” said Lee Jantz. “It likely results from modified growth patterns because of better nutrition, lower infant and maternal mortality, less physical work, and a breakdown of former ethnic barriers to marriage. Which of these is paramount we do not know.”

The researchers found that the average height from the base to the top of the skull in men has increased by eight millimeters (0.3 inches). The skull size has grown by 200 cubic centimeters, a space equivalent to a tennis ball. In women, the corresponding increases are seven millimeters and 180 cubic centimeters.

Skull height has increased 6.8 percent since the late 1800s, while body height has increased 5.6 percent and femur length has only increased about 2 percent. Also, skull-height has continued to change whereas the overall heightening has recently slowed or stopped.

The scientists also noted changes that illustrate our population is maturing sooner. This is reflected in the earlier closing of a separation in the bone structure of the skull called the spheno-occipital synchondrosis, which in the past was thought to fuse at about age twenty. Richard Jantz and Natalie Shirley, an adjunct assistant professor in the FAC, have found the bone is fusing much earlier — 14 for girls and 16 for boys.

America’s obesity epidemic is the latest development that could affect skeletal shape but its precise effects are unclear.

“This might affect skull shape by changing the hormonal environment, which in turn could affect timing of growth and maturation,” said Richard Jantz. “We know it has an effect on the long bones by increasing muscle attachment areas, increasing arthritis at certain joints, especially the knee, and increasing the weight bearing capacity.”

The research only assessed Americans of European ancestry because they provided the largest sample sizes to work with. Richard Jantz said changes in skeletal structure are taking place in many parts of the world, but tend to be less studied. He said research has uncovered shifts in skull shape in Europe though it is not as dramatic as seen in the U.S.

The findings were presented on April 14 in Portland, Ore. at the annual meeting of the American Association of Physical Anthropologists

The U.S. Has Fallen Behind in Numerical Weather Prediction: Part I

March 28, 2012 – 05:00 AM
By Dr. Cliff Mass (Twitter @CliffMass)

It’s a national embarrassment. It has resulted in large unnecessary costs for the U.S. economy and needless endangerment of our citizens. And it shouldn’t be occurring.

What am I talking about? The third rate status of numerical weather prediction in the U.S. It is a huge story, an important story, but one the media has not touched, probably from lack of familiarity with a highly technical subject. And the truth has been buried or unavailable to those not intimately involved in the U.S. weather prediction enterprise. This is an issue I have mentioned briefly in previous blogs, and one many of you have asked to learn more about. It’s time to discuss it.

Weather forecasting today is dependent on numerical weather prediction, the numerical solution of the equations that describe the atmosphere. The technology of weather prediction has improved dramatically during the past decades as faster computers, better models, and much more data (mainly satellites) have become available.

Supercomputers are used for numerical weather prediciton.

U.S. numerical weather prediction has fallen to third or fourth place worldwide, with the clear leader in global numerical weather prediction (NWP) being the European Center for Medium Range Weather Forecasting (ECMWF). And we have also fallen behind in ensembles (using many models to give probabilistic prediction) and high-resolution operational forecasting. We used to be the world leader decades ago in numerical weather prediction: NWP began and was perfected here in the U.S. Ironically, we have the largest weather research community in the world and the largest collection of universities doing cutting-edge NWP research (like the University of Washington!). Something is very, very wrong and I will talk about some of the issues here. And our nation needs to fix it.

But to understand the problem, you have to understand the competition and the players. And let me apologize upfront for the acronyms.

In the U.S., numerical weather prediction mainly takes place at the National Weather Service’s Environmental Modeling Center (EMC), a part of NCEP (National Centers for Environmental Prediction). They run a global model (GFS) and regional models (e.g., NAM).

The Europeans banded together decades ago to form the European Center for Medium-Range Forecasting (ECMWF), which runs a very good global model. Several European countries run regional models as well.

The United Kingdom Met Office (UKMET) runs an excellent global model and regional models. So does the Canadian Meteorological Center (CMC).

There are other major global NWP centers such as the Japanese Meteorological Agency (JMA), the U.S. Navy (FNMOC), the Australian center, one in Beijing, among others. All of these centers collect worldwide data and do global NWP.

The problem is that both objective and subjective comparisons indicate that the U.S. global model is number 3 or number 4 in quality, resulting in our forecasts being noticeably inferior to the competition. Let me show you a rather technical graph (produced by the NWS) that illustrates this. This figure shows the quality of the 500hPa forecast (about halfway up in the troposphere–approximately 18,000 ft) for the day 5 forecast. The top graph is a measure of forecast skill (closer to 1 is better) from 1996 to 2012 for several models (U.S.–black, GFS; ECMWF-red, Canadian: CMC-blue, UKMET: green, Navy: FNG, orange). The bottom graph shows the difference between the U.S. and other nation’s model skill.

You first notice that forecasts are all getting better. That’s good. But you will notice that the most skillful forecast (closest to one) is clearly the red one…the European Center. The second best is the UKMET office. The U.S. (GFS model) is third…roughly tied with the Canadians.

Here is a global model comparison done by the Canadian Meteorological Center, for various global models from 2009-2012 for the 120 h forecast. This is a plot of error (RMSE, root mean square error) again for 500 hPa, and only for North America. Guess who is best again (lowest error)?–the European Center (green circle). UKMET is next best, and the U.S. (NCEP, blue triangle) is back in the pack.

Lets looks at short-term errors. Here is a plot from a paper by Garrett Wedam, Lynn McMurdie and myself comparing various models at 24, 48, and 72 hr for sea level pressure along the West Coast. Bigger bar means more error. Guess who has the lowest errors by far? You guessed it, ECMWF.

I could show you a hundred of these plots, but the answers are very consistent. ECMWF is the worldwide gold standard in global prediction, with the British (UKMET) second. We are third or fourth (with the Canadians). One way to describe this, is that the ECWMF model is not only better at the short range, but has about one day of additional predictability: their 8 day forecast is about as skillful as our 7 day forecast. Another way to look at it is that with the current upward trend in skill they are 5-7 years ahead of the U.S.

Most forecasters understand the frequent superiority of the ECMWF model. If you read the NWS forecast discussion, which is available online, you will frequently read how they often depend not on the U.S. model, but the ECMWF. And during the January western WA snowstorm, it was the ECMWF model that first indicated the correct solution. Recently, I talked to the CEO of a weather/climate related firm that was moving up to Seattle. I asked them what model they were using: the U.S. GFS? He laughed, of course not…they were using the ECMWF.

A lot of U.S. firms are using the ECMWF and this is very costly, because the Europeans charge a lot to gain access to their gridded forecasts (hundreds of thousands of dollars per year). Can you imagine how many millions of dollars are being spent by U.S. companies to secure ECMWF predictions? But the cost of the inferior NWS forecasts are far greater than that, because many users cannot afford the ECMWF grids and the NWS uses their global predictions to drive the higher-resolution regional models–which are NOT duplicated by the Europeans. All of U.S. NWP is dragged down by these second-rate forecasts and the costs for the nation has to be huge, since so much of our economy is weather sensitive. Inferior NWP must be costing billions of dollars, perhaps many billions.

The question all of you must be wondering is why this bad situation exists. How did the most technologically advanced country in the world, with the largest atmospheric sciences community, end up with third-rate global weather forecasts? I believe I can tell you…in fact, I have been working on this issue for several decades (with little to show for it). Some reasons:

1. The U.S. has inadequate computer power available for numerical weather prediction. The ECMWF is running models with substantially higher resolution than ours because they have more resources available for NWP. This is simply ridiculous–the U.S. can afford the processors and disk space it would take. We are talking about millions or tens of millions of dollars at most to have the hardware we need. A part of the problem has been NWS procurement, that is not forward-leaning, using heavy metal IBM machines at very high costs.

2. The U.S. has used inferior data assimilation. A key aspect of NWP is to assimilate the observations to create a good description of the atmosphere. The European Center, the UKMET Office, and the Canadians using 4DVAR, an advanced approach that requires lots of computer power. We used an older, inferior approach (3DVAR). The Europeans have been using 4DVAR for 20 years! Right now, the U.S. is working on another advanced approach (ensemble-based data assimilation), but it is not operational yet.

3. The NWS numerical weather prediction effort has been isolated and has not taken advantage of the research community. NCEP’s Environmental Modeling Center (EMC) is well known for its isolation and “not invented here” attitude. While the European Center has lots of visitors and workshops, such things are a rarity at EMC. Interactions with the university community have been limited and EMC has been reluctant to use the models and approaches developed by the U.S. research community. (True story: some of the advances in probabilistic weather prediction at the UW has been adopted by the Canadians, while the NWS had little interest). The National Weather Service has invested very little in extramural research and when their budget is under pressure, university research is the first thing they reduce. And the U.S. NWP center has been housed in a decaying building outside of D.C.,one too small for their needs as well. (Good news… a new building should be available soon).

4. The NWS approach to weather related research has been ineffective and divided. The governmnent weather research is NOT in the NWS, but rather in NOAA. Thus, the head of the NWS and his leadership team do not have authority over folks doing research in support of his mission. This has been an extraordinarily ineffective and wasteful system, with the NOAA research teams doing work that often has a marginal benefit for the NWS.

5. Lack of leadership. This is the key issue. The folks in NCEP, NWS, and NOAA leadership have been willing to accept third-class status, providing lots of excuses, but not making the fundamental changes in organization and priority that could deal with the problem. Lack of resources for NWP is another issue…but that is a decision made by NOAA/NWS/Dept of Commerce leadership.

This note is getting long, so I will wait to talk about the other problems in the NWS weather modeling efforts, such as our very poor ensemble (probabilistic) prediction systems. One could write a paper on this…and I may.

I should stress that I am not alone in saying these things. A blue-ribbon panel did a review of NCEP in 2009 and came to similar conclusions (found here). And these issues are frequently noted at conferences, workshops, and meetings.

Let me note that the above is about the modeling aspects of the NWS, NOT the many people in the local forecast offices. This part of the NWS is first-rate. They suffer from inferior U.S. guidance and fortunately have access to the ECMWF global forecasts. And there are some very good people at NCEP that have lacked the resources required and suitable organization necessary to push forward effectively.

This problem at the National Weather Service is not a weather prediction problem alone, but an example of a deeper national malaise. It is related to other U.S. issues, like our inferior K-12 education system. Our nation, gaining world leadership in almost all areas, became smug, self-satisfied, and a bit lazy. We lost the impetus to be the best. We were satisfied to coast. And this attitude must end…in weather prediction, education, and everything else… or we will see our nation sink into mediocrity.

The U.S. can reclaim leadership in weather prediction, but I am not hopeful that things will change quickly without pressure from outside of the NWS. The various weather user communities and our congressional representatives must deliver a strong message to the NWS that enough is enough, that the time for accepting mediocrity is over. And the Weather Service requires the resources to be first rate, something it does not have at this point.

*  *  *

Saturday, April 7, 2012

Lack of Computer Power Undermines U.S. Numerical Weather Prediction (Revised)

In my last blog on this subject, I provided objective evidence of how U.S. numerical weather prediction (NWP), and particularly our global prediction skill, lags between major international centers, such as the European Centre for Medium Range Weather Forecasting (ECMWF), the UKMET office, and the Canadian Meteorological Center (CMC).   I mentioned briefly how the problem extends to high-resolution weather prediction over the U.S. and the use of ensemble (many model runs) weather prediction, both globally and over the U.S.  Our nation is clearly number one in meteorological research and we certainly have the knowledge base to lead the world in numerical weather prediction, but for a number of reasons we are not.  The cost of inferior weather prediction is huge: in lives lost, injuries sustained, and economic impacts unmitigated.  Truly, a national embarrassment. And one we must change.

In this blog, I will describe in some detail one major roadblock in giving the U.S. state-of-the-art weather prediction:  inadequate computer resources.   This situation should clearly have been addressed years ago by leadership in the National Weather Service, NOAA, and the Dept of Commerce, but has not, and I am convinced will not without outside pressure.  It is time for the user community and our congressional representatives to intervene.  To quote Samuel L. Jackson, enough is enough. (…)

In the U.S. we are trying to use less computer resources to do more tasks than the global leaders in numerical weather prediction. (Note: U.S. NWP is done by National Centers for Environmental Prediction’s (NCEP) Environmental Modeling Center (EMC)).  This chart tells the story:
Courtesy of Bill Lapenta, EMC.
ECMWF does global high resolution and ensemble forecasts, and seasonal climate forecasts.  UKMET office also does regional NWP (England is not a big country!) and regional air quality.  NCEP does all of this plus much, much more (high resolution rapid update modeling, hurricane modeling, etc.).   And NCEP has to deal with prediction over a continental-size country.

If you would expect the U.S. has a lot more computer power to balance all these responsibilities and tasks, you would be very wrong.  Right now the U.S. NWS has two IBM supercomputers, each with 4992 processors (IBM Power6 processors).   One computer does the operational work, the other is for back up (research and testing runs are done on the back-up).  About 70 teraflops (trillion floating points operations per second) for each machine.

NCEP (U.S.) Computer
The European Centre has a newer IBM machine with 8192, much faster, processors that gets 182 terraflops (yes, over twice as fast and with far fewer tasks to do).

The UKMET office, serving a far, far smaller country, has two newer IBM machines, each with 7680 processors for 175 teraflops per machine.

Here is a figure, produced at NCEP that compares the relative computer power of NCEP’s machine with the European Centre’s.  The shading indicates computational activity and the x-axis for each represents a 24-h period.  The relative heights allows you to compare computer resources.  Not only does the ECMWF have much more computer power, but they are more efficient in using it…packing useful computations into every available minute.

Courtesy of Bill Lapenta, EMC
Recently, NCEP had a request for proposals for a replacement computer system.  You may not believe this, but the specifications were ONLY for a system at least equal to the one that have.    A report in acomputer magazine suggests that perhaps this new system (IBM got the contract) might be slightly less powerful (around 150 terraflops) than one of the UKMET office systems…but that is not known at this point.

The Canadians?  They have TWO machines like the European Centre’s!

So what kind of system does NCEP require to serve the nation in a reasonable way?

To start, we need to double the resolution of our global model to bring it into line with ECMWF (they are now 15 km global).   Such resolution allows the global model to model regional features (such as our mountains).  Doubling horizontal resolution requires 8 times more computer power.  We need to use better physics (description of things like cloud processes and radiation).  Double again.  And we need better data assimilation (better use of observations to provide an improved starting point for the model).  Double once more.  So we need 32 times more computer power for the high-resolution global runs to allow us to catch up with ECMWF.  Furthermore, we must do the same thing for the ensembles (running many lower resolution global simulations to get probabilistic information).  32 times more computer resources for that (we can use some of the gaps in the schedule of the high resolution runs to fit some of this in…that is what ECMWF does).   There are some potential ways NCEP can work more efficiently as well.  Right now NCEP runs our global model out to 384 hours four times a day (every six hours).  To many of us this seems excessive, perhaps the longest periods (180hr plus) could be done twice a day.  So lets begin with a computer 32 times faster that the current one.

Many workshops and meteorological meetings (such as one on improvements in model physics that was held at NCEP last summer—I was the chair) have made a very strong case that the U.S. requires an ensemble prediction system that runs at 4-km horizontal resolution.  The current national ensemble system has a horizontal resolution about 32 km…and NWS plans to get to about 20 km in a few years…both are inadequate.   Here is an example of the ensemble output (mean of the ensemble members) for the NWS and UW (4km) ensemble systems:  the difference is huge–the NWS system does not even get close to modeling the impacts of the mountains.  It is similarly unable to simulate large convective systems.

Current NWS( NCEP) “high resolution” ensembles (32 km)
4 km ensemble mean from UW system
Let me make one thing clear.  Probabilistic prediction based on ensemble forecasts and reforecasting (running models back for years to get statistics of performance) is the future of weather prediction.  The days of giving a single number for say temperature at day 5 are over.  We need to let people know about uncertainty and probabilities.  The NWS needs a massive increase of computer power to do this. It lacks this computer power now and does not seem destined to get it soon.

A real champion within NOAA of the need for more computer power is Tom Hamill, an expert on data assimilation and model post-processing.   He and colleagues have put together a compelling case for more NWS computer resources for NWP.  Read it here.

Back-of-the-envelope calculations indicates that a good first step– 4km national ensembles–would require about 20,000 processors to do so in a timely manner–but it would revolutionize weather prediction in the U.S., including forecasting convection and in mountainous areas.  This high-resolution ensemble effort would meld with data assimilation over the long-term.

And then there is running super-high resolution numerical weather prediction to get fine-scale details right.  Here in the NW my group runs a 1.3 km horizontal resolution forecast out twice a day for 48h.   Such capability is needed for the entire country.  It does not exist now due to inadequate computer resources.

The bottom line is that the NWS numerical modeling effort needs a huge increase of computer power to serve the needs of the country–and the potential impacts would be transformative.   We could go from having a third-place effort, which is slipping back into the pack, to a world leader.  Furthermore, the added computer power will finally allow NOAA to complete Observing System Simulation Experiments (OSSEs) and Observing System Experiments (OSEs) to make rational decisions about acquisitions of very expensive satellite systems.  The fact that this is barely done today is really amazing and a potential waste of hundreds of millions of dollars on unnecessary satellite systems.

But do to so will require a major jump in computational power, a jump our nation can easily afford.   I would suggest that NWS’s EMC should begin by securing at least a 100,000 processor machine, and down the road something considerably larger.  Keep in mind my department has about 1000 processors in our computational clusters, so this is not as large as you think.

For a country with several billion-dollar weather disasters a year, investment in reasonable computer resrouces for NWP is obvious.
The cost?   Well, I asked Art Mann of Silicon Mechanics (a really wonderful local vendor of computer clusters) to give me rough quote:  using fast AMD chips, you could have such a 100K core machine for 11 million dollars. (this is without any discount!)  OK, this is the U.S. government and they like expensive, heavy metal machines….lets go for 25 million dollars.  The National Center for Atmospheric Research (NCAR) is getting a new machine with around 75,000 processors and the cost will be around 25-35 million dollars.   NCEP will want two machines, so lets budget 60 million dollars. We spend this much money on a single jet fighter, but we can’t invest this amount to greatly improve forecasts and public safety in the U.S.?  We have machines far larger than this for breaking codes, doing simulations of thermonuclear explosions, and simulating climate change.

Yes, a lot of money, but I suspect the cost of the machine would be paid back in a few months from improved forecasts.   Last year we had quite a few (over ten) billion-dollar storms….imagine the benefits of forecasting even a few of them better.  Or the benefits to the wind energy and utility industries, or U.S. aviation, of even modestly improved forecasts.   And there is no doubt such computer resources would improve weather prediction.  The list of benefits is nearly endless.   Recent estimates suggest that  normal weather events cost the U.S. economy nearly 1/2 trillion dollars a year.  Add to that hurricanes, tornadoes, floods, and other extreme weather.  The business case is there.

As someone with an insider’s view of the process, it is clear to me that the current players are not going to move effectively without some external pressure.  In fact, the budgetary pressure on the NWS is very intense right now and they are cutting away muscle and bone at this point (like reducing IT staff in the forecast offices by over 120 people and cutting back on extramural research).  I believe it is time for weather sensitive industries and local government, together with t he general public, to let NOAA management and our congressional representatives know that this acute problem needs to be addressed and addressed soon.   We are acquiring huge computer resources for climate simulations, but only a small fraction of that for weather prediction…which can clearly save lives and help the economy.  Enough is enough.

Posted by Cliff Mass Weather Blog at 8:38 PM

Doubtful significance (World Economics Association)

by G M Peter Swann [gmpswann@yahoo.co.uk]
World Economics Association Newsletter 2(2), April.2012, page 6.

In the February issue of this newsletter, Steve Keen (2012) makes some very good points about the use of mathematics in economics. Perhaps we should say that the problem is not so much the use of mathematics as the abuse of mathematics.

A particular issue that worries me is when econometricians make liberal use of assumptions, without realising how strong these are.

Consider the following example. First, you are shown a regression summary of the relationship between Y and X, estimated from 402 observations. The conventional t-statistic for the coefficient on X is 3.0. How would you react to that?

Most economists would remark that t = 3.0 implies significance at the 1% level, which is a strong confirmation of the relationship. Indeed, many researchers mark significance at the 1% level with three stars!

Second, consider the scatter diagram below. This also shows two variables Y and X, and is also based on 402 observations. What does this say about the relationship between Y and X?

Figure 1

I have shown this diagram to several colleagues and students, and typical reactions are either that there is no relationship, or that the relationship could be almost anything.
But the surprising fact is that the data in Figure 1 are exactly the same data as used to estimate the regression summary described earlier. How can such an amorphous scatter of points represent a statistically significant relationship? It is the result of a standard assumption of OLS regression: that the explanatory variable(s) X is/are independent of the noise term u.

So long as this independence assumption is true, we can estimate the relationship with surprising precision. To see this, rewrite the conventional t-statistic as,

, where ψ is a signal to noise ratio (describing the clarity of the scatter-plot) and N-k is the number of degrees of freedom (Swann, 2012). This formula can be used for bivariate and multivariate models.

In Figure 1, ψ is 0.15, which is quite low, but N-k = 400, which is large enough to make t = 3.0. More generally, even if the signal to noise ratio is very low, so that the relationship between Y and X is imperceptible from a scatter-plot, we can always estimate a significant tstatistic – so long as we have a large enough number of observations, and so long as the independence assumption is true. But there is something doubtful about this ‘significance’.

Is the independence assumption justified? In a context where data are noisy, where rough proxy variables are used, where endogeneity is pervasive, and so on, it does seem an exceptionally strong assumption.

What happens if we relax the independence assumption? When the signal to noise ratio is very low, the estimated relationship depends entirely on the assumption that replaces it. Swann (2012) shows that the relationship in Figure 1 could indeed be almost anything – depending on what we assume about the noise variable(s).

Some have suggested that this is not a problem in practice, because signal to noise ratios are usually large enough to avoid this difficulty. But, on the contrary, some evidence suggests the problem is generally worse than indicated by Figure 1.

Swann (2012) examined 100 econometric studies taken from 20 leading economics journals, yielding a sample of 2220 parameter estimates and the corresponding signal to noise ratios. Focussing on the parameter estimates that are significant (at the 5% level or better), we find that almost 80% of those have a signal to noise ratio even lower than that in Figure 1.

In summary, it appears that the problem of ‘doubtful significance’ is pervasive. The great majority of ‘significant relationships’ in this sample would be imperceptible from the corresponding scatter-plot. The ‘significance’ indicated by a high t-statistic derives from the large number of observations and the (very strong) independence assumption.

References

Keen S. (2012) “Maths for Pluralist Economics”, World Economics Association Newsletter 2 (1), 10-11

Swann G.M.P. (2012) Doubtful Significance, Working paper available at: https://sites.google.com/site/gmpswann/doubtful-significance

[Editor’s note: If you are interested in this topic, you may also wish to read D.A. Hollanders, “Five methodological fallacies in applied econometrics”, real-world economics review, issue no. 57, 6 September 2011, pp. 115-126, http://www.paecon.net/PAEReview/issue57/Hollanders57.pdf%5D

As linguagens da psicose (Revista Fapesp)

Abordagem matemática evidencia as diferenças entre os discursos de quem tem mania ou esquizofrenia

CARLOS FIORAVANTI | Edição 194 – Abril de 2012

Como o estudo foi feito: os entrevistados relatavam um sonho e a entrevistadora convertia as palavras mais importantes em pontos e as frases em setas para examinar a estrutura da linguagem

Para os psiquiatras e para a maioria das pessoas, é relativamente fácil diferenciar uma pessoa com psicose de quem não apresentou nenhum distúrbio mental já diagnosticado: as do primeiro grupo relatam delírios e alucinações e por vezes se apresentam como messias que vão salvar o mundo. Porém, diferenciar os dois tipos de psicose – mania e esquizofrenia – já não é tão simples e exige um bocado de experiência pessoal, conhecimento e intuição dos especialistas. Uma abordagem matemática desenvolvida no Instituto do Cérebro da Universidade Federal do Rio Grande do Norte (UFRN) talvez facilite essa diferenciação, fundamental para estabelecer os tratamentos mais adequados para cada enfermidade, ao avaliar de modo quantitativo as diferenças nas estruturas de linguagem verbal adotadas por quem tem mania ou esquizofrenia.

A estratégia de análise – com base na teoria dos grafos, que representou as palavras como pontos e a sequência entre elas nas frases por setas – indicou que as pessoas com mania são muito mais prolixas e repetitivas do que as com esquizofrenia, geralmente lacônicas e centradas em um único assunto, sem deixar o pensamento viajar. “A recorrência é uma marca do discurso do paciente com mania, que conta três ou quatro vezes a mesma coisa, enquanto aquele com esquizofrenia fala objetivamente o que tem para falar, sem se desviar, e tem um discurso pobre em sentidos”, diz a psiquiatra Natália Mota, pesquisadora do instituto. “Em cada grupo”, diz Sidarta Ribeiro, diretor do instituto, “o número de palavras, a estrutura da linguagem e outros indicadores são completamente distintos”.

Eles acreditam que conseguiram dar os primeiros passos rumo a uma forma objetiva de diferenciar as duas formas de psicose, do mesmo modo que um hemograma é usado para atestar uma doença infecciosa, desde que os próximos testes, com uma amostra maior de participantes, reforcem a consistência dessa abordagem e os médicos consintam em trabalhar com um assistente desse tipo. Os testes comparativos descritos em um artigo recém-publicado na revista PLoS One indicaram que essa nova abordagem proporciona taxas de acerto da ordem de 93% no diagnóstico, enquanto as escalas psicométricas hoje em uso, com base em questionários de avaliação de sintomas, chegam a apenas 67%. “São métodos complementares”, diz Natália. “As escalas psicométricas e a experiência dos médicos continuam indispensáveis.”

“O resultado é bastante simples, mesmo para quem não entende matemática”, diz o físico Mauro Copelli, da Universidade Federal de Pernambuco (UFPE), que participou desse trabalho. O discurso das pessoas com mania se mostra como um emaranhado de pontos e linhas, enquanto o das com esquizofrenia se apresenta como uma reta, com poucos pontos. A teoria dos grafos, que levou a esses diagramas, tem sido usada há séculos para examinar as trajetórias pelas quais um viajante poderia visitar todas as cidades de uma região, por exemplo. Mais recentemente, tem servido para otimizar o tráfego aéreo, considerando os aeroportos como um conjunto de pontos ou nós conectados entre si por meio dos aviões.

“Na primeira vez que rodei o programa de grafos, as diferenças de linguagem saltaram aos olhos”, conta Natália. Em 2007, ao terminar o curso de medicina e começar a residência médica em psiquiatria no hospital da UFRN, Natália notava que muitos diagnósticos diferenciais de mania e de esquizofrenia dependiam da experiência pessoal e de julgamentos subjetivos dos médicos – os que trabalhavam mais com pacientes com esquizofrenia tendiam a encontrar mais casos de esquizofrenia e menos de mania – e muitas vezes não havia consenso. Já se sabia que as pessoas com mania falam mais e se desviam do tópico central muito mais facilmente que as com esquizofrenia, mas isso lhe pareceu genérico demais. 
Em um congresso científico em 2008 em Fortaleza ela conversou com Copelli, que já colaborava com Ribeiro e a incentivou a trabalhar com grafos. No início ela resistiu, por causa da pouca familiaridade com matemática, mas logo depois a nova teoria lhe pareceu simples e prática.

Para levar o trabalho adiante, ela gravou e, com a ajuda de Nathália Lemos e Ana Cardina Pieretti, transcreveu as entrevistas com 24 pessoas 
(oito com mania, oito com esquizofrenia e oito sem qualquer distúrbio mental diagnosticado), a quem pedia para relatar um sonho; qualquer comentário fora desse tema era considerado um voo da imaginação, bastante comum entre as pessoas com mania.

“Já na transcrição, os relatos dos pacientes com mania eram claramente maiores que os com esquizofrenia”, diz. Em seguida, ela eliminou elementos menos importantes como artigos e preposições, dividiu a frase em sujeito, verbo e objetos, representados por pontos ou nós, enquanto a sequência entre elas na frase era representada por setas, unindo dois nós, e assinalou as que não se referiam ao tema central do relato, ou seja, o sonho recente que ela pedira para os entrevistados contarem, e marcavam um desvio do pensamento, comum entre as pessoas com mania.

Um programa específico para grafos baixado de graça na internet indicava as características relevantes para análise – ou atributos – e representava as principais diferenças de discurso entre os participantes, como quantidades de nós, extensão e densidade das conexões entre os pontos, recorrência, prolixidade (ou logorreia) e desvio do tópico central. “É supersimples”, assegura Natália. Nas validações e análises dos resultados, ela contou também com a colaboração de Osame Kinouchi, da Universidade de São Paulo (USP) em Ribeirão Preto, e Guillermo Cecchi, do Centro de Biologia Computacional da IBM, Estados Unidos.

Resultado: as pessoas com mania obtiveram uma pontuação maior que as com esquizofrenia em quase todos os itens avaliados. “A logorreia típica de pacientes com mania não resulta só do excesso de palavras, mas de um discurso que volta sempre ao mesmo tópico, em comparação com o grupo com esquizofrenia”, ela observou. Curiosamente, os participantes do grupo-controle, sem distúrbio mental diagnosticado, apresentaram estruturas discursivas de dois tipos, ora redundantes como os participantes com mania, ora enxutas como os com esquizofrenia, refletindo as diferenças entre suas personalidades ou a motivação para, naquele momento, falar mais ou menos. “A patologia define o discurso, não é nenhuma novidade”, diz ela. “Os psiquiatras são treinados para reconhecer essas diferenças, mas dificilmente poderão dizer que a recorrência de um paciente com mania está 28% menor, por mais experientes que sejam.”

“O ambiente interdisciplinar do instituto foi essencial para realizar esse estudo, porque eu estava todo dia trocando ideias com gente de outras áreas. Nivaldo Vasconcelos, um engenheiro de computação, me ajudou muito”, diz ela. O Instituto do Cérebro, em funcionamento desde 2007, conta atualmente com 13 professores, 22 estudantes de graduação e 42 de pós, 8 pós-doutorandos e 30 técnicos. “Vencidas as dificuldades iniciais, conseguimos formar um grupo de pesquisadores jovens e talentosos”, comemora Ribeiro. “A casa em que estamos agora tem um jardim amplo, e muitas noites ficamos lá até as duas, três da manhã, falando sobre ciência e tomando chimarrão.”

Artigo científico
MOTA, N.B. et al
Speech graphs provide 
a quantitative measure of thought disorder 
in psychosis. PLoS ONE (no prelo).

The ‘perfect chaos’ of π (The Guardian)

One of the most important numbers is irrational

GRRLSCIENTIST, by The Guardian

π has fascinated mathematicians, engineers and other people for centuries. It is a mathematical constant that is the ratio of a circle’s circumference (C) to its diameter (d);

This also explains why and how this number got its name: the lowercase Greek letter π was first adopted in 1706 as an abbreviation for this number because it is the first letter of the Greek for “perimeter”, specifically of a circle. This symbol is convenient because π is an irrational number, meaning that it cannot be expressed as a ratio of a/b, where a and b are integers, that its digits never terminate, and it does not contain an infinitely repeating sequence.

Even though we know that the decimal for π is approximately 3.14159, we actually do not know all its digits precisely: as of October 2011, we know that π has more than 10 trillion non-repeating digits, and the occurrence of these digits appears to be nearly perfectly statistically random. However, we do know that any given sequence of numbers with a finite length has a 100% probability that it will occur somewhere in π — which is the premise of this fun little π search engine. For example, my 8-digit university student ID number pops up after 3.24 million decimal places. My mobile number pops up after 9.69 million decimal places, although it does not show up within the first 200 million digits of π when I add the country and area codes. Where do your digits pop up in π?

Many formulae in mathematics, science, and engineering involve π, which makes it one of the most important mathematical constants. But who first rigorously calculated the value for this irrational number and how was it done? This interesting video explores those questions in more detail:

Those of you who enjoy music probably already know that there’s a song about π by the amazing British singer and songwriter, Kate Bush, where she sings its digits.

You can’t do the math without the words (University of Miami Press Release)

University of Miami anthropological linguist studies the anumeric language of an Amazonian tribe; the findings add new perspective to the way people acquire knowledge, perception and reasoning

Marie Guma Diaz
University of Miami

 VIDEO: Caleb Everett, assistant professor in the department of anthropology at the University of Miami College of Arts and Sciences, talks about the unique insight we gain about people by studying…

CORAL GABLES, FL (February 20, 2012)–Most people learn to count when they are children. Yet surprisingly, not all languages have words for numbers. A recent study published in the journal ofCognitive Science shows that a few tongues lack number words and as a result, people in these cultures have a difficult time performing common quantitative tasks. The findings add new insight to the way people acquire knowledge, perception and reasoning.

The Piraha people of the Amazon are a group of about 700 semi-nomadic people living in small villages of about 10-15 adults, along the Maici River, a tributary of the Amazon. According to University of Miami (UM) anthropological linguist Caleb Everett, the Piraha are surprisingly unable to represent exact amounts. Their language contains just three imprecise words for quantities: Hòi means “small size or amount,” hoì, means “somewhat larger amount,” and baàgiso indicates to “cause to come together, or many.” Linguists refer to languages that do not have number specific words as anumeric.

“The Piraha is a really fascinating group because they are really only one or two groups in the world that are totally anumeric,” says Everett, assistant professor in the Department of Anthropology at the UM College of Arts and Sciences. “This is maybe one of the most extreme cases of language actually restricting how people think.”

His study “Quantity Recognition Among speakers of an Anumeric Language” demonstrates that number words are essential tools of thought required to solve even the simplest quantitative problems, such as one-to-one correspondence.

“I’m interested in how the language you speak affects the way that you think,” says Everett. “The question here is what tools like number words really allows us to do and how they change the way we think about the world.”

The work was motivated by contradictory results on the numerical performance of the Piraha. An earlier article reported the people incapable of performing simple numeric tasks with quantities greater than three, while another showed they were capable of accomplishing such tasks.

Everett repeated all the field experiments of the two previous studies. The results indicated that the Piraha could not consistently perform simple mathematical tasks. For example, one test involved 14 adults in one village that were presented with lines of spools of thread and were asked to create a matching line of empty rubber balloons. The people were not able to do the one-to-one correspondence, when the numbers were greater than two or three.

The study provides a simple explanation for the controversy. Unbeknown to other researchers, the villagers that participated in one of the previous studies had received basic numerical training by Keren Madora, an American missionary that has worked with the indigenous people of the Amazon for 33 years, and co-author of this study. “Her knowledge of what had happened in that village was crucial. I understood then why they got the results that they did,” Everett says.

Madora used the Piraha language to create number words. For instance she used the words “all the sons of the hand,” to indicate the number four. The introduction of number words into the village provides a reasonable explanation for the disagreement in the previous studies.

The findings support the idea that language is a key component in processes of the mind. “When they’ve been introduced to those words, their performance improved, so it’s clearly a linguistic effect, rather than a generally cultural factor,” Everett says. The study highlights the unique insight we gain about people and society by studying mother languages.

“Preservation of mother tongues is important because languages can tell us about aspects of human history, human cognition, and human culture that we would not have access to if the languages are gone,” he says. “From a scientific perspective I think it’s important, but it’s most important from the perspective of the people, because they lose a lot of their cultural heritage when their languages die.”

Will one researcher’s discovery deep in the Amazon destroy the foundation of modern linguistics? (The Chronicle of Higher Education)

The Chronicle Review

By Tom Bartlett

March 20, 2012

Angry Words

chomsky everett

A Christian missionary sets out to convert a remote Amazonian tribe. He lives with them for years in primitive conditions, learns their extremely difficult language, risks his life battling malaria, giant anacondas, and sometimes the tribe itself. In a plot twist, instead of converting them he loses his faith, morphing from an evangelist trying to translate the Bible into an academic determined to understand the people he’s come to respect and love.

Along the way, the former missionary discovers that the language these people speak doesn’t follow one of the fundamental tenets of linguistics, a finding that would seem to turn the field on its head, undermine basic assumptions about how children learn to communicate, and dethrone the discipline’s long-reigning king, who also happens to be among the most well-known and influential intellectuals of the 20th century.

It feels like a movie, and it may in fact turn into one—there’s a script and producers on board. It’s already a documentary that will air in May on the Smithsonian Channel. A play is in the works in London. And the man who lived the story, Daniel Everett, has written two books about it. His 2008 memoir Don’t Sleep, There Are Snakes, is filled with Joseph Conrad-esque drama. The new book, Language: The Cultural Tool, which is lighter on jungle anecdotes, instead takes square aim at Noam Chomsky, who has remained the pre-eminent figure in linguistics since the 1960s, thanks to the brilliance of his ideas and the force of his personality.

But before any Hollywood premiere, it’s worth asking whether Everett actually has it right. Answering that question is not straightforward, in part because it hinges on a bit of grammar that no one except linguists ever thinks about. It’s also made tricky by the fact that Everett is the foremost expert on this language, called Pirahã, and one of only a handful of outsiders who can speak it, making it tough for others to weigh in and leading his critics to wonder aloud if he has somehow rigged the results.

More than any of that, though, his claim is difficult to verify because linguistics is populated by a deeply factionalized group of scholars who can’t agree on what they’re arguing about and who tend to dismiss their opponents as morons or frauds or both. Such divisions exist, to varying degrees, in all disciplines, but linguists seem uncommonly hostile. The word “brutal” comes up again and again, as do “spiteful,” “ridiculous,” and “childish.”

With that in mind, why should anyone care about the answer? Because it might hold the key to understanding what separates us from the rest of the animals.

Imagine a linguist from Mars lands on Earth to survey the planet’s languages (presumably after obtaining the necessary interplanetary funding). The alien would reasonably conclude that the languages of the world are mostly similar with interesting but relatively minor variations.

As science-fiction premises go it’s rather dull, but it roughly illustrates Chomsky’s view of linguistics, known as Universal Grammar, which has dominated the field for a half-century. Chomsky is fond of this hypothetical and has used it repeatedly for decades, including in a 1971 discussion with Michel Foucault, during which he added that “this Martian would, if he were rational, conclude that the structure of the knowledge that is acquired in the case of language is basically internal to the human mind.”

In his new book, Everett, now dean of arts and sciences at Bentley University, writes about hearing Chomsky bring up the Martian in a lecture he gave in the early 1990s. Everett noticed a group of graduate students in the back row laughing and exchanging money. After the talk, Everett asked them what was so funny, and they told him they had taken bets on precisely when Chomsky would once again cite the opinion of the linguist from Mars.

The somewhat unkind implication is that the distinguished scholar had become so predictable that his audiences had to search for ways to amuse themselves. Another Chomsky nugget is the way he responds when asked to give a definition of Universal Grammar. He will sometimes say that Universal Grammar is whatever made it possible for his granddaughter to learn to talk but left the world’s supply of kittens and rocks speechless—a less-than-precise answer. Say “kittens and rocks” to a cluster of linguists and eyes are likely to roll.

Chomsky’s detractors have said that Universal Grammar is whatever he needs it to be at that moment. By keeping it mysterious, they contend, he is able to dodge criticism and avoid those who are gunning for him. It’s hard to murder a phantom.

Everett’s book is an attempt to deliver, if not a fatal blow, then at least a solid right cross to Universal Grammar. He believes that the structure of language doesn’t spring from the mind but is instead largely formed by culture, and he points to the Amazonian tribe he studied for 30 years as evidence. It’s not that Everett thinks our brains don’t play a role—they obviously do. But he argues that just because we are capable of language does not mean it is necessarily prewired. As he writes in his book: “The discovery that humans are better at building human houses than porpoises tells us nothing about whether the architecture of human houses is innate.”

The language Everett has focused on, Pirahã, is spoken by just a few hundred members of a hunter-gatherer tribe in a remote part of Brazil. Everett got to know the Pirahã in the late 1970s as an American missionary. With his wife and kids, he lived among them for months at a time, learning their language from scratch. He would point to objects and ask their names. He would transcribe words that sounded identical to his ears but had completely different meanings. His progress was maddeningly slow, and he had to deal with the many challenges of jungle living. His story of taking his family, by boat, to get treatment for severe malaria is an epic in itself.

His initial goal was to translate the Bible. He got his Ph.D. in linguistics along the way and, in 1984, spent a year studying at the Massachusetts Institute of Technology in an office near Chomsky’s. He was a true-blue Chomskyan then, so much so that his kids grew up thinking Chomsky was more saint than professor. “All they ever heard about was how great Chomsky was,” he says. He was a linguist with a dual focus: studying the Pirahã language and trying to save the Pirahã from hell. The second part, he found, was tough because the Pirahã are rooted in the present. They don’t discuss the future or the distant past. They don’t have a belief in gods or an afterlife. And they have a strong cultural resistance to the influence of outsiders, dubbing all non-Pirahã “crooked heads.” They responded to Everett’s evangelism with indifference or ridicule.

As he puts it now, the Pirahã weren’t lost, and therefore they had no interest in being saved. They are a happy people. Living in the present has been an excellent strategy, and their lack of faith in the divine has not hindered them. Everett came to convert them, but over many years found that his own belief in God had melted away.

So did his belief in Chomsky, albeit for different reasons. The Pirahã language is remarkable in many respects. Entire conversations can be whistled, making it easier to communicate in the jungle while hunting. Also, the Pirahã don’t use numbers. They have words for amounts, like a lot or a little, but nothing for five or one hundred. Most significantly, for Everett’s argument, he says their language lacks what linguists call “recursion”—that is, the Pirahã don’t embed phrases in other phrases. They instead speak only in short, simple sentences.

In a recursive language, additional phrases and clauses can be inserted in a sentence, complicating the meaning, in theory indefinitely. For most of us, the lack of recursion in a little-known Brazilian language may not seem terribly interesting. But when Everett published a paper with that finding in 2005, the news created a stir. There were magazine articles and TV appearances. Fellow linguists weighed in, if only in some cases to scoff. Everett had put himself and the Pirahã on the map.

His paper might have received a shrug if Chomsky had not recently co-written a paper, published in 2002, that said (or seemed to say) that recursion was the single most important feature of human language. “In particular, animal communication systems lack the rich expressive and open-ended power of human language (based on humans’ capacity for recursion),” the authors wrote. Elsewhere in the paper, the authors wrote that the faculty of human language “at minimum” contains recursion. They also deemed it the “only uniquely human component of the faculty of language.”

In other words, Chomsky had finally issued what seemed like a concrete, definitive statement about what made human language unique, exposing a possible vulnerability. Before Everett’s paper was published, there had already been back and forth between Chomsky and the authors of a response to the 2002 paper, Ray Jackendoff and Steven Pinker. In the wake of that public disagreement, Everett’s paper had extra punch.

It’s been said that if you want to make a name for yourself in modern linguistics, you have to either align yourself with Chomsky or seek to destroy him. Either you are desirous of his approval or his downfall. With his 2005 paper, Everett opted for the latter course.

Because the pace of academic debate is just this side of glacial, it wasn’t until June 2009 that the next major chapter in the saga was written. Three scholars who are generally allies of Chomsky published a lengthy paper in the journal Language dissecting Everett’s claims one by one. What he considered unique features of Pirahã weren’t unique. What he considered “gaps” in the language weren’t gaps. They argued this in part by comparing Everett’s recent paper to work he published in the 1980s, calling it, slightly snidely, his earlier “rich material.” Everett wasn’t arguing with Chomsky, they claimed; he was arguing with himself. Young Everett thought Pirahã had recursion. Old Everett did not.

Everett’s defense was, in so many words, to agree. Yes, his earlier work was contradictory, but that’s because he was still under Chomsky’s sway when he wrote it. It’s natural, he argued, even when doing basic field work, cataloging the words of a language and the stories of a people, to be biased by your theoretical assumptions. Everett was a Chomskyan through and through, so much so that he had written the MSN Encarta encyclopedia entry on him. But now, after more years with the Pirahã, the scales had fallen from his eyes, and he saw the language on its own terms rather than those he was trying to impose on it.

David Pesetsky, a linguistics professor at MIT and one of the authors of the critical Languagepaper, thinks Everett was trying to gin up a “Star Wars-level battle between himself and the forces of Universal Grammar,” presumably with Everett as Luke Skywalker and Chomsky as Darth Vader.

Contradicting Everett meant getting into the weeds of the Pirahã language, a language that Everett knew intimately and his critics did not. “Most people took the attitude that this wasn’t worth taking on,” Pesetsky says. “There’s a junior-high-school corridor, two kids are having a fight, and everyone else stands back.” Everett wrote a lengthy reply that Pesetsky and his co-authors found unsatisfying and evasive. “The response could have been ‘Yeah, we need to do this more carefully,'” says Pesetsky. “But he’s had seven years to do it more carefully and he hasn’t.”

Critics haven’t just accused Everett of inaccurate analysis. He’s the sole authority on a language that he says changes everything. If he wanted to, they suggest, he could lie about his findings without getting caught. Some were willing to declare him essentially a fraud. That’s what one of the authors of the 2009 paper, Andrew Nevins, now at University College London, seems to believe. When I requested an interview with Nevins, his reply read, “I may be being glib, but it seems you’ve already analyzed this kind of case!” Below his message was a link to an article I had written about a Dutch social psychologist who had admitted to fabricating results, including creating data from studies that were never conducted. In another e-mail, after declining to expand on his apparent accusation, Nevins wrote that the “world does not need another article about Dan Everett.”

In 2007, Everett heard reports of a letter signed by Cilene Rodrigues, who is Brazilian, and who co-wrote the paper with Pesetsky and Nevins, that accuses him of racism. According to Everett, he got a call from a source informing him that Rodrigues, an honorary research fellow at University College London, had sent a letter to the organization in Brazil that grants permission for researchers to visit indigenous groups like the Pirahã. He then discovered that the organization, called FUNAI, the National Indian Foundation, would no longer grant him permission to visit the Pirahã, whom he had known for most of his adult life and who remain the focus of his research.

He still hasn’t been able to return. Rodrigues would not respond directly to questions about whether she had signed such a letter, nor would Nevins. Rodrigues forwarded an e-mail from another linguist who has worked in Brazil, which speculates that Everett was denied access to the Pirahã because he did not obtain the proper permits and flouted the law, accusations Everett calls “completely false” and “amazingly nasty lies.”

Whatever the reason for his being blocked, the question remains: Is Everett’s work racist? The accusation goes that because Everett says that the Pirahã do not have recursion, and that all human languages supposedly have recursion, Everett is asserting that the Pirahã are less than human. Part of this claim is based on an online summary, written by a former graduate student of Everett’s, that quotes traders in Brazil saying the Pirahã “talk like chickens and act like monkeys,” something Everett himself never said and condemns. The issue is sensitive because the Pirahã, who eschew the trappings of modern civilization and live the way their forebears lived for thousands of years, are regularly denigrated by their neighbors in the region as less than human. The fact that Everett is American, not Brazilian, lends the charge added symbolic weight.

When you read Everett’s two books about the Pirahã, it is nearly impossible to think that he believes they are inferior. In fact, he goes to great lengths not to condescend and offers defenses of practices that outsiders would probably find repugnant. In one instance he describes, a Pirahã woman died, leaving behind a baby that the rest of the tribe thought was too sick to live. Everett cared for the infant. One day, while he was away, members of the tribe killed the baby, telling him that it was in pain and wanted to die. He cried, but didn’t condemn, instead defending in the book their seemingly cruel logic.

Likewise, the Pirahã’s aversion to learning agriculture, or preserving meat, or the fact that they show no interest in producing artwork, is portrayed by Everett not as a shortcoming but as evidence of the Pirahã’s insistence on living in the present. Their nonhierarchical social system seems to Everett fair and sensible. He is critical of his own earlier attempts to convert the Pirahã to Christianity as a sort of “colonialism of the mind.” If anything, Everett is more open to a charge of romanticizing the Pirahã culture.

Other critics are more measured but equally suspicious. Mark Baker, a linguist at Rutgers University at New Brunswick, who considers himself part of Chomsky’s camp, mentions Everett’s “vested motive” in saying that the Pirahã don’t have recursion. “We always have to be a little careful when we have one person who has researched a language that isn’t accessible to other people,” Baker says. He is dubious of Everett’s claims. “I can’t believe it’s true as described,” he says.

Chomsky hasn’t exactly risen above the fray. He told a Brazilian newspaper that Everett was a “charlatan.” In the documentary about Everett, Chomsky raises the possibility, without saying he believes it, that Everett may have faked his results. Behind the scenes, he has been active as well. According to Pesetsky, Chomsky asked him to send an e-mail to David Papineau, a professor of philosophy at King’s College London, who had written a positive, or at least not negative, review of Don’t Sleep, There Are Snakes. The e-mail complained that Papineau had misunderstood recursion and was incorrectly siding with Everett. Papineau thought he had done nothing of the sort. “For people outside of linguistics, it’s rather surprising to find this kind of protection of orthodoxy,” Papineau says.

And what if the Pirahã don’t have recursion? Rather than ferreting out flaws in Everett’s work as Pesetsky did, Chomsky’s preferred response is to say that it doesn’t matter. In a lecture he gave last October at University College London, he referred to Everett’s work without mentioning his name, talking about those who believed that “exceptions to the generalizations are considered lethal.” He went on to say that a “rational reaction” to finding such exceptions “isn’t to say ‘Let’s throw out the field.'” Universal Grammar permits such exceptions. There is no problem. As Pesetsky puts it: “There’s nothing that says languages without subordinate clauses can’t exist.”

Except the 2002 paper on which Chomsky’s name appears. Pesetsky and others have backed away from that paper, arguing not that it was incorrect, but that it was “written in an unfortunate way” and that the authors were “trying to make certain things comprehensible about linguistics to a larger public, but they didn’t make it clear that they were simplifying.” Some say that Chomsky signed his name to the paper but that it was actually written by Marc Hauser, the former professor of psychology at Harvard University, who resigned after Harvard officials found him guilty of eight counts of research misconduct. (For the record, no one has suggested the alleged misconduct affected his work with Chomsky.)

Chomsky declined to grant me an interview. Those close to him say he sees Everett as seizing on a few stray, perhaps underexplained, lines from that 2002 paper and distorting them for his own purposes. And the truth, Chomsky has made clear, should be apparent to any rational person.

Ted Gibson has heard that one before. When Gibson, a professor of cognitive sciences at MIT, gave a paper on the topic at a January meeting of the Linguistic Society of America, held in Portland, Ore., Pesetsky stood up at the end to ask a question. “His first comment was that Chomsky never said that. I went back and found the slide,” he says. “Whenever I talk about this question in front of these people I have to put up the literal quote from Chomsky. Then I have to put it up again.”

Geoffrey Pullum, a professor of linguistics at the University of Edinburgh, is also vexed at how Chomsky and company have, in his view, played rhetorical sleight-of-hand to make their case. “They have retreated to such an extreme degree that it says really nothing,” he says. “If it has a sentence longer than three words then they’re claiming they were right. If that’s what they claim, then they weren’t claiming anything.” Pullum calls this move “grossly dishonest and deeply silly.”

Everett has been arguing about this for seven years. He says Pirahã undermines Universal Grammar. The other side says it doesn’t. In an effort to settle the dispute, Everett asked Gibson, who holds a joint appointment in linguistics at MIT, to look at the data and reach his own conclusions. He didn’t provide Gibson with data he had collected himself because he knows his critics suspect those data have been cooked. Instead he provided him with sentences and stories collected by his missionary predecessor. That way, no one could object that it was biased.

In the documentary about Everett, handing over the data to Gibson is given tremendous narrative importance. Everett is the bearded, safari-hatted field researcher boating down a river in the middle of nowhere, talking and eating with the natives. Meanwhile, Gibson is the nerd hunched over his keyboard back in Cambridge, crunching the data, examining it with his research assistants, to determine whether Everett really has discovered something. If you watch the documentary, you get the sense that what Gibson has found confirms Everett’s theory. And that’s the story you get from Everett, too. In our first interview, he encouraged me to call Gibson. “The evidence supports what I’m saying,” he told me, noting that he and Gibson had a few minor differences of interpretation.

But that’s not what Gibson thinks. Some of what he found does support Everett. For example, he’s confirmed that Pirahã lacks possessive recursion, phrases like “my brother’s mother’s house.” Also, there appear to be no conjunctions like “and” or “or.” In other instances, though, he’s found evidence that seems to undercut Everett’s claims—specifically, when it comes to noun phrases in sentences like “His mother, Itaha, spoke.”

That is a simple sentence, but inserting the mother’s name is a hallmark of recursion. Gibson’s paper, on which Everett is a co-author, states, “We have provided suggestive evidence that Pirahã may have sentences with recursive structures.”

If that turns out to be true, it would undermine the primary thesis of both of Everett’s books about the Pirahã. Rather than the hero who spent years in the Amazon emerging with evidence that demolished the field’s predominant theory, Everett would be the descriptive linguist who came back with a couple of books full of riveting anecdotes and cataloged a language that is remarkable, but hardly changes the game.

Everett only realized during the reporting of this article that Gibson disagreed with him so strongly. Until then, he had been saying that the results generally supported his theory. “I don’t know why he says that,” Gibson says. “Because it doesn’t. He wrote that our work corroborates it. A better word would be falsified. Suggestive evidence is against it right now and not for it.” Though, he points out, the verdict isn’t final. “It looks like it is recursive,” he says. “I wouldn’t bet my life on it.”

Another researcher, Ray Jackendoff, a linguist at Tufts University, was also provided the data and sees it slightly differently. “I think we decided there is some embedding but it is of limited depth,” he says. “It’s not recursive in the sense that you can have infinitely deep embedding.” Remember that in Chomsky’s paper, it was the idea that “open-ended” recursion was possible that separated human and animal communication. Whether the kind of limited recursion Gibson and Jackendoff have noted qualifies depends, like everything else in this debate, on the interpretation.

Everett thinks what Gibson has found is not recursion, but rather false starts, and he believes further research will back him up. “These are very short, extremely limited examples and they almost always are nouns clarifying other nouns,” he says. “You almost never see anything but that in these cases.” And he points out that there still doesn’t seem to be any evidence of infinite recursion. Says Everett: “There simply is no way, even if what I claim to be false starts are recursive instead, to say, “‘My mother, Susie, you know who I mean, you like her, is coming tonight.'”

The field has a history of theoretical disagreements that turn ugly. In the book The Linguistic Wars, published in 1995, Randy Allen Harris tells the story of another skirmish between Chomsky and a group of insurgent linguists called generative semanticists. Chomsky dismissed his opponents’ arguments as absurd. His opponents accused him of altering his theories when confronted and of general arrogance. “Chomsky has the impressive rhetorical talent of offering ideas which are at once tentative and fully endorsed, of appearing to take the if out of his arguments while nevertheless keeping it safely around,” writes Harris.

That rhetorical talent was on display in his lecture last October, in which he didn’t just disagree with other linguists, but treated their arguments as ridiculous and a mortal danger to the field. The style seems to be reflected in his political activism. Watch his 1969 debate on Firing Lineagainst William F. Buckley Jr., available on YouTube, and witness Chomsky tie his famous interlocutor in knots. It is a thorough, measured evisceration. Chomsky is willing to deploy those formidable skills in linguistic arguments as well.

Everett is far from the only current Chomsky challenger. Recently there’s been a rise in so-called corpus linguistics, a data-driven method of evaluating a language, using computer software to analyze sentences and phrases. The method produces detailed information and, for scholars like Gibson, finally provides scientific rigor for a field he believes has been mired in never-ending theoretical disputes. That, along with the brain-scanning technology that linguists are increasingly making use of, may be able to help resolve questions about how much of the structure of language is innate and how much is shaped by culture.

But Chomsky has little use for that method. In his lecture, he deemed corpus linguistics nonscientific, comparing it to doing physics by describing the swirl of leaves on a windy day rather than performing experiments. This was “just statistical modeling,” he said, evidence of a “kind of pathology in the cognitive sciences.” Referring to brain scans, Chomsky joked that the only way to get a grant was to propose an fMRI.

As for Universal Grammar, some are already writing its obituary. Michael Tomasello, co-director of the Max Planck Institute for Evolutionary Anthropology, has stated flatly that “Universal Grammar is dead.” Two linguists, Nicholas Evans and Stephen Levinson, published a paper in 2009 titled “The Myth of Language Universals,” arguing that the “claims of Universal Grammar … are either empirically false, unfalsifiable, or misleading in that they refer to tendencies rather than strict universals.” Pullum has a similar take: “There is no Universal Grammar now, not if you take Chomsky seriously about the things he says.”

Gibson puts it even more harshly. Just as Chomsky doesn’t think corpus linguistics is science, Gibson doesn’t think Universal Grammar is worthwhile. “The question is, ‘What is it?’ How much is built-in and what does it do? There are no details,” he says. “It’s crazy to say it’s dead. It was never alive.”

Such proclamations have been made before and Chomsky, now 83, has a history of outmaneuvering and outlasting his adversaries. Whether Everett will be yet another in a long line of would-be debunkers who turn into footnotes remains to be seen. “I probably do, despite my best intentions, hope that I turn out to be right,” he says. “I know that it is not scientific. But I would be a hypocrite if I didn’t admit it.”

How Do You Say ‘Disagreement’ in Pirahã? (N.Y.Times)

By JENNIFER SCHUESSLER. Published: March 21, 2012

Dan Everett. Essential Media & Entertainment/Smithsonian Channel

In his 2008 memoir, “Don’t Sleep, There Are Snakes,” the linguist Dan Everett recalled the night members of the Pirahã — the isolated Amazonian hunter-gatherers he first visited as a Christian missionary in the late 1970s — tried to kill him.

Dr. Everett survived, and his life among the Pirahã, a group of several hundred living in northwest Brazil, went on mostly peacefully as he established himself as a leading scholarly authority on the group and one of a handful of outsiders to master their difficult language.

His life among his fellow linguists, however, has been far less idyllic, and debate about his scholarship is poised to boil over anew, thanks to his ambitious new book, “Language: The Cultural Tool,” and a forthcoming television documentary that presents an admiring view of his research among the Pirahã along with a darkly conspiratorial view of some of his critics.

Members of the Pirahã people of Amazonian Brazil, who have an unusual language, as seen in “The Grammar of Happiness.” Essential Media & Entertainment/Smithsonian Channel

In 2005 Dr. Everett shot to international prominence with a paper claiming that he had identified some peculiar features of the Pirahã language that challenged Noam Chomsky’s influential theory, first proposed in the 1950s, that human language is governed by “universal grammar,” a genetically determined capacity that imposes the same fundamental shape on all the world’s tongues.

The paper, published in the journal Current Anthropology, turned him into something of a popular hero but a professional lightning rod, embraced in the press as a giant killer who had felled the mighty Chomsky but denounced by some fellow linguists as a fraud, an attention seeker or worse, promoting dubious ideas about a powerless indigenous group while refusing to release his data to skeptics.

The controversy has been simmering in journals and at conferences ever since, fed by a slow trickle of findings by researchers who have followed Dr. Everett’s path down to the Amazon. In a telephone interview Dr. Everett, 60, who is the dean of arts and sciences at Bentley University in Waltham, Mass., insisted that he’s not trying to pick a fresh fight, let alone present himself as a rival to the man he calls “the smartest person I’ve ever met.”

“I’m a small fish in the sea,” he said, adding, “I do not put myself at Chomsky’s level.”

Dan Everett in the Amazon region of Brazil with the Pirahã in 1981. Courtesy Daniel Everett

Still, he doesn’t shy from making big claims for “Language: The Cultural Tool,” published last week by Pantheon. “I am going beyond my work with Pirahã and systematically dismantling the evidence in favor of a language instinct,” he said. “I suspect it will be extremely controversial.”

Even some of Dr. Everett’s admirers fault him for representing himself as a lonely voice of truth against an all-powerful Chomskian orthodoxy bent on stopping his ideas dead. It’s certainly the view advanced in the documentary, “The Grammar of Happiness,” which accuses unnamed linguists of improperly influencing the Brazilian government to deny his request to return to Pirahã territory, either with the film crew or with a research team from M.I.T., led by Ted Gibson, a professor of cognitive science. (It’s scheduled to run on the Smithsonian Channel in May.)

A Pirahã man in the film “The Grammar of Happiness.” Essential Media & Entertainment/Smithsonian Channel

Dr. Everett acknowledged that he had no firsthand evidence of any intrigues against him. But Miguel Oliveira, an associate professor of linguistics at the Federal University of Alagoas and the M.I.T. expedition’s Brazilian sponsor, said in an interview that Dr. Everett is widely resented among scholars in Brazil for his missionary past, anti-Chomskian stance and ability to attract research money.

“This is politics, everybody knows that,” Dr. Oliveira said. “One of the arguments is that he’s stealing something from the indigenous people to become famous. It’s not said. But that’s the way they think.”

Claims of skullduggery certainly add juice to a debate that, to nonlinguists, can seem arcane. In a sense what Dr. Everett has taken from the Pirahã isn’t gold or rare medicinal plants but recursion, a property of language that allows speakers to embed phrases within phrases — for example, “The professor said Everett said Chomsky is wrong” — infinitely.

In a much-cited 2002 paper Professor Chomsky, an emeritus professor of linguistics at M.I.T., writing with Marc D. Hauser and W. Tecumseh Fitch, declared recursion to be the crucial feature of universal grammar and the only thing separating human language from its evolutionary forerunners. But Dr. Everett, who had been publishing quietly on the Pirahã for two decades, announced in his 2005 paper that their language lacked recursion, along with color terms, number terms, and other common properties of language. The Pirahã, Dr. Everett wrote, showed these linguistic gaps not because they were simple-minded, but because their culture — which emphasized concrete matters in the here and now and also lacked creation myths and traditions of art making — did not require it.

To Dr. Everett, Pirahã was a clear case of culture shaping grammar — an impossibility according to the theory of universal grammar. But to some of his critics the paper was really just a case of Dr. Everett — who said he began questioning his own Chomskian ideas in the early 1990s, around the time he began questioning his faith — fixing the facts around his new theories.

In 2009 the linguists Andrew Nevins, Cilene Rodrigues and David Pesetsky, three of the fiercest early critics of Dr. Everett’s paper, published their own in the journal Language, disputing his linguistic claims and expressing “discomfort” with his overall account of the Pirahã’s simple culture. Their main source was Dr. Everett himself, whose 1982 doctoral dissertation, they argued, showed clear evidence of recursion in Pirahã.

“He was right the first time,” Dr. Pesetsky, an M.I.T. professor, said in an interview. “The first time he had reasons. The second time he had no reasons.”

Some scholars say the debate remains stymied by a lack of fresh, independently gathered data. Three different research teams, including one led by Dr. Gibson that traveled to the Pirahã in 2007, have published papers supporting Dr. Everett’s claim that there are no numbers in the Pirahã language. But efforts to go recursion hunting in the jungle — using techniques that range from eliciting sentences to having the Pirahã play specially designed video games — have so far yielded no published results.

Still, some have tried to figure out ways to press ahead, even without direct access to the Pirahã. After Dr. Gibson’s team was denied permission to return to Brazil in 2010, its members devised a method that minimized reliance on Dr. Everett’s data by analyzing instead a corpus of 1,000 sentences from Pirahã stories transcribed by another missionary in the region.

Their analysis, presented at the Linguistic Society of America’s annual meeting in January, found no embedded clauses but did uncover “suggestive evidence” of recursion in a more obscure grammatical corner. It’s a result that is hardly satisfying to Dr. Everett, who questions it. But his critics, oddly, seem no more pleased.

Dr. Pesetsky, who heard the presentation, dismissed the whole effort as biased from the start by its reliance on Dr. Everett’s grammatical classifications and basic assumptions. “They were taking for granted the correctness of the hypothesis they were trying to disconfirm,” he said.

But to Dr. Gibson, who said he does not find Dr. Everett’s cultural theory of language persuasive, such responses reflect the gap between theoretical linguists and data-driven cognitive scientists, not to mention the strangely calcified state of the recursion debate.

“Chomskians and non-Chomskians are weirdly illogical at times,” he said. “It’s like they just don’t want to have a cogent argument. They just want to contradict what the other guy is saying.”

Dr. Everett’s critics fault him for failing to release his field data, even seven years after the controversy erupted. He countered that he is currently working to translate his decades’ worth of material and hopes to post some transcriptions online “over the next several months.” The bigger outrage, he insisted, is what he characterized as other scholars’ efforts to accuse him of “racist research” and interfere with his access to the Pirahã.

Dr. Rodrigues, a professor of linguistics at the Pontifical Catholic University in Rio de Janeiro, acknowledged by e-mail that in 2007 she wrote a letter to Funai, the Brazilian government agency in charge of indigenous affairs, detailing her objections to Dr. Everett’s linguistic research and to his broader description of Pirahã culture.

She declined to elaborate on the contents of the letter, which she said was written at Funai’s request and did not recommend any particular course of action. But asked about her overall opinion of Dr. Everett’s research, she said, “It does not meet the standards of scientific evidence in our field.”

Whatever the reasons for Dr. Everett’s being denied access, he’s enlisting the help of the Pirahã themselves, who are shown at the end of “The Grammar of Happiness” recording an emotional plea to the Brazilian government.

“We love Dan,” one man says into the camera. “Dan speaks our language.”

Exterminate a species or two, save the planet (RT)

Published: 26 January, 2011, 14:43

Edited: 15 April, 2011, 05:18

 Biologists have suggested a mathematical model, which will hopefully predict which species need to be eliminated from an unstable ecosystem, and in which order, to help it recover.

The counterintuitive idea to kill living things for the sake of biodiversity conservation comes from the complex connections presented in ecosystems. Eliminate a predator, and its prey thrives and shrinks the amount of whatever it has for its own food. Such “cascading” impacts along the “food webs” can be unpredictable and sometimes catastrophic.

Sagar Sahasrabudhe and Adilson Motter of Northwestern University in the US have shown that in some food web models, the timely removal or suppression of one or several species can do quite the opposite and mitigate the damage caused by local extinction. The paper is described in Nature magazine.

The trick is not an easy one, since the timing of removal is just as important as the targeted species. A live example Sahasrabudhe and Motter use is that of island foxes on the Channel Islands off the coast of California. When feral pigs were introduced in the ecosystem, they attracted golden eagles, which preyed on foxes as well. Simply reversing the situation by removing the pigs would make the birds switch solely to foxes, which would eventually make them extinct. Instead, conservation activists captured and relocated the eagles before eradicating the pigs, saving the fox population.

Of course conservation scientists are not going to start taking decisions based on the models straight away. Real ecosystems are not limited to predator and prey relationships, and things like parasitism, pollination and nutrient dynamics have to be taken into account as well. On the other hand, ecosystems were thought to be too complex to be modeled at all some eight years ago, Martinez says. Their work gives more confidence that it will have practical uses in nearest future.

Desafios do “tsunami de dados” (FAPESP)

Lançado pelo Instituto Microsoft Research-FAPESP de Pesquisas em TI, o livro O Quarto Paradigma debate os desafios da eScience, nova área dedicada a lidar com o imenso volume de informações que caracteriza a ciência atual

07/11/2011

Por Fábio de Castro

Agência FAPESP – Se há alguns anos a falta de dados limitava os avanços da ciência, hoje o problema se inverteu. O desenvolvimento de novas tecnologias de captação de dados, nas mais variadas áreas e escalas, tem gerado um volume tão imenso de informações que o excesso se tornou um gargalo para o avanço científico.

Nesse contexto, cientistas da computação têm se unido a especialistas de diferentes áreas para desenvolver novos conceitos e teorias capazes de lidar com a enxurrada de dados da ciência contemporânea. O resultado é chamado de eScience.

Esse é o tema debatido no livro O Quarto Paradigma – Descobertas científicas na era da eScience, lançado no dia 3 de novembro pelo Instituto Microsoft Research-FAPESP de Pesquisas em TI.

Organizado por Tony Hey, Stewart Tansley, Kristin Tolle – todos da Microsoft Research –, a publicação foi lançada na sede da FAPESP, em evento que contou com a presença do diretor científico da Fundação, Carlos Henrique de Brito Cruz.

Durante o lançamento, Roberto Marcondes Cesar Jr., do Instituto de Matemática e Estatística (IME) da Universidade de São Paulo (USP), apresentou a palestra “eScience no Brasil”. “O Quarto Paradigma: computação intensiva de dados avançando a descoberta científica” foi o tema da palestra de Daniel Fay, diretor de Terra, Energia e Meio Ambiente da MSR.

Brito Cruz destacou o interesse da FAPESP em estimular o desenvolvimento da eScience no Brasil. “A FAPESP está muito conectada a essa ideia, porque muitos dos nossos projetos e programas apresentam essa necessidade de mais capacidade de gerenciar grandes conjuntos de dados. O nosso grande desafio está na ciência por trás dessa capacidade de lidar com grandes volumes de dados”, disse.

Iniciativas como o Programa FAPESP de Pesquisa sobre Mudanças Climáticas Globais (PFPMCG), o BIOTA-FAPESP e o Programa FAPESP de Pesquisa em Bioenergia (BIOEN) são exemplos de programas que têm grande necessidade de integrar e processar imensos volumes de dados.

“Sabemos que a ciência avança quando novos instrumentos são disponibilizados. Por outro lado, os cientistas normalmente não percebem o computador como um novo grande instrumento que revoluciona a ciência. A FAPESP está interessada em ações para que a comunidade científica tome consciência de que há grandes desafios na área de eScience”, disse Brito Cruz.

O livro é uma coleção de 26 ensaios técnicos divididos em quatro seções: “Terra e meio ambiente”, “Saúde e bem-estar”, “Infraestrutura científica” e “Comunicação acadêmica”.

“O livro fala da emergência de um novo paradigma para as descobertas científicas. Há milhares de anos, o paradigma vigente era o da ciência experimental, fundamentada na descrição de fenômenos naturais. Há algumas centenas de anos, surgiu o paradigma da ciência teórica, simbolizado pelas leis de Newton. Há algumas décadas, surgiu a ciência computacional, simulando fenômenos complexos. Agora, chegamos ao quarto paradigma, que é o da ciência orientada por dados”, disse Fay.

Com o advento do novo paradigma, segundo ele, houve uma mudança completa na natureza da descoberta científica. Entraram em cena modelos complexos, com amplas escalas espaciais e temporais, que exigem cada vez mais interações multidisciplinares.

“Os dados, em quantidade incrível, são provenientes de diferentes fontes e precisam também de abordagem multidisciplinar e, muitas vezes, de tratamento em tempo real. As comunidades científicas também estão mais distribuídas. Tudo isso transformou a maneira como se fazem descobertas”, disse Fay.

A ecologia, uma das áreas altamente afetadas pelos grandes volumes de dados, é um exemplo de como o avanço da ciência, cada vez mais, dependerá da colaboração entre pesquisadores acadêmicos e especialistas em computação.

“Vivemos em uma tempestade de sensoriamento remoto, sensores terrestres baratos e acesso a dados na internet. Mas extrair as variáveis que a ciência requer dessa massa de dados heterogêneos continua sendo um problema. É preciso ter conhecimento especializado sobre algoritmos, formatos de arquivos e limpeza de dados, por exemplo, que nem sempre é acessível para o pessoal da área de ecologia”, explicou.

O mesmo ocorre em áreas como medicina e biologia – que se beneficiam de novas tecnologias, por exemplo, em registros de atividade cerebral, ou de sequenciamento de DNA – ou a astronomia e física, à medida que os modernos telescópios capturam terabytes de informação diariamente e o Grande Colisor de Hádrons (LHC) gera petabytes de dados a cada ano.

Instituto Virtual

Segundo Cesar Jr., a comunidade envolvida com eScience no Brasil está crescendo. O país tem 2.167 cursos de sistemas de informação ou engenharia e ciências da computação. Em 2009, houve 45 mil formados nessas áreas e a pós-graduação, entre 2007 e 2009, tinha 32 cursos, mil orientadores, 2.705 mestrandos e 410 doutorandos.

“A ciência mudou do paradigma da aquisição de dados para o da análise de dados. Temos diferentes tecnologias que produzem terabytes em diversos campos do conhecimento e, hoje, podemos dizer que essas áreas têm foco na análise de um dilúvio de dados”, disse o membro da Coordenação da Área de Ciência e Engenharia da Computação da FAPESP.

Em 2006, a Sociedade Brasileira de Computação (SBC) organizou um encontro a fim de identificar os problemas-chave e os principais desafios para a área. Isso levou a diferentes propostas para que o Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) criasse um programa específico para esse tipo de problema.

“Em 2009, realizamos uma série de workshops na FAPESP, reunindo, para discutir essa questão, cientistas de áreas como agricultura, mudanças climáticas, medicina, transcriptômica, games, governo eletrônico e redes sociais. A iniciativa resultou em excelentes colaborações entre grupos de cientistas com problemas semelhantes e originou diversas iniciativas”, disse César Jr.

As chamadas do Instituto Microsoft Research-FAPESP de Pesquisas em TI, segundo ele, têm sido parte importante do conjunto de iniciativas para promover a eScience, assim como a organização da Escola São Paulo de Ciência Avançada em Processamento e Visualização de Imagens Computacionais. Além disso, a FAPESP tem apoiado diversos projetos de pesquisa ligados ao tema.

“A comunidade de eScience em São Paulo tem trabalhado com profissionais de diversas áreas e publicado em revistas de várias delas. Isso é indicação de qualidade adquirida pela comunidade para encarar o grande desafio que teremos nos próximos anos”, disse César Jr., que assina o prefácio da edição brasileira do livro.

  • O Quarto Paradigma
    Organizadores: Tony Hey, Stewart Tansley e Kristin Tolle
    Lançamento: 2011
    Preço: R$ 60
    Páginas: 263
    Mais informações: www.ofitexto.com.br

Fraud Case Seen as a Red Flag for Psychology Research (N.Y. Times)

By BENEDICT CAREY

Published: November 2, 2011

A well-known psychologist in the Netherlands whose work has been published widely in professional journals falsified data and made up entire experiments, an investigating committee has found. Experts say the case exposes deep flaws in the way science is done in a field,psychology, that has only recently earned a fragile respectability.

Joris Buijs/Pve

The psychologist Diederik Stapel in an undated photograph. “I have failed as a scientist and researcher,” he said in a statement after a committee found problems in dozens of his papers.

The psychologist, Diederik Stapel, of Tilburg University, committed academic fraud in “several dozen” published papers, many accepted in respected journals and reported in the news media, according to a report released on Monday by the three Dutch institutions where he has worked: the University of Groningen, the University of Amsterdam, and Tilburg. The journal Science, which published one of Dr. Stapel’s papers in April, posted an “editorial expression of concern” about the research online on Tuesday.

The scandal, involving about a decade of work, is the latest in a string of embarrassments in a field that critics and statisticians say badly needs to overhaul how it treats research results. In recent years, psychologists have reported a raft of findings on race biases, brain imaging and even extrasensory perception that have not stood up to scrutiny. Outright fraud may be rare, these experts say, but they contend that Dr. Stapel took advantage of a system that allows researchers to operate in near secrecy and massage data to find what they want to find, without much fear of being challenged.

“The big problem is that the culture is such that researchers spin their work in a way that tells a prettier story than what they really found,” said Jonathan Schooler, a psychologist at the University of California, Santa Barbara. “It’s almost like everyone is on steroids, and to compete you have to take steroids as well.”

In a prolific career, Dr. Stapel published papers on the effect of power on hypocrisy, on racial stereotyping and on how advertisements affect how people view themselves. Many of his findings appeared in newspapers around the world, including The New York Times, which reported in December on his study about advertising and identity.

In a statement posted Monday on Tilburg University’s Web site, Dr. Stapel apologized to his colleagues. “I have failed as a scientist and researcher,” it read, in part. “I feel ashamed for it and have great regret.”

More than a dozen doctoral theses that he oversaw are also questionable, the investigators concluded, after interviewing former students, co-authors and colleagues. Dr. Stapel has published about 150 papers, many of which, like the advertising study, seem devised to make a splash in the media. The study published in Science this year claimed that white people became more likely to “stereotype and discriminate” against black people when they were in a messy environment, versus an organized one. Another study, published in 2009, claimed that people judged job applicants as more competent if they had a male voice. The investigating committee did not post a list of papers that it had found fraudulent.

Dr. Stapel was able to operate for so long, the committee said, in large measure because he was “lord of the data,” the only person who saw the experimental evidence that had been gathered (or fabricated). This is a widespread problem in psychology, said Jelte M. Wicherts, a psychologist at the University of Amsterdam. In a recent survey, two-thirds of Dutch research psychologists said they did not make their raw data available for other researchers to see. “This is in violation of ethical rules established in the field,” Dr. Wicherts said.

In a survey of more than 2,000 American psychologists scheduled to be published this year, Leslie John of Harvard Business School and two colleagues found that 70 percent had acknowledged, anonymously, to cutting some corners in reporting data. About a third said they had reported an unexpected finding as predicted from the start, and about 1 percent admitted to falsifying data.

Also common is a self-serving statistical sloppiness. In an analysis published this year, Dr. Wicherts and Marjan Bakker, also at the University of Amsterdam, searched a random sample of 281 psychology papers for statistical errors. They found that about half of the papers in high-end journals contained some statistical error, and that about 15 percent of all papers had at least one error that changed a reported finding — almost always in opposition to the authors’ hypothesis.

The American Psychological Association, the field’s largest and most influential publisher of results, “is very concerned about scientific ethics and having only reliable and valid research findings within the literature,” said Kim I. Mills, a spokeswoman. “We will move to retract any invalid research as such articles are clearly identified.”

Researchers in psychology are certainly aware of the issue. In recent years, some have mocked studies showing correlations between activity on brain images and personality measures as “voodoo” science, and a controversy over statistics erupted in January after The Journal of Personality and Social Psychology accepted a paper purporting to show evidence of extrasensory perception. In cases like these, the authors being challenged are often reluctant to share their raw data. But an analysis of 49 studies appearing Wednesday in the journal PLoS One, by Dr. Wicherts, Dr. Bakker and Dylan Molenaar, found that the more reluctant that scientists were to share their data, the more likely that evidence contradicted their reported findings.

“We know the general tendency of humans to draw the conclusions they want to draw — there’s a different threshold,” said Joseph P. Simmons, a psychologist at the University of Pennsylvania’s Wharton School. “With findings we want to see, we ask, ‘Can I believe this?’ With those we don’t, we ask, ‘Must I believe this?’ ”

But reviewers working for psychology journals rarely take this into account in any rigorous way. Neither do they typically ask to see the original data. While many psychologists shade and spin, Dr. Stapel went ahead and drew any conclusion he wanted.

“We have the technology to share data and publish our initial hypotheses, and now’s the time,” Dr. Schooler said. “It would clean up the field’s act in a very big way.”

Mathematically Detecting Stock Market Bubbles Before They Burst (Science Daily)

ScienceDaily (Oct. 31, 2011) — From the dotcom bust in the late nineties to the housing crash in the run-up to the 2008 crisis, financial bubbles have been a topic of major concern. Identifying bubbles is important in order to prevent collapses that can severely impact nations and economies.

A paper published this month in the SIAM Journal on Financial Mathematics addresses just this issue. Opening fittingly with a quote from New York Federal Reserve President William Dudley emphasizing the importance of developing tools to identify and address bubbles in real time, authors Robert Jarrow, Younes Kchia, and Philip Protter propose a mathematical model to detect financial bubbles.

A financial bubble occurs when prices for assets, such as stocks, rise far above their actual value. Such an economic cycle is usually characterized by rapid expansion followed by a contraction, or sharp decline in prices.

“It has been hard not to notice that financial bubbles play an important role in our economy, and speculation as to whether a given risky asset is undergoing bubble pricing has approached the level of an armchair sport. But bubbles can have real and often negative consequences,” explains Protter, who has spent many years studying and analyzing financial markets.

“The ability to tell when an asset is or is not in a bubble could have important ramifications in the regulation of the capital reserves of banks as well as for individual investors and retirement funds holding assets for the long term. For banks, if their capital reserve holdings include large investments with unrealistic values due to bubbles, a shock to the bank could occur when the bubbles burst, potentially causing a run on the bank, as infamously happened with Lehman Brothers, and is currently happening with Dexia, a major European bank,” he goes on to explain, citing the significance of such inflated prices.

Using sophisticated mathematical methods, Protter and his co-authors answer the question of whether the price increase of a particular asset represents a bubble in real time. “[In this paper] we show that by using tick data and some statistical techniques, one is able to tell with a large degree of certainty, whether or not a given financial asset (or group of assets) is undergoing bubble pricing,” says Protter.

This question is answered by estimating an asset’s price volatility, which is stochastic or randomly determined. The authors define an asset’s price process in terms of a standard stochastic differential equation, which is driven by Brownian motion. Brownian motion, based on a natural process involving the erratic, random movement of small particles suspended in gas or liquid, has been widely used in mathematical finance. The concept is specifically used to model instances where previous change in the value of a variable is unrelated to past changes.

The key characteristic in determining a bubble is the volatility of an asset’s price, which, in the case of bubbles is very high. The authors estimate the volatility by applying state of the art estimators to real-time tick price data for a given stock. They then obtain the best possible extension of this data for large values using a technique called Reproducing Kernel Hilbert Spaces (RKHS), which is a widely used method for statistical learning.

“First, one uses tick price data to estimate the volatility of the asset in question for various levels of the asset’s price,” Protter explains. “Then, a special technique (RKHS with an optimization addition) is employed to extrapolate this estimated volatility function to large values for the asset’s price, where this information is not (and cannot be) available from tick data. Using this extrapolation, one can check the rate of increase of the volatility function as the asset price gets arbitrarily large. Whether or not there is a bubble depends on how fast this increase occurs (its asymptotic rate of increase).”

If it does not increase fast enough, there is no bubble within the model’s framework.

The authors test their methodology by applying the model to several stocks from the dot-com bubble of the nineties. They find fairly successful rates in their predictions, with higher accuracies in cases where market volatilities can be modeled more efficiently. This helps establish the strengths and weaknesses of the method.

The authors have also used the model to test more recent price increases to detect bubbles. “We have found, for example, that the IPO [initial public offering] of LinkedIn underwent bubble pricing at its debut, and that the recent rise in gold prices was not a bubble, according to our models,” Protter says.

It is encouraging to see that mathematical analysis can play a role in the diagnosis and detection of bubbles, which have significantly impacted economic upheavals in the past few decades.

Robert Jarrow is a professor at the Johnson Graduate School of Management at Cornell University in Ithaca, NY, and managing director of the Kamakura Corporation. Younes Kchia is a graduate student at Ecole Polytechnique in Paris, and Philip Protter is a professor in the Statistics Department at Columbia University in New York.

Professor Protter’s work was supported in part by NSF grant DMS-0906995.