Arquivo da tag: Método científico

The Petabyte Age: Because More Isn’t Just More — More Is Different (Wired)

WIRED Staff, Science, 06.23.2008 12:00 PM

Introduction: Sensors everywhere. Infinite storage. Clouds of processors. Our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn’t just more. […]

petabyte age
Marian Bantjes


Sensors everywhere. Infinite storage. Clouds of processors. Our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn’t just more. More is different.

The End of Theory:

The Data Deluge Makes the Scientific Method Obsolete

Feeding the Masses:
Data In, Crop Predictions Out

Chasing the Quark:
Sometimes You Need to Throw Information Away

Winning the Lawsuit:
Data Miners Dig for Dirt

Tracking the News:
A Smarter Way to Predict Riots and Wars

__Spotting the Hot Zones: __
Now We Can Monitor Epidemics Hour by Hour

__ Sorting the World:__
Google Invents New Way to Manage Data

__ Watching the Skies:__
Space Is Big — But Not Too Big to Map

Scanning Our Skeletons:
Bone Images Show Wear and Tear

Tracking Air Fares:
Elaborate Algorithms Predict Ticket Prices

Predicting the Vote:
Pollsters Identify Tiny Voting Blocs

Pricing Terrorism:
Insurers Gauge Risks, Costs

Visualizing Big Data:
Bar Charts for Words

Big data and the end of theory? (The Guardian)

Mark Graham, Fri 9 Mar 2012 14.39 GM

Does big data have the answers? Maybe some, but not all, says Mark Graham

In 2008, Chris Anderson, then editor of Wired, wrote a provocative piece titled The End of Theory. Anderson was referring to the ways that computers, algorithms, and big data can potentially generate more insightful, useful, accurate, or true results than specialists or
domain experts who traditionally craft carefully targeted hypotheses
and research strategies.

This revolutionary notion has now entered not just the popular imagination, but also the research practices of corporations, states, journalists and academics. The idea being that the data shadows and information trails of people, machines, commodities and even nature can reveal secrets to us that we now have the power and prowess to uncover.

In other words, we no longer need to speculate and hypothesise; we simply need to let machines lead us to the patterns, trends, and relationships in social, economic, political, and environmental relationships.

It is quite likely that you yourself have been the unwitting subject of a big data experiment carried out by Google, Facebook and many other large Web platforms. Google, for instance, has been able to collect extraordinary insights into what specific colours, layouts, rankings, and designs make people more efficient searchers. They do this by slightly tweaking their results and website for a few million searches at a time and then examining the often subtle ways in which people react.

Most large retailers similarly analyse enormous quantities of data from their databases of sales (which are linked to you by credit card numbers and loyalty cards) in order to make uncanny predictions about your future behaviours. In a now famous case, the American retailer, Target, upset a Minneapolis man by knowing more about his teenage daughter’s sex life than he did. Target was able to predict his daughter’s pregnancy by monitoring her shopping patterns and comparing that information to an enormous database detailing billions of dollars of sales. This ultimately allows the company to make uncanny
predictions about its shoppers.

More significantly, national intelligence agencies are mining vast quantities of non-public Internet data to look for weak signals that might indicate planned threats or attacks.

There can by no denying the significant power and potentials of big data. And the huge resources being invested in both the public and private sectors to study it are a testament to this.

However, crucially important caveats are needed when using such datasets: caveats that, worryingly, seem to be frequently overlooked.

The raw informational material for big data projects is often derived from large user-generated or social media platforms (e.g. Twitter or Wikipedia). Yet, in all such cases we are necessarily only relying on information generated by an incredibly biased or skewed user-base.

Gender, geography, race, income, and a range of other social and economic factors all play a role in how information is produced and reproduced. People from different places and different backgrounds tend to produce different sorts of information. And so we risk ignoring a lot of important nuance if relying on big data as a social/economic/political mirror.

We can of course account for such bias by segmenting our data. Take the case of using Twitter to gain insights into last summer’s London riots. About a third of all UK Internet users have a twitter profile; a subset of that group are the active tweeters who produce the bulk of content; and then a tiny subset of that group (about 1%) geocode their tweets (essential information if you want to know about where your information is coming from).

Despite the fact that we have a database of tens of millions of data points, we are necessarily working with subsets of subsets of subsets. Big data no longer seems so big. Such data thus serves to amplify the information produced by a small minority (a point repeatedly made by UCL’s Muki Haklay), and skew, or even render invisible, ideas, trends, people, and patterns that aren’t mirrored or represented in the datasets that we work with.

Big data is undoubtedly useful for addressing and overcoming many important issues face by society. But we need to ensure that we aren’t seduced by the promises of big data to render theory unnecessary.

We may one day get to the point where sufficient quantities of big data can be harvested to answer all of the social questions that most concern us. I doubt it though. There will always be digital divides; always be uneven data shadows; and always be biases in how information and technology are used and produced.

And so we shouldn’t forget the important role of specialists to contextualise and offer insights into what our data do, and maybe more importantly, don’t tell us.

Mark Graham is a research fellow at the Oxford Internet Institute and is one of the creators of the Floating Sheep blog

The Paradox of the Proof (Project Wordsworth)

By Caroline Chen

MAY 9, 2013

On August 31, 2012, Japanese mathematician Shinichi Mochizuki posted four papers on the Internet.

The titles were inscrutable. The volume was daunting: 512 pages in total. The claim was audacious: he said he had proved the ABC Conjecture, a famed, beguilingly simple number theory problem that had stumped mathematicians for decades.

Then Mochizuki walked away. He did not send his work to the Annals of Mathematics. Nor did he leave a message on any of the online forums frequented by mathematicians around the world. He just posted the papers, and waited.

Two days later, Jordan Ellenberg, a math professor at the University of Wisconsin-Madison, received an email alert from Google Scholar, a service which scans the Internet looking for articles on topics he has specified. On September 2, Google Scholar sent him Mochizuki’s papers: You might be interested in this.

“I was like, ‘Yes, Google, I am kind of interested in that!’” Ellenberg recalls. “I posted it on Facebook and on my blog, saying, ‘By the way, it seems like Mochizuki solved the ABC Conjecture.’”

The Internet exploded. Within days, even the mainstream media had picked up on the story. “World’s Most Complex Mathematical Theory Cracked,” announced the Telegraph. “Possible Breakthrough in ABC Conjecture,” reported the New York Times, more demurely.

On MathOverflow, an online math forum, mathematicians around the world began to debate and discuss Mochizuki’s claim. The question which quickly bubbled to the top of the forum, encouraged by the community’s “upvotes,” was simple: “Can someone briefly explain the philosophy behind his work and comment on why it might be expected to shed light on questions like the ABC conjecture?” asked Andy Putman, assistant professor at Rice University. Or, in plainer words: I don’t get it. Does anyone?

The problem, as many mathematicians were discovering when they flocked to Mochizuki’s website, was that the proof was impossible to read. The first paper, entitled “Inter-universal Teichmuller Theory I: Construction of Hodge Theaters,” starts out by stating that the goal is “to establish an arithmetic version of Teichmuller theory for number fields equipped with an elliptic curve…by applying the theory of semi-graphs of anabelioids, Frobenioids, the etale theta function, and log-shells.”

This is not just gibberish to the average layman. It was gibberish to the math community as well.

“Looking at it, you feel a bit like you might be reading a paper from the future, or from outer space,” wrote Ellenberg on his blog.

“It’s very, very weird,” says Columbia University professor Johan de Jong, who works in a related field of mathematics.

Mochizuki had created so many new mathematical tools and brought together so many disparate strands of mathematics that his paper was populated with vocabulary that nobody could understand. It was totally novel, and totally mystifying.

As Tufts professor Moon Duchin put it: “He’s really created his own world.”

It was going to take a while before anyone would be able to understand Mochizuki’s work, let alone judge whether or not his proof was right. In the ensuing months, the papers weighed like a rock in the math community. A handful of people approached it and began examining it. Others tried, then gave up. Some ignored it entirely, preferring to observe from a distance. As for the man himself, the man who had claimed to solve one of mathematics’ biggest problems, there was not a sound.

For centuries, mathematicians have strived towards a single goal: to understand how the universe works, and describe it. To this objective, math itself is only a tool — it is the language that mathematicians have invented to help them describe the known and query the unknown.

This history of mathematical inquiry is marked by milestones that come in the form of theorems and conjectures. Simply put, a theorem is an observation known to be true. The Pythagorean theorem, for example, makes the observation that for all right-angled triangles, the relationship between the lengths of the three sides, ab and is expressed in the equation a2+ b2= c2. Conjectures are predecessors to a theorem — they are proposals for theorems, observations that mathematicians believe to be true, but are yet to be confirmed. When a conjecture is proved, it becomes a theorem and when that happens, mathematicians rejoice, and add the new theorem to their tally of the understood universe.

“The point is not to prove the theorem,” explains Ellenberg. “The point is to understand how the universe works and what the hell is going on.”

Ellenberg is doing the dishes while talking to me over the phone, and I can hear the sound of a small infant somewhere in the background. Ellenberg is passionate about explaining mathematics to the world. He writes a math column for Slate magazine and is working on a book called How Not To Be Wrong, which is supposed to help laypeople apply math to their lives.

The sounds of the dishes pause as Ellenberg explains what motivates him and his fellow mathematicians. I imagine him gesturing in the air with soapy hands: “There’s a feeling that there’s a vast dark area of ignorance, but all of us are pushing together, taking steps together to pick at the boundaries.”

The ABC Conjecture probes deep into the darkness, reaching at the foundations of math itself. First proposed by mathematicians David Masser and Joseph Oesterle in the 1980s, it makes an observation about a fundamental relationship between addition and multiplication. Yet despite its deep implications, the ABC Conjecture is famous because, on the surface, it seems rather simple.

It starts with an easy equation: a + b = c.

The variables ab, and c, which give the conjecture its name, have some restrictions. They need to be whole numbers, and and cannot share any common factors, that is, they cannot be divisible by the same prime number. So, for example, if was 64, which equals 26, then could not be any number that is a multiple of two. In this case, could be 81, which is 34. Now and do not share any factors, and we get the equation 64 + 81 = 145.

It isn’t hard to come up with combinations of and that satisfy the conditions. You could come up with huge numbers, such as 3,072 + 390,625 = 393,697 (3,072 = 210 x 3 and 390,625 = 58, no overlapping factors there), or very small numbers, such as 3 + 125 = 128 (125 = 5 x 5 x5).

What the ABC conjecture then says is that the properties of a and affect the properties of c. To understand the observation, it first helps to rewrite these equations a + b = c into versions made up of the prime factors:

Our first equation, 64 + 81 = 145, is equivalent to 26+ 34= 5 x 29.

Our second example, 3,072 + 390,625 = 393,697 is equivalent to  210 x 3 + 58 = 393,697 (which happens to be prime!)

Our last example, 3 + 125 = 128, is equivalent to 3 + 53= 27

The first two equations are not like the third, because in the first two equations, you have lots of prime factors on the left hand side of the equation, and very few on the right hand side. The third example is the opposite — there are more primes on the right hand side (seven) of the equation than on the left (only four). As it turns out, in all the possible combinations of a, b, and c, situation three is pretty rare. The ABC Conjecture essentially says that when there are lots of prime factors on the left hand of the equation then, usually, there will be not very many on the right side of the equation.

Of course, “lots of,” “not very many,” and “usually” are very vague words, and in a formal version of the ABC Conjecture, all these terms are spelled out in more precise math-speak. But even in this watered-down version, one can begin to appreciate the conjecture’s implications. The equation is based on addition, but the conjecture’s observation is more about multiplication.

“It really is about something very, very basic, about a tight constraint that relates multiplicative and additive properties of numbers,” says Minhyong Kim, professor at Oxford University. “If there’s something new to discover about that, you might expect it to be very influential.”

This is not intuitive. While mathematicians came up with addition and multiplication in the first place, based on their current knowledge of mathematics, there is no reason for them to presume that the additive properties of numbers would somehow influence or affect their multiplicative properties.

“There’s very little evidence for it,” says Peter Sarnak, professor at Princeton University, who is a self-described skeptic of the ABC conjecture. “I’ll only believe it when it’s proved.”

But if it were true? Mathematicians say that it would reveal a deep relationship between addition and multiplication that they never knew of before.

Even Sarnak, the skeptic, acknowledges this.

“If it’s true, then it will be the most powerful thing we have,” he says.

It would be so powerful, in fact, that it would automatically unlock many legendary math puzzles. One of these would be Fermat’s last theorem, an infamous math problem that was proposed in 1637, and solved only recently by Andrew Wiles in 1993. Wiles’ proof earned him more than 100,000 Deutsche marks in prize money (equivalent to about $50,000 in 1997), a reward that was offered almost a century before, in 1908. Wiles did not solve Fermat’s Last Theorem via the ABC conjecture — he took a different route — but if the ABC conjecture were to be true, then the proof for Fermat’s Last Theorem would be an easy consequence.

Because of its simplicity, the ABC Conjecture is well-known by all mathematicians. CUNY professor Lucien Szpiro says that “every professional has tried at least one night” to theorize about a proof. Yet few people have seriously attempted to crack it. Szpiro, whose eponymous conjecture is a precursor of the ABC Conjecture, presented a proof in 2007, but it was soon found to be problematic. Since then, nobody has dared to touch it, not until Mochizuki.

When Mochizuki posted his papers, the math community had much reason to be enthusiastic. They were excited not just because someone had claimed to prove an important conjecture, but because of who that someone was.

Mochizuki was known to be brilliant. Born in Tokyo, he moved to New York with his parents, Kiichi and Anne Mochizuki, when he was 5 years old. He left home for high school, attending Philips Exeter Academy, a selective prep school in New Hampshire. There, he whipped through his academics with lightning speed, graduating after two years, at age 16, with advanced placements in mathematics, physics, American and European history, and Latin.

Then Mochizuki enrolled at Princeton University where, again, he finished ahead of his peers, earning his bachelor’s degree in mathematics in three years and moving quickly onto his Ph.D, which he received at age 23. After lecturing at Harvard University for two years, he returned to Japan, joining the Research Institute for Mathematical Sciences at Kyoto University. In 2002, he became a full professor at the unusually young age of 33. His early papers were widely acknowledged to be very good work.

Academic prowess is not the only characteristic that set Mochizuki apart from his peers. His friend, Oxford professor Minhyong Kim, says that Mochizuki’s most outstanding characteristic is his intense focus on work.

“Even among many mathematicians I’ve known, he seems to have an extremely high tolerance for just sitting and doing mathematics for long, long hours,” says Kim.

Mochizuki and Kim met in the early 1990s, when Mochizuki was still an undergraduate student at Princeton. Kim, on exchange from Yale University, recalls Mochizuki making his way through the works of French mathematician Alexander Grothedieck, whose books on algebraic and arithmetic geometry are a must-read for any mathematician in the field.

“Most of us gradually come to understand [Grothendieck’s works] over many years, after dipping into it here and there,” said Kim. “It adds up to thousands and thousands of pages.”

But not Mochizuki.

“Mochizuki…just read them from beginning to end sitting at his desk,” recalls Kim. “He started this process when he was still an undergraduate, and within a few years, he was just completely done.”

A few years after returning to Japan, Mochizuki turned his focus to the ABC Conjecture. Over the years, word got around that he believed to have cracked the puzzle, and Mochizuki himself said that he expected results by 2012. So when the papers appeared, the math community was waiting, and eager. But then the enthusiasm stalled.

“His other papers – they’re readable, I can understand them and they’re fantastic,” says de Jong, who works in a similar field. Pacing in his office at Columbia University, de Jong shook his head as he recalled his first impression of the new papers. They were different. They were unreadable. After working in isolation for more than a decade, Mochizuki had built up a structure of mathematical language that only he could understand. To even begin to parse the four papers posted in August 2012, one would have to read through hundreds, maybe even thousands, of pages of previous work, none which had been vetted or peer-reviewed. It would take at least a year to read and understand everything. De Jong, who was about to go on sabbatical, briefly considered spending his year on Mochizuki’s papers, but when he saw height of the mountain, he quailed.

“I decided, I can’t possibly work on this. It would drive me nuts,” he said.

Soon, frustration turned into anger. Few professors were willing to directly critique a fellow mathematician, but almost every person I interviewed was quick to point out that Mochizuki was not following community standards. Usually, they said, mathematicians discuss their findings with their colleagues. Normally, they publish pre-prints to widely respected online forums. Then they submit their papers to the Annals of Mathematics, where papers are refereed by eminent mathematicians before publication. Mochizuki was bucking the trend. He was, according to his peers, “unorthodox.”

But what roused their ire most was Mochizuki’s refusal to lecture. Usually, after publication, a mathematician lectures on his papers, travelling to various universities to explain his work and answer questions from his colleagues. Mochizuki has turned down multiple invitations.

“A very prominent research university has asked him, ‘Come explain your result,’ and he said, ‘I couldn’t possibly do that in one talk,’” says Cathy O’Neil, de Jong’s wife, a former math professor better known as the blogger “Mathbabe.”

“And so they said, ‘Well then, stay for a week,’ and he’s like, ‘I couldn’t do it in a week.’

“So they said, ‘Stay for a month. Stay as long as you want,’ and he still said no.

“The guy does not want to do it.”

Kim sympathizes with his frustrated colleagues, but suggests a different reason for the rancor. “It really is painful to read other people’s work,” he says. “That’s all it is… All of us are just too lazy to read them.”

Kim is also quick to defend his friend. He says Mochizuki’s reticence is due to being a “slightly shy character” as well as his assiduous work ethic. “He’s a very hard working guy and he just doesn’t want to spend time on airplanes and hotels and so on.”

O’Neil, however, holds Mochizuki accountable, saying that his refusal to cooperate places an unfair burden on his colleagues.

“You don’t get to say you’ve proved something if you haven’t explained it,” she says. “A proof is a social construct. If the community doesn’t understand it, you haven’t done your job.”

Today, the math community faces a conundrum: the proof to a very important conjecture hangs in the air, yet nobody will touch it. For a brief moment in October, heads turned when Yale graduate student Vesselin Dimitrov pointed out a potential contradiction in the proof, but Mochizuki quickly responded, saying he had accounted for the problem. Dimitrov retreated, and the flicker of activity subsided.

As the months pass, the silence has also begun to call into question a basic premise of mathematical academia. Duchin explains the mainstream view this way: “Proofs are right or wrong. The community passes verdict.”

This foundational stone is one that mathematicians are proud of. The community works together; they are not cut-throat or competitive. Colleagues check each other’s work, spending hours upon hours verifying that a peer got it right. This behavior is not just altruistic, but also necessary: unlike in medical science, where you know you’re right if the patient is cured, or in engineering, where the rocket either launches or it doesn’t, theoretical math, better known as “pure” math, has no physical, visible standard. It is entirely based on logic. To know you’re right means you need someone else, preferably many other people, to walk in your footsteps and confirm that every step was made on solid ground. A proof in a vacuum is no proof at all.

Even an incorrect proof is better than no proof, because if the ideas are novel, they may still be useful for other problems, or inspire another mathematician to figure out the right answer. So the most pressing question isn’t whether or not Mochizuki is right — the more important question is, will the math community fulfill their promise, step up to the plate and read the papers?

The prospects seem thin. Szpiro is among the few who have made attempts to understand short segments of the paper. He holds a weekly workshop with his post-doctoral students at CUNY to discuss the paper, but he says they are limited to “local” analysis and do not understand the big picture yet. The only other known candidate is Go Yamashita, a colleague of Mochizuki at Kyoto University. According to Kim, Mochizuki is holding a private seminar with Yamashita, and Kim hopes that Yamashita will then go on to share and explain the work. If Yamashita does not pull through, it is unclear who else might be up to the task.

For now, all the math community can do is wait. While they wait, they tell stories, and recall great moments in math — the year Wiles cracked Fermat’s Last Theorem; how Perelman proved the Poincaré Conjecture. Columbia professor Dorian Goldfeld tells the story of Kurt Heegner, a high school teacher in Berlin, who solved a classic problem proposed by Gauss. “Nobody believed it. All the famous mathematicians pooh-poohed it and said it was wrong.” Heegner’s paper gathered dust for more than a decade until finally, four years after his death, mathematicians realized that Heegner had been right all along. Kim recalls Yoichi Miyaoka’s proposed proof of Fermat’s Last Theorem in 1988, which garnered a lot of media attention before serious flaws were discovered. “He became very embarrassed,” says Kim.

As they tell these stories, Mochizuki and his proofs hang in the air. All these stories are possible outcomes. The only question is – which?

Kim is one of the few people who remains optimistic about the future of this proof. He is planning a conference at Oxford University this November, and hopes to invite Yamashita to come and share what he has learned from Mochizuki. Perhaps more will be made clear, then.

As for Mochizuki, who has refused all media requests, who seems so reluctant to promote even his own work, one has to wonder if he is even aware of the storm he has created.

On his website, one of the only photos of Mochizuki available on the Internet shows a middle-aged man with old-fashioned 90’s style glasses, staring up and out, somewhere over our heads. A self-given title runs over his head. It is not “mathematician” but, rather, “Inter-universal Geometer.”

What does it mean? His website offers no clues. There are his papers, thousands of pages long, reams upon reams of dense mathematics. His resume is spare and formal. He reports his marital status as “Single (never married).” And then there is a page called Thoughts of Shinichi Mochizuki, which has only 17 entries. “I would like to report on my recent progress,” he writes, February 2009. “Let me report on my progress,” October 2009. “Let me report on my progress,” April 2010, June 2011, January 2012. Then follows math-speak. It is hard to tell if he is excited, daunted, frustrated, or enthralled.

Mochizuki has reported all this progress for years, but where is he going? This “inter-universal geometer,” this possible genius, may have found the key that would redefine number theory as we know it. He has, perhaps, charted a new path into the dark unknown of mathematics. But for now, his footsteps are untraceable. Wherever he is going, he seems to be travelling alone.