Arquivo da tag: Ética e Inteligência Artificial

The Facebook whistleblower says its algorithms are dangerous. Here’s why. (MIT Technology Review)

Frances Haugen’s testimony at the Senate hearing today raised serious questions about how Facebook’s algorithms work—and echoes many findings from our previous investigation.

October 5, 2021

Karen Hao

Facebook whistleblower Frances Haugen testifies during a Senate Committee October 5. Drew Angerer/Getty Images

On Sunday night, the primary source for the Wall Street Journal’s Facebook Files, an investigative series based on internal Facebook documents, revealed her identity in an episode of 60 Minutes.

Frances Haugen, a former product manager at the company, says she came forward after she saw Facebook’s leadership repeatedly prioritize profit over safety.

Before quitting in May of this year, she combed through Facebook Workplace, the company’s internal employee social media network, and gathered a wide swath of internal reports and research in an attempt to conclusively demonstrate that Facebook had willfully chosen not to fix the problems on its platform.

Today she testified in front of the Senate on the impact of Facebook on society. She reiterated many of the findings from the internal research and implored Congress to act.

“I’m here today because I believe Facebook’s products harm children, stoke division, and weaken our democracy,” she said in her opening statement to lawmakers. “These problems are solvable. A safer, free-speech respecting, more enjoyable social media is possible. But there is one thing that I hope everyone takes away from these disclosures, it is that Facebook can change, but is clearly not going to do so on its own.”

During her testimony, Haugen particularly blamed Facebook’s algorithm and platform design decisions for many of its issues. This is a notable shift from the existing focus of policymakers on Facebook’s content policy and censorship—what does and doesn’t belong on Facebook. Many experts believe that this narrow view leads to a whack-a-mole strategy that misses the bigger picture.

“I’m a strong advocate for non-content-based solutions, because those solutions will protect the most vulnerable people in the world,” Haugen said, pointing to Facebook’s uneven ability to enforce its content policy in languages other than English.

Haugen’s testimony echoes many of the findings from an MIT Technology Review investigation published earlier this year, which drew upon dozens of interviews with Facebook executives, current and former employees, industry peers, and external experts. We pulled together the most relevant parts of our investigation and other reporting to give more context to Haugen’s testimony.

How does Facebook’s algorithm work?

Colloquially, we use the term “Facebook’s algorithm” as though there’s only one. In fact, Facebook decides how to target ads and rank content based on hundreds, perhaps thousands, of algorithms. Some of those algorithms tease out a user’s preferences and boost that kind of content up the user’s news feed. Others are for detecting specific types of bad content, like nudity, spam, or clickbait headlines, and deleting or pushing them down the feed.

All of these algorithms are known as machine-learning algorithms. As I wrote earlier this year:

Unlike traditional algorithms, which are hard-coded by engineers, machine-learning algorithms “train” on input data to learn the correlations within it. The trained algorithm, known as a machine-learning model, can then automate future decisions. An algorithm trained on ad click data, for example, might learn that women click on ads for yoga leggings more often than men. The resultant model will then serve more of those ads to women.

And because of Facebook’s enormous amounts of user data, it can

develop models that learned to infer the existence not only of broad categories like “women” and “men,” but of very fine-grained categories like “women between 25 and 34 who liked Facebook pages related to yoga,” and [target] ads to them. The finer-grained the targeting, the better the chance of a click, which would give advertisers more bang for their buck.

The same principles apply for ranking content in news feed:

Just as algorithms [can] be trained to predict who would click what ad, they [can] also be trained to predict who would like or share what post, and then give those posts more prominence. If the model determined that a person really liked dogs, for instance, friends’ posts about dogs would appear higher up on that user’s news feed.

Before Facebook began using machine-learning algorithms, teams used design tactics to increase engagement. They’d experiment with things like the color of a button or the frequency of notifications to keep users coming back to the platform. But machine-learning algorithms create a much more powerful feedback loop. Not only can they personalize what each user sees, they will also continue to evolve with a user’s shifting preferences, perpetually showing each person what will keep them most engaged.

Who runs Facebook’s algorithm?

Within Facebook, there’s no one team in charge of this content-ranking system in its entirety. Engineers develop and add their own machine-learning models into the mix, based on their team’s objectives. For example, teams focused on removing or demoting bad content, known as the integrity teams, will only train models for detecting different types of bad content.

This was a decision Facebook made early on as part of its “move fast and break things” culture. It developed an internal tool known as FBLearner Flow that made it easy for engineers without machine learning experience to develop whatever models they needed at their disposal. By one data point, it was already in use by more than a quarter of Facebook’s engineering team in 2016.

Many of the current and former Facebook employees I’ve spoken to say that this is part of why Facebook can’t seem to get a handle on what it serves up to users in the news feed. Different teams can have competing objectives, and the system has grown so complex and unwieldy that no one can keep track anymore of all of its different components.

As a result, the company’s main process for quality control is through experimentation and measurement. As I wrote:

Teams train up a new machine-learning model on FBLearner, whether to change the ranking order of posts or to better catch content that violates Facebook’s community standards (its rules on what is and isn’t allowed on the platform). Then they test the new model on a small subset of Facebook’s users to measure how it changes engagement metrics, such as the number of likes, comments, and shares, says Krishna Gade, who served as the engineering manager for news feed from 2016 to 2018.

If a model reduces engagement too much, it’s discarded. Otherwise, it’s deployed and continually monitored. On Twitter, Gade explained that his engineers would get notifications every few days when metrics such as likes or comments were down. Then they’d decipher what had caused the problem and whether any models needed retraining.

How has Facebook’s content ranking led to the spread of misinformation and hate speech?

During her testimony, Haugen repeatedly came back to the idea that Facebook’s algorithm incites misinformation, hate speech, and even ethnic violence. 

“Facebook … knows—they have admitted in public—that engagement-based ranking is dangerous without integrity and security systems but then not rolled out those integrity and security systems in most of the languages in the world,” she told the Senate today. “It is pulling families apart. And in places like Ethiopia it is literally fanning ethnic violence.”

Here’s what I’ve written about this previously:

The machine-learning models that maximize engagement also favor controversy, misinformation, and extremism: put simply, people just like outrageous stuff.

Sometimes this inflames existing political tensions. The most devastating example to date is the case of Myanmar, where viral fake news and hate speech about the Rohingya Muslim minority escalated the country’s religious conflict into a full-blown genocide. Facebook admitted in 2018, after years of downplaying its role, that it had not done enough “to help prevent our platform from being used to foment division and incite offline violence.”

As Haugen mentioned, Facebook has also known this for a while. Previous reporting has found that it’s been studying the phenomenon since at least 2016.

In an internal presentation from that year, reviewed by the Wall Street Journal, a company researcher, Monica Lee, found that Facebook was not only hosting a large number of extremist groups but also promoting them to its users: “64% of all extremist group joins are due to our recommendation tools,” the presentation said, predominantly thanks to the models behind the “Groups You Should Join” and “Discover” features.

In 2017, Chris Cox, Facebook’s longtime chief product officer, formed a new task force to understand whether maximizing user engagement on Facebook was contributing to political polarization. It found that there was indeed a correlation, and that reducing polarization would mean taking a hit on engagement. In a mid-2018 document reviewed by the Journal, the task force proposed several potential fixes, such as tweaking the recommendation algorithms to suggest a more diverse range of groups for people to join. But it acknowledged that some of the ideas were “antigrowth.” Most of the proposals didn’t move forward, and the task force disbanded.

In my own conversations, Facebook employees also corroborated these findings.

A former Facebook AI researcher who joined in 2018 says he and his team conducted “study after study” confirming the same basic idea: models that maximize engagement increase polarization. They could easily track how strongly users agreed or disagreed on different issues, what content they liked to engage with, and how their stances changed as a result. Regardless of the issue, the models learned to feed users increasingly extreme viewpoints. “Over time they measurably become more polarized,” he says.

In her testimony, Haugen also repeatedly emphasized how these phenomena are far worse in regions that don’t speak English because of Facebook’s uneven coverage of different languages.

“In the case of Ethiopia there are 100 million people and six languages. Facebook only supports two of those languages for integrity systems,” she said. “This strategy of focusing on language-specific, content-specific systems for AI to save us is doomed to fail.”

She continued: “So investing in non-content-based ways to slow the platform down not only protects our freedom of speech, it protects people’s lives.”

I explore this more in a different article from earlier this year on the limitations of large language models, or LLMs:

Despite LLMs having these linguistic deficiencies, Facebook relies heavily on them to automate its content moderation globally. When the war in Tigray[, Ethiopia] first broke out in November, [AI ethics researcher Timnit] Gebru saw the platform flounder to get a handle on the flurry of misinformation. This is emblematic of a persistent pattern that researchers have observed in content moderation. Communities that speak languages not prioritized by Silicon Valley suffer the most hostile digital environments.

Gebru noted that this isn’t where the harm ends, either. When fake news, hate speech, and even death threats aren’t moderated out, they are then scraped as training data to build the next generation of LLMs. And those models, parroting back what they’re trained on, end up regurgitating these toxic linguistic patterns on the internet.

How does Facebook’s content ranking relate to teen mental health?

One of the more shocking revelations from the Journal’s Facebook Files was Instagram’s internal research, which found that its platform is worsening mental health among teenage girls. “Thirty-two percent of teen girls said that when they felt bad about their bodies, Instagram made them feel worse,” researchers wrote in a slide presentation from March 2020.

Haugen connects this phenomenon to engagement-based ranking systems as well, which she told the Senate today “is causing teenagers to be exposed to more anorexia content.”

“If Instagram is such a positive force, have we seen a golden age of teenage mental health in the last 10 years? No, we have seen escalating rates of suicide and depression amongst teenagers,” she continued. “There’s a broad swath of research that supports the idea that the usage of social media amplifies the risk of these mental health harms.”

In my own reporting, I heard from a former AI researcher who also saw this effect extend to Facebook.

The researcher’s team…found that users with a tendency to post or engage with melancholy content—a possible sign of depression—could easily spiral into consuming increasingly negative material that risked further worsening their mental health.

But as with Haugen, the researcher found that leadership wasn’t interested in making fundamental algorithmic changes.

The team proposed tweaking the content-ranking models for these users to stop maximizing engagement alone, so they would be shown less of the depressing stuff. “The question for leadership was: Should we be optimizing for engagement if you find that somebody is in a vulnerable state of mind?” he remembers.

But anything that reduced engagement, even for reasons such as not exacerbating someone’s depression, led to a lot of hemming and hawing among leadership. With their performance reviews and salaries tied to the successful completion of projects, employees quickly learned to drop those that received pushback and continue working on those dictated from the top down….

That former employee, meanwhile, no longer lets his daughter use Facebook.

How do we fix this?

Haugen is against breaking up Facebook or repealing Section 230 of the US Communications Decency Act, which protects tech platforms from taking responsibility for the content it distributes.

Instead, she recommends carving out a more targeted exemption in Section 230 for algorithmic ranking, which she argues would “get rid of the engagement-based ranking.” She also advocates for a return to Facebook’s chronological news feed.

Ellery Roberts Biddle, a projects director at Ranking Digital Rights, a nonprofit that studies social media ranking systems and their impact on human rights, says a Section 230 carve-out would need to be vetted carefully: “I think it would have a narrow implication. I don’t think it would quite achieve what we might hope for.”

In order for such a carve-out to be actionable, she says, policymakers and the public would need to have a much greater level of transparency into how Facebook’s ad-targeting and content-ranking systems even work. “I understand Haugen’s intention—it makes sense,” she says. “But it’s tough. We haven’t actually answered the question of transparency around algorithms yet. There’s a lot more to do.”

Nonetheless, Haugen’s revelations and testimony have brought renewed attention to what many experts and Facebook employees have been saying for years: that unless Facebook changes the fundamental design of its algorithms, it will not make a meaningful dent in the platform’s issues. 

Her intervention also raises the prospect that if Facebook cannot put its own house in order, policymakers may force the issue.

“Congress can change the rules that Facebook plays by and stop the many harms it is now causing,” Haugen told the Senate. “I came forward at great personal risk because I believe we still have time to act, but we must act now.”

Inteligência Artificial e os dilemas éticos que não estamos prontos para resolver (Estadão)

André Aléxis de Almeida, 4 de outubro de 2021

Durante a pandemia do novo coronavírus, vimos nascer uma série de inovações ligadas à Inteligência Artificial (IA). Um exemplo foi o projeto “IACOV-BR: Inteligência Artificial para Covid-19 no Brasil”, do Laboratório de Big Data e Análise Preditiva em Saúde da Faculdade de Saúde Pública da Universidade de São Paulo (USP), que desenvolve algoritmos de machine learning (aprendizagem de máquina) para antecipar o diagnóstico e o prognóstico da doença e é conduzido com hospitais parceiros em diversas regiões do Brasil para auxiliar médicos e gestores.

Já uma pesquisa da Universidade Federal de São Paulo (Unifesp), em parceria com a Rede D’Or e o Instituto Tecnológico de Aeronáutica (ITA), apontou, em fase piloto, ser possível identificar de forma rápida a gravidade dos casos de infecção por SARS-CoV-2 atendidos em pronto socorro lançando mão da IA para realizar a análise de diversos marcadores clínicos e de exames de sangue dos pacientes.

Esses são apenas dois – e nacionais – de uma infinidade de cases que mostram como o desenvolvimento e aprimoramento da IA pode ser benéfico para a sociedade. Temos que ressaltar, contudo, que a tecnologia é a famosa faca de dois gumes. De um lado, faz a humanidade avançar, otimiza processos, promove disrupções. De outro, cria divergências, paradoxos e traz problemas e dilemas que antes pareciam inimagináveis.

Em 2020, por exemplo, o departamento de polícia de Detroit, no Centro-Oeste dos Estados Unidos, foi processado por prender um homem negro identificado erroneamente por um software de reconhecimento facial como autor de um furto.

Ainda, um estudo publicado na revista Science em outubro de 2019 apontou que um software usado em atendimentos hospitalares nos EUA privilegiava pacientes brancos em detrimento de negros na fila de programas especiais voltados ao tratamento de doenças crônicas, como problemas renais e diabetes. A tecnologia, segundo os pesquisadores, tinha sido desenvolvida pela subsidiária de uma companhia de seguros e era utilizada no atendimento de, aproximadamente, 70 milhões de pacientes.

Mais recentemente, já em 2021, a startup russa Xsolla demitiu cerca de 150 funcionários com base em análise de big data. Dados dos colaboradores foram avaliados em ferramentas como o Jira – software que permite o monitoramento de tarefas e acompanhamento de projetos –, Gmail e o wiki corporativo Confluence, além de conversas e documentos, para classificá-los como “interessados” e “produtivos” no ambiente de trabalho remoto. Os que ficaram aquém do esperado foram desligados. Controverso, no mínimo, vez que houve a substituição de uma avaliação de resultados pelo simples monitoramento dos funcionários.

Novamente, esses são apenas alguns exemplos em um mar de diversos outros envolvendo polêmicas similares, cuja realidade demonstra que os gestores não estão preparados para lidar. O estudo “O estado da IA responsável: 2021”, produzido pela FICO em parceria com a empresa de inteligência de mercado Corinium, apontou que 65% das organizações não conseguem explicar como as decisões ou previsões dos seus modelos de IA são feitas. A pesquisa foi elaborada com base em conversas com 100 líderes de grandes empresas globais, inclusive brasileiros. Ainda, 73% dos entrevistados afirmaram estar enfrentando dificuldades para conseguir suporte executivo voltado a priorizar a ética e as práticas responsáveis de IA.

Softwares e aplicativos de Inteligência Artificial, que envolvem técnicas como big data e machine learning, não são perfeitos porque, justamente, foram programados por seres humanos. Há uma diferença, que pode até parecer sutil à primeira vista, entre ser inteligente e ser sábio, o que as máquinas, ao menos por enquanto, ainda não são. Em um mundo algorítmico, a IA responsável, pautada pela ética, deve ser o modelo de governança. Ao que tudo indica, entretanto, como demonstrou o estudo da FICO, é que tanto executivos como programadores não sabem como se guiar nesse sentido.

É aqui que entra a importância dos marcos regulatórios, que jogam luz sobre um tema, procuram prevenir conflitos e, caso estes ocorram, demonstram como os problemas devem ser solucionados.

Assim como ocorreu em relação à proteção de dados pessoais, a União Europeia busca ser protagonista e se tornar modelo global na regulação da IA. Por lá, o debate ainda é incipiente, mas já envolve pontos como a criação de uma autoridade para promover as normas de IA em cada país da União Europeia (EU). A regulação também mira o IA que potencialmente coloque em risco a segurança e os direitos fundamentais dos cidadãos, além da necessidade de uma maior transparência no uso de automações, como chatbots.

No Brasil, o Marco Legal da Inteligência Artificial (Projeto de Lei 21/2020) já está em tramitação no Congresso Nacional, para o qual o regime de urgência, que dispensa algumas formalidades regimentais, foi aprovado na Câmara dos Deputados. Além de toda a problemática envolvendo a falta de uma discussão aprofundada sobre o tema no Legislativo, o substitutivo do projeto se mostrou uma verdadeira bomba quanto à responsabilidade, trazendo que:

“(…) normas sobre responsabilidade dos agentes que atuam na cadeia de desenvolvimento e operação de sistemas de inteligência artificial devem, salvo disposição em contrário, se pautar na responsabilidade subjetiva, levar em consideração a efetiva participação desses agentes, os danos específicos que se deseja evitar ou remediar, e como esses agentes podem demonstrar adequação às normas aplicáveis por meio de esforços razoáveis compatíveis com padrões internacionais e melhores práticas de mercado”.

Enquanto a responsabilidade objetiva depende apenas de comprovação de nexo causal, a responsabilidade subjetiva pressupõe dolo ou culpa na conduta. Significa que agentes que atuam na cadeia de desenvolvimento e operação de sistemas de IA somente responderão por eventuais danos causados por esses sistemas se for comprovado que eles desejaram o resultado danoso ou que foram negligentes, imprudentes ou imperitos. Ademais, quem são tais agentes? Não há quaisquer definições sobre quem seriam esses operadores.

Na pressa de regular, corre-se o risco de termos, assim como diversas outras leis de nosso país, uma legislação “para inglês ver”, que mais atrapalha do que ajuda; que em vez de fazer justiça, é, na verdade, injusta. Por enquanto, no Brasil, não se tem registros de casos como os trazidos no início do texto, mas, invariavelmente, haverá. É apenas questão de tempo. E quando isso ocorrer, o risco que correremos é o de termos em mãos uma legislação incompatível com os preceitos constitucionais, que não protegem o cidadão, mas o tornam ainda mais vulnerável.

*André Aléxis de Almeida é advogado, especialista em Direito Constitucional, mestre em Direito Empresarial e mentor jurídico de empresas