Later On

A blog written for those whose interests more or less match mine.

Archive for the ‘Software’ Category

Can an Algorithm Tell When Kids Are in Danger?

leave a comment »

Dan Hurley reports in the NY Times Magazine:

The call to Pittsburgh’s hotline for child abuse and neglect came in at 3:50 p.m. on the Wednesday after Thanksgiving 2016. Sitting in one of 12 cubicles, in a former factory now occupied by the Allegheny County Police Department and the back offices of the department of Children, Youth and Families, the call screener, Timothy Byrne, listened as a preschool teacher described what a 3-year-old child had told him. The little girl had said that a man, a friend of her mother’s, had been in her home when he “hurt their head and was bleeding and shaking on the floor and the bathtub.” The teacher said he had seen on the news that the mother’s boyfriend had overdosed and died in the home.

According to the case records, Byrne searched the department’s computer database for the family, finding allegations dating back to 2008: parental substance abuse, inadequate hygiene, domestic violence, inadequate provision of food and physical care, medical neglect and sexual abuse by an uncle involving one of the girl’s two older siblings. But none of those allegations had been substantiated. And while the current claim, of a man dying of an overdose in the child’s home, was shocking, it fell short of the minimal legal requirement for sending out a caseworker to knock on the family’s door and open an investigation.

Before closing the file, Byrne had to estimate the risk to the child’s future well-being. Screeners like him hear far more alarming stories of children in peril nearly every day. He keyed into the computer: “Low risk.” In the box where he had to select the likely threat to the children’s immediate safety, he chose “No safety threat.”

Had the decision been left solely to Byrne — as these decisions are left to screeners and their supervisors in jurisdictions around the world — that might have been the end of it. He would have, in industry parlance, screened the call out. That’s what happens to around half of the 14,000 or so allegations received each year in Allegheny County — reports that might involve charges of serious physical harm to the child, but can also include just about anything that a disgruntled landlord, noncustodial parent or nagging neighbor decides to call about. Nationally, 42 percent of the four million allegations received in 2015, involving 7.2 million children, were screened out, often based on sound legal reasoning but also because of judgment calls, opinions, biases and beliefs. And yet more United States children died in 2015 as a result of abuse and neglect — 1,670, according to the federal Administration for Children and Families; or twice that many, according to leaders in the field — than died of cancer.

This time, however, the decision to screen out or in was not Byrne’s alone. In August 2016, Allegheny County became the first jurisdiction in the United States, or anywhere else, to let a predictive-analytics algorithm — the same kind of sophisticated pattern analysis used in credit reports, the automated buying and selling of stocks and the hiring, firing and fielding of baseball players on World Series-winning teams — offer up a second opinion on every incoming call, in hopes of doing a better job of identifying the families most in need of intervention. And so Byrne’s final step in assessing the call was to click on the icon of the Allegheny Family Screening Tool.

After a few seconds, his screen displayed a vertical color bar, running from a green 1 (lowest risk) at the bottom to a red 20 (highest risk) on top. The assessment was based on a statistical analysis of four years of prior calls, using well over 100 criteria maintained in eight databases for jails, psychiatric services, public-welfare benefits, drug and alcohol treatment centers and more. For the 3-year-old’s family, the score came back as 19 out of a possible 20.

Over the course of an 18-month investigation, officials in the county’s Office of Children, Youth and Families (C.Y.F.) offered me extraordinary access to their files and procedures, on the condition that I not identify the families involved. Exactly what in this family’s background led the screening tool to score it in the top 5 percent of risk for future abuse and neglect cannot be known for certain. But a close inspection of the files revealed that the mother was attending a drug-treatment center for addiction to opiates; that she had a history of arrest and jail on drug-possession charges; that the three fathers of the little girl and her two older siblings had significant drug or criminal histories, including allegations of violence; that one of the older siblings had a lifelong physical disability; and that the two younger children had received diagnoses of developmental or mental-health issues.

Finding all that information about the mother, her three children and their three fathers in the county’s maze of databases would have taken Byrne hours he did not have; call screeners are expected to render a decision on whether or not to open an investigation within an hour at most, and usually in half that time. Even then, he would have had no way of knowing which factors, or combinations of factors, are most predictive of future bad outcomes. The algorithm, however, searched the files and rendered its score in seconds. And so now, despite Byrne’s initial skepticism, the high score prompted him and his supervisor to screen the case in, marking it for further investigation. Within 24 hours, a C.Y.F. caseworker would have to “put eyes on” the children, meet the mother and see what a score of 19 looks like in flesh and blood.

For decades, debates over how to protect children from abuse and neglect have centered on which remedies work best: Is it better to provide services to parents to help them cope or should the kids be whisked out of the home as soon as possible? If they are removed, should they be placed with relatives or with foster parents? Beginning in 2012, though, two pioneering social scientists working on opposite sides of the globe — Emily Putnam-Hornstein, of the University of Southern California, and Rhema Vaithianathan, now a professor at the Auckland University of Technology in New Zealand — began asking a different question: Which families are most at risk and in need of help? “People like me are saying, ‘You know what, the quality of the services you provide might be just fine — it could be that you are providing them to the wrong families,’ ” Vaithianathan told me.

Vaithianathan, who is in her early 50s, emigrated from Sri Lanka to New Zealand as a child; Putnam-Hornstein, a decade younger, has lived in California for years. Both share an enthusiasm for the prospect of using public databases for the public good. Three years ago, the two were asked to investigate how predictive analytics could improve Allegheny County’s handling of maltreatment allegations, and they eventually found themselves focused on the call-screening process. They were brought in following a series of tragedies in which children died after their family had been screened out — the nightmare of every child-welfare agency.

One of the worst failures occurred on June 30, 2011, when firefighters were called to a blaze coming from a third-floor apartment on East Pittsburgh-McKeesport Boulevard. When firefighters broke down the locked door, the body of 7-year-old KiDonn Pollard-Ford was found under a pile of clothes in his bedroom, where he had apparently sought shelter from the smoke. KiDonn’s 4-year-old brother, KrisDon Williams-Pollard, was under a bed, not breathing. He was resuscitated outside, but died two days later in the hospital.

The children, it turned out, had been left alone by their mother, Kiaira Pollard, 27, when she went to work that night as an exotic dancer. She was said by neighbors to be an adoring mother of her two kids; the older boy was getting good grades in school. For C.Y.F., the bitterest part of the tragedy was that the department had received numerous calls about the family but had screened them all out as unworthy of a full investigation.

Incompetence on the part of the screeners? No, says Vaithianathan, who spent months with Putnam-Hornstein burrowing through the county’s databases to build their algorithm, based on all 76,964 allegations of maltreatment made between April 2010 and April 2014. “What the screeners have is a lot of data,” she told me, “but it’s quite difficult to navigate and know which factors are most important. Within a single call to C.Y.F., you might have two children, an alleged perpetrator, you’ll have Mom, you might have another adult in the household — all these people will have histories in the system that the person screening the call can go investigate. But the human brain is not that deft at harnessing and making sense of all that data.”

She and Putnam-Hornstein linked many dozens of data points — just about everything known to the county about each family before an allegation arrived — to predict how the children would fare afterward. What they found was startling and disturbing: 48 percent of the lowest-risk families were being screened in, while 27 percent of the highest-risk families were being screened out. Of the 18 calls to C.Y.F. between 2010 and 2014 in which a child was later killed or gravely injured as a result of parental maltreatment, eight cases, or 44 percent, had been screened out as not worth investigation.

According to Rachel Berger, a pediatrician who directs the child-abuse research center at Children’s Hospital of Pittsburgh and who led research for the federal Commission to Eliminate Child Abuse and Neglect Fatalities, the problem is not one of finding a needle in a haystack but of finding the right needle in a pile of needles. “All of these children are living in chaos,” she told me. “How does C.Y.F. pick out which ones are most in danger when they all have risk factors? You can’t believe the amount of subjectivity that goes into child-protection decisions. That’s why I love predictive analytics. It’s finally bringing some objectivity and science to decisions that can be so unbelievably life-changing.”

The morning after the algorithm prompted C.Y.F. to investigate the family of the 3-year-old who witnessed a fatal drug overdose, a caseworker named Emily Lankes knocked on their front door. The weathered, two-story brick building was surrounded by razed lots and boarded-up homes. No one answered, so Lankes drove to the child’s preschool. The little girl seemed fine. Lankes then called the mother’s cellphone. The woman asked repeatedly why she was being investigated, but agreed to a visit the next afternoon.

The home, Lankes found when she returned, had little furniture and no beds, though the 20-something mother insisted that she was in the process of securing those and that the children slept at relatives’ homes. All the appliances worked. There was food in the refrigerator. The mother’s disposition was hyper and erratic, but she insisted that she was clean of drugs and attending a treatment center. All three children denied having any worries about how their mother cared for them. Lankes would still need to confirm the mother’s story with her treatment center, but for the time being, it looked as though the algorithm had struck out.

Charges of faulty forecasts have accompanied the emergence of predictive analytics into public policy. And when it comes to criminal justice, where analytics are now entrenched as a tool for judges and parole boards, even larger complaints have arisen about the secrecy surrounding the workings of the algorithms themselves — most of which are developed, marketed and closely guarded by private firms. That’s a chief objection lodged against two Florida companies: Eckerd Connects, a nonprofit, and its for-profit partner, MindShare Technology. Their predictive-analytics package, called Rapid Safety Feedback, is now being used, the companies say, by child-welfare agencies in Connecticut, Louisiana, Maine, Oklahoma and Tennessee. Early last month, the Illinois Department of Children and Family Services announced that it would stop using the program, for which it had already been billed $366,000 — in part because Eckerd and MindShare refused to reveal details about what goes into their formula, even after the deaths of children whose cases had not been flagged as high risk.

The Allegheny Family Screening Tool developed by Vaithianathan and Putnam-Hornstein is different: It is owned by the county. Its workings are public. Its criteria are described in academic publications and picked apart by local officials. At public meetings held in downtown Pittsburgh before the system’s adoption, lawyers, child advocates, parents and even former foster children asked hard questions not only of the academics but also of the county administrators who invited them. . .

Continue reading.

Once again we see that government services can be better than services provided by for-profit corporations with proprietary interests.

Written by LeisureGuy

22 March 2018 at 5:30 pm

What Facebook Did to American Democracy—and why it was so hard to see it coming

leave a comment »

Alexis Madrigal writes in the Atlantic:

In the media world, as in so many other realms, there is a sharp discontinuity in the timeline: before the 2016 election, and after.

Things we thought we understood—narratives, data, software, news events—have had to be reinterpreted in light of Donald Trump’s surprising win as well as the continuing questions about the role that misinformation and disinformation played in his election.

Tech journalists covering Facebook had a duty to cover what was happening before, during, and after the election. Reporters tried to see past their often liberal political orientations and the unprecedented actions of Donald Trump to see how 2016 was playing out on the internet. Every component of the chaotic digital campaign has been reported on, here at The Atlantic, and elsewhere: Facebook’s enormous distribution power for political information, rapacious partisanship reinforced by distinct media information spheres, the increasing scourge of “viral” hoaxes and other kinds of misinformation that could propagate through those networks, and the Russian information ops agency.

But no one delivered the synthesis that could have tied together all these disparate threads. It’s not that this hypothetical perfect story would have changed the outcome of the election. The real problem—for all political stripes—is understanding the set of conditions that led to Trump’s victory. The informational underpinnings of democracy have eroded, and no one has explained precisely how.

* * *

We’ve known since at least 2012 that Facebook was a powerful, non-neutral force in electoral politics. In that year, a combined University of California, San Diego and Facebook research team led by James Fowler published a study in Nature, which argued that Facebook’s “I Voted” button had driven a small but measurable increase in turnout, primarily among young people.

Rebecca Rosen’s 2012 story, “Did Facebook Give Democrats the Upper Hand?” relied on new research from Fowler, et al., about the presidential election that year. Again, the conclusion of their work was that Facebook’s get-out-the-vote message could have driven a substantial chunk of the increase in youth voter participation in the 2012 general election. Fowler told Rosen that it was “even possible that Facebook is completely responsible” for the youth voter increase. And because a higher proportion of young people vote Democratic than the general population, the net effect of Facebook’s GOTV effort would have been to help the Dems.

The research showed that a small design change by Facebook could have electoral repercussions, especially with America’s electoral-college format in which a few hotly contested states have a disproportionate impact on the national outcome. And the pro-liberal effect it implied became enshrined as an axiom of how campaign staffers, reporters, and academics viewed social media.In June 2014, Harvard Law scholar Jonathan Zittrain wrote an essay in New Republic called, “Facebook Could Decide an Election Without Anyone Ever Finding Out,” in which he called attention to the possibility of Facebook selectively depressing voter turnout. (He also suggested that Facebook be seen as an “information fiduciary,” charged with certain special roles and responsibilities because it controls so much personal data.)

In late 2014, The Daily Dot called attention to an obscure Facebook-produced case study on how strategists defeated a statewide measure in Florida by relentlessly focusing Facebook ads on Broward and Dade counties, Democratic strongholds. Working with a tiny budget that would have allowed them to send a single mailer to just 150,000 households, the digital-advertising firm Chong and Koster was able to obtain remarkable results. “Where the Facebook ads appeared, we did almost 20 percentage points better than where they didn’t,” testified a leader of the firm. “Within that area, the people who saw the ads were 17 percent more likely to vote our way than the people who didn’t. Within that group, the people who voted the way we wanted them to, when asked why, often cited the messages they learned from the Facebook ads.”

In April 2016, Rob Meyer published “How Facebook Could Tilt the 2016 Election” after a company meeting in which some employees apparently put the stopping-Trump question to Mark Zuckerberg. Based on Fowler’s research, Meyer reimagined Zittrain’s hypothetical as a direct Facebook intervention to depress turnout among non-college graduates, who leaned Trump as a whole.

Facebook, of course, said it would never do such a thing. “Voting is a core value of democracy and we believe that supporting civic participation is an important contribution we can make to the community,” a spokesperson said. “We as a company are neutral—we have not and will not use our products in a way that attempts to influence how people vote.”

They wouldn’t do it intentionally, at least.

As all these examples show, though, the potential for Facebook to have an impact on an election was clear for at least half a decade before Donald Trump was elected. But rather than focusing specifically on the integrity of elections, most writers—myself included, some observers like Sasha IssenbergZeynep Tufekci, and Daniel Kreiss excepted—bundled electoral problems inside other, broader concerns like privacysurveillancetech ideologymedia-industry competition, or the psychological effects of social media.

The same was true even of people inside Facebook. “If you’d come to me in 2012, when the last presidential election was raging and we were cooking up ever more complicated ways to monetize Facebook data, and told me that Russian agents in the Kremlin’s employ would be buying Facebook ads to subvert American democracy, I’d have asked where your tin-foil hat was,” wrote Antonio García Martínez, who managed ad targeting for Facebook back then. “And yet, now we live in that otherworldly political reality.”

Not to excuse us, but this was back on the Old Earth, too, when electoral politics was not the thing that every single person talked about all the time. There were other important dynamics to Facebook’s growing power that needed to be covered.

* * *

Facebook’s draw is its ability to give you what you want. Like a page, get more of that page’s posts; like a story, get more stories like that; interact with a person, get more of their updates. The way Facebook determines the ranking of the News Feed is the probability that you’ll like, comment on, or share a story. Shares are worth more than comments, which are both worth more than likes, but in all cases, the more likely you are to interact with a post, the higher up it will show in your News Feed. Two thousand kinds of data (or “features” in the industry parlance) get smelted in Facebook’s machine-learning system to make those predictions.

What’s crucial to understand is that, from the system’s perspective, success is correctly predicting what you’ll like, comment on, or share. That’s what matters. People call this “engagement.” There are other factors, as Slate’s Will Oremus noted in this rare story about the News Feed ranking team. But who knows how much weight they actually receive and for how long as the system evolves. For example, one change that Facebook highlighted to Oremus in early 2016—taking into account how long people look at a story, even if they don’t click it—was subsequently dismissed by Lars Backstrom, the VP of engineering in charge of News Feed ranking, as a “noisy” signal that’s also “biased in a few ways” making it “hard to use” in a May 2017 technical talk.

Facebook’s engineers do not want to introduce noise into the system. Because the News Feed, this machine for generating engagement, is Facebook’s most important technical system. Their success predicting what you’ll like is why users spend an average of more than 50 minutes a day on the site, and why even the former creator of the “like” button worries about how well the site captures attention. News Feed works really well.

But as far as “personalized newspapers” go, this one’s editorial sensibilities are limited. Most people are far less likely to engage with viewpoints that they find confusing, annoying, incorrect, or abhorrent. And this is true not just in politics, but the broader culture.

That this could be a problem was apparent to many. Eli Pariser’s The Filter Bubble, which came out in the summer of 2011, became the most widely cited distillation of the effects Facebook and other internet platforms could have on public discourse.

Pariser began the book research when he noticed conservative people, whom he’d befriended on the platform despite his left-leaning politics, had disappeared from his News Feed. “I was still clicking my progressive friends’ links more than my conservative friends’— and links to the latest Lady Gaga videos more than either,” he wrote. “So no conservative links for me.”

Through the book, he traces the many potential problems that the “personalization” of media might bring. Most germane to this discussion, he raised the point that if every one of the billion News Feeds is different, how can anyone understand what other people are seeing and responding to?

“The most serious political problem posed by filter bubbles is that they make it increasingly difficult to have a public argument. As the number of different segments and messages increases, it becomes harder and harder for the campaigns to track who’s saying what to whom,” Pariser wrote. “How does a [political] campaign know what its opponent is saying if ads are only targeted to white Jewish men between 28 and 34 who have expressed a fondness for U2 on Facebook and who donated to Barack Obama’s campaign?”

This did, indeed, become an enormous problem. When I was editor in chief of Fusion, we set about trying to track the “digital campaign” with several dedicated people. What we quickly realized was that there was both too much data—the noisiness of all the different posts by the various candidates and their associates—as well as too little. Targeting made tracking the actual messaging that the campaigns were paying for impossible to track. On Facebook, the campaigns could show ads only to the people they targeted. We couldn’t actually see the messages that were actually reaching people in battleground areas. From the outside, it was a technical impossibility to know what ads were running on Facebook, one that the company had fought to keep intact.

Pariser suggests in his book, “one simple solution to this problem would simply be to require campaigns to immediately disclose all of their online advertising materials and to whom each ad is targeted.” Which could happen in future campaigns.

Imagine if this had happened in 2016. . .

Continue reading.

Written by LeisureGuy

21 March 2018 at 11:36 am

Wow! This Facebook thing is a fusion bomb.

leave a comment »

From Brian Stelter’s Reliable Sources:

Exec summary: Scroll down for Ralph Peters’ scorching statement about Fox News, NYMag’s new hire, Google’s subscription help, and another “Black Panther” record… Plus, a snow day delay in the AT&T trial…


What will Facebook do?

Lawmakers in the U.S. and the U.K. are asking Mark Zuckerberg to testify… FTC officials are making inquiries… And Cambridge Analytica is suspending CEOAlexander Nix.

What will Wednesday bring? Maybe a public statement from Zuckerberg and/or Sheryl Sandberg. As CNN’s Laurie Segall reported, frustration is brewing inside Facebook about the company’s response to this crisis…

–> FB says Zuckerberg and Sandberg are “working around the clock to get all the facts… The entire company is outraged we were deceived…

–> Zuck’s former mentor Roger McNamee told Christiane Amanpour that Facebook is confronting a crisis of public trust “that is going to destroy the company…”

–> Wired’s Nicholas Thompson and Fred Vogelstein summed it up well here: “A Hurricane Flattens Facebook”

Meet the data scientist

Donie O’Sullivan emails: We tracked down Aleksandr Kogan, the scientist that swept up Facebook data on millions of Americans for Cambridge Analytica. He saysFacebook is making him a scapegoat… He suspects thousands of other developers gathered Facebook data just like him… And he says he’s willing to talk to Congress…

–> Kogan’s point about FB: “Using users’ data for profit is their business model…”

 –> Donie adds: For a guy that has prompted so much international intrigue these past few days, Kogan seems quite calm about it all. He seems to find it more surreal than anything else…

Continue reading.

And definitely watch this video.

Written by LeisureGuy

20 March 2018 at 9:08 pm

There’s no end to it: Facebook employs the psychologist whose firm sold data to Cambridge Analytica

leave a comment »

Paul Lewis and Julia Carrie Wong report in the Guardian:

The co-director of a company that harvested data from tens of millions of Facebook users before selling it to the controversial data analytics firms Cambridge Analytica is currently working for the tech giant as an in-house psychologist.

Joseph Chancellor was one of two founding directors of Global Science Research (GSR), the company that harvested Facebook data using a personality app under the guise of academic research and later shared the data with Cambridge Analytica.

He was hired to work at Facebook as a quantitative social psychologist around November 2015, roughly two months after leaving GSR, which had by then acquired data on millions of Facebook users.

Chancellor is still working as a researcher at Facebook’s Menlo Park headquarters in California, where psychologists frequently conduct research and experiments using the company’s vast trove of data on more than 2 billion users.

It is not known how much Chancellor knew of the operation to harvest the data of more than 50 million Facebook users and pass their information on to the company that went on to run data analytics for Donald Trump’s presidential campaign.

Chancellor was a director of GSR along with Aleksandr Kogan, a more senior Cambridge University psychologist who is said to have devised the scheme to harvest Facebook data from people who used a personality app that was ostensibly acquiring data for academic research.

On Friday, Facebook announced it had suspended both Kogan and Cambridge Analytica from using the platform, pending an investigation.

Facebook said in a statement Kogan “gained access to this information in a legitimate way and through the proper channels” but “did not subsequently abide by our rules” because he passed the information on to third parties. Kogan maintains that he did nothing illegal and had a “close working relationship” with Facebook.

Facebook appears to have taken no action against Chancellor – Kogan’s business partner at the time their company acquired the data, using an app called thisisyourdigitallife.

Cambridge Analytica – a company owned by the hedge fund billionaire Robert Mercer, and headed at the time by Trump’s key adviser Steve Bannon – used the data to build sophisticated psychological profiles of US voters.

Facebook’s deputy general counsel has described the data harvesting scheme as “a scam” and “a fraud”. He singled out Kogan, an assistant professor at Cambridge University, as having “lied to us and violated our platform policies” by passing the data on to Cambridge Analytica.

Facebook’s public statements have omitted any reference to GSR, the company Kogan incorporated in May 2014 with Chancellor, who was at the time was a postdoctoral research assistant. . .

Continue reading.

And having hired him, Facebook with make sure he doesn’t talk. See this Guardian article: “‘They’ll squash you like a bug’: how Silicon Valley keeps a lid on leakers.”

Written by LeisureGuy

18 March 2018 at 4:06 pm

Hackers, fed up with Twitter bots, are hunting them down themselves

leave a comment »

Yael Grauer reports in the Intercept:

ONCE A MERE nuisance for Twitter, accounts created by software programs pretending to be human — “bots” — have become a major headache for the social network. In October, Twitter’s general counsel told a Senate committee investigating disinformation that Russian bots tweeted 1.4 million times during the run-up to the last presidential election, and such bots would later be implicated in hundreds of tweets that followed a school shooting in Florida. In January, the New York Times detailed how U.S. companies, executives, journalists, and celebrities often purchase bots as followers in an attempt to make themselves seem more popular.

The fallout for the company has been withering. In Vanity Fair last month, writer Nick Bilton, who has tracked the company closely as an author and journalist, accused Twitter of “turning a blind eye to the problem” of bots for years in order to artificially inflate its count of active users. Meanwhile, disgruntled former Twitter executives told Maya Kosoff, also in Vanity Fair, that the social network was throwing too many humans and too little technology at the problem of bots and other misbehavior. “You had this unsophisticated human army with no real scalable platform to plug into,” one said.

Even if Twitter hasn’t invested much in anti-bot software, some of its most technically proficient users have. They’re writing and refining code that can use Twitter’s public application programming interface, or API, as well as Google and other online interfaces, to ferret out fake accounts and bad actors. The effort, at least among the researchers I spoke with, has begun with hunting bots designed to promote pornographic material — a type of fake account that is particularly easy to spot — but the plan is to eventually broaden the hunt to other types of bots. The bot-hunting programming and research has been a strictly volunteer, part-time endeavor, but the efforts have collectively identified tens of thousands of fake accounts, underlining just how much low-hanging fruit remains for Twitter to prune.

Autodidacts at Automaton Detection

Among the part-time bot-hunters is French security researcher and freelance Android developer Baptiste Robert, who in February of this year noticed that Twitter accounts with profile photos of scantily clad women were liking his tweets or following him on Twitter. Aside from the sexually suggestive images, the bots had similarities. Not only did these Twitter accounts typically include profile photos of adult actresses, but they also had similar bios, followed similar accounts, liked more tweets than they retweeted, had fewer than 1,000 followers, and directed readers to click the link in their bios.

One of the first accounts Robert looked at, which is now suspended, linked to the site, which was registered with a Russian email address also connected to,,, and Robert said it looked like various phishing sites had an identical schema and were likely operated by the same person.

So, Robert decided to create a proof-of-concept bot to show his followers that finding these accounts is pretty easy. After determining a set of similarities between the bots, he used advanced Google queries and Google reverse image search to find more of them. He then wrote a few hundred lines of code in the programming language Python to create a bot to hunt down and expose fake accounts. “It took less than one hour to write the first version of @PornBotHunter,” Robert said. “The first idea was to show how easy it was to find these bots by just using Google search.” The bot hunter tweets information on porn bots it has detected every hour, including Twitter handles, profile pictures, number of followers, longevity of the profile, biographical links, and whether Twitter has suspended the account. It also posts lengthier reports about its activities to Pastebin, a text hosting site popular among security researchers. Robert also allows people to report false positives to his regular Twitter account.

Robert is quick to admit that the software is just a proof of concept. He is planning on rewriting it to catch other types of bots as well, ranging from cryptocurrency bots to political bots. He also hopes to create a framework that’ll help people see how many bots are following them on Twitter. Once the project is stable and reviewed, he plans to open source the source code.

Still, it’s fascinating that a tool put together in just an hour is catching bots before Twitter does itself. As of March 1, @PornBotHunter has listed 197 spammy, apparently fake accounts in Pastebin, and 66, roughly a third, have yet to be suspended by Twitter. The others were suspended soon after being indexed by Google or after Robert reported them to Twitter. (A handful of the remaining active accounts, which may have only been compromised temporarily, do not appear to be spam accounts run by bots.) . . .

Continue reading.

It seems obvious that Twitter doesn’t want to close down the bots because bots increase the number of apparent users, a conflict between what is good for Twitter (at least in the short run) and what is good for the community.

Written by LeisureGuy

17 March 2018 at 11:16 am

Most lawyers don’t understand cryptography. So why do they dominate tech policy debates?

leave a comment »

Henry Farrell writes in the Washington Post:

On Wednesday, the Trump administration appointed the renowned computer science professor Ed Felten to the Privacy and Civil Liberties Oversight Board (PCLOB). This is the first time that a nonlawyer has been appointed to the board, even though it has oversight responsibilities for a variety of complex technological issues.
The bias toward lawyers reflects a more general problem in the U.S. government. Lawyers dominate debates over privacy and technology policy, and people who have a deep understanding of the technological questions surrounding complex questions, such as cryptography, are often shut out of the argument.
Some days ago, I interviewed Timothy Edgar, who served as the intelligence community’s first officer on civil liberties and is the author of the book “Beyond Snowden: Privacy, Mass Surveillance, and the Struggle to Reform the NSA,” about the reasons government policymaking isn’t as open to technological expertise as it ought to be.
The U.S. policy debate over surveillance mostly overlooks the ways in which cryptography could assure the privacy of data collected by the NSA and other entities. What broad benefits does cryptography offer?
When people think about cryptography, they mostly think about encrypting data and communications, like emails or instant messages, but modern cryptography offers many more capabilities. Today’s debate over surveillance ignores some of the ways these capabilities might allow the public to have the best of both worlds: robust intelligence collection with ironclad, mathematically rigorous privacy guarantees.
The problem is that many of these capabilities are counterintuitive. They seem like magic to those who are not aware of how cryptography has advanced over the past two decades. Because policymakers may not be aware of these advances, they view intelligence collection and privacy as a zero-sum game: more of one necessarily requires less of the other — but that’s a false trade-off.
Which specific techniques have cryptographers developed that could be applied to collected data?
Probably the most promising technology for ensuring the privacy of data that intelligence agencies are collecting is called encrypted search, something that my colleague at Brown, Prof. Seny Kamara, has helped pioneer. Imagine a large database that an intelligence agency like the NSA would like to query. The vast, vast majority of the data is irrelevant: It belongs to people that intelligence analysts should not be able to monitor. Of course, the agency could formulate queries and submit them to whoever owns the database, perhaps a telecommunications company or a digital services provider. But what if the agency is worried that its queries will reveal too much about its sensitive operations, and is not willing to take the chance that this information will leak?
Without encrypted search, the scenario I just outlined is a classic trade-off. Of course, the intelligence agency could simply forgo its queries, but if the stakes are too high — maybe the agency is trying to prevent a devastating terrorist attack — it could decide instead to engage in a highly intrusive intelligence practice called bulk collection. Bulk collection means the agency collects the entire database, including all the irrelevant information, hopefully with legal or policy safeguards to prevent abuse. Following the Snowden revelations in 2013, bulk collection of domestic data was reformed, but it remains an option when the NSA collects data outside the United States, even if that data includes communications with Americans.
Encrypted search allows us to do much better than this. The entire database is encrypted in a way that allows the intelligence agency to pose specific queries, which are also encrypted. Policymakers can decide what kinds of queries are appropriate. There are mathematically rigorous guarantees that ensure 1) the intelligence agency may only pose permissible queries, 2) the agency only receives the answers to those queries and does not receive any other data, and 3) the company will not learn what queries the agency has posed, offering the agency security for its operations.
Why is it that lawyers, rather than technologists, seem to dominate U.S. policy debates over technically complex subjects like surveillance and cryptography?
Lawyers have been dominating debates in the United States since at least the days when the French writer Alexis de Tocqueville wrote “Democracy in America” in 1831. De Tocqueville describes lawyers as occupying a place in American society similar to the aristocracies of Europe. If we examine just how many members of Congress, senior government officials and even business leaders are drawn from the legal profession today, it appears that little has changed in this regard in the subsequent two centuries. Lawyers tend to be verbal and overconfident [and thus are vulnerable to the Dunning-Kruger effect – LG]. Computer scientists are more prone to be reserved and even introverted.
The failure of lawyers and technologists to communicate well led the NSA to make some serious mistakes in the domestic bulk collection programs it was running until 2015, when they were reformed in the aftermath of the Snowden revelations. It has also, unfortunately, impeded the deployment of technologically based alternatives to intrusive intelligence programs.
Is this changing, and if it is changing, is it changing for the better or the worse? . . .

Continue reading.

Written by LeisureGuy

16 March 2018 at 1:25 pm

Enjoy Your Job in Software? You Have a Woman to Thank

with one comment

Elaine Ou reports in Bloomberg:

The most tragic story of the computer industry is how a field once dominated by women became the domain of men. Contrary to popular belief, it wasn’t just a matter of the latter pushing out the former. To a large extent, men have women to thank for the very existence of their jobs.

Once upon a time, only programmers could interact with computers. It was considered a form of clerical work, like data entry or switchboard operation. Female programmers — known as computer “feeders,” because they fed data into a machine (hence the term “data feed”) — translated flow charts into logic operations, then punched the corresponding machine codes into cards.

A mathematician at Remington Rand, Dr. Grace Hopper recognized that human feeders were a bottleneck in the programming process. Hopper imagined that someday, nontechnical users could communicate directly with machines in English, bypassing the inefficient process of translating commands into cards. Although her employer dismissed the idea, Dr. Hopper went ahead and created her own English-like computer language called FLOW-MATIC.

At the same time, Hopper’s colleague Betty Holberton wrote the first automatic programming system — that is, a program that people can use to create or operate other programs. The two women contributed to what became one of the first widely used programming languages, COBOL (COmmon Business-Oriented Language).

COBOL obviated the need for human-to-machine translation, a process that in 1959 could require more than $600,000 and two years of effort for just one program. 1 Software became both intelligible and reusable across different machines. Within 10 years, computer-feeding jobs were automated out of existence.

So women created the technology that took their jobs. But this gave rise to demand for all kinds of new tasks, such as developing the software that quickly became a critical component of every business sector, from banking to inventory control. Hopper’s vision of humans conversing with computers also led to tools such as Excel and Quickbooks, which provide accessible interfaces that translate users’ requests into code.

When people say that women are insufficiently represented in the computer industry, perhaps they’re defining it too narrowly. In a sense, everyone who uses a computer today — a management consultant armed with Microsoft Access, a teenager using Snapchat — is doing what the early programmers once did. Today’s database software is so far removed from the underlying computations that we don’t think of users as coders at all. . .

Continue reading.

Here’s an illustration from the article, giving a 1967 view of computers and programming.

Written by LeisureGuy

13 March 2018 at 9:03 am

%d bloggers like this: