Tag Archives: technology

Just data?… White House confronts subtle algorithmic discrimination

Updated: On rereading this post from the weekend I realized that this was possibly the most boring post ever written on a topic that isn’t boring at all.  So I’ve re-edited this  – for fun-ness.


I’m loving yesterday’s White House report on big data. It takes on how to hold commercial use of big data  to the same ethical standards we hold traditional business practices (theoretically anyway).  There’s also good discussion of important security and privacy issues.  But the really welcome stuff in Big Data:  Seizing Opportunities, Preserving Values is its discussion of how commercial data analysis is already shaping our daily lives in less obvious but potentially profound ways.

Let’s start with digital red-lining. As I alluded to in my previous post on high frequency trading, whether its bots or bank managers doing the “red-lining” of no-loan districts, it should be equally illegal.  Or as the New York Times put it in their coverage of the WH report:

“the same technology that is often so useful in predicting places that would be struck by floods or diagnosing hard-to-find illnesses in infants also has “the potential to eclipse longstanding civil rights protections in how personal information is used in housing, credit, employment, health, education and the marketplace.””

But in addition to direct threats to civil rights protections, the report surfaces the small “d” discrimination inherent in new far more sophisticated mechanisms to characterize and segment consumers.  The structure of these algorithms is largely invisible, often proprietary and intentionally “seamless”, but their effect is to differently shape the world (the information, products, prices, opportunities) that each of us has access to through the internet and, increasingly, offline as well.

There’s nothing necessarily wrong with this, indeed, many of these algorithms are essential and powerful tools without which we’d struggle to make sense of the sea of data in which we already swim.  But if we’re blithely oblivious or treat algorithmic decision-making as somehow inscrutable (see my post on HFT),  it doesn’t take a lot of imagination to come up with a whole host of ways things could go south in a hurry.  As the report rather drily puts it in the recommendations section, for example:

“Consumers have a legitimate expectation of knowing whether the prices they are offered for goods and services are systematically different than the prices offered to others.”

So its great to see the Administration launch a civic discussion of how we enhance transparency and accountability in big data analytics.  Whats more, the discipline of  carrying this out could very well have knock-on benefits for the rest of the economy.

As the report also flags, big data and the tools to analyze it also hold immense potential to improve the performance, transparency, and accountability of complex systems like healthcare – which (I’d add) has long been rife with waste and inequity masked by old-fashioned complexity and technological obfuscation.

Its time we got used to asking what’s happening behind the curtain.


Below, some highlights on these topics from the report itself:

On “Algorithms, Alternative Scoring and the Specter of Discrimination”

The business models and big data strategies now being built around the collection and use of consumer data, particularly among the “third-party” data services companies, raise important questions about how to ensure transparency and accountability in these practices. Powerful algorithms can unlock value in the vast troves of information available to businesses, and can help empower consumers, but also raise the potential of encoding discrimination in automated decisions. Fueled by greater access to data and powerful analytics, there are now a host of products that “score” individuals beyond the scope of traditional credit scores, which are regulated by law.110 These products attempt to statistically characterize everything from a consumer’s ability to pay to whether, on the basis of their social media posts, they are a “social influencer” or “socially influenced.”

While these scores may be generated for marketing purposes, they can also in practice be used similarly to regulated credit scores in ways that influence an individuals’ opportunities to find housing, forecast their job security, or estimate their health, outside of the protections of the Fair Credit Reporting Act or Equal Credit Opportunity Act.111 Details on what types of data are included in these scores and the algorithms used for assigning attributes to an individual are held closely by companies and largely invisible to consumers. That means there is often no meaningful avenue for either identifying harms or holding any entity in the decision-making chain accountable.””

On consumer and search related issues:

“The fusion of many different kinds of data, processed in real time, has the power to deliver exactly the right message, product, or service to consumers before they even ask. Small bits of data can be brought together to create a clear picture of a person to predict preferences or behaviors. These detailed personal profiles and personalized experiences are effective in the consumer marketplace and can deliver products and offers to precise segments of the population—like a professional accountant with a passion for knit-ting, or a home chef with a penchant for horror films.

Unfortunately, “perfect personalization” also leaves room for subtle and not-so-subtle forms of discrimination in pricing, services, and opportunities. For example, one study found web searches involving black-identifying names (e.g., “Jermaine”) were more likely to display ads with the word “arrest” in them than searches with white-identifying names (e.g., “Geoffrey”). This research was not able to determine exactly why a racially biased result occurred, recognizing that ad display is algorithmically generated based on a number of variables and decision processes.17 But it’s clear that outcomes like these, by serving up different kinds of information to different groups, have the potential to cause real harm to individuals, whether they are pursuing a job, purchasing a home, or simply searching for information.

Another concern is that big data technology could assign people to ideologically or culturally segregated enclaves known as “filter bubbles” that effectively prevent them from encountering information that challenges their biases or assumptions.18 Extensive profiles about individuals and their preferences are being painstakingly developed by com-panies that acquire and process increasing amounts of data. Public awareness of the scope and scale of these activities is limited, however, and consumers have few opportunities to control the collection, use, and re-use of these data profiles.”

Relatedly, in the privacy sphere

“As techniques like data fusion make big data analytics more powerful, the challenges to current expectations of privacy grow more serious. When data is initially linked to an in-dividual or device, some privacy-protective technology seeks to remove this linkage, or “de-identify” personally identifiable information—but equally effective techniques exist to pull the pieces back together through “re-identification.” Similarly, integrating diverse data can lead to what some analysts call the “mosaic effect,” whereby personally identifiable information can be derived or inferred from datasets that do not even include personal identifiers, bringing into focus a picture of who an individual is and what he or she likes.

Many technologists are of the view that de-identification of data as a means of protecting individual privacy is, at best, a limited proposition.19”

And finally, in general:

“Recognizing that big data technologies are used far beyond the intelligence community, this report has taken a broad view of the issues implicated by big data. These new technologies do not only test individual privacy, whether defined as the right to be let alone, the right to control one’s identity, or some other variation. Some of the most profound challenges revealed during this review concern how big data analytics may lead to disparate inequitable treatment, particularly of disadvantaged groups, or create such an opaque decision-making environment that individual autonomy is lost in an impenetrable set of algorithms.

These are not unsolvable problems, but they merit deep and serious consideration. The historian Melvin Kranzberg’s First Law of Technology is important to keep in mind: “Technology is neither good nor bad; nor is it neutral.”22 Technology can be used for the public good, but so too can it be used for individual harm. Regardless of technological advances, the American public retains the power to structure the policies and laws that govern the use of new technologies in a way that protects foundational values.

Big data is changing the world. But it is not changing Americans’ belief in the value of protecting personal privacy, of ensuring fairness, or of preventing discrimination. This report aims to encourage the use of data to advance social good, particularly where markets and existing institutions do not otherwise support such progress, while at the same time supporting frameworks, structures, and research that help protect our core values.”

The whole report and its recommendations are here

Tagged , , ,

Attention Titans of Tech – This is What $ Billions and Billions are For

This post got delayed and there haven’t been quite as many multi-billion tech investments in the news this week.  But for any coming down the pike, I’m voting for more steel (or carbon fiber or whatever) in the ground.

Tesla’s announcement a few weeks back of their plans to invest $5 billion to build a 10 million square foot “gigafactory” in the US to build batteries for its vehicles and solar businesses is worth underscoring as we debate the future of work, innovation, and what to do with billions and billions of dollars you may have on hand.  Here are my top 5 reasons Tesla’s announcement is important:

Photo: Tesla from the company's powerpoint on the project at:   http://bit.ly/1dANXNY

Photo: Tesla from the company’s powerpoint on the project at: http://bit.ly/1dANXNY

  1.  It provides a vivid counter to the hand wringing around jobless innovation that followed the Facebook/Whats App acquisition. Certainly some new technologies today (as in every era) may threaten today’s jobs, but we should be careful to separate the impacts of disruptive technologies from other trends that keep us from getting the benefits we should from innovation. Too much emphasis on big, quick payouts over longer term productive investment and on innovation alone over the things that embody that innovation has been damaging job creation for years, and the gigafactory is a salutary counter example. Turning innovation into jobs requires investment not just in ideas, but in building the products that embody those ideas. Not only does choosing to build cutting-edge products in the U.S. enhance the overall economic value of innovation to the American economy, but feedback between technology innovation, manufacturing, system integration, and product improvements are critical to sustaining an innovation edge. 
  2.  High tech manufacturing is a compelling path to job growth in the US.  Tesla is projecting 6500 jobs at the new factory.  Is that as many direct jobs as a $5B investment would have created 30 years ago? Of course not. Today’s manufacturing is much more capital intensive.  But it’s approximately 500 times as many jobs per $ billion as the WhatsApp investment.  Furthermore, the capital intensiveness of today’s advanced manufacturing is part of what makes the US newly competitive.  Today even comparatively well paid labor is a small fraction of total product cost, so quality, co-location of engineering with markets and the supply chain, scale (see below), and the relationship between management and workers (that’s another blog) can be much more important.
  3.  If Tesla’s stock price doesn’t already have you convinced, the EV – and the energy storage market – is SO not dead.  The gigafactory as described will have the capacity to build more batteries alone than total global production today. Tesla has the necessity or the luxury of being more single minded, but virtually every automaker worldwide is investing, and some like GM are investing heavily, in vehicle electrification, in batteries, and in the connection of vehicles to the home, business, and to clean energy generation.  Numerous energy companies are similarly focused.  This isn’t just because of the economic, environmental, and political attractions of leaving the gas station behind.  It’s because electrification, energy storage, and machine to machine technology, are on the cusp of delivering cars, homes, businesses and energy systems with transformative functionality.  Put differently the gigafactory – and these other U.S. investments in advanced vehicles, energy storage and energy management –  leverage disruption in cars and power, both central to the American way of life and to preventing runaway climate change, and do so in ways that help ensure the American economy and American workers are part and parcel of this disruption, not left behind by it.
  4.  And speaking of American ingenuity…its payback time.  Let’s take some time to heap scorn on those who naysayed the government’s overwhelmingly successful investment portfolio in clean energy and advanced vehicles technology.  The DOE’s Advanced Technology Vehicle Manufacturing loan program provided Tesla $465 million loan (that’s nearly half a billion dollars) at a time when it, like the rest of the industry, was in dire economic straits and when the financial sector was unwilling to take on risky automotive projects.  Tesla used that loan to retool its factory in Fremont, California to build the Tesla S (a factory that in another era was the site of an innovative partnership between GM and Toyota), it paid taxpayers back that loan early, and now the US economy is seeing an additional payback at even greater scale.
  5.  Finally, it’s hard not to love/hate/envy a guy who gets a cameo in Iron Man.  But I have to credit Elon Musk for being what I hope is the leading edge of a cultural and strategic shift in Silicon Valley –  a hot IPO, corporate statements on Twitter, plus steel in the ground.
Tagged , , , , , ,

Why we all should be following high frequency trading (and how to make the most of the robot revolution)

High frequency trading keeps showing up in the news.   We’ve seen flash crashes, system inefficiencies, and most recently, as Matthew Obrien outlines in his recent piece in the Atlantic’s blog, data sources selling early information to these traders.

We should be concerned about wasteful speculation, and the specter of super-fast quant-designed algobots skimming profits from ‘ordinary’ investors naturally raises hot button questions in a climate of concern over financial markets.

But high frequency trading and the various proposals to address it are equally valuable as a case study in how we interact and govern in an increasingly digital and intelligent machine enabled world.

elegant new design for the Bloomberg terminal (used by 'ordinary' investors) from Colin P Kelly and team

Speaking of ‘ordinary’ traders.  A sleek new design for the Bloomberg terminal from Colin P Kelly et al. photo: colinpkelly.com

O’Brien highlights recurring news of HFT firms buying corporate and economic information from third parties fractions of a second ahead of the public release, allowing them time to trade on the data before the rest of the market.  I would tend to agree with Obrien that this is different only in style and timescale (and current legality) to a company CFO selling earnings data to a trader the day before the public report.  The fact that we can make an illicit insider gain in a fraction of a second that used to take several days is beside the point. (Business Wire announced Thursday it would stop selling its reports directly to HFT firms).  And technology cuts both ways.  Some have argued that some traditional insider trading might be moot if we updated the way companies report information to a more continuous stream that better aligns with today’s information technology.

Using data sold a second ahead is just an easier to visualize version of what some types of HFT are designed to do routinely, however.  Acting on a millisecond timescale, high frequency trading computer algorithms –   what finance writer Felix Salmon has called ‘algobots’ – are designed to jump in between ‘regular’ buy and sell bids as they travel between exchanges and capture the gains (so-called “latency arbitrage”). While speed and automation can certainly improve the functioning of the market, several recent studies have confirmed that much HFT does not, instead it reduces efficiency and profits in the market overall.  In addition, the volume and speed of HFT algobots’ machine-to-machine decisions can amplify errors into ‘irrational’ feedback loops and brief but crazy market swings – so-called “flash crashes”.

For all these reasons there are calls to regulate.  O’Brien calls for a financial transactions tax. But this is where things get interesting.

There are many good arguments to be made for a FTT.  Such a tax could be a fair and effective instrument to retard some speculative trading and/or to generate revenue to address the various public costs of unproductively short term, risky or speculative behavior in financial markets.  But it may be a blunt instrument to address predatory high frequency trading, per se.

By contrast, this past summer two sets of researchers from different disciplines – economics and computer science, respectively – separately demonstrated the market loss resulting from HFT,  They then proposed a technological fix to the exchanges, instead of a tax.   They propose that instead of trading continuously, the exchange system move to a ‘discrete time market’ or ‘frequent batch auction’ where buy and sell orders would clear in discrete intervals of less than a second.  This is plenty fast enough for market liquidity and better tracks “real” changes in price, but virtually eliminates the opportunity for latency arbitrage. These studies are a great read, and their technological fix is elegant.  In October, IEX, an “upstart” stock exchange, launched using just this type of technology.

My guess is that some combination of tax and technology policy is probably ideal.  But whether or not we’re concerned with financial markets, its worth reading the HFT studies and keeping an eye on the progress of IEX, because finance is also a great place to explore a world in which we’re all cyborgs of a sort already, and we’re grappling with the challenge that poses to work, ethics and government.

For example, in assessing the impacts of HFT, the authors of these studies reach inside the technobabble, explore what the algorithms actually do and quantify their effects in the non-digital world.  Then they propose a tailored fix.  We should take notice because it’s not just HFT algobots that are running about.  Our lives are increasingly shaped by a host of algorithms, bots, and machine to machine communications whose specifics and impacts are completely nontransparent.  For example, where bank managers once sat down with a computer and a set of statistics and prejudices to red-line ‘no loan’ districts, now companies and search engines routinely offer you and your neighbor different products at different prices or not at all.   Why, how, and what’s the effect? We won’t know unless we make this kind of systematic inquiry routine.

Second, and more philosophically, they propose a somewhat counter-intuitive solution to the critical current challenge of digital speed and information.  After centuries where decision-making was generally improved by more and faster data, we’ve reached an inflection point where we have so much data and speed it can degrade the efficiency with which we understand and act.  Learning to harness big data is one of the most exciting opportunities today, and our usual inclination is to build an additional tool or algorithm to help us understand, understand faster, or identify errors.  What the discrete time markets proposal does is the opposite.  It slows down action in the market to the speed (still very fast) that real, information-rich decisions can be made (whether those decisions are made by humans or bots). Of course the “speed of reality” (As Michael Wellman talks about in his blog on the University of Michigan HFT research he led) – the speed by which we generate sound actionable information – is different in every network and has different consequences.  In the HFT situation, trading churn that adds nothing to price discovery likely costs the stock market billions every year.  Actions taken too quickly on crowd-sourced data during the Boston bombing, for example, fingered innocent bystanders as criminals.

And finally, back to the robot revolution.  There’s hardly any trader left today, not even grandmoms and pops, who makes trading decisions without a computer.  And when we talk about “ordinary investors” above, what we mean are hedge funds and institutional investors whose trading desks are profoundly digitally connected and running analytical software rife with algorithms of their own. So when algobots talk to each other, execute high frequency trades without much adult supervision, and are prone to irrational swings, are they just acting like regular traders?  Who’s best to regulate, then? Human or machine? And does any of this impact the underlying direction and value of our investments?

Debating HFT reminds us that it is worth continuously asking whether our current frameworks of oversight are sufficient to capture unique problems and opportunities that arise from Machine to Machine decision-making.  At the same time the larger challenge is ensuring that our activities and our regulatory approaches (whether done by humans or bots) accord with the societal objectives we wish to achieve.  And that’s our task as citizens.


For fabulous further reading….

Felix Salmon puts the race for (and perhaps past) better market information in historical context: http://blogs.reuters.com/felix-salmon/2012/10/06/the-problem-with-high-frequency-trading/

Is HFT wasteful?  And how to fix it:

Latency Arbitrage, Market Fragmentation, and Efficiency: A Two-Market Model from Michael Wellman and Elaine Wah of University of Michigan’s Strategic Reasoning Group, Wellman’s article on the same:

Trading faster than the speed of reality

The High-Frequency Trading Arms Race:  Frequent Batch Auctions as a Market Design Response. From Eric Budish at the University of Chicago Booth School of Business etal, and the blog post on their study An alternative to high frequency trading

Research firm NANEX’s now famous clip of what happens in ½ second of trading on a single stock.  Here shown at original and slowed down speeds; Matt Gongloff’s discussion of whats happening in it (albeit with very colorful adjectives).

Tagged , ,

Swarm of flying workarounds to 70’s era infrastructure?

Lots of good uses for drones, lots of good places for transportation innovation as a foundation for a commerce, but ah…  

...not a problem for UPS

…not a problem for UPS

Here’s my riff on the Amazon Drone:  Missing the forest for the trees, assuming, ah…

Tagged , ,