Tuesday, December 29, 2009

Here's an interesting take from Wired via Futurismic:
Homeopapes: journalism by machine
Paul Raven @ 29-12-2009
Here’s an interesting piece at Wired UK that picks up the “OMG journalism is dying” ball and runs with it in the direction of automated machine-to-machine and machine-to-person news aggregation:

NewsScope is a machine-readable news service designed for financial institutions that make their money from automated, event-driven, trading. Triggered by signals detected by algorithms within vast mountains of real-time data, trading of this kind now accounts for a significant proportion of turnover in the world’s financial centres.

Reuters’ algorithms parse news stories. Then they assign “sentiment scores” to words and phrases. The company argues that its systems are able to do this “faster and more consistently than human operators”.

Millisecond by millisecond, the aim is to calculate “prevailing sentiment” surrounding specific companies, sectors, indices and markets. Untouched by human hand, these measurements of sentiment feed into the pools of raw data that trigger trading strategies.

[...]

Here and there, interesting possibilities are emerging. Earlier this year, at Northwestern University in the US, a group of computer science and journalism students rigged up a programme called Stats Monkey that uses statistical data to generate news reports on baseball matches.

Stats Monkey relies upon two key metrics: Game Score (which allows a computer to figure out which team members are influencing the action most significantly) and Win Probability (which analyses the state of a game at any particular moment, and calculates which side is likely to win).

Combining the two, Stats Monkey identifies the players who change the course of games, alongside specific turning points in the action. The rest of the process involves on-the-fly assembly of templated “narrative arcs” to describe the action in a format recognisable as a news story.

The resulting news stories read surprisingly well. If we assume that the underlying data is accurate, there’s little to prevent newspapers from using similar techniques to report a wide range of sporting events.

No comments: