This is a piece that I wrote for The Media Briefing. It can be seen in situ (including links to source articles) here.
The number of UK employed journalists has declined by 6,000 (9%) since 2013 as publishers have had to cut costs. However, it means they are cutting content creators at a time that they are demanding more content creation. Some experienced writers have been replaced by younger, cheaper ‘digital-natives’ but publishers will increasingly use robo-journalists instead.
If you think that’s far-fetched – they’re here already, learning fast.
The career path of the robo-journalist
Robots are employed by several publishers including The New York Times, LA Times and Forbes.
They’ve shown a natural aptitude for data so careers have started on sport and business desks but they’re moving into breaking news and investigative journalism.
Like all junior reporters they’re learning from their copy editors. Although in this case the ‘subs’ are there to actually, not metaphorically, re-programme them.
Like junior reporters they can learn from and draw on a back catalogue of great writing – but with more powerful memories and analytical techniques.
A few big publishers will understand their potential and let them shine. Other publishers will only ever give them mundane jobs.
They’ll open doors for the sort of nimble new companies that arrive during disruption, able to use technology for what it is suited.
Silicon Valley is already immersed in the technology behind robo-journalism but will use it first in fields like healthcare. They’ll enter robo-journalism later, buying or eliminating the newcomers.
Take a step back. How good is robo-journalism?
It has taken ages to reach a point where people in tests can’t tell the difference between machine written articles and similar articles by humans. A key feature of the ‘new machine age’ is that slow development quickly turns to accelerated gains.
A hard exercise has been getting journalists to verbalise what they’ve learnt to do instinctively. Once verbalised, those lessons are turned into algorithms. Machines can then trawl through wire stories, the Internet, press releases and data sets, finding and writing stories.
They don’t, of course, knock on doors, burn shoe leather or make contacts and phone calls. However, they can do the same tasks as the increasing proportion of journalists set to re-packaging news or making sense of the increasingly digitised data that informs the news.
After algorithm creation there’s slow fine-tuning. This is human labour-intensive process of re-programming but it is coming on apace.
The Associated Press once checked everything machines produced but now they put the majority of it on the wire directly.
Other companies will be check everything but decoupling machines from over zealous human chaperoning will be essential to take full advantage. It is what will make new entrant companies nimble; it is what will hold back established publishers. Machines produce more if checking doesn’t slow output.
How machines learn
Machines can learn language from large banks of expensively produced, comparable texts.
They translate between languages by comparing decades of EU and UN reports, expensively translated into multiple languages by humans. When asked to translate a sentence they scan these translations to find a close match or a few fragments they can add together.
Similarly, that newsbrands publish to the open web means machines can compare how publishers cover the same story. They learn alternative phrases, different approaches, narratives, tones and house styles.
It means they can be set to write with a particular skew: in support of a sports team or against a political party.
They can learn in an unsupervised way. They can absorb captions under hundreds of pictures and so describe what is in a new picture. They can test their understanding of stories against summaries like those CNN and MailOnline use in articles. They can learn in dynamic environments, reacting to events around them. They don’t always need a human to slowly feed them knowledge.
This rich back-catalogue of digitised articles is also a source of facts to draw on. Machines have powerful memories. Fact checking is fast.
Paul Pierotti, MD of Accenture Digital says:
“This technology is being used in healthcare firstly because of its ability to digest vast amounts of textbook knowledge and new research; secondly because it can diagnose what it sees in pictures or in patient data; thirdly, it can use language to report the diagnosis along with supporting evidence and recommendations. The reasoning and language will evolve to feel human. If the healthcare industry can harness that potential so too can news companies.”
Machines are adept at investigating data sets. Publishers have set them to tax records, homicide data, meteorological reports and more – looking for patterns and describing them. They’re thorough, not prone to error and they’re fast.
The LA Times uses robo-journalism to break news about earthquakes because machines can analyse geological survey data faster than a human. It takes under five minutes to spot a story and get it online.
Robo-journalists are arriving at a time when the lack of data skills amongst journalists is starting to show. Peter Bale, MD of CNN observed at a Reuters Institute Big Data event that traditional journalists who aired opinions based on very little proof were being embarrassed by people, often outside the industry, who could draw more solid conclusions from data. Machines will help publishers catch up.
Machines can produce multiple versions of an article to make it more ‘personal’ – to give it local flavour, for example.
Again, speed is important. Re-writes are produced in a fraction of the time it would take a human so stories are both current and personal. As well as location, personalisation might be based on the demographics or behaviours of groups of readers, as determined by their online activity. Publishers already target advertising like this.
Articles can be re-written based on what an individual might show they know about a topic in an interactive element in an article. The machine then describes the difference between your perception and reality.
We’ve always read between the lines to understand how we are personally affected or to see how reality differs from what we assume. In the future we will only need to read the lines themselves to understand those things.
Language translation is another form of personalisation. Publishers from De Correspondent to The Economist have ambitions to find new customers amongst different language speakers. Machines offer opportunity.
Finally, machines can write tirelessly. By covering more topics they’re more likely to write about your football team, your industry etc. News feels more personal.
Interest to Advertisers
Robo-journalism will be of interest to the advertising department. They’ve built native advertising units to write copy for advertisers – and charge a premium. Personalisation means bigger premiums.
On the flip side, what if robo-journalism technology got into advertiser hands? They already plan to buy native at scale across many sites. What’s missing is the ability to speedily and cost-effectively re-write content to suit different publishers’ environments. If technology enables it, they can force prices down.
The slow part is over. The rate of development will accelerate. Costs of entry will drop, as expensive lessons learnt by machines will be cheaply replicated in others.
Publishers will battle nimble robo-publishers and advertisers seeking to drive down cost. They’ll need to fully embrace the twin opportunities of data interpretation and personalisation – and avoid chaperoning machines too closely.
Before you know it, these challenges will be upon us.