As computing gets better and the average quality of professional writing gets worse/more formulaic, it seems inevitable for us to approach a cross-over, where journalism is entirely, or largely computer-generated, and we can’t tell the difference.
In fact, most of what the average millenial spends their time reading has been arranged for them by a computer. Facebook feeds, Instagram feeds, and Google search results may be sharing human-generated content, but at a certain point, which we’re approaching, the editorial voice of the algorithm overwhelms the voice of even the most powerful human influencer.
This scares me a little, but it also intrigues me, which is why I’ve started working on a new side-project I’m calling “Journalism Machine”. Journalism machine takes small snippets of human-generated content and larger templates for modern writing, and attempts to turn it into a successful, automated publication, complete with social media interactions. Social media generates a custom feed for every individual’s tastes. Why can’t we do the same thing with journalism? Why can’t we, in the near future, have articles customized to what we know, what we believe, and what we’d share?
Right now the mechanics of my project are pretty simple:
- Build a database of content snippets.
- Build a set of social and article templates.
- Build a script for turning all of those database snippets into articles, posting them, and sharing them with the right audience.
In the future though, there are some even more interesting experimental features I’m thinking about adding.
First of all, I’m considering ways to use small, cheap human-interface tasks to generate more snippets, and more articles. APIs like Mechanical Turk and Scale would bring the cost of generating sentences down to just a few cents a piece, versus hiring a journalist to write an article for 10x that.
Second, I’m thinking about ways to make it less noticeable when content is re-used from article to article. Search engines don’t like duplicate content, and neither do readers.
Third, I’m thinking about how machine learning techniques could help produce 10x the content for each human-generated sentence. If journalism is dissolved into a series of patterns, what’s to stop me from using a pattern-matching algorithm to generate sentences without a human writer?
I don’t want journalism to dissolve like this. I want human journalists chasing leads, and writing stories for the public, not for one person. This is where the future is going though, so I’m determined to understand it.