The Algorithms are Hangry

On January 20, 2019January 20, 2019 By RachelIn content strategy, semantic webLeave a comment

Lots of articles (some with shiny infographics) will tell you about how much data we’re now creating, and how it’s increasing at a stunning rate every year. And yet, it’s still not enough data to make algorithms actually useful most of the time.

When I first started talking with people in the content industry about what was happening with semantic technology about a decade ago, occasionally people wondered with concern if things like natural language processing and artificial intelligence were going to make human content professionals obsolete.

My feeling at the time was “not any time soon.” These technologies seemed useful for assisting people, especially for managing data at scale, but they were always going to need to be guided and tweaked by people.

The basic expectation that most content professionals have is that algorithms will help us understand what people are interested in, and this information will be used to dynamically serve up more content that will be of interest. Some organizations may even use this information to guide content creation. Ideally, smart systems will even provide some level of assistance in producing that content.

The bots are coming!

There have been many examples reinforcing that this tech-driven intelligent content ecosystem is not quite there yet. Some are fascinating, artsy experiments, like Sunspring, a science-fiction movie written by an AI. Or paint colors created by a neural net. Or funny examples like image recognition APIs that can’t distinguish between blueberry muffins and chihuahuas. And most of us have probably played silly games with our phone’s autocomplete feature at some time or another.

My gender is the main reason I thought you were going to send me a picture of the Fishermen.— Rachel Lovinger (@rlovinger) December 29, 2018

Then, there are less benign examples. Google Photos excluded “gorilla” from it’s possible tags after learning that it’s API was applying the term to photos of black people. Microsoft shut off a chatbot after the Internet taught her to be racist in less than 24 hours. A later iteration, designed to block conversations about potentially volatile topics, had it’s own set of shortcomings. YouTube purged a whole bunch of content and channels after James Bridle wrote about the vast number of creepy and alarming children’s videos that appeared to be both created and recommended based on loopholes and misuse of bots and algorithms.

But, I’m not here today to go down the rabbit hole of horrifying, pre-apocalyptic examples of AI gone wrong. I’m not even here to talk about the trashy link-bait promos that have infected most online journalism like a plague. I want to talk about how, even in their most mundane, limited functions, algorithms just aren’t hitting the mark as much as I’d have expected them to by now.

The bots are boring!

My complaint is with Google Now. My Android phone knows more about me than any technology really should. It knows what I search for, it knows where I go, it reads my email and knows (among other things) what movie tickets I bought. So, in theory, it should be able to show me some interesting things in the daily feed. I mean, specifically interesting to me, based on my actual interests.

Sometimes it works. It showed me an interview with Wim Wenders about the recently remastered “Wings of Desire” after I bought tickets to see the movie. That was a pretty cool article that I wouldn’t have even guessed was available. But generally the feed is roughly 80-90% things that are completely uninteresting to me.

To some degree, this is because I do a limited set of things on my phone, even within the larger realm of things I do online. For example, I always look up Fortnite hints and maps on my phone because my computer is too far away when I’m in the living room using the Xbox. So, my phone obviously thinks I’m a huge Fortnite fan and it now constantly shows me news updates, leaks, and articles about fan suggestions for the game.

I also wonder if there’s some kind of crossed-signal demographic effect going on (“people who like Fortnite also like XYZ”), because my feed recently included a string of stories about various football figures, even though I have never once shown any interest in football in anything I’ve done online. I had to manually mark a whole bunch of topics as “not interested.”

However, the deeper source of failure really seems to be that there isn’t the volume of unique, high-quality content out there to meet the need that Google is trying to fill with this tailored feed. When I first started using Google Now, I noted that there were a lot of cases where I would read an article and then it would show me “similar” articles which were really just summaries that other sites had written of the original article.

Lately I’ve noticed a different trend, which I’m sure is also influenced by these algorithms and metrics. While Google has previously gone to great efforts to cut down on content farms, it has also created an appetite for nutritionless content. And there are plenty of sources ready to jump in and fill that hungry void.

For example, let’s take Avengers: Infinity War. I was very interested in seeing this movie, but I didn’t particularly read a lot about it. I probably looked up the release date at some point before it came out, watched the trailer when it was released, and then bought tickets to see it in a theater. After seeing it, I looked up the expected release date for the sequel. It’s possible (even likely) that I did all of these things on my phone.

Since the movie came out, last April, my phone has shown me content about it every single day. At first it was explanations of the ending, and analysis of the poster showing how it secretly contained spoilers for what happened in the movie. But it very quickly became a stream of non-stop speculation, fan theories, hints, spoilers, and occasionally legitimate news about the sequel (which is coming out this coming April).

I cannot tell you how uninterested I am in almost all of this. I definitely didn’t want to read about Avengers for an entire year between movies. I have zero interest in fan theories that explain some speculative aspect of the sequel. Sure, it ended with a dramatic cliff-hanger, but I just want to quietly go about my business for a year and then go see part 2 when it’s ready and I can enjoy the culmination of 10+ years of Marvel Cinematic Universe storytelling. Sure, I could mark this topic as “not interested” but that’s not the case. I am interested in it. Just not to that degree, and not wild speculation and rumors.

Maybe if Google Now knew more about my other interests, the feed would be more balanced. But my guess is that this topic, being broadly popular, has a more steady stream of source material than the obscure “long tail” topics I’m interested in.

So, is the failing with the algorithms, or is the failing with the sources of content? Or is it some kind of dysfunctional way that they learn from and influence each other? In all of the examples described above, from spectacularly disturbing to humdrum disappointing, the problem seems to call for the capability to course correct, to monitor the algorithms and tune them to be more discerning. That gets into some very subjective areas that our AIs are not quite read for. More likely, we will just keep feeding them whatever they demand and hope for the best.

Archive: Resources, October 2009

On October 18, 2012January 3, 2019 By RachelIn content strategy, semantic webLeave a comment

I realized recently that I haven’t updated the Resources page in three years. Obviously, there are a lot more recent, more interesting resources out there now. So many, in fact, that I should probably entirely replace what was there. But I do want to retain that info, so here it is in a post. Refreshed Resources page, coming soon.

[As of 10/7/09]

I gathered the following resources to be a handout to go along with my “Content Gone Wild!” talk at the MIMA Summit 2009. These articles and sites support the Content Strategy practices discussed in the examples from the presentation. These are not the only resources, and they’re not necessarily the final word on these topics, but they should provide some good information and get you started with practical techniques and tips.

General Content Strategy Resources

Content Strategy Knol, by Jeffrey MacIntyre, Colleen Jones & Bob Maynard
Content Strategy Google Group
Scatter/Gather: A Razorfish Content Strategy blog
Search for #contentstrategy on Twitter
Content Strategy for the Web, by Kristina Halvorson

Research

“User Research for Personas and Other Audience Models,” by Steve Baty , UXmatters
“Setting Up Business Stakeholder Interviews, Part 1,” by Michael Beavers, Boxes & Arrows
“Setting Up Business Stakeholder Interviews, Part 2,” by Michael Beavers, Boxes & Arrows

Content Assessment

“Content Analysis: A Practical Approach,” by Colleen Jones, UXmatters
“Content Inventory,” Usability.gov

Writing for the Web

“Be Succinct! (Writing for the Web),” by Jakob Nielson, Alertbox (An old classic)
“Writing for the Web,” by Sarah Horton, Web Teaching Guide
“Web content, writing for web sites,” by Shirley Kaiser, WebsiteTips.com (includes links to many writing resources)

Voice / Tone

“Tone in Business Writing,” by Victoria Kellough and Angela Laflen, The OWL at Purdue
“What’s all the fuss about tone of voice?,” by Barnaby Benson, Barnaby Benson Ltd
“Online tone of voice for business,” Tom Albrighton, ABC Copywriting Blog

Taxonomy & Metadata

“Metadata: Defined,” by Rachel Lovinger, Razorfish White Paper
“Developing and Creatively Leveraging Hierarchical Metadata and Taxonomy,” by Christian Ricci, Boxes & Arrows
“Representing Taxonomies: What am I looking at here?,” by Rachel Lovinger, Presentation from Semantic Technology Conference 2007

Social Media Strategy

Social Media Governance, by Chris Boudreaux (Policies from several companies and organizations)
“Razorfish Social Influence Marketing Guidelines,” by David Deal, Razorfish

Corporate Blog Strategy

CEO Blog Watch
“Blogging Strategy 101: a Primer,” Scout
“Corporate Blog Design: Trends and Examples,” by Steven Snell, Smashing Magazine
“Corporate and political blogging — get rid of the fear, be yourself!,” by Robert Scoble, Scobleizer
“10 Harsh Truths About Corporate Blogging,” by Paul Boag, Smashing Magazine
“10 Features of Successful Blogs,” RSS Pieces

Globalization

Global By Design

Looking for Taxonomy & Metadata Resources?

Here are some resources I gathered about metadata, taxonomy and ontology data.

Glossaries

Taxonomy Glossary on CMS Wiki
Metadata? Thesauri? Taxonomies? Topic Maps! – A paper on Topic Maps that includes a good overview of these terms. By Lars Marius Garshol of Ontopia.

Making a Business Case

Dublin Core Metadata Initiative – includes business cases, ROI presentation, etc.
Taxonomy today, ROI tomorrow – By David Berlind, ZDnet

Working with Existing Data

Thousands of OWL documents are indexed in Google. Add “filetype:owl” to your search and see what comes up.
Piggy Bank – open source tool for scraping data from a website.

Prototyping – test it out

MindJet® MindManger® – commercial mind mapping software
FreeMind – open source mind mapping software
Bubbl.us – an online brainstorming tool
TopBraid Composer™ – a commercial tool for building ontologies and semantic web applications. TopBraid Ensemble™ adds a layer that makes it more user-friendly for content providers, and may also be useful in prototyping.
Protégé – an open source ontology editor
Knoodl.com – a semantic wiki, combining collaborative editing with ontology models

Shared Knowledge – join a community

Taxonomy Community of Practice
CM Pros – Content Management Community of Practice
Semantic Web Interest Group

Information Design

Announcing: Nimble!

On June 3, 2010 By RachelIn conference, content strategy, semantic web1 Comment

Since the beginning of the year I’ve been researching, writing, and editing a report called Nimble: A Razorfish report on publishing in the digital age. It launched this week, and so far the response has been really great. I’ve written about it over on Scatter/Gather and you can view or download the report itself at http://nimble.razorfish.com. There’s even a Twitter account for it (@NimbleRF).

In June I’ll be doing a presentation about report at the Semantic Technology Conference in San Francisco. And there will be other presentations and developments in the coming months.

What else? I wrote a couple other pieces for Scatter/Gather:

Content Strategy Stories from the Frontline (SXSW wrapup)
Busy Times for Content Strategy (CS Forum 2010 wrapup)

And I’m helping to organize two interesting events for Internet Week next week:

Razorfish Screening: We Live in Public (by invitation only)
Razorfish: Living in Public (open event, RSVP on Facebook)

Hope to see you there!

SXSW Panel Picker: Please Vote!

On August 23, 2009 By RachelIn conference, content strategy, semantic webLeave a comment

This year I’m determined to present at SXSW. To that end, I’m involved in five (5!) proposals. Two of them are talks, and the rest are panels submitted by other people that, SXSW-gods willing, I will be participating in.

SXSW likes to have the community get involved in deciding what panels will be chosen for the conference, so they use this Panel Picker to let people indicate which ones are of greatest interest. It’s free and easy to register to vote, so please consider voting for these proposals:

Understanding Content: The Stuff We Design For – my solo presentation
Content & Semantics: The Wild, Wild Web of Data – a dual presentation I’ll give with Rahel Anne Bailie of Intentional Design Inc.
Why Your Content Sucks and How to Fix It – a panel I’ll be on, moderated by Kristina Halvorson of Brain Traffic
Let’s Talk About CS: Understanding Content Strategy – a panel I’ll be on, organized by Elena Melendy, an Independent Content Strategy Consultant
Tales from the Basement: Makers of Geek Documentaries – a panel in the SXSW Film conference which I hope to moderate, organized by Jason Scott of Bovine Ignition Systems

While you’re in there, here are some other really interesting panels by some of my friends and colleagues. Please consider voting for these as well!

Adventures in Text and Filmmaking: Making GET LAMP, Jason Scott, Bovine Ignition Systems
Web Re-Design Cage Match: Refreshing the Online Experience, Anh Dang, New York Times
Conserving The Web’s Social Ecology: Theory and Practice, Tim Hwang, The Web Ecology Project (plus a few other cool presentations by Tim)
Your Cell Phone is Your Sugar Daddy (Controlling Diabetes), John Pettengill, Razorfish
The UX of Mobile, Kyle Outlaw, Razorfish
Social Networking for Dealers: Cultivate Relationships First, Mary Butler, Razorfish
Context Strategy: Wrangling Content Gone Wild, Patrick Nichols, Razorfish
Greening Your Content: Reduce, Reuse, Recycle, Rahel Anne Bailie,Intentional Design Inc.
Content Strategy FTW, Kristina Halvorson, Brain Traffic
Oh, Go Service Yourself! Interactive Content for Humans, Colleen Jones,threebrick
Keep ‘Em Coming: Courting Customers with Content, Colleen Jones,threebrick

There are many others that will probably be amazing, and I haven’t even touched on all the ones about the Semantic Web (will have to write a separate post for that), so get started voting now – you only have until September 4th!

Semantic Web for Publishers

On July 22, 2009January 3, 2019 By RachelIn conference, semantic webLeave a comment

When I got back from the Semantic Technology Conference last month, I helped my colleague, Domenic Venuto, write a piece for MinOnline about the things magazine publishers should know about the Semantic Web. I summed up some of the most relevant presentations at SemTech this year, and why I think these things should be important to publishers. Domenic put it all into the context of the work we do with our Media and Entertainment clients, and we worked together to try to express why they should really get moving on this stuff now!

After the article came out, Semantic Universe posted video from a lot of the talks that I mentioned. Very interesting, if you want more detail:

Tom Tague’s Keynote about OpenCalias’ first year
An executive round table on Semantic Search
The Publishers Panel
Closing Keynote with The New York Times announcement

Semantic Web takes root at the IA Summit

On March 31, 2009March 31, 2009 By RachelIn conference, semantic web5 Comments

At the recent IA Summit, I was surprised and delighted to see how many talks there were about the Semantic Web. Before this emerging technology can really catch on, we will need more Information Architects and Interaction Designers who understand the potential and can design elegant solutions to real problems (both user problems and business problems). In some ways, I wish the conversation were further along, but I realize that it has to start somewhere. The fact that the subject exploded onto the scene in such a big way is a good indication that Web 3.0 is on a lot of people’s minds.

These are the talks I saw: Continue reading “Semantic Web takes root at the IA Summit” →

Semantic Web for Dummies

On March 24, 2009 By RachelIn semantic webLeave a comment

Jeff Pollock has just released a book called Semantic Web for Dummies. Over at Semantic Universe you can download a free chapter (registration required), order the book, or read Jeff’s blog posts. I haven’t read the book yet, but Jeff is a really smart person with the ability to speak plainly and compellingly. This book is bound to be useful for people who are trying to understand the Semantic Web, or are still struggling with how to explain it to others. I just put my copy on order.

FEED: The Razorfish Consumer Experience Report 2008

On October 29, 2008 By RachelIn article, content strategy, semantic webLeave a comment

Yes, I’ve been neglecting this blog, but it doesn’t mean I haven’t been productive. My employer (which has changed it’s name back to Razorfish) has published another book, which contains an article by me about semantic web and user-generated data. You can see the whole, beautifully designed document online. My article is on page 60. Congratulations to my colleagues who also contributed to the book.

SXSW Panels

On August 8, 2008August 14, 2008 By RachelIn conference, content strategy, semantic web2 Comments

The SXSW panel picker is live, and I proposed two panels. If you’re going, or you might go, or you just like voting for things, please have a look and consider voting for my proposals. You’ll have to create an account to vote, but it should be pretty painless. Here are the descriptions of the panels I proposed:

When the Semantic Web Meets User Generated Metadata

The Semantic Web promises to make the internet smarter, in part by adding structure and definition around the content on the web. Sounds great, but who’s going to do all the work? As User Generated Content gives rise to User Generated Metadata, turns out it’s going to be… YOU! (Click here to vote for it)

Content Content Revolution: The Rise of Content Strategy

What’s Content Strategy, you ask? Navigation, publishing guidelines, taxonomy, syndication, style guides, UGC strategy, the semantic web? All this and more! Come hear some of the leading content strategy professionals discuss where this emerging discipline came from, why it matters, and where it’s going. (Click here to vote for it)

On another note, I didn’t get a chance to post my fourth (and last) post about The Last HOPE before I went out of town for the weekend, and I forgot to bring my notes. So that will have to wait until I get back next week.

STC 2008 – Day 4

On May 23, 2008January 3, 2019 By RachelIn conference, semantic web1 Comment

The last day of the Semantic Technology Conference has a few morning panels, a closing keynote, and then some afternoon seminars. But the day is really about saying goodbye to everyone, finally introducing yourself to a few of the people you’ve been crossing paths with for the past week, and making that annual trip to Koo-ki Sushi. Well, that’s what it’s about for me, anyway.

Continue reading “STC 2008 – Day 4” →

Meaningful Data

Category: semantic web