This post continues on from a previous post: Where is Your Digital Hub/Home?
In a previous post, I talked about POSSE and PESOS, and publishing on your own site vs other platforms, syndicating content back and forth and content ownership. I mentioned that I’d opted for the PESOS approach, and that I was publishing content on other platforms, then syndicating it back to my own site. Let’s take a look at how that happens.
First of all, I’m running WordPress. Since I’ve been working with WordPress for years, and since my full time job has me working with it as well, this made a lot of sense. Even without those motivators though, WordPress has a huge community, is open source, is a really solid publishing platform, is built from the ground up to be completely customizable through plugins, and has an incredibly powerful themeing system (which basically allows you to do whatever you want).
One of the other things WordPress has going for it is a long history of providing data import and export tools. You’ve always been able to get data into and out of WordPress with relative ease, so it seemed like getting a bunch more data in there would be a reasonable goal. With the advent of Post Formats (in WP 3.1), WordPress also has a native way of hinting at how different types of data should be displayed, plus Custom Post Types (since WP 3.0) mean that if you really want to get crazy, you can step completely outside of the normal “Post” model and get really custom.
One of the things that got me started down the road of actually getting control over my content was “The Great Twitter 3200 Tweet Debacle” (I made that name up). Because of technical constraints, Twitter only allows you to access your most recent 3200 tweets. I’ll give you a few seconds to let that sink in. Twitter. Only allows you to access. Your most recent 3200 tweets. Your own tweets. Has that hit home yet? Here you are producing all this stuff, thinking it’s yours, and Twitter actually decides what you can and can’t access. Before I hit that 3200 mark (I was up to around 3100 at the time), I vowed that I’d get something figured out to get a copy of all of my tweets stored somewhere that I controlled.
I roughed out a really basic importer for WordPress that would access my public timeline (previously available via unauthenticated requests) and download them into WordPress. At the time I was importing them as a custom post type so that they wouldn’t show up in the rest of my blog, but that approach has changed now. I got my tweets and I was happy, so I left it running for a while and continued grabbing copies of tweets and stuffing them away in my WordPress, without displaying them publicly anywhere.
A while later, I started thinking about where else I might be producing content that I would be unhappy about losing access to. I pretty quickly came up with a shortlist of services where I was producing meaningful enough content that I wanted to ensure that I had a copy of it. This was my immediate list:
Since I’d already written one importer, I figured I’d just go ahead and write similar ones for everything else. That’s when I came up against the realization that most of these other services weren’t going to make it as easy as Twitter had, and in fact, Twitter was changing its policy as well, so I’d soon need to authenticate with my importer before I could pull down any data.
The prospect of re-implementing the OAuth stack (or something similar) in each importer, a bunch of times, was not something that excited me. That put me down a path that had me building Keyring, a generic authentication framework for WordPress that allows any plugin to tie into it, use it to handle the authentication flow, and then future authenticated requests to external services.
Once that was built, I converted the Twitter importer I had over to using it, then got started on the other services. With a lot of copy-paste, and once I’d worked with each service, I had a good feel for what was being repeated in each one, and decided to build a standardized base importer library which could handle a lot of that. I then back-ported all of the importers into using the base code, and now we have Keyring Social Importers. Each individual importer is only a few hundred lines of code, and I can slurp down content from Flickr, Delicious, Twitter, Foursquare and Instagram.
I now have each of these importes running on an hourly WP Cron to grab any new content that pops up on that service, and download it into my WordPress. Each data type is handled differently, and all valuable data possible is downloaded (including original-sized images where available, geo data from services that support it, tags, links, etc). Everything goes into WordPress where I have complete control over it. I stuff a copy of the entire raw block of import data in a postmeta value in case I want to re-process it later somehow differently.
With all of this new data though, I quickly came to realize that a “normal” WordPress theme, even one that supports different post formats, kind of breaks down under this sheer volume of data, and the different ways that it makes sense to display them. I now have over 10,000 posts, with 4,700 different tags. In addition to Standard posts, I’m using Asides, Status, Link, Gallery and Image post formats. On a normal day, I’m likely to have around 15 posts of some type or another. Using most themes, this becomes an endless flood of similar-looking content, which is about as exciting to look at as a raw printout of a dictionary.
What I needed, was a completely custom theme to handle all of this data.
To be continued.