Computers & Internet

Tweets of Glory

There’s some great stuff on Twitter, but the tweets just keep coming, so there’s a fair chance you’ve missed some funny stuff, even from the people you follow. Anywho, time is short tonight, so it’s time for another installment of Tweets of Glory:

I have to admit, hatewatching The Newsroom has actually been pretty entertaining, but I’d much rather watch this proposed feline-themed show.

Yeah, so that one’s a little out of date, but for the uninitiated, Duncan Jones is David Bowie’s son.

(I love the internet)

Well, that happened. Stay tuned for some (hopefully) more fulfilling content on Sunday…

Web browsers I have known, 1996-2012

Jason Kottke recently recapped all of the browsers he used as his default for the past 18 years. It sounded like fun, so I’m going to shamelessly steal the idea and list out my default browsers for the past 16 years (prior to 1996, I was stuck in the dark ages of dialup AOL – but once I went away to college and discovered the joys of T1/T3 connections, my browsing career started in earnest, so that’s when I’m starting this list).

  • 1996Netscape Navigator 3 – This was pretty much the uncontested king of browsers at the time, but it’s reign would be short. I had a copy of IE3 (I think?) on my computer too, but I almost never used it…
  • 1997-1998Netscape Communicator 4 – Basically Netscape Navigator 4, but the Communicator was a whole suite of applications which appealed to me at the time. I used it for email and even to start playing with some HTML editing (though I would eventually abandon everything but the browser from this suite). IE4 did come out sometime in this timeframe and I used it occasionally, but I think I stuck with NN4 way longer than I probably should have.
  • 1999-2000Internet Explorer 5 – With the release of IE5 and the increasing issues surrounding NN4, I finally jumped ship to Microsoft. I was never particularly comfortable with IE though, and so I was constantly looking for alternatives and trying new things. I believe early builds of Mozilla were available, and I kept downloading the updates in the hopes that it would allow me to dispense with IE, but it was still early in the process for Mozilla. This was also my first exposure to Opera, which at the time wasn’t that remarkable (we’re talking version 3.5 – 4 here) except that, as usual, they were ahead of the curve on tabbed browsing (a mixed blessing, as monitor resolutions at the time weren’t great). Opera was also something you had to pay for at the time, and a lot of sites didn’t work in Opera. This would all change at the end of 2000, though, with the release of Opera 5.
  • 2001Opera 5 – This browser changed everything for me. It was the first “free” Opera browser available, although the free version was ad-supported (quite annoying, but it was easy enough to get rid of the ads). The thing that was revolutionary about this browser, though, was mouse gestures. It was such a useful feature, and Opera’s implementation was (and quite frankly, still is) the best, smoothest implementation of the functionality I’ve seen. At this point, I was working at a website, so for work, I was still using IE5 and IE6 as my primary browser (because at the time, they represented something like 85-90% of the traffic to our site). I was also still experimenting with the various Mozilla-based browsers at the time as well, but Opera was my default for personal browsing. Of course, no one codes for Opera, so there were plenty of sites that I’d have to fire up IE for (this has always been an issue with Opera)
  • 2002-2006Opera 6/7/8/9 – I pretty much kept rolling with Opera during this timeframe. Again, for my professional use, IE6/IE7 was still a must, but in 2004, Firefox 1.0 launched, so that added another variable to the mix. I wasn’t completely won over by the initial Firefox offerings, but it was the first new browser in a long time that I thought had a bright future. It also provided a credible alternative for when Opera crapped out on a weirdly coded page. However, as web standards started to actually be implemented, Opera’s issues became fewer as time went on…
  • 2007Firefox 2/Opera 9 – It was around this time that Firefox started to really assert itself in my personal and professional usage. I still used Opera a lot for personal usage, but for professional purposes, Firefox was a simple must. At the time, I was embroiled in a year-long site redesign project for my company, and I was doing a ton of HTML/CSS/JavaScript development… Firefox was an indispensable tool at the time, mostly due to extensions like Firebug and the Web-Developer Toolbar. I suppose I should note that Safari first came to my attention at this point, mostly for troubleshooting purposes. I freakin’ hate that browser.
  • 2008-2011Firefox/Opera – After 2007, there was a slow, inexorable drive towards Firefox. Opera kept things interesting with a feature they call Speed Dial (and quite frankly, I like that feature much better than what Chrome and recent versions of Firefox have implemented), but the robust and mature list of extensions for Firefox were really difficult to compete with, especially when I was trying to get stuff done. Chrome also started to gain popularity in this timeframe, but while I loved how well it loaded Ajax and other JavaScript-heavy features, I could never really get comfortable with the interface. Firefox still afforded more control, and Opera’s experience was generally better.
  • 2012/PresentFirefox – Well, I think it’s pretty telling that I’m composing this post on Firefox. That being said, I still use Opera for simple browsing purposes semi-frequently. Indeed, I usually have both browsers open at all times on my personal computer. At work, I’m primarily using Firefox, but I’m still forced to use IE8, as our customers tend to still prefer IE (though the percentage is much less these days). I still avoid Safari like the plague (though I do sometimes need to troubleshoot and I suppose I do use Mobile Safari on my phone). I think I do need to give Chrome a closer look, as it’s definitely more attractive these days…

Well, there you have it. I do wonder if I’ll ever get over my stubborn love for Opera, a browser that almost no one but me uses. They really do manage to keep up with the times, and have even somewhat recently allowed Firefox and Chrome style extensions, though I think it’s a little too late for them. FF and Chrome just have a more robust community surrounding their development than Opera. I feel like it’s a browser fated to die at some point, but I’ll probably continue to use it until it does… So what browser do you use?

Tweets of Glory

One of the frustrating things about Twitter is that it’s impossible to find something once it’s gone past a few days. I’ve gotten into the habit of favoriting ones I find particularly funny or that I need to come back to, which is nice, as it allows me to publish a cheap Wednesday blog entry (incidentally, sorry for the cheapness of this entry) that will hopefully still be fun for folks to read. So here are some tweets of glory:

Note: This was Stephenson’s first tweet in a year and a half.

This one is obviously a variation on a million similar tweets (and, admit it, it’s a thought we’ve all had), but the first one I saw (or at least, favorited – I’m sure it’s far from the first time someone made that observation though)

Well, that happened. Stay tuned for some (hopefully) more fulfilling content on Sunday…

Kickstarted

When the whole Kickstarter thing started, I went through a number of phases. First, it’s a neat idea and it leverages some of the stuff that makes the internet great. Second, as my systems analyst brain started chewing on it, I had some reservations… but that was shortlived as, third, some really interesting stuff started getting funded. Here are some of the ones I’m looking forward to:

  • Singularity & Co. – Save the SciFi! – Yeah, so you’ll be seeing a lot of my nerdy pursuits represented here, and this one is particularly interesting. This is a project dedicated to saving SF books that are out of print, out of circulation, and, ironically, unavailable in any sort of digital format. The Kickstarter is funding the technical solution for scanning the books as well as tracking down and securing copyright. Judging from the response (over $50,000), this is a venture that has found a huge base of support, and I’m really looking forward to discovering some of these books (some of which are from well known authors, like Arthur C. Clarke).
  • A Show With Ze Frank – One of the craziest things I’ve seen on the internet is Ze Frank’s The Show. Not just the content, which is indeed crazy, but the sheer magnitude of what he did – a video produced every weekday for an entire year. Ze Frank grew quite a following at the time, and in fact, half the fun was his interactions with the fans. Here’s to hoping that Sniff, hook, rub, power makes another appearance. And at $146 thousand, I have no idea what we’re in for. I always wondered how he kept himself going during the original show, but now at least he’ll be funded.
  • Oast House Hop Farm – And now we come to my newest obsession: beer. This is a New Jersey farm that’s seeking to convert a (very) small portion of their land into a Hop Farm. Hops in the US generally come from the west coast (Washington’s Yakima valley, in particular). In the past, that wasn’t the case, but some bad luck (blights and infestations) brought east coast hops down, then Prohibition put a nail in the coffin. The farm hopes to supply NJ brewers as well as homebrewers, so mayhaps I’ll be using some of their stuff in the future! So far, they’ve planted Cascade and Nugget hops, with Centennial and Newport coming next. I’m really curious to see how this turns out. My understanding is that it takes a few years for a hop farm to mature, and that each crop varies. I wonder how the East Coast environs will impact the hops…
  • American Beer Blogger – Despite the apparent failure of Discovery’s Brewmasters, there’s got to be room for some sort of beer television show, and famous beer blogger and author Lew Bryson wants to give it a shot. The Kickstarter is just for the pilot episode, but assuming things go well, there may be follow up efforts. I can only hope it turns out well. I enjoyed Brewmasters for what it was, but being centered on Dogfish Head limited it severely. Sam Calagione is a great, charismatic guy, but the show never really captured the amazing stuff going on in the US right now (which is amazing because it is so broad and local and a million other things Brewmasters couldn’t really highlight given its structure).

Well, there you have it. I… probably should have been linking to these before they were funded, but whatever, I’m really happy to see that all of these things will be coming. I’m still curious to see if this whole Kickstarter thing will remain sustainable, but I guess time will tell, and for now, I’m pretty happy with the stuff being funded. There are definitely a ton of other campaigns that I think are interesting, especially surrounding beer and video games, but I’m a little tight on time here, so I’ll leave it at that…

More Disgruntled, Freakish Reflections on ebooks and Readers

While I have some pet peeves with the Kindle, I’ve mostly found it to be a good experience. That being said, there are some things I’d love to see in the future. These aren’t really complaints, as some of this stuff isn’t yet available, but there are a few opportunities afforded by the electronic nature of eBooks that would make the whole process better.

  • The Display – The electronic ink display that the basic Kindles use is fantastic… for reading text. Once you get beyond simple text, things are a little less fantastic. Things like diagrams, artwork, and photography aren’t well represented in e-ink, and even in color readers (like the iPad or Kindle Fire), there are issues with resolution and formatting that often show up in eBooks. Much of this comes down to technology and cost, both of which are improving quickly. Once stuff like IMOD displays start to deliver on their promise (low power consumption, full color, readable in sunlight, easy on the eyes, capable of supporting video, etc…), we should see a new breed of reader.

    I’m not entirely sure how well this type of display will work, at least initially. For instance, how will it compare to the iPad 3’s display? What’s the resolution like? How much will it cost? And so on. Current implementations aren’t full color, and I suspect that future iterations will go through a phase where the tech isn’t quite there yet… but I think it will be good enough to move forward. I think Amazon will most certainly jump on this technology when it becomes feasible (both from a technical and cost perspective). I’m not sure if Apple would switch though. I feel like they’d want a much more robust and established display before they committed.

  • General Metrics and Metadata – While everyone would appreciate improvements in device displays, I’m not sure how important this would be. Maybe it’s just me, but I’d love to see a lot more in the way of metadata and flexibility, both about the book and about device usage. With respect to the book itself, this gets to the whole page number issue I was whinging about in my previous post, but it’s more than that. I’d love to see a statistical analysis of what I’m reading, on both individual and collective levels.

    I’m not entirely sure what this looks like, but it doesn’t need to be rocket science. Simple Flesch-Kincaid grades seems like an easy enough place to start, and it would be pretty simple to implement. Calculating such things for my entire library (or a subset of my library), or ranking my library by grade (or similar sorting methods) would be interesting. I don’t know that this would provide a huge amount of value, but I would personally find it very illuminating and fun to play around with… and it would be very easy to implement. Individual works wouldn’t even require any processing power on the reader, it could be part of the download. Doing calculations of your collective library might be a little more complicated, but even that could probably be done in the cloud.

    Other metadata would also be interesting to view. For example, Goodreads will graph your recently read books by year of publication – a lot of analysis could be done about your collection (or a sub-grouping of your collection) of books along those lines. Groupings by decade or genre or reading level, all would be very interesting to know.

  • Personal Metrics and Metadata – Basically, I’d like to have a way to track my reading speed. For whatever reason, this is something I’m always trying to figure out for myself. I’ve never gone through the process of actually recording my reading habits and speeds because it would be tedious and manual and maybe not even all that accurate. But now that I’m reading books in an electronic format, there’s no reason why the reader couldn’t keep track of what I’m reading, when I’m reading, and how fast I’m reading. My anecdotal experience suggests that I read anywhere from 20-50 pages an hour, depending mostly on the book. As mentioned in the previous post, a lot of this has to do with the arbitrary nature of page numbers, so perhaps standardizing to a better metric (words per minute or something like that) would normalize my reading speed.

    Knowing my reading speed and graphing changes over time could be illuminating. I’ve played around a bit with speed reading software, and the results are interesting, but not drastic. In any case, one thing that would be really interesting to know when reading a book would be how much time you have left before you finish. Instead of having 200 pages, maybe you have 8 hours of reading time left.

    Combining my personal data with the general data could also yield some interesting results. Maybe I read trashy SF written before 1970 much faster than more contemporary literary fiction. Maybe I read long books faster than short books. There are a lot of possibilities here.

    There are a few catches to this whole personal metrics thing though. You’d need a way to account for breaks and interruptions. I might spend three hours reading tonight, but I’m sure I’ll take a break to get a glass of water or answer a phone call, etc… There’s not really an easy way around this, though there could be mitigating factors like when the reader goes to sleep mode or something like that. Another problem is that one device can be used by multiple people, which would require some sort of profile system. That might be fine, but it also adds a layer of complexity to the interface that I’m sure most companies would like to avoid. The biggest and most concerning potential issue is that of privacy. I’d love to see this information about myself, but would I want Amazon to have access to it? On the other hand, being able to aggregate data from all Kindles might prove interesting in its own right. Things like average reading speed, number of books read in a year, and so on. All interesting and useful info.

    This would require an openness and flexibility that Amazon has not yet demonstrated. It’s encouraging that the Kindle Fire runs a flavor of Android (an open source OS), but on the other hand, it’s a forked version that I’m sure isn’t as free (as in speech) as I’d like (and from what I know, the Fire is partially limited by its hardware). Expecting comprehensive privacy controls from Amazon seems naive.

    I’d like to think that these metrics would be desirable to a large audience of readers, but I really have no inclination what the mass market appeal would be. It’s something I’d actually like to see in a lot of other places too. Video games, for instance, provide a lot of opportunity for statistics, and some games provide a huge amount of data on your gaming habits (be it online or in a single player mode). Heck, half the fun of sports games (or sports in general) is tracking the progress of your players (particularly prospects). Other games provide a lack of depth that is most baffling. People should be playing meta-games like Fantasy Baseball, but with MLB The Show providing the data instead of real life.

  • The Gamification of Reading – Much of the above wanking about metrics could probably be summarized as a way to make reading a game. The metrics mentioned above readily lend themselves to point scores, social-app-like badges, and leaderboards. I don’t know that this would necessarily be a good thing, but it could make for an intriguing system. There’s an interesting psychology at work in systems like this, and I’d be curious to see if someone like Amazon could make reading more addictive. Assuming most people don’t try to abuse the system (though there will always be a cohort that will attempt to exploit stuff like this), it could ultimately lead to beneficial effects for individuals who “play” the game competitively with their friends. Again, this isn’t necessarily a good thing. Perhaps the gamification of reading will lead to a sacrifice of comprehension in the name of speed, or other mitigating effects. Still, it would be nice to see the “gamification of everything” used for something other than a way for companies to trick customers into buying their products.

As previously mentioned, the need for improved displays is a given (and not just for ereaders). But assuming these nutty metrics (and the gamification of reading) are an appealing concept, I’d like to think that it would provide an opening for someone to challenge Amazon in the market. An open, flexible device using a non-DRMed format and tied to a common store would be very nice. Throw in some game elements, add a great display, and you’ve got something close to my ideal reader. Unfortunately, it doesn’t seem like we’re all that close just yet. Maybe in 5-10 years? Seems possible, but it’s probably more likely that Amazon will continue its dominance.

Zemanta

Last week, I looked at commonplace books and various implementation solutions. Ideally, I wanted something open and flexible that would also provide some degree of analysis in addition to the simple data aggregation most tools provide. I wanted something that would take into account a wide variety of sources in addition to my own writing (on this blog, for instance). Most tools provide a search capability of some kind, but I was hoping for something more advanced. Something that would make connections between data, or find similarities with something I’m currently writing.

At a first glance, Zemanta seemed like a promising candidate. It’s a “content suggestion engine” specifically built for blogging and it comes pre-installed on a lot of blogging software (including Movable Type). I just had to activate it, which was pretty simple. Theoretically, it continually scans a post in progress (like this one) and provides content recommendations, ranging from simple text links defining key concepts (i.e. links to Wikipedia, IMDB, Amazon, etc…), to imagery (much of which seems to be integrated with Flickr and Wikipedia), to recommended blog posts from other folks’ blogs. One of the things I thought was really neat was that I could input my own blogs, which would then give me more personalized recommendations.

Unfortunately, results so far have been mixed. There are some things I really like about Zemanta, but it’s pretty clearly not the solution I’m looking for. Some assorted thoughts:

  • Zemanta will only work when using the WYSIWYG Rich Text editor, which turns out to be a huge pain in the arse.  I’m sure lots of people are probably fine with that, but I’ve been editing my blog posts in straight HTML for far too long. I suppose this is more of a hangup on my end than a problem with Zemanta, but it’s definitely something I find annoying.  When I write a post in WYSIWYG format, I invariably switch it back to no formatting and jump through a bunch of hoops getting the post to look like what I want.
  • The recommended posts haven’t been very useful so far. Some of the external choices are interesting, but so far, nothing has really helped me in writing my posts. I was really hoping that loading my blog into Zemanta would add a lot of value, but it turns out that Zemanta only really scanned my recent posts, and it sorta recommended most of them, which doesn’t really help me that much.  I know what I’ve written recently, what I was hoping for was that Zemanta would be able to point out some post I wrote in 2005 along similar lines (In my previous post on Taxonomy Platforms, I specifically referenced the titles of some of my old blog posts, but since they were old, Zemanta didn’t find them and recommend them.  Even more annoying, when writing this post, the Taxonomy Platforms post wasn’t one of the recommended articles despite my specifically mentioning it. Update: It has it now, but it didn’t seem to appear until after I’d already gone through the trouble of linking it…) It appears that Zemanta is basing all of this on my RSS feed, which makes sense, but I wish there was a way to upload my full archives, as that might make this tool a little more powerful…
  • The recommendations seem to be based on a relatively simplistic algorithm. A good search engine will index data and learn associations between individual words by tracking their frequency and how close they are to other words.  Zemanta doesn’t seem to do that.  In my previous post, I referenced famous beer author Michael Jackson. What did Zemanta recommend?  Lots of pictures and articles about the musician, nothing about the beer journalist. I don’t know if I’m expecting too much out of the system, but it would be nice if the software would pick up on the fact that this guy’s name was showing up near lots of beer talk, with nary a reference to music. It’s probably too much to hope that my specifically calling out that I was talking about “the beer critic, not the pop star” would influence the system (and indeed, my reference to “pop star” may have influenced the recommendations, despite the fact that I was trying to negate that).
  • The “In-Text Links”, on the other hand, seem to come in quite handy. I actually leveraged several of them in my past few posts, and they were very easy to use. Indeed, I particularly appreciated their integration with Amazon, where I could enter my associates ID, and the links that were inserted were automatically generated with my ID. This is normally a pretty intensive process involving multiple steps that has been simplified down to the press of a button.  Very well done, and most of the suggestions there were very relevant.

I will probably continue to play with Zemanta, but I suspect it will be something that doesn’t last much longer. It provides some value, but it’s ultimately not as convenient as I’d like, and it’s analysis and recommendation functions don’t seem as useful as I’d like.

I’ve also been playing around with Evernote more and more, and I feel like that could be a useful tool, despite the fact that it doesn’t really offer any sort of analysis (though it does have a simple search function). There’s at least one third party, though, that seems to be positioning itself as an analysis tool that will integrate with Evernote.  That tool is called Topicmarks.  Unfortunately, I seem to be having some issues integrating my Evernote data with that service. At this rate, I don’t know that I’ll find a great tool for what I want, but it’s an interesting subject, and I’m guessing it will be something that will become more and more important as time goes on. We’re living in the Information Age, it seems only fair that our aggregation and analysis tools get more sophisticated.

Enhanced by Zemanta

Commonplacing

During the Enlightenment, most intellectuals kept what’s called a Commonplace Book. Basically, folks like John Locke or Mark Twain would curate transcriptions of interesting quotes from their readings. It was a personalized record of interesting ideas that the author encountered. When I first heard about the concept, I immediately started thinking of how I could implement one… which is when I realized that I’ve actually been keeping one, more or less, for the past decade or so on this blog. It’s not very organized, though, and it’s something that’s been banging around in my head for the better part of the last year or so.

Locke was a big fan of Commonplace Books, and he spent years developing an intricate system for indexing his books’ content. It was, of course, a ridiculous and painstaking process, but it worked. Fortunately for us, this is exactly the sort of thing that computer systems excel at, right? The reason I’m writing this post is a small confluence of events that has lead me to consider creating a more formal Commonplace Book. Despite my earlier musing on the subject, this blog doesn’t really count. It’s not really organized correctly, and I don’t publish all the interesting quotes that I find. Even if I did, it’s not really in a format that would do me much good. So I’d need to devise another plan.

Why do I need a plan at all? What’s the benefit of a commonplace book? Well, I’ve been reading Steven Johnson’s book Where Good Ideas Come From: The Natural History of Innovation and he mentions how he uses a computerized version of the commonplace book:

For more than a decade now, I have been curating a private digital archive of quotes that I’ve found intriguing, my twenty-first century version of the commonplace book. … I keep all these quotes in a database using a program called DEVONthink, where I also store my own writing: chapters, essays, blog posts, notes. By combining my own words with passages from other sources, the collection becomes something more than just a file storage system. It becomes a digital extension of my imperfect memory, an archive of all my old ideas, and the ideas that have influenced me.

This DEVONthink software certainly sounds useful. It’s apparently got this fancy AI that will generate semantic connections between quotes and what you’re writing. It’s advanced enough that many of those connections seem to be subtle and “lyrical”, finding connections you didn’t know you were looking for. It sounds perfect except for the fact that it only runs on Mac OSX. Drats. It’s worth keeping in mind in case I ever do make the transition from PC to Mac, but it seems like lunacy to do so just to use this application (which, for all I know, will be useless to me).

As sheer happenstance, I’ve also been playing around with Pinterest lately, and it occurs to me that it’s a sort of commonplace book, albeit one with more of a narrow focus on images and video (and recipes?) than quotes. There are actually quite a few sites like that. I’ve been curating a large selection of links on Delicious for years now (1600+ links on my account). Steven Johnson himself has recently contributed to a new web startup called Findings, which is primarily concerned with book quotes. All of this seems rather limiting, and quite frankly, I don’t want to be using 7 completely different tools to do the same thing, but for different types of media.

I also took a look at Tumblr again, this time evaluating it from a commonplacing perspective. There are some really nice things about the interface and the ease with which you can curate your collection of media. The problem, though, is that their archiving system is even more useless than most blog software. It’s not quite the hell that is Twitter archives, but that’s a pretty low bar. Also, as near as I can tell, the data is locked up on their server, which means that even if I could find some sort of indexing and analysis tool to run through my data, I won’t really be able to do so (Update: apparently Tumblr does have a backup tool, but only for use with OSX. Again!? What is it with you people? This is the internet, right? How hard is it to make this stuff open?)

Evernote shows a lot of promise and probably warrants further examination. It seems to be the go-to alternative for lots of researchers and writers. It’s got a nice cloud implementation with a robust desktop client and the ability to export data as I see fit. I’m not sure if its search will be as sophisticated as what I ultimately want, but it could be an interesting tool.

Ultimately, I’m not sure the tool I’m looking for exists. DEVONthink sounds pretty close, but it’s hard to tell how it will work without actually using the damn thing. The ideal would be a system where you can easily maintain a whole slew of data and metadata, to the point where I could be writing something (say a blog post or a requirements document for my job) and the tool would suggest relevant quotes/posts based on what I’m writing. This would probably be difficult to accmomplish in real-time, but a “Find related content” feature would still be pretty awesome. Anyone know of any alternatives?

Enhanced by ZemantaUpdate: Zemanta! I completely forgot about this. It comes installed by default with my blogging software, but I had turned it off a while ago because it took forever to load and was never really that useful. It’s basically a content recommendation engine, pulling content from lots of internet sources (notably Wikipedia, Amazon, Flickr and IMDB). It’s also grown considerably in the time since I’d last used it, and it now features a truckload of customization options, including the ability to separate general content recommendations from your own, personally curated sources. So far, I’ve only connected my two blogs to the software, but it would be interesting if I could integrate Zemanta with Evernote, Delicious, etc… I have no idea how great the recommendations will be (or how far back it will look on my blogs), but this could be exactly what I was looking for. Even if integration with other services isn’t working, I could probably create myself another blog just for quotes, and then use that blog with Zemanta. I’ll have to play around with this some more, but I’m intrigued by the possibilities

SOPA Blues

I was going to write the annual arbitrary movie awards tonight, but since the web has apparently gone on strike, I figured I’d spend a little time talking about that instead. Many sites, including the likes of Wikipedia and Reddit, have instituted a complete blackout as part of a protest against two ill-conceived pieces of censorship legislation currently being considered by the U.S. Congress (these laws are called the Stop Online Piracy Act and Protect Intellectual Property Act, henceforth to be referred to as SOPA and PIPA). I can’t even begin to pretend that blacking out my humble little site would accomplish anything, but since a lot of my personal and professional livelihood depends on the internet, I suppose I can’t ignore this either.

For the uninitiated, if the bills known as SOPA and PIPA become law, many websites could be taken offline involuntarily, without warning, and without due process of law, based on little more than an alleged copyright owner’s unproven and uncontested allegations of infringement1. The reason Wikipedia is blacked out today is that they depend solely on user-contributed content, which means they would be a ripe target for overzealous copyright holders. Sites like Google haven’t blacked themselves out, but have staged a bit of a protest as well, because under the provisions of the bill, even just linking to a site that infringes upon copyright is grounds for action (and thus search engines have a vested interest in defeating these bills). You could argue that these bills are well intentioned, and from what I can tell, their original purpose seemed to be more about foreign websites and DNS, but the road to hell is paved with good intentions and as written, these bills are completely absurd.

Lots of other sites have been registering their feelings on the matter. ArsTechnica has been posting up a storm. Shamus has a good post on the subject which is followed by a lively comment thread. But I think Aziz hits the nail on the head:

Looks like the DNS provisions in SOPA are getting pulled, and the House is delaying action on the bill until February, so it’s gratifying to see that the activism had an effect. However, that activism would have been put to better use to educate people about why DRM is harmful, why piracy should be fought not with law but with smarter pro-consumer marketing by content owners (lowered prices, more options for digital distribution, removal of DRM, fair use, and ubiquitous time-shifting). Look at the ridiculous limitations on Hulu Plus – even if you’re a paid subscriber, some shows won’t air episodes until the week after, old episodes are not always available, some episodes can only be watched on the computer and are restricted from mobile devices. These are utterly arbitrary limitations on watching content that just drive people into the pirates’ arms.

I may disagree with some of the other things in Aziz’s post, but the above paragraph is important, and for some reason, people aren’t talking about this aspect of the story. Sure, some folks are disputing the numbers, but few are pointing out the things that IP owners could be doing instead of legislation. For my money, the most important thing that IP owners have forgotten is convenience. Aziz points out Hulu, which is one of the worst services I’ve ever seen in terms of being convenient or even just intuitive to customers. I understand that piracy is frustrating for content owners and artists, but this is not the way to fight piracy. It might be disheartening to acknowledge that piracy will always exist, but it probably will, so we’re going to have to figure out a way to deal with it. The one thing we’ve seen work is convenience. Despite the fact that iTunes had DRM, it was loose enough and convenient enough that it became a massive success (it now doesn’t have DRM, which is even better). People want to spend money on this stuff, but more often than not, content owners are making it harder on the paying customer than on the pirate. SOPA/PIPA is just the latest example of this sort of thing.

I’ve already written about my thoughts on Intellectual Property, Copyright and DRM, so I encourage you to check that out. And if you’re so inclined, you can find out what senators and representatives are supporting these bills, and throw them out in November (or in a few years, if need be). I also try to support companies or individuals that put out DRM-free content (for example, Louis CK’s latest concert video has been made available, DRM free, and has apparently been a success).

Intellectual Property and Copyright is a big subject, and I have to be honest in that I don’t have all the answers. But the way it works right now just doesn’t seem right. A copyrighted work released just before I was born (i.e. Star Wars) probably won’t enter the public domain until after I’m dead (I’m generally an optimistic guy, so I won’t complain if I do make it to 2072, but still). Both protection and expiration are important parts of the way copyright works in the U.S. It’s a balancing act, to be sure, but I think the pendulum has swung too far in one direction. Maybe it’s time we swing it back. Now if you’ll excuse me, I’m going to participate in a different kind of blackout to protest SOPA.

1 – Thanks to James for the concise description. There are lots of much longer longer and better sourced descriptions of the shortcomings of this bill and the issues surrounding it, so I won’t belabor the point here.

Streaming and Netflix’s Woes

A few years ago, when I was still contemplating the purchase of a Blu-Ray player (which ended up being the PS3), there was a lot of huffing-and-puffing about how Blu-Ray would never catch on, physical media was dead, and that streaming was the future. My thoughts on that at the time were that streaming is indeed the future, but that it would take at least 10 years before it actually happened in an ideal form. The more I see, the more I’m convinced that I actually underestimated the time it would take to get a genuinely great streaming service running.

One of the leading examples of a streaming service is Netflix’s Watch Instantly service. As a long time Netflix member, I can say that it is indeed awesome, especially now that I can easily stream it to my television. However, there is one major flaw to their streaming service: the selection. Now, they have somewhere on the order of 20,000-30,000 titles available, which is certainly a huge selection… but it’s about 1/5th of what they have available on physical media. For some folks, I’m sure that’s enough, but for movie nerds like myself, I’m going to want to keep the physical option on my plan…

The reason Netflix’s selection is limited is the same reason I don’t think we’ll see an ideal streaming service anytime soon. The problems are not technological. It all comes down to intellectual property. Studios and distributors own the rights, and they often don’t want to allow streaming, especially for new releases. Indeed, several studios won’t even allow Netflix to rent physical media for the first month of release. In order for a streaming service to actually supplant physical media, it will have to feature a comprehensive selection. Netflix does have a vested interest in making that happen (the infrastructure needed for physical media rentals via mail is massive and costly, while streaming is, at least, more streamlined from a logistical point of view), but I don’t see this happening anytime soon.

Netflix has recently encountered some issues along these lines, and as a result, they’ve changed their pricing structure. It used to be that you could buy a plan that would allow you to rent 1, 2, 3, or 4 DVDs or BDs at a time. If you belonged to one of those plans, you also got free, unlimited streaming. Within the past year or so, they added another option for folks who only wanted streaming. And just a few weeks ago, they made streaming an altogether separate service. Instead of buying the physical media plan of your choice and getting streaming “for free”, you now also need to pay for streaming. I believe their most popular plan used to be 1 disc with unlimited streaming, which was $9.99. This plan is now $16.98.

As you might expect, this has resulted in a massive online shitstorm of infantile rage and fury. Their blog post announcing the change currently has 12,000+ comments from indignant users. There are even more comments on their Facebook page (somewhere on the order of 80,000 comments there), and of course, other social media sites like Twitter were filled with indignant posts on the subject.

So why did Netflix risk the ire of their customers? They’ve even acknowledged that they were expecting some outrage at the change. My guess is that the bill’s about to come due, and Netflix didn’t really have a choice in the matter.

Indeed, a few weeks ago, Netflix had to temporarily stop streaming all of its Sony movies (which are distributed through Starz). It turns out that there’s a contractual limit on the number of subscribers that Sony will allow, so now Netflix needs to renegotiate with Sony/Starz. The current cost to license Sony/Starz content for streaming is around $30 million annually. Details aren’t really public (and it’s probably not finalized yet), but it’s estimated that the new contract will cost Netflix somewhere on the order of $200-$350 million a year. And that’s just Sony/Starz. I imagine other studios will now be chomping at the bit. And of course, all these studios will continually up their rates as Netflix tries to expand their streaming selection.

So I think that all of the invective being thrown Netflix’s way is mostly unwarranted (or, rather, misplaced). All that rage should really be directed at the studios who are trying to squeeze every penny out of their IP. At least Netflix seems to be doing business in an honest and open way here, and yet everyone’s bitching about it. Other companies would do something sneaky. For instance, movie theaters (which also get a raw deal from studios) seem to be raising ticket prices by a quarter every few months. Any given increase is met with a bit of a meh, but consolidated over the past few years, ticket prices have risen considerably.

Ultimately, it’s quite possible that Netflix will take a big hit on this in the next few years. Internet nerd-rage notwithstanding, I’m doubting that their customer base will drop, but if their cost of doing business goes up the way it seems, I can see their profits dropping considerably. But if that happens, it won’t be Netflix that we should blame, it will be the studios… I don’t want to completely demonize the studios here – they do create and own the content, and are entitled to be compensated for that. However, I don’t think anyone believes they’re being fair about this. They’ve been trying to slow Netflix down for years, after all. Quite frankly, Netflix has been much more customer friendly than the studios.

Communication

About two years ago (has it really been that long!?), I wrote a post about Interrupts and Context Switching. As long and ponderous as that post was, it was actually meant to be part of a larger series of posts. This post is meant to be the continuation of that original post and hopefully, I’ll be able to get through the rest of the series in relatively short order (instead of dithering for another couple years). While I’m busy providing context, I should also note that this series was also planned for my internal work blog, but in the spirit of arranging my interests in parallel (and because I don’t have that much time at work dedicated to blogging on our intranet), I’ve decided to publish what I can here. Obviously, some of the specifics of my workplace have been removed from what follows, but it should still contain enough general value to be worthwhile.

In the previous post, I wrote about how computers and humans process information and in particular, how they handle switching between multiple different tasks. It turns out that computers are much better at switching tasks than humans are (for reasons belabored in that post). When humans want to do something that requires a lot of concentration and attention, such as computer programming or complex writing, they tend to work best when they have large amounts of uninterrupted time and can work in an environment that is quiet and free of distractions. Unfortunately, such environments can be difficult to find. As such, I thought it might be worth examining the source of most interruptions and distractions: communication.

Of course, this is a massive subject that can’t even be summarized in something as trivial as a blog post (even one as long and bloviated as this one is turning out to be). That being said, it’s worth examining in more detail because most interruptions we face are either directly or indirectly attributable to communication. In short, communication forces us to do context switching, which, as we’ve already established, is bad for getting things done.

Let’s say that you’re working on something large and complex. You’ve managed to get started and have reached a mental state that psychologists refer to as flow (also colloquially known as being “in the zone”). Flow is basically a condition of deep concentration and immersion. When you’re in this state, you feel energized and often don’t even recognize the passage of time. Seemingly difficult tasks no longer feel like they require much effort and the work just kinda… flows. Then someone stops by your desk to ask you an unrelated question. As a nice person and an accomodating coworker, you stop what you’re doing, listen to the question and hopefully provide a helpful answer. This isn’t necessarily a bad thing (we all enjoy helping other people out from time to time) but it also represents a series of context switches that would most likely break you out of your flow.

Not all work requires you to reach a state of flow in order to be productive, but for anyone involved in complex tasks like engineering, computer programming, design, or in-depth writing, flow is a necessity. Unfortunately, flow is somewhat fragile. It doesn’t happen instantaneously; it requires a transition period where you refamiliarize yourself with the task at hand and the myriad issues and variables you need to consider. When your collegue departs and you can turn your attention back to the task at hand, you’ll need to spend some time getting your brain back up to speed.

In isolation, the kind of interruption described above might still be alright every now and again, but imagine if the above scenario happened a couple dozen times in a day. If you’re supposed to be working on something complicated, such a series of distractions would be disasterous. Unfortunately, I work for a 24/7 retail company and the nature of our business sometimes requires frequen interruptions and thus there are times when I am in a near constant state of context switching. Noe of this is to say I’m not part of the problem. I am certainly guilty of interrupting others, sometimes frequently, when I need some urgent information. This makes working on particularly complicated problems extremely difficult.

In the above example, there are only two people involved: you and the person asking you a question. However, in most workplace environments, that situation indirectly impacts the people around you as well. If they’re immersed in their work, an unrelated conversation two cubes down may still break them out of their flow and slow their progress. This isn’t nearly as bad as some workplaces that have a public address system – basically a way to interrupt hundreds or even thousands of people in order to reach one person – but it does still represent a challenge.

Now, the really insideous part about all this is that communication is really a good thing, a necessary thing. In a large scale organization, no one person can know everything, so communication is unavoidable. Meetings and phone calls can be indispensible sources of information and enablers of collaboration. The trick is to do this sort of thing in a way that interrupts as few people as possible. In some cases, this will be impossible. For example, urgency often forces disruptive communication (because you cannot afford to wait for an answer, you will need to be more intrusive). In other cases, there are ways to minimize the impact of frequent communication.

One way to minimize communication is to have frequently requested information documented in a common repository, so that if someone has a question, they can find it there instead of interrupting you (and potentially those around you). Naturally, this isn’t quite as effective as we’d like, mostly because documenting information is a difficult and time consuming task in itself and one that often gets left out due to busy schedules and tight timelines. It turns out that documentation is hard! A while ago, Shamus wrote a terrific rant about technical documentation:

The stereotype is that technical people are bad at writing documentation. Technical people are supposedly inept at organizing information, bad at translating technical concepts into plain English, and useless at intuiting what the audience needs to know. There is a reason for this stereotype. It’s completely true.

I don’t think it’s quite as bad as Shamus points out, mostly because I think that most people suffer from the same issues as technical people. Technology tends to be complex and difficult to explain in the first place, so it’s just more obvious there. Technology is also incredibly useful because it abstracts many difficult tasks, often through the use of metaphors. But when a user experiences the inevitable metaphor shear, they have to confront how the system really works, not the easy abstraction they’ve been using. This descent into technical details will almost always be a painful one, no matter how well documented something is, which is part of why documentation gets short shrift. I think the fact that there actually is documentation is usually a rather good sign. Then again, lots of things aren’t documented at all.

There are numerous challenges for a documentation system. It takes resources, time, and motivation to write. It can become stale and inaccurate (sometimes this can happen very quickly) and thus it requires a good amount of maintenance (this can involve numerous other topics, such as version histories, automated alert systems, etc…). It has to be stored somewhere, and thus people have to know where and how to find it. And finally, the system for building, storing, maintaining, and using documentation has to be easy to learn and easy to use. This sounds all well and good, but in practice, it’s a nonesuch beast. I don’t want to get too carried away talking about documentation, so I’ll leave it at that (if you’re still interested, that nonesuch beast article is quite good). Ultimately, documentation is a good thing, but it’s obviously not the only way to minimize communication strain.

I’ve previously mentioned that computer programming is one of those tasks that require a lot of concentration. As such, most programmers abhor interruptions. Interestingly, communication technology has been becoming more and more reliant on software. As such, it should be no surprise that a lot of new tools for communication are asynchronous, meaning that the exchange of information happens at each participant’s own convenience. Email, for example, is asynchronous. You send an email to me. I choose when I want to review my messages and I also choose when I want to respond. Theoretically, email does not interrupt me (unless I use automated alerts for new email, such as the default Outlook behavior) and thus I can continue to work, uninterrupted.

The aformentioned documentation system is also a form of asynchronous communication and indeed, most of the internet itself could be considered a form of documentation. Even the communication tools used on the web are mostly asynchronous. Twitter, Facebook, YouTube, Flickr, blogs, message boards/forums, RSS and aggregators are all reliant on asynchronous communication. Mobile phones are obviously very popular, but I bet that SMS texting (which is asynchronous) is used just as much as voice, if not moreso (at least, for younger people). The only major communication tools invented in the past few decades that wouldn’t be asynchronous are instant messaging and chat clients. And even those systems are often used in a more asynchronous way than traditional speech or conversation. (I suppose web conferencing is a relatively new communication tool, though it’s really just an extension of conference calls.)

The benefit of asynchronous communication is, of course, that it doesn’t (or at least it shouldn’t) represent an interruption. If you’re immersed in a particular task, you don’t have to stop what you’re doing to respond to an incoming communication request. You can deal with it at your own convenience. Furthermore, such correspondence (even in a supposedly short-lived medium like email) is usually stored for later reference. Such records are certainly valuable resources. Unfortunately, asynchronous communication has it’s own set of difficulties as well.

Miscommunication is certainly a danger in any case, but it seems more prominent in the world of asynchronous communication. Since there is no easy back-and-forth in such a method, there is no room for clarification and one is often left only with their own interpretation. Miscommunication is doubly challenging because it creates an ongoing problem. What could have been a single conversation has now ballooned into several asynchronous touch-points and even the potential for wasted work.

One of my favorite quotations is from Anne Morrow Lindbergh:

To write or to speak is almost inevitably to lie a little. It is an attempt to clothe an intangible in a tangible form; to compress an immeasurable into a mold. And in the act of compression, how the Truth is mangled and torn!

It’s difficult to beat the endless nuance of face-to-face communication, and for some discussions, nothing else will do. But as Lindbergh notes, communication is, in itself, a difficult proposition. Difficult, but necessary. About the best we can do is to attempt to minimize the misunderstanding.

I suppose one way to mitigate the possibility of miscommunication is to formalize the language in which the discussion is happening. This is easier said than done, as our friends in the legal department would no doubt say. Take a close look at a formal legal contract and you can clearly see the flaws in formal language. They are ostensibly written in English, but they require a lot of effort to compose or to read. Even then, opportunities for miscommunication or loopholes exist. Such a process makes sense when dealing with two separate organizations that each have their own agenda. But for internal collaboration purposes, such a formalization of communication would be disastrous.

You could consider computer languages a form of formal communication, but for most practical purposes, this would also fall short of a meaningful method of communication. At least, with other humans. The point of a computer language is to convert human thought into computational instructions that can be carried out in an almost mechanical fashion. While such a language is indeed very formal, it is also tedious, unintuitive, and difficult to compose and read. Our brains just don’t work like that. Not to mention the fact that most of the communication efforts I’m talking about are the precursors to the writing of a computer program!

Despite all of this, a light formalization can be helpful and the fact that teams are required to produce important documentation practically requires a compromise between informal and formal methods of communication. In requirements specifications, for instance, I have found it quite beneficial to formally define various systems, acronyms, and other jargon that is referenced later in the document. This allows for a certain consistency within the document itself, and it also helps establish guidelines surrounding meaningful dialogue outside of the document. Of course, it wouldn’t quite be up to legal standards and it would certainly lack the rigid syntax of computer languages, but it can still be helpful.

I am not an expert in linguistics, but it seems to me that spoken language is much richer and more complex than written language. Spoken language features numerous intricacies and tonal subtleties such as inflections and pauses. Indeed, spoken language often contains its own set of grammatical patterns which can be different than written language. Furthermore, face-to-face communication also consists of body language and other signs that can influence the meaning of what is said depending on the context in which it is spoken. This sort of nuance just isn’t possible in written form.

This actually illustrates a wider problem. Again, I’m no linguist and haven’t spent a ton of time examining the origins of language, but it seems to me that language emerged as a more immediate form of communication than what we use it for today. In other words, language was meant to be ephemeral, but with the advent of written language and improved technological means for recording communication (which are, historically, relatively recent developments), we’re treating it differently. What was meant to be short-lived and transitory is now enduring and long-lived. As a result, we get things like the ever changing concept of political-correctness. Or, more relevant to this discussion, we get the aforementioned compromise between formal and informal language.

Another drawback to asynchronous communication is the propensity for over-communication. The CC field in an email can be a dangerous thing. It’s very easy to broadcast your work out to many people, but the more this happens, the more difficult it becomes to keep track of all the incoming stimuli. Also, the language used in such a communication may be optimized for one type of reader, while the audience may be more general. This applies to other asynchronous methods as well. Documentation in a wiki is infamously difficult to categorize and find later. When you have an army of volunteers (as Wikipedia does), it’s not as large a problem. But most organizations don’t have such luxuries. Indeed, we’re usually lucky if something is documented at all, let alone well organized and optimized.

The obvious question, which I’ve skipped over for most of this post (and, for that matter, the previous post), is: why communicate in the first place? If there are so many difficulties that arise out of communication, why not minimize such frivolities so that we can get something done?

Indeed, many of the greatest works in history were created by one mind. Sometimes, two. If I were to ask you to name the greatest inventor of all time, what would you say? Leonardo da Vinci or perhaps Thomas Edison. Both had workshops consisting of many helping hands, but their greatest ideas and conceptual integrity came from one man. Great works of literature? Shakespeare is the clear choice. Music? Bach, Mozart, Beethoven. Painting? da Vinci (again!), Rembrandt, Michelangelo. All individuals! There are collaborations as well, but usually only among two people. The Wright brothers, Gilbert and Sullivan, and so on.

So why has design and invention gone from solo efforts to group efforts? Why do we know the names of most of the inventors of 19th and early 20th century innovations, but not later achievements? For instance, who designed the Saturn V rocket? No one knows that, because it was a large team of people (and it was the culmination of numerous predecessors made by other teams of people). Why is that?

The biggest and most obvious answer is the increasing technological sophistication in nearly every area of engineering. The infamous Lazarus Long adage that “Specialization is for insects.” notwithstanding, the amount of effort and specialization in various fields is astounding. Take a relatively obscure and narrow branch of mechanical engineering like Fluid Dynamics, and you’ll find people devoting most of their life to the study of that field. Furthermore, the applications of that field go far beyond what we’d assume. Someone tinkering in their garage couldn’t make the Saturn V alone. They’d require too much expertise in a wide and disparate array of fields.

This isn’t to say that someone tinkering in their garage can’t create something wonderful. Indeed, that’s where the first personal computer came from! And we certainly know the names of many innovators today. Mark Zuckerberg and Larry Page/Sergey Brin immediately come to mind… but even their inventions spawned large companies with massive teams driving future innovation and optimization. It turns out that scaling a product up often takes more effort and more people than expected. (More information about the pros and cons of moving to a collaborative structure will have to wait for a separate post.)

And with more people comes more communication. It’s a necessity. You cannot collaborate without large amounts of communication. In Tom DeMarco and Timothy Lister’s book Peopleware, they call this the High-Tech Illusion:

…the widely held conviction among people who deal with any aspect of new technology (as who of us does not?) that they are in an intrinsically high-tech business. … The researchers who made fundamental breakthroughs in those areas are in a high-tech business. The rest of us are appliers of their work. We use computers and other new technology components to develop our products or to organize our affairs. Because we go about this work in teams and projects and other tightly knit working groups, we are mostly in the human communication business. Our successes stem from good human interactions by all participants in the effort, and our failures stem from poor human interactions.

(Emphasis mine.) That insight is part of what initially inspired this series of posts. It’s very astute, and most organizations work along those lines, and thus need to figure out ways to account for the additional costs of communication (this is particularly daunting, as such things are notoriously difficult to measure, but I’m getting ahead of myself). I suppose you could argue that both of these posts are somewhat inconclusive. Some of that is because they are part of a larger series, but also, as I’ve been known to say, human beings don’t so much solve problems as they do trade one set of problems for another (in the hopes that the new problems are preferable the old). Recognizing and acknowledging the problems introduced by collaboration and communication is vital to working on any large project. As I mentioned towards the beginning of this post, this only really scratches the surface of the subject of communication, but for the purposes of this series, I think I’ve blathered on long enough. My next topic in this series will probably cover the various difficulties of providing estimates. I’m hoping the groundwork laid in these first two posts will mean that the next post won’t be quite so long, but you never know!