This is your brain on music

March 8th, 2010 by Rob Haitani

Previously on “Vitamin D Blog,” I said the brain processes streams of data hierarchically to form high-level concepts. If you’ve seen a musician improvise or play by ear, though, you’ve seen the process in reverse. They start with a high-level concept, and output individual notes. How does this work?

According to HTM theory, when you recognize a partial pattern you instinctively predict what comes next. This can be handy for patterns like “impending stock market collapse” or “psychopath with axe.” But with music, it’s about how your mind fills in the last note when you hear “twinkle, twinkle little…”

When you learn an instrument, you find that patterns of notes, chords and rhythms repeat in certain types of songs. You also find that you can swap these pieces. If you know the key, you can plug in a couple riffs to play a measure; plug in a couple measures to play a verse. Since music is hierarchical, once you’ve learned enough riffs and chords, then playing a new song is like pulling clothes out of your closet to get dressed (instead of sewing from scratch).

So why learn every note when you can use this:

  • Verse: Am  Fmaj7  G
  • Chorus: G  Fmaj7

That’s neocortical efficiency in action. Learn two combinations of three chords , and you’re playing “Under The Milky Way.”

And it gets better with “Twelve-bar blues.”  This is a high-level pattern of blues/rock. Think of it as the world’s first industry standard. For centuries, musicians knowing this pattern have been able to jam with strangers.

As for improvisation, it’s just prediction in real time. You start with some random notes and then predict more notes that will complete a pattern. It’s like creating your own brain-teaser and solving it on the spot. The best solos are those start with no pattern in mind but somehow jell into one. So next time you see a killer solo at a jazz club, try yelling out, “Dude, way to predict a higher-level cause from self-generated partial novel input.” (Musicians love that.)

My cousin Dean Haitani plays a mean blues guitar

Crowdsourcing terrorism

March 1st, 2010 by Rob Haitani

I suppose terrorism isn’t exactly on topic, but since I’ve talked about everything from Hume to Star Trek, it’s hard to define what “off topic” is.

I used to think of Al-Qaida as a bunch of fanatics in caves carrying AK-47s. Of course they did defeat the Soviet Army, but I didn’t say they were incompetent. I just didn’t think they were particularly tech-savvy.

So I’m fascinated how Jarret Brachman describes Al-Qaida’s sophisticated Internet marketing and recruitment strategies. I suppose if you’re a shadowy, dispersed organization, what better place to distribute your message than the decentralized, anonymous Internet? But I figured advanced social media marketing was more our home field advantage.

Brachman coined the term “jihobbyists” for the legions of wannabe terrorists who live in their mother’s basements (his phrase). Al Qaeda feeds a network of websites (many targeting English speakers) with carefully staged videos, emblazoned with the official golden Al-Qaida media logo. (Media logo…? Yes, even on coffee mugs. Seriously.) The messages are crafted to appeal to a target audience of young disaffected Muslim men. There is an abundance of ideological material and online tutorials describing not only how to make bombs, but how to edit videos and create web pages (including links to pirated software). In sum, “al-Qaida has transformed from a terrorist organization that selectively leverages the media… into a media organization that selectively leverages terrorism….

Brachman argues that the only way to defeat Al-Qaida is to beat them at their own game (which is of course our game, dammit). He doesn’t mean better marketing of America’s image. (I’m reminded of the “This American Life” episode where an ad agency tried to create a commercial to sell American values to the Muslim world. The same all-white account team had a debate for another commercial about whether “black kids swim.”  Let’s just say the focus groups didn’t go well.)

Instead, Brachman advocates an open, collaborative forum for exchanging ideas and information about Al-Qaida’s ideology and media. Crowd-sourcing a response to Al-Qaida—now that’s something we should be good at doing.

The coffee logo mug graphic was better but I didn't have copyrights....

Why doesn’t Moore’s Law apply to user interface?

February 22nd, 2010 by Rob Haitani

Products become smaller, faster, cheaper every year. My sports watch probably has more computing power than my first computer. But it’s harder to use than my last sports watch. Why doesn’t usability advance continuously? Why does it often regress?

Well with hardware, many advances involve greater density. Pack more transistors on a chip and it gets faster and smaller. But interface design faces a bottleneck: the brain. A psychology paper called “The Magical Number Seven, Plus or Minus Two,”  describes how your brain can only process a handful of items at once. Too many features on one screen and you can’t process them (unless you memorize them). Divide the features into too many different places and you can’t remember how to find them. So as features increase, information density creates cognitive gridlock like 880 at rush hour.

If you can’t widen the cognitive pipe, however, you can optimize. Fortunately, most people use only a fraction of product features. One study found people spend 80% of their time on 3.6% of the features (the “80/3.6 rule”?). Consequently, putting frequently used features on screen and hiding the rest in menus can make complexity feel simple (a.k.a., “Zen of Palm”). It’s not gridlock if the cars you care about get through because they’re in front. The intelligent computing equivalent is “relevance.” Instead of hours of video, show only clips of people coming through a door. Or instead of random shopping options, show likely choices based on my preference patterns.

Moreover, if software had any idea what you wanted to do, you wouldn’t have to hunt for features in the first place. UI structures act like filing systems for features. Over time, smart natural language input will provide more direct access to features than menus and cryptic icons. Ultimately, it’s easier to ask a librarian than to wander the shelves figuring out the Dewey Decimal System. (It’s similar to Clay Shirky’s description of how Google’s direct search supplanted Yahoo’s structured lists.) That’s what I mean by the (tongue-in-cheek) statement that intelligent computing will make interface design obsolete.

The creativity of Dr. House, chimpanzees & Kepler

February 15th, 2010 by Rob Haitani

If HTM theory describes cognitive activity, I figure goofy topics like creativity are fair game to ponder. I don’t mean random creativity. If you show up at a meeting wearing a gopher suit and speaking Klingon, I suppose that’s “creative,” if not “helpful” or “sane.” I mean creative problem solving, like Dr. House. He’ll pick up a FedEx package, and suddenly see an analogous relationship to lungs that leads to a diagnosis. Jeff Hawkins describes this as “prediction by analogy,” where creativity means using a “higher level of abstraction” to make “uncommon predictions.” We find salient relationships between disparate concepts, which inspire answers. Delivering packages is like delivering oxygen.

Humans have a big lead over HTM because we know about FedEx and pulmonary systems (House does, anyway). So what’s the simplest level of abstraction we could test? Rudolf Arnheim describes how a chimpanzee can’t categorize triangular arrangements of dots as triangles. I’ve seen an HTM categorize four circles with 90-degree cut-outs as a square (right image below). Does that count?

But to be truly “creative,” you must find what Arnheim calls “good fits hidden by the primary appearance of the evidence, yet applicable through ingenious re-structuring.” (Hawkins uses similar terms.) Arnheim describes how star movement appeared erratic until our buddy Kepler adjusted for earth’s orbital motion. This re-structuring seems to me like a pre-processing step for the data rather than a naturally discoverable pattern. (I could speculate further, but I’d just be making stuff up.)

And can you adjust “creativity” levels? I don’t want Vitamin D Video getting “creative” about recognizing people. Would “creative” networks become unwieldy if you didn’t prune out uncommon combinations? Humans don’t waste time re-evaluating every night the probability that the sun will rise the next day. (Hume used this example to assert the irrationality of causal reasoning—how do we know for sure that past results indicate future performance? Quantum physics states we don’t actually know the sun will rise—it’s just very, very likely.)

In the meantime, “Ka’plah!” fellow gophers.

shapes

(Apologies for the pretentious Magritte reference)

Vitamin D Video 1.0 is now available

February 8th, 2010 by Rob Haitani

I’m pleased to announce we have released version 1.0 of Vitamin D Video.  For those who missed my previous blog on pricing, we are offering three versions:

  • The Starter Edition is free.  It has all of the functionality of the Beta release, but is limited to one camera configured per computer.  If you switch from a beta version to the Starter Edition, all previously recorded video  will be saved, but moving forward you can only have one camera set up at a time.
  • The Basic Edition is $49, and supports two cameras per computer. In addition,  in this version you can save and view video in VGA (640 x 480) resolution.  This could be the best choice for older or low-powered machines (geekspeak translation: single-core processors probably won’t handle more than two cameras anyway).
  • The Pro Edition is $199 and gives you the same functionality as the Basic Edition but with no limit on cameras. Or more specifically, you can run as many cameras as your PC can handle. Roughly speaking, Vitamin D Video can run 2 cameras per core, or 1.5 cameras if you want to use other applications. For example, a quad-core machine, should run eight cameras, but if you want to use it while the cameras are running it probably caps out round six cameras.

If you are a residential customer you could save money depending on how many computers you have. For example, if you want to use four cameras and have two computers, you could buy two Basic versions for a total of $98.  Over time we will add more functionality to the Basic and Pro Editions, and plan to continue to provide the Starter Edition for free.

Last but not least, I’m pleased to announce a contest to give away a free iPad!  We are looking for video clips of real-life cases where Vitamin D Video helped you find something valuable (or funny).  Vitamin D will select the finalists and you can vote for the winner.  Details can be found here.

Thank you all for your support and feedback during the Beta period!

Cool Clips Contest

Why smart people make bad products

February 1st, 2010 by Rob Haitani

I said I’d write about user interface, so let’s start by asking why products are hard to use. Well, that’s a long discussion. But after trying to explain the brain in 350 words this can only be easier.

First, software interfaces are not constrained by the annoying laws of physics. It’s nice that you can jump anywhere, or things magically appear.  But the constraints of physics have an upside: predictability. Water wet. Fire bad. When traveling, you might wonder if you can drink the water, but you’re pretty sure you shouldn’t step into the campfire.

I’m reminded of a book called Einstein’s Dreams. Each chapter describes a world with different laws of physics—time flows backwards, or in circles. Trying new software is like entering different worlds where you have to figure out the rules. And some apps are like the world where cause and effect are erratic. If you can’t figure out or remember what action leads to what result (a “mental model”), you’re paralyzed.

Of course software is constrained by logic in the programmatic realm, where developers live. Your interface world is just a shadowy parallel existence. So why don’t we map UI to the programmatic world? We did once—it was called DOS. (Remember “dir/w…?”).

But aren’t there known UI conventions? Wouldn’t software be easy if designers followed consistent design conventions? Well, that’s like saying I’d understand molecular biology if you’d just use correct spelling and grammar. As lawyers say, “necessary but not sufficient.”  You must “edit” an interface, like you would prose, until it’s clear and concise. You also need  skill with metaphor. Suppress your understanding of how the technology works, and instead structure an interface around what your customer wants to do. I don’t want to launch phone.exe, query contacts DB, and call the telephony library. I want to tell my wife I’ll be late.

So that’s a high-level take at why smart people can make bad products. Later, I’ll talk about  why I think design processes are sometimes misguided, and the curse of feature creep.

Come on in, the fire's fine!

Please don't use Vitamin D Video if…

January 24th, 2010 by Rob Haitani

Last week I mentioned a $1 billion government contract that has only delivered a prototype that “sorta works.” When introducing new technology, it’s important not to oversell. It might “demo well,” but you have to know how often it breaks in the real world. And that degree of error must not compromise usability.

In the video analytics field, companies made big promises about detecting any intruder and alerting you. Some systems met those claims, but weren’t necessarily usable.

Let me illustrate by analogy. Take a deck of cards, and guess the suit before you flip each card. Bet someone a beer you can guess 100% of the clubs in the deck. Then simply yell “CLUBS” confidently for every card. Congratulations, you got 100% of the clubs right, even if you got all the others wrong (“100% true positive rate,” if you want to rub the statistical term in their face while relishing your well-earned beer).

That’s a silly example, but there are similar concepts in recognition software.  You can expand your rules to increase the percent of people detected, but your broader net also tends to catch more false alarms (“false positive rate”).  And even a 1% false alarm rate can cripple usability in large-scale deployments.  For example, 2000 objects per day would trigger 20 false alarms—per camera. With 100 cameras, you’d have security guards scrambling all day. So if you’re looking for fail-safe notification to guard weapons-grade plutonium, please don’t use Vitamin D Video.

So how accurate is Vitamin D Video? We found mid-90% accuracy in real-world tests, but raw accuracy numbers honestly don’t mean much. That’s why we made it easy to try our software yourself with a simple webcam. Your results will depend on your specific environment, camera setup and rules.

More importantly, we believe the greater value of the software is how recognition makes it easy to scan through recorded video.  100% accuracy isn’t required to do that, but high accuracy (and good UI) is. We’d love to hear about your experiences.  Please post a comment or send us a note.

card-deck

How to guess cards with 100% accuracy

How to survive a robotic uprising

January 17th, 2010 by Rob Haitani

The NY Times reports that 4,000 military analysts watched the equivalent of 24 years of video last year from unmanned drones flying over Iraq and Afghanistan.  To learn more, I read Wired for War by P.W. Singer, but found about much more than drones.  The book covers technical, ethical, social and legal issues regarding military robotics.  There are chapters on military strategy, AI, the “Singularity,” and science fiction inspirations for inventors (I was even startled to see my name in the book!).

Maybe familiarity breeds contempt, but if the state of the art in your industry is telling people from cats, you don’t lose sleep worrying about cyborgs.  But I was surprised seeing how far robotics has advanced.  Not that we have Cylons walking around, but apparently we have deployed remote-controlled robots that fire machine guns.  The manufacturer’s website states, “Contrary to what you may have read on other web sites or seen on television,  there actually have been NO instances of uncommanded or unexpected movements….”  (Why don’t I find that reassuring?)

But that’s nothing compared to the chilling description of a different weapon’s “software glitch.”  ”There was nowhere to hide. The rogue gun began firing wildly, spraying high-explosive shells at a rate of 550 a minute, swinging around through 360 degrees like a high-pressure hose.”

And your QA test cases must include getting shot at, electronically hacked or jammed.  One project aims to create a single, impregnable satellite control system for all robots. Once again I’m thinking “not reassured,” like the skeptic who said, “They should just go ahead and call it Skynet.”

I’m not trying to be alarmist.  Personally, I advocate cynical dystopian humor, like How To Survive a Robot Uprising.  We’re a long way from autonomous robots.  But lots is happening before then.  Maybe the takeaway is that military robotics is a bad area to over-promise and under-deliver. I’m more alarmist as a taxpayer when I hear the 60 Minutes report on $1 billion spent on a 2000 mile “virtual fence” that only covers 28 miles and “sorta works.”  In a future blog, I’ll talk about perceptions and realities about how well intelligent computing systems work.

robot

So…which camera should I get?

January 10th, 2010 by Rob Haitani

I geek out on some technology, but could care less about others.  I could tell you more about my friends’ cell phones than I could about their children, but don’t ask me what kind of feline predator OS my Mac is running (Bornean Clouded Leopard?).  And I don’t want to invest much effort or money in something new, until I’m confident I’ll stick with it.  So when people ask me what type of camera to get for Vitamin D Video, I tend to start them off simply.

If you have laptop with a webcam built in, you’re set.  Granted, the first video you see will be your face, which isn’t too useful (unless you want to set up an alert saying, “Dude get away from my computer.”).  But you can set your laptop on your dresser or a chair by the window to see how the software works.

You can also get a cheap USB webcam starting around $20 at any computer store.  USB webcams are easy to set up (just install the software and plug them into a USB port), and so small you can mount them in odd places.  You can also get a 10-foot USB extension cable for about $10.  One tradeoff to keep in mind is that many webcams are optimized for video chat, so the software may not automatically adjust to very bright or dim outdoor lighting.

Another category of cameras called “IP” or “network” cameras (sometimes also called “webcams”) can stream video wirelessly to your PC (though you need to plug them into a power outlet). You can get one for under $100, or if you spend around $200 for a good Axis or Panasonic camera, you may find the video quality and even recognition is better. (The better cameras are usually sold on sites like Amazon rather than computer stores.)  Smaller network cameras can be placed on a window sill, or you can remove the stand and tape them to a window with clear packing tape (I do this so I can easily move them around).

With network cameras, you need to have a good wireless signal where the camera is (or a place to plug a network cable).  Setup is more involved, and is more like getting a printer on your network. To help you out, the Vitamin D Reference Guide has detailed instructions.

Of course, if you are hardcore about cameras, the sky is the limit. You can get cameras with night-vision, with higher resolution or that work outdoors. For more information, you can Google for camera review websites.  Happy monitoring!

blog-cameras

Intelligent computing is kind of a big deal

January 3rd, 2010 by Rob Haitani

I’m always intrigued when I hear non-geeks mention my field. Yesterday I was watching CNN’s Your Money, when suddenly economist Diane Swonk says, “We are on a precipice of a new technology revolution…the move from the information to the knowledge age. Smart technologies…are the way of the future….”

Now, people in Silicon Valley are constantly talking up the next Big Deal, so to paraphrase Sarah Vowell you tend to get a headache from always rolling your eyes. Also, it’s easy to find support for any Big Deal if you ask people in that industry.  But here was the chief economist of a mainstream financial firm, who had never bought into the dotcom hype of the 90s, saying smart technology is kind of a big deal.  (Admittedly the “precipice” imagery was somewhat disconcerting, but it was a live interview.)

A few weeks earlier, a review of a book called The Fourth Paradigm showed up in my daily NY Times email. In it, Jim Gray argues for new tools to manage and analyze the flood of data that has transformed scientific research (“It’s the data, stupid”).  This was not an endorsement of intelligent computing per se, but it articulated a significant problem that HTM could address. To a product guy like me, that’s even better, since HTM theory is all about finding complex patterns in masses of data. HTM could be to data what the plow is to soil.  (Pardon the tortured analogy–on a related note, apparently the SAT dropped analogy questions years ago.)

These are of course just two blips on the radar screen, and it’s deceptively easy to hear what you want to hear.  But there’s a big difference between the feedback loop of your buddies at the Singularity Conference (or whomever you consider your industry peers), and the needs or opinions expressed by disinterested parties. Keep a discerning ear on the ground, and good luck pursuing your Big Deal.

I'm kind of a big deal - t-shirt