Thursday, April 25, 2013

Learning About Learning

I started Andrew Ng's coursera class on Machine Learning this week. It's fun so far, and I've learned some new terminology to help frame my goals. There are two domains in which I aim to use ML: 1) learning to associate colors with words through an expansive database, and 2) learning to recommend a color given a text prompt. Here's a tidy definition of ML offered by Ng, and how I think it can be applied to both of my domains:
computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P improves with experience E. - Tom Mitchell, 1998
1) In the case of Neko learning from datasets:

   E is the collocation of colors and words in a database.
   T is the clustering and re-clustering of colors with words.
   P is the score of the clusters (how well-sorted they are).


An example of k-means clustering

2) In the case of Neko learning from people:

   E is testing colors on different individuals.
   T is returning a color, given some text.
   P is the number of well-liked colors.


An example of a support vector machine

The categorical names for each are that Case 1 is unsupervised clustering, and Case 2 is supervised classification. K-means is a likely algorithm for the former, and a support vector machine for the latter. Because order is meaningful (Orange is closer to Red than Yellow), color is a regression problem with continuously valued output. But there is a sense in which colors are discrete as well, so that's what I'm mulling over now.

No comments:

Post a Comment