Clustering and Classification are two common Machine Learning methods for recognizing patterns in data. Lucid Thoughts explains what they are and the differences between them.
Subscribe to the Lucid Thoughts channel and be sure to leave your questions and comments on each video.
So we meet again. Let’s be real, it’s really easy to think clustering and classification are the same thing. I mean, everybody talks about ’em like they are.
Trust me, they’re not.
Both clustering and classification are types of machine learning, but work in very different ways. And each can have a big impact on your business.
Let’s start with classification.
Classification is a supervised form of learning, where you teach the computer to do something with data that’s already labeled by humans. This training set includes a fixed amount of labels or categories for the computer to learn from. By spotting patterns in the training data, the machine can classify new data to pre-determine categories, so that’s classification.
Now, how is that different from clustering?
Clustering is a form of unsupervised learning. No training sets, no labels.
Think back to where you were a mere 12 years old with that new piece of graph paper in front of you, full of possibilities.
Depending on which data characteristics are important, some of these points will hang out in the same areas, these are clusters. They tell us that the pieces of data are similar based on the parameters we set for the computer. Now, imagine this on a much larger scale. Oh, wow!
So these two methods are basically different ways to teach machine to organize new data and both have a place in business today.
In retail, for example, you can use data to see where customers have looked on your website and whether or not they made a purchase.
Using this information, you can label visitors as likely customers or just browsing customers. Boom! Now the machine knows how to classify customers versus window shoppers. Once you know who’s ready to draw some cash, you can make recommendations.
Start by letting the computer to find customers with similar purchase systems. Once this cluster is created, you can use this to suggest other products they’re enjoying. Labels, categories, similarities… their how we humans organize all things we see in a day. And they’re how machines that’s gonna change the way we see things in the future. So, till next time.