I'm trying to figure out the best way to recommend images based on past classifications using k-means clusters. What I have done is mapped the RGB values of a set of images, performed a k-means cluster analysis on those RGB values, and attached a "rating" to each image. This has created Voronoi cells similar to this graph. I've stored the cluster centers and ratings into my "training set".
The next step is to take a new image and make a recommendation based on the training set previous images. I'm not sure how to proceed. Would I want to implement a Collaborative Filtering process? Or do I need to perform more processing on the data?
Not sure if it matters but I'm using Apache Spark for the project. Thanks!
Edit: Collaborative filtering is probably not the best way to proceed, since the features being compared for products uses more than just ratings. I need to compare the similarity. I'm guessing this would involve heavy matrix operations?
Edit2: Some feedback here would be awesome. What I'm thinking is training two datasets (a rating of "yes" images vs. "no" images) and then using Spark's computeCost
function to develop a value for variance/bias of an image being compared. The final step would then compare whether the image was more similar to the "yes" dataset or "no" dataset, and then make a final recommendation. I'm new to machine learning, so I could be over-thinking it.