So far on my quest to create better restaurant metrics I’ve tackled two issues: better weighting of reviews by star-count, and accounting for confidence level of any rating. There’s a bigger issue than that though: Who’s reviews should we trust, and how much should we trust them (i.e weight them)?
This is a really confusing question, because the answer varies from person to person and region to region. As I’ve learned after more than a year in the midwest, midwesterners and northeasterners have different tastes. Even within this region, there’s incredible variation. Taste buds vary immensely between Hyde Park, Peoria, and ‘the loop’. On top of that there’s a lot of review fraud, with restaurant-owners buying good reviews for their establishment, and bad reviews for their competitors.
At first I thought of using a Fletsch-Kincaid grade-level algorithm to give more weight to better-written reviews. That seems pretty elitist though, and I’m not sure there’s any real correlation past a certain point between review quality and what grade-level it’s written at.
Lately I’ve come to the conclusion that this should probably be individualized, and I’ve been desperately trying to figure out how this could be done. I’m still far from a solution, but I have some ideas.
I read a lot in recent weeks about Bayes filters (for email spam). These comb datasets of spam and non-spam emails to develop individual probabilities that individual words and phrases are in a spam email. I feel like this could be applicable in our situation too, and for more than just fraud detection.
What I’m envisioning is this:
- Users input seven restaurants they love, and seven they hate.
- Our code scrapes the reviews for these restaurants, and divides them into four datasets:
- 1-3 star reviews at places the user loves
- 4-5 star reviews at places the user hates.
- 1-3 star reviews at places the user hates.
- 4-5 star reviews at places the user loves.
- Using the scraped reviews, we can create individual probabilities that certain words would be used in reviews the user would agree or disagree with. With these probabilities, we could infer for any restaurant (even outside our dataset) whether our user is more or less likely to like it than the average reviewer.
- We can weight individual reviews based on the likelihood that they have similar tastes to our user, and create individual probabilities that our user will like a certain restaurant.
A lot has been done using Bayes to filter out spam. I think it can do more than that though.
I still have a lot to learn on different filter types, but this seems like a really interesting way to use data to personalize restaurant reviews.