Let’s admit something: Star rating aren’t that helpful. We’ve all eaten at that terrible restaurant with four stars on Yelp and that delicious corner joint with two. Lately I’ve been very curious about more effective metrics, and I’ve been testing a few different ideas.Today is part one of a series where I’ll explore the potential of some of these ideas.
The end goal of this series is to define a system for finding restaurants where I’m most likely to have an enjoyable or very enjoyable meal in any given area.
When it comes to designing measurements, you always have to make choices, and I’ve decided that consistency should matter.
Consider the following two restaurants.
Restaurant 1 has two chefs who alternate days. Chef 1 is terrible. Expect food poison. Chef 2 is amazing. Unsurprisingly this restaurant’s 10 Yelp reviews are evenly divided between 1 and 5 star reviews.
Restaurant 2 has only one chef. He’s entirely adequate, and consistently so. His food will never wow you, but it will never make you sick. Every review of his restaurant gives it exactly 3 stars.
You have the option to go to either restaurant one night, but the only data you have availbale is their Yelp rating. Both restaurants are rated at a 3. (3*10 and [1*5]+[5*5]). You choose the restaurant where the top reviews read “amazing” and wind up with food poisoning.
Scenarios like this actually happen (Ok, Maybe a little less extreme than this).
So, we’re going to account for it with some middle-school-style math inspired by Reddit. Reddit uses a logarithmic voting scale to minimize the impact of outsized numbers on the system.
Rather than assign star ratings based on a simple average of star reviews, we’re going to assign each review the value of LOG(StarNumber + 1) (I added the constant because even 1-star reviews can have redeeming qualities). For now we’ll take the average of these figures and multiply by five.
Under this new rating system, it’s pretty easily apparent which restaurant you should choose. Restaurant 1 comes in with a score of 2.698. Restaurant 2 comes in with a score of 3.01.
Let’s say chef 1’s slightly-less-terrible sister works as the chef 2/3 of the time at another restaurant, restaurant 3. Chef 2’s also-amazing sister works the other 1/3 of the time, and their nine Yelp ratings include 6 two-star reviews and 3 five-star reviews.
Under our system, restaurant 3 achieves score of 2.887, reflecting the slightly lower risk of getting poisoned compared to restaurant 1.
It’s not a perfect system, but it improves on the simple average by a lot (assuming that it’s better not to be poisoned 50% of the time).
