J

1 Message

 • 

70 Points

Wednesday, November 3rd, 2021 3:37 PM

No Status

1

API Improvement needed for de-skewing ratings

I want to develop an algorithm that de-skews IMDB rating histories. I hypothesize that early episodes in series are skewed downward by viewers who check out a show and decide it's not to their liking and stop watching (see sample show graph below.) What remains is the average rating of "loyal viewers." The problem with combining this data is that if, say, one wants to determine the most liked show in a series by people who have watched most of a series, this becomes impossible.

What would make this possible is if IMDB were to release the detailed ratings of a show, grouped into an array by user, de-identified of course. Conceptually, it could look something like this:

User 1: {S1E1: 5, S1E2: 6}User 2: {S1E1: 8, S1E2: 9, S1E3: 9}


... and so on.

A sample show that seems to exhibit this self-selecting bias phenomenon is "Peep Show" as graphed at ratingraph.com:

No Responses!