S

147 Messages

 • 

1.9K Points

Monday, November 8th, 2021 3:03 AM

No Status

-1

Suggestion: Automatically remove keywords when two-thirds of their votes are "not-relevant"

The specifics on how this works could be up for debate, but currently the only way keywords get removed is via users manually putting them through a removal process. The problem here is that very few people seem to do this and we end up with many releases having a mass of inaccurate or duplicate keywords with limited oversight as to their relevancy. @keyword_expert seems to methodically go through keywords, isolate duplicates and genre terms and do a lot of legwork to remove inaccurate or otherwise flawed keywords, but seems to me he's fighting an uphill battle.

A solution, as the title proposes, would be to simply have the system automatically remove keywords that get marked as "not relevant" by a majority of users with a 3 (or more, maybe) threshold. Keywords rejected in this way would effectively be unable to be re-added until they get voted up again. Rejected (or "not-relevant" in terms of reference of IMDB terminology) keywords on releases would no longer show up on the keyword search.

This, I believe would make keywords broadly more accurate over time and encourage more audience participation - it would empower users to deal with duplicate keywords and inaccurate keywords in a much more convenient way.

2.7K Messages

 • 

47K Points

3 years ago

I support this proposal, and to clarify, I believe there should be a minimum number of votes on relevance before the automatic deletion would kick in. I believe that minimum should be 3 votes where all 3 votes are for "not relevant," or a minimum of 4 votes where the votes are mixed (some votes for "not relevant," some votes for "relevant").

To give some examples, keywords would be automatically deleted when marked as "0 of 3 found this relevant," "1 of 4 found this relevant," "2 of 6 found this relevant," "3 of 9 found this relevant," etc.

I am far from the only one who cleans up keywords on IMDb, but even though there are others, it would be a good idea to institute some kind of automatic cleanup, like described in this post.

(edited)

Champion

 • 

14.4K Messages

 • 

329.9K Points

You have described how spammers use sock accounts to up- or downvote keywords. So is it really a good idea to give more weight to votes?

147 Messages

 • 

1.9K Points

There are ways IMDB could mitigate that issue, if they so chose to - even under the current system.

2.7K Messages

 • 

47K Points

@Peter_pbn Sock accounts and spamming are separate problems that absolutely need to be dealt with, but I don't see them as uniquely related to this proposal. If someone wants to go to the trouble to downvote keywords using three different sock accounts, they could just as easily manually delete the keywords via the keyword editing function (using only a single account).

(edited)

Champion

 • 

14.4K Messages

 • 

329.9K Points

Instead of "automatically remove", perhaps "automatically submit for deletion", which (at least theoretically) gives contribution processing a chance to catch vandalism.

2.7K Messages

 • 

47K Points

@Peter_pbn I would support that idea as well. In 99% of cases, a keyword contribution in fact ends up being automatically processed. Your idea would allow for the 1% of cases where something special is going on, potentially including vandalism and bots.

Champion

 • 

14.4K Messages

 • 

329.9K Points

Where did you grab that number from?

2.7K Messages

 • 

47K Points

@Peter_pbn From my experience with keyword contributions. Only a very small number of keyword contributions are actually reviewed by humans. And this is as it should be -- it would be ridiculous for every single keyword edit to be manually reviewed. 

2.7K Messages

 • 

47K Points

@Peter_pbn Another way to corroborate that 99% of all keywords are automatically approved is to check the Data Processing Times webpage, especially on weekends. This webpage typically shows a keyword backlog of only a couple hundred keywords. Those are the keyword contributions that have been pre-flagged for human review, such as keywords that include certain sensitive words within them (e.g., "rape"), sometimes newly created keywords, and keyword editing done by new users. The number of pending keywords displayed on that webpage will grow on the weekends, because that's when IMDb staff are typically not working. And it's also on the weekends when the automated approval of 99% of keyword contributions is most readily observable -- again, because IMDb staff are not working, and yet the keyword contributions are nevertheless being "approved." 

Lately, it can take a lot longer for keywords to go through the automatic approval process -- it used to be about 20 minutes, but these days it can be more like 24 hours -- but the vast majority of keywords continue to be automatically approved (which, again, is as it should be).

(edited)

147 Messages

 • 

1.9K Points

3 years ago

I've added a note about a minimum threshold of votes. It could be 3. It could be 5. It's whatever. But obviously just 1 or 2 would be too small lol.

147 Messages

 • 

1.9K Points

3 years ago

Also, under this idea keywords that go into the negative wouldn't necessarily be deleted from the page entirely - they just would no longer show up in keyword searches, and be displayed at the bottom of the page. This would essentially mean that users could upvote downvoted keywords if they disagree.

2.7K Messages

 • 

47K Points

@Skavau If a keyword has 3 of 3 not relevant votes (i.e., 0 of 3 relevant votes), I would be completely okay with that keyword being deleted from the database altogether.

147 Messages

 • 

1.9K Points

@keyword_expert Well, I suggested that to mitigate the fears people have with an accurate keyword being removed entirely on a small number of votes.

Also seems to me that if it's removed entirely someone could just re-add it and we just get a circle of people adding and then it being downvoted, and consequently deleted?

2.7K Messages

 • 

47K Points

@Skavau In my experience, it's pretty rare for a deleted keyword to be added back in. There are exceptions -- particularly trolling, strong-willed users, and situations where the deletions themselves occurred through trolling or sabotage -- but when a bad keyword is deleted, it typically doesn't show up again.