keyword_expert's profile

2.7K Messages

 • 

47K Points

Sunday, May 22nd, 2022 2:20 PM

Solved

Duplicate Keywords - List #24 (Proposals for Permanent Merger and Auto-Conversion) (real life people keywords)

Here is the next installment of my lists of proposed keywords for permanent merger and auto-conversion. 

I am posting this for fellow contributors to review first and raise any objections or questions. I will wait at least seven days before changing this post to a "problem" post and asking IMDb staff to make the proposed changes.

The mergers and auto-conversions should be made in the direction of the arrows.

Duplicate Keywords Proposed for Permanent Merging and Auto-Conversion

beethoven (84 titles) --> ludwig-van-beethoven (23 titles) --> reference-to-ludwig-von-beethoven (15 titles) --> beethoven-reference (3 titles) --> reference-to-ludwig-van-beethoven (323 titles)

churchill (6 titles) --> winston-churchill (65 titles) --> reference-to-churchill (8 titles) --> reference-to-winston-churchill (472 titles)

elvis (52 titles) --> elvis-presley (174 titles) --> reference-to-elvis (37 titles) --> reference-to-elvis-presley (684 titles)

franklin-d.-roosevelt (2 titles) --> reference-to-franklin-d-roosevelt (9 titles) --> reference-to-franklin.d.roosvelt (1 title) --> reference-to-franklin-delano-roosevelt (25 titles) --> reference-to-franklin-delano--roosevelt (7 titles) --> reference-to-president-frankin-d-roosevelt (4 titles) --> reference-to-fdr (2 titles) --> reference-to-franklin-d.-roosevelt (377 titles)

hitler (45 titles) --> adolph-hitler (8 titles) --> reference-to-adolph-hitler (23 titles) --> reference-to-adolf-hitler (2085 titles)

jfk (21 titles) -->  john-f.-kennedy (16 titles) -->  john-f-kennedy (3 titles) -->  reference-to-john-fitzgerald-kennedy (17 titles) --> reference-to-john-f.-kennedy (504 titles)

lenin (38 titles) -->  vladimir-lenin (5 titles) -->  reference-to-lenin (159 titles) --> reference-to-vladimir-lenin (94 titles)

lyndon-b.-johnson (4 titles) --> lyndon-johnson (3 titles) --> reference-to-lyndon-johnson (117 titles) --> reference-to-lyndon-b-johnson (12 titles) --> reference-to-lyndon-baines-johnson (2 titles) --> reference-to-lbj (1 title) --> reference-to-lyndon-b.-johnson (33 titles)

mussolini (22 titles) -->  benito-mussolini (22 titles) -->  reference-to-benito-mussolini (267 titles)

princess-diana (59 titles)  -->  reference-to-princess-diana (255 titles)

putin (34 titles) --> vladimir-putin (32 titles) --> reference-to-vladimir-vladimirovich-putin (8 titles) --> reference-to-vladimir-putin (544 titles)

saddam-hussein (137 titles) --> reference-to-saddam-hussein (209 titles)

stalin (84 titles) --> josef-stalin (78 titles) -->  joseph-stalin (13 titles) --> reference-to-josef-stalin (379 titles) --> reference-to-joseph-stalin (149 titles)

the-beatles (205 titles)  -->  beatles (42 titles) -->  reference-to-the-beatles (715 titles)

theodore-roosevelt (41 titles) --> reference-to-teddy-roosevelt (10 titles) --> reference-to-theodore-roosevelt (174 titles)

trump-presidency (82 titles) --> donald-trump-presidency (383 titles)

Accepted Solution

Employee

 • 

5K Messages

 • 

53.3K Points

2 years ago

Hi keyword_expert-

 

All the list of keywords has been merged and auto-converted.

 

Cheers!

2.7K Messages

 • 

47K Points

@Bethanny​ Thank you!!

1.3K Messages

 • 

23.1K Points

2 years ago

Standalone proper names are, of course, not intended to be keywords, but they must be audited carefully.  IF the person appears in archive footage, they should be added to the cast listing.  If they are just talked about or appear on-screen in a picture or drawing or text, the reference-to- keyword is appropriate.

''Just because a standalone proper name now appears as an incorrect  keyword, one should not assume that it is automatically a reference-to- situation.

P.S.  Of late, I have been having difficulty in getting the staff to accept a cast addition of someone who appears in archive footage.  (And I am not talking about someone in a title that already exists on IMDb, which should be, according go guidelines, eliminated.) Don't know why there has been such hesitancy in adding people who appear in legitimate archive footage to the cast listing???

Just one title of many recent examples of these kinds of rejections:  (I have seen this film.  Guess IMDb hasn't.)

220412-155307-133000
Track Contribution
2022-04-12 15:53:07 Red Rocket (2021)
Cast -  1 credit added
220412-155203-274000
Track Contribution
2022-04-12 15:52:03 Red Rocket (2021)
Cast -  1 credit added

(edited)

2.7K Messages

 • 

47K Points

@bradley_kent​ Under all those scenarios, it would be perfectly appropriate to have "reference-to-" keywords in addition to the other edits or cast entries you mention.

In other words, whether these real persons are part of the cast, shown in archive footage, featured in the form of a character, spoofed, shown by name in text, shown in a photo, or spoken about by a character, it would be perfectly fine to add a "reference-to-" keyword for that person, in addition to any other keywords or cast changes deemed appropriate for the particular situation. 

Thus, the only "auditing" to do for any of these keywords would be to determine which additional edits might be made, not whether or not to merge these keywords into "reference-to-" keywords. 

(edited)

1.3K Messages

 • 

23.1K Points

This, again, wold result in a lot of unnecessary duplication.  In any Biography or History title, for examples, characters listed in the cast (resulting in the "-character" addendum as a keyword) could potentially run into hundreds of "reference-to-" keywords.  What a mess that wold be.

"Yankee Doodle Dandy" (1942) george-m.-cohan-chaeacter is correct, but reference-to-george-m.-cohan is just duplicitous.

(edited)

2.7K Messages

 • 

47K Points

@bradley_kent​ But the keywords in your examples are not actually "duplicitous" (I would say "duplicative"), because "reference-to-george-m.-cohan" and "george-m.-cohan-character" are two different keywords with two different meanings. It is perfectly fine to add both of them to titles where both apply. This also facilitates keyword searches, so that when users include the more prevalent "reference-to-" keywords in their searches, they will still pick up the relevant titles.

As a matter of fact, names of "real life people" is one of the rare circumstances where the IMDb guidelines expressly advise contributors to use "reference-to-" prefixes (which should otherwise be applied sparsely). This guidance is in the "reference-to-marilyn-monroe" example in the guidelines linked below.

https://help.imdb.com/article/contribution/titles/keywords/GXQ22G5Y72TH8MJ5#unaccept

1.3K Messages

 • 

23.1K Points

But, if you open up this can of worms for these duplications, they will run into hundreds of thousands, even millions of "duplicative" keywords.  A search for george-m-cohan would give you all the -character and reference-to- and all the other "relevant descriptive signifiers."  Why repeat information that is already easily searchable?  With this attitude, keywords just become as ocean of synonyms signifying everything, yet nothing.

I thought this was something that you were trying to "clear up" with your keyword mergers.  Guidelines are guidelines, and they have limitations.

P.S. I wager that if a name has a -character keyword, THAT same name is automatically referred to in the same title's content. The reference-to- keyword should signify people who DO NOT have the -character keyword.

(edited)

Champion

 • 

14.1K Messages

 • 

326.7K Points

@bradley_kent​ 

Whether the suggested edits are made or not, no one is stopping you from editing these keywords later and changing reference-to- to -character etc.

1.3K Messages

 • 

23.1K Points

Or, vice versa.  I am just trying to reduce the workload by using guidelines that are factual and sustainable and, in some cases, already well established.

2.7K Messages

 • 

47K Points

@Peter_pbn​ 

@bradley_kent​ 

Whether the suggested edits are made or not, no one is stopping you from editing these keywords later and changing reference-to- to -character etc.

I would hope Mr. Kent would add other keywords, rather than changing the "reference-to-" keywords to other keywords. As I have tried to explain in this thread, it is totally fine to use "reference-to-" keywords alongside "-character" keywords. If instead a "reference-to-" keyword is converted to a "-character" keyword, then the "reference-to-" keyword is lost, which is a bad thing.

(edited)

1.3K Messages

 • 

23.1K Points

It is not lost.  If a proper name keyword has the -character addendum, the proper name is there, and does not also need the reference-to- keyword.  The reference-to- keyword, then, is just redundant and unnecessary in that case..   Enter the proper name as a keyword, and you get everything.

2.7K Messages

 • 

47K Points

@bradley_kent​  When a relevant keyword is deleted from a title, something is lost for sure, especially in terms of keyword-combination searches. In this case, "reference-to-" and "-character" keywords are different types of keywords with different meanings, and they both have independent utility and value. They are not duplicate sets of keywords, and it is not redundant to have both of them on the same title. 

1.3K Messages

 • 

23.1K Points

I look forward to you adding reference-to-proper name keywords for EVERY proper-name-character keyword in the database.  This repetition is totally unnecessary and borders on the ridiculous.

2.7K Messages

 • 

47K Points

@bradley_kent​ It's not worth my time. Nor is it worth your time to delete relevant keywords. 

Champion

 • 

14.1K Messages

 • 

326.7K Points

@keyword_expert​ 

names of "real life people" is one of the rare circumstances where the IMDb guidelines expressly advise contributors to use "reference-to-" prefixes

reference-to is suggested alongside several other options in a non-exhaustive list. Contributors can choose the most relevant option(s).

1.3K Messages

 • 

23.1K Points

Duh!  Of course I know this. BUT this does not address the situation of "real live people" who are ACTUALLY portrayed IN the title, when the -character suffix is required.  (By the way, a lot of them are "real dead people"!). To repeat the reference-to- keyword for those "people" who are (or should be) in the Cast listing as characters, and who have (or should have) the -character keyword suffix, is an unnecessary repetition, and will add many thousands (perhaps hundreds of thousands, possibly a million) unnecessary keywords.

(edited)

2.7K Messages

 • 

47K Points

@Peter_pbn​ 

reference-to is suggested alongside several other options in a non-exhaustive list. Contributors can choose the most relevant option(s).

That is exactly correct. The most important word in what you wrote is "option(s)," not "option." 

1.3K Messages

 • 

23.1K Points

Absolutely! reference-to- is just an option, NOT the only option.  The most relevant option should be the one selected FIRST.  And some "already existing" options may need to be removed IF they are not relevant.

(edited)

2.7K Messages

 • 

47K Points

@bradley_kent​ We are talking in circles. At this point, I will simply repeat (once again) three points:

(1) Using "reference-to-" keywords along with "-character" keywords is not duplication.

(2) Your approach hinders keyword searches by preventing users from finding relevant titles when they search for "reference-to-" keywords in combination with other keywords. 

(3) Your approach literally removes relevant keywords from the database.

1.3K Messages

 • 

23.1K Points

Perhaps we are both "taking in circles," although I prefer octagons.

(1) Yes, it is.

(2) No, it doesn't.

(3) No, it doesn't.

If you search a proper name keyword, ALL options appear.  

2.7K Messages

 • 

47K Points

@bradley_kent​ Can you explain how to do a multi-keyword combination search using "-character" and/or "reference-to-" keywords, in combination with other distinct keywords, all in the same search, and return all the "-character" and all the "reference-to-" keywords in a single search?

Hint: this can't be done. And that is exactly why your approach, by removing relevant keywords from the database, hinders multi-keyword combination searches, and hides relevant data from the public.

1.3K Messages

 • 

23.1K Points

It can (could) be done, but the problem resides, not in the keywords themselves, but in the "Refine" drop down menu when doing a keyword search.  The listed menu of secondary keywords rarely gets, alphabetically, to the "r's" (including reference-to-) keywords, and is further impaired because reference-to- keywords usually have so few titles that they are relegated to the bottom of the "Refine" drop down menu.

Is there a way to extend this drop down menu?  If so, the search would be much easier when combining keywords. 

2.7K Messages

 • 

47K Points

@bradley_kent​ 

With respect, you are neither understanding my question nor how keyword searches work. 

First, the dropdown menu you mentioned is ordered first by prevalence, and only secondarily by alphabetical order.

Second, the dropdown menu would not be a hindrance if the type of search that I am mentioning were allowed. The dropdown menu is just suggestions for what keywords to combine. These keywords can be modified manually by changing the URLs. For example, if you choose "fork" and "spoon," you can go to the resulting URL and change out "spoon" for "knife" if you want. In this way, the dropdown menu is not ultimately limiting in the types of searches allowed.

But all of this is beside the point. My point is that keyword searches on IMDb only allow multiple keywords to be combined in a way that requires all of the chosen keywords to be included in the titles. It is impossible to combine keywords and obtain search results that display titles that include any of the chosen keywords.

To illustrate the point, you can search for titles that contain "reference-to-batman" AND "reference-to-superman" together, and this only shows titles that contain both keywords. You can also search for "batman-character" and "reference-to-superman" together. But what you can't do is search for titles that contain "reference-to-superman" plus "reference-to-batman" AND/OR "batman-character." 

In other words, keywords can only be combined to show titles that include all selected keywords, not any of the selected keywords (or any of a subset of keywords).

Ideally, any title where Batman is relevant should have the "reference-to-batman" keyword. To delete this keyword from titles where it is relevant, simply because a title includes Batman as a character, is literally deleting relevant keywords, disrupts and frustrates keyword searches, and ultimately hides relevant information from the public.

(edited)

1.3K Messages

 • 

23.1K Points

2 years ago

https://community-imdb.sprinklr.com/conversations/data-issues-policy-discussions/standalone-names-shouldnt-be-keywords/5f4a7c488815453dbafe9de9

I have been working on this listing for years, renting and borrowing and searching on line for actual copies of titles to clarify standalone "name" keywords.  Instead of just merging them (again, an expedient, scorched earth policy), why not join in the research so they can have the "correct" keyword?

(edited)

2.7K Messages

 • 

47K Points

@bradley_kent​ If the mergers and auto-conversions had been performed by IMDb staff three years ago back when @Marco first posted some of these standalone name keywords, then we wouldn't still be discussing the problems with these keywords.

Manual editing of large numbers of keywords by contributors on a private, temporary basis (as opposed to mass keyword edits by IMDb staff following a public discussion) can in fact prolong these types of problems indefinitely by giving both staff and other contributors a false sense of the true extent of the problem. Meanwhile, the underlying problem has not been resolved, and the manual editing is doomed to be repeated like a Sisyphean task when the improper keywords build up again. 

In this way, private manual editing (or "auditing" as you like to call it) can actually exacerbate the problem. It's like putting a bandage on a large wound that requires surgery and calling it good.

Thanks for the link to that other post, though. It gives me a few more ideas for future requests for mass mergers and auto-conversions. In the meantime, I would ask that you please don't "audit" any of the other keywords on that list, because that could give everyone else a false sense of security that the problems with these keywords are not as big as they truly are.

2.7K Messages

 • 

82.3K Points

@bradley_kent​ Once again, thanks for all the work you've done regarding these keywords. It's a pity IMDb wasn't willing to block new additions of these keywords, that would've made your job a whole lot easier.

2.7K Messages

 • 

47K Points

2 years ago

I added a few more keywords to this post, taken from the most prevalent and problematic of @Marco's keyword list here.

2.7K Messages

 • 

82.3K Points

@keyword_expert​ Thanks for adding some keywords to your list. That'll decrease the problem quite a bit.

Champion

 • 

6.6K Messages

 • 

116.5K Points

2 years ago

FYC:

The Rolling Stones (18 titles) --> Reference to the rolling stones (263 titles)

2.7K Messages

 • 

47K Points

@Pencho15​ I appreciate the thought, but I like to reserve my public lists for keywords (or sets of multiple keywords) that total 50 titles or more.

That way, we get the biggest "bang for our buck" asking IMDb staff to do the changes on a mass scale, rather than making the edits manually ourselves.

"the-rolling-stones" only has 18 titles currently.

2.7K Messages

 • 

47K Points

@Pencho15​ 

I have now included the keyword "the-rolling-stones" on a new list

2.7K Messages

 • 

47K Points

2 years ago

Now that the seven-day comment period has passed, I am changing this to a problem post and the keyword list is ready for action by IMDb staff. Below is the full list with the numbers of titles removed.

Duplicate Keywords Proposed for Permanent Merging and Auto-Conversion

beethoven  --> ludwig-van-beethoven --> reference-to-ludwig-von-beethoven  --> beethoven-reference  --> reference-to-ludwig-van-beethoven 

churchill --> winston-churchill --> reference-to-churchill --> reference-to-winston-churchill 

elvis  --> elvis-presley --> reference-to-elvis --> reference-to-elvis-presley 

franklin-d.-roosevelt  --> reference-to-franklin-d-roosevelt  --> reference-to-franklin.d.roosvelt --> reference-to-franklin-delano-roosevelt --> reference-to-franklin-delano--roosevelt  --> reference-to-president-frankin-d-roosevelt  --> reference-to-fdr  --> reference-to-franklin-d.-roosevelt 

hitler  --> adolph-hitler  --> reference-to-adolph-hitler  --> reference-to-adolf-hitler 

jfk  -->  john-f.-kennedy  -->  john-f-kennedy  -->  reference-to-john-fitzgerald-kennedy  --> reference-to-john-f.-kennedy 

lenin  -->  vladimir-lenin  -->  reference-to-lenin  --> reference-to-vladimir-lenin 

lyndon-b.-johnson --> lyndon-johnson --> reference-to-lyndon-johnson  --> reference-to-lyndon-b-johnson  --> reference-to-lyndon-baines-johnson  --> reference-to-lbj  --> reference-to-lyndon-b.-johnson 

mussolini -->  benito-mussolini -->  reference-to-benito-mussolini 

princess-diana   -->  reference-to-princess-diana 

putin  --> vladimir-putin  --> reference-to-vladimir-vladimirovich-putin  --> reference-to-vladimir-putin 

saddam-hussein  --> reference-to-saddam-hussein 

stalin  --> josef-stalin  -->  joseph-stalin  --> reference-to-josef-stalin  --> reference-to-joseph-stalin 

the-beatles   -->  beatles  -->  reference-to-the-beatles 

theodore-roosevelt  --> reference-to-teddy-roosevelt  --> reference-to-theodore-roosevelt 

trump-presidency  --> donald-trump-presidency