2.7K Messages


47K Points

Wednesday, July 14th, 2021 2:19 AM


Keywords Proposed for Merging

I have been compiling lists of keywords that should be merged by IMDb staff. I am now ready to start posting many of these keywords, and I will focus this first list on the least controversial proposals--in other words, the keywords that I expect there will be no disagreement about whether to merge, or which direction to merge them. In the list below, the keywords should be merged in the direction of the arrows.  So, for example, all instances of "shape-shifter" would be merged (or if you will pardon the pun, shifted) into "shapeshifter," and "shape-shifter" would thereafter be abandoned as an IMDb keyword.  Also, the numbers below are the current number of titles for each keyword. As you can see, I am focusing on keywords with high numbers of titles, which would take a very long time for contributors to edit manually.  For spelling and punctuation conventions, my general practice here is to go with the American style. According to this traffic ranking website, the USA easily ranks #1 in terms of IMDb's traffic. Therefore I believe American spellings and punctuation should be preferred for IMDb keywords, except for words that are not as used as much in America or that have different meanings based on different spellings. In most cases the merging should be in the direction of the keywords with the higher existing number of titles. However, that is not always the case. For example, the preferred American spelling is "nighttime" (without any hyphen or space), so that should be the spelling for the IMDb keyword even though it currently has fewer titles. Keywords Proposed for Merging shape-shifter (293 titles) ---> shapeshifter (253 titles) shape-shifting (65 titles) ---> shapeshifting (417 titles) * time-travelling (31 titles) ---> time-travel (2893 titles) travelling (131 titles) ---> traveling (334 titles) ---> travel (4660 titles) soldiers (48 titles) ---> soldier (6924 titles) crime-investigation (302 titles) ---> criminal-investigation (1165 titles) dead-body-in-car (51 titles) ---> dead-body-in-a-car (56 titles) hook-for-hand (87 titles) ---> hook-for-a-hand (50 titles) woman-with-gun (23 titles) ---> woman-with-a-gun (313 titles) psychotic-killer (49 titles) ---> psychopathic-killer (321 titles) ---> psycho-killer (346 titles) serial-murderer (163 titles) ---> serial-killer (4103 titles) serial-murderer-as-protagonist (32 titles) ---> serial-killer-as-protagonist (59 titles) supernatural-serial-murderer (179 titles) ---> supernatural-serial-killer (23 titles) female-serial-murderer (47 titles) ---> female-serial-killer (244 titles) woman-screaming (76 titles) ---> screaming-woman (811 titles) man-screaming (33 titles) ---> screaming-man (503 titles) woman-sitting-on-a-toilet (80 titles) ---> woman-on-toilet (52 titles) ---> woman-sits-on-a-toilet (120 titles) face-deformity (46 titles) --->  facial-deformity (35 titles) moustache (534 titles) ---> mustache (893 titles)moustached-man (293 titles) ---> mustached-man (4 titles) moustache-twirling-villain (29 titles) ---> mustache-twirling-villain (3 titles) fake-moustache (393 titles) ---> fake-mustache (121 titles) mini-skirt (900 titles) ---> miniskirt (553 titles) girl-wearing-a-mini-skirt (28 titles) ---> girl-wearing-a-miniskirt (70 titles) ---> girl-wears-a-miniskirt (202 titles) mini-dress (278 titles) ---> minidress (155 titles) school-girl (178 titles) ---> schoolgirl (1554 titles) japanese-school-girl (95 titles) ---> japanese-schoolgirl (548 titles) brunette-female (403 titles) ---> brunette (12659 titles) based-off-manga (21 titles) ---> based-on-manga (1400 titles) night-time (789 titles) ---> nighttime (35 titles)    crawl-space (27 titles) ---> crawlspace (59 titles)  environmental-issues (328 titles) ---> environmental-issue (374 titles) counter-culture (208 titles) ---> counterculture (36 titles) * Apparently the "shape-shifting" and "shapeshifting" keywords were supposed to be merged two years ago, but based on the continued parallel existence of both keywords today, it appears the merger never actually happened. My ever-growing list of potential keyword mergers is much longer than this, but I will stop the list here, since as I said before my goal is to limit this initial post to the least controversial proposals.  As for why these keywords should be merged, that should be obvious, but the simplest explanation is that it will help immensely with searches for titles on IMDb. For example, if I want to find all movies with both "shapeshifting" and "time travel" in the same movie, I should be able to do that via a single search, without having to account for different spellings and variations in the keywords. And when more than two keywords are involved, it becomes increasingly complex and difficult to do these searches if the keywords are not uniform.  My proposal is for the community to be given a week to comment on this list, after which IMDb staff would make the requested mergers (barring any objections). 

Accepted Solution

2.7K Messages


47K Points

3 years ago

This keyword list was subsequently resolved in this post. 

1.4K Messages


24.3K Points

4 years ago

I could add many others to this list (I have an extensive Keyword Cross Reference Index with hundreds of examples), but let me just comment on a few that are listed: travelling and traveling should be merged into travel (4648 titles) I have been back-and-forth between miniskirt/minidress and mini-skirt/mini-dress for years.  Titles in the database with this(ese) word(s) are very inconsistent. It seems that murder should refer to human beings, while killing should refer to animals.  (Yes, there are exceptions.) And, as I previously proposed, most "actions" should take precedence over the "actor."  shapeshifter should be merged into shapeshifting, murderer and murder-victim into murder, abductor and abductee into abduction, kidnapper and kidnap-victim into kidnapping, gambler into gambling, thief into theft, etc. There are certainly exceptions, but, if possible, it seems like this should be a general preference.  To quote a song lyric, "You can't have one without the other!") Merging soldiers into soldier and environmental-issues into environmental-issue is just a case of merging a plural into a singular, already an IMDb guideline.  (There are hundreds of similar examples.) And, perhaps female-serial-murderer and female-serial-killer should be merged into serial-murderess. I could go on, but am interested in hearing from others.


2.7K Messages


47K Points

@bradley_kent  travelling and traveling should be merged into travel (4648 titles) Good point. I will edit my post accordingly. And, as I previously proposed, most "actions" should take precedence over the "actor."  shapeshifter should be merged into shapeshifting, murderer and murder-victim into murder, abductor and abductee into abduction, kidnapper and kidnap-victim into kidnapping, gambler into gambling, thief into theft, etc. There are certainly exceptions, but, if possible, it seems like this should be a general precedent.  To quote a song lyric, "You can't have one without the other!") I continue to disagree with that point, which we have debated several times in the past. In real life, you can't have one without the other, but in a movie or show, you absolutely can have one without the other. I have previously brought up examples to illustrate the point.

1.4K Messages


24.3K Points

brunette-woman should probably remain as a keyword.  I doubt if most people know that brunette is feminine, while brunet is masculine, and blond is masculine, while blonde is feminine.

2.7K Messages


47K Points

@bradley_kent I agree that "brunette-woman" should remain its own keyword, but for a different reason, and that is to keep the distinction between "brunette-woman" and "brunette-girl." If you check my original post, I am proposing getting rid of "brunette-female." That is a bad keyword for a number of reasons: it treats "female" as a noun, it is tautological, it is grammatically incorrect, and it needlessly bifurcates what could be a single keyword (the #1 reason for merging keywords). Edit: How could someone not know that "ette" means feminine? Yet, apparently there are enough people who don't know this to keep typing in this ridiculous keyword "brunette-female."


2.7K Messages


47K Points

@bradley_kent By the way, I bet you can guess who is primarily responsible for most of the instances of the ridiculous "brunette-female" keyword. It is none other than the creamy-legs/slut/jug/rack/almond-eyes/showing-skin/pig/bacon/canine/man's-best-friend spammer. As best I can tell, he has been adding the "brunette-female" keyword to titles for years. It is even possible that he is responsible for all instances of its usage, or at least the vast majority. 

2.7K Messages


47K Points

4 years ago

The one-week comment period on my post has now expired. During that week I received some helpful feedback from @bradley_kent regarding the "traveling" keyword, and I edited my list accordingly. I also removed two proposed keyword merges, which I will re-include in a future list, where they will fit better.  Now that the comment period has passed with no objections, I would like to officially invite IMDb staff to merge these keywords. I will repost the list below.  The keywords should be merged in the direction of the arrows. @Michelle, can you please do the honors? Keywords Proposed for Merging shape-shifter (293 titles) ---> shapeshifter (253 titles) shape-shifting (65 titles) ---> shapeshifting (417 titles)  time-travelling (31 titles) ---> time-travel (2893 titles) travelling (131 titles) ---> traveling (334 titles) ---> travel (4660 titles) soldiers (48 titles) ---> soldier (6924 titles) crime-investigation (302 titles) ---> criminal-investigation (1165 titles) dead-body-in-car (51 titles) ---> dead-body-in-a-car (56 titles) hook-for-hand (87 titles) ---> hook-for-a-hand (50 titles) woman-with-gun (23 titles) ---> woman-with-a-gun (313 titles) psychotic-killer (49 titles) ---> psychopathic-killer (321 titles) ---> psycho-killer (346 titles) serial-murderer (163 titles) ---> serial-killer (4103 titles) serial-murderer-as-protagonist (32 titles) ---> serial-killer-as-protagonist (59 titles) supernatural-serial-murderer (179 titles) ---> supernatural-serial-killer (23 titles) female-serial-murderer (47 titles) ---> female-serial-killer (244 titles) woman-screaming (76 titles) ---> screaming-woman (811 titles) man-screaming (33 titles) ---> screaming-man (503 titles) woman-sitting-on-a-toilet (80 titles) ---> woman-on-toilet (52 titles) ---> woman-sits-on-a-toilet (120 titles) face-deformity (46 titles) --->  facial-deformity (35 titles) moustache (534 titles) ---> mustache (893 titles)moustached-man (293 titles) ---> mustached-man (4 titles) moustache-twirling-villain (29 titles) ---> mustache-twirling-villain (3 titles) fake-moustache (393 titles) ---> fake-mustache (121 titles) mini-skirt (900 titles) ---> miniskirt (553 titles) girl-wearing-a-mini-skirt (28 titles) ---> girl-wearing-a-miniskirt (70 titles) ---> girl-wears-a-miniskirt (202 titles) mini-dress (278 titles) ---> minidress (155 titles) school-girl (178 titles) ---> schoolgirl (1554 titles) japanese-school-girl (95 titles) ---> japanese-schoolgirl (548 titles) brunette-female (403 titles) ---> brunette (12659 titles) based-off-manga (21 titles) ---> based-on-manga (1400 titles) night-time (789 titles) ---> nighttime (35 titles)    crawl-space (27 titles) ---> crawlspace (59 titles)  environmental-issues (328 titles) ---> environmental-issue (374 titles) counter-culture (208 titles) ---> counterculture (36 titles)

1.4K Messages


24.3K Points

@keyword_expert  Recheck the numbers.  I have submitted several of these mergers, and they have been accepted, and I am also working on several others. Sometimes, I think that we ask too much of the staff when it comes to keywords.  First, keyword have, and always have had, a low priority.  Second, I suspect that the staff is not as large as some of us might think.  Especially with the pandemic, it looks like there has been a lessening of advertisement revenue, which may have resulted in staff reductions. As I have done in the past, and continue to do, I encourage other contributors to undertake needed keyword mergers and audits. (There are always surprises and special situations, so mass merging is not the way to go.  Each title must be "looked at" individually.)   This list is just a small sample of such needed mergers.  The process is time consuming, and usually entails just one keyword at a time, but it is rewarding to see them "cleaned up," even though the "clean up" may not last for long. C'mon, other contributors, join me in this enormous project, and don't just "pass it on" to the staff.


2.7K Messages


47K Points

@bradley_kent Thank you for the manual cleanup, but as you point out, this cleanup may not last long for some of these keywords, because your cleanup work did not result in true mergers. One of the reasons why I asked for true mergers of these keywords was to proactively prevent their ongoing birfucation in the future. The numbers captured in my initial list show that these keywords involve "problem" bifurcations that will likely persist into the future unless they are truly merged.  I have done my own manual cleanup of keywords, resulting in tens of thousands of contributions. I agree with you that fellow contributors should help us as well.  The purpose of this particular list was to focus on keyword birfucations with relatively higher numbers. The higher the numbers for any particular keyword, the greater the "bang for the buck" in asking staff to merge them. For keywords where the numbers are high enough, there should be a community discussion, and then staff should simply merge the keywords. It can't take that much work to do so. This discussion board is filled with examples where contributors have asked the staff to merge two particular keywords (many with very small numbers), and the staff agreed to do so.  Another purpose of my post was to provide for at least a one-week comment period on the proposed mergers, before the mergers would take effect, instead of making too quick of a decision or leaving the mergers to the whims of individual contributors. Private manual edits by one or two contributors may give the false impression that the keywords have been "fixed." As you point out, that status can ultimately prove temporary over time, unless true mergers occur. Another problem with one or two individual contributors manually (and privately) editing these keywords is that it results in that contributor making his or her own decisions about the appropriateness of which keywords should or should not continue to exist, rather than having a public discussion and ultimate decision by staff. This has resulted in some inappropriate mass editing, such as the inappropriate mass purging of the "telephone-conversation" and "cigarette" keywords (which fortunately seems to have stopped now that there has been a public discussion about these particular keywords). And after such mass edits by a single contributor, the disparate numbers in favor of one keyword over the other can be misleadingly cited as "proof" that one keyword is better than the other, which becomes a self-fulfilling prophecy.  As you point out, the list in my post is just a small sample of such needed mergers. I have continued to compile my own personal list and will post further updates in the future.  As you suggest, I will recheck the numbers and will repost an updated list in this thread.

2.7K Messages


47K Points

4 years ago

@Michelle Can these keywords please be formally merged? Keywords Proposed for Merging - Updated List  (title numbers for each keyword may be slightly inaccurate based on recent edits) shape-shifter (293 titles) ---> shapeshifter (253 titles) shape-shifting (65 titles) ---> shapeshifting (417 titles)  crime-investigation (166 titles) ---> criminal-investigation (1165 titles) dead-body-in-car (51 titles) ---> dead-body-in-a-car (56 titles) hook-for-hand (87 titles) ---> hook-for-a-hand (50 titles) psychotic-killer (49 titles) ---> psychopathic-killer (321 titles) ---> psycho-killer (346 titles) serial-murderer (163 titles) ---> serial-killer (4103 titles) serial-murderer-as-protagonist (32 titles) ---> serial-killer-as-protagonist (59 titles) supernatural-serial-murderer (179 titles) ---> supernatural-serial-killer (23 titles) female-serial-murderer (47 titles) ---> female-serial-killer (244 titles) face-deformity (46 titles) --->  facial-deformity (35 titles) moustache (534 titles) ---> mustache (893 titles)moustached-man (293 titles) ---> man-with-a-mustache (178 titles) ---> mustached-man (4 titles) moustache-twirling-villain (29 titles) ---> mustache-twirling-villain (3 titles) fake-moustache (393 titles) ---> fake-mustache (121 titles) mini-skirt (869 titles) ---> miniskirt (483 titles) girl-wearing-a-mini-skirt (23 titles) ---> girl-wearing-a-miniskirt (70 titles) ---> girl-wears-a-miniskirt (202 titles) mini-dress (250 titles) ---> minidress (142 titles) school-girl (176 titles) ---> schoolgirl (1556 titles) japanese-school-girl (95 titles) ---> japanese-schoolgirl (548 titles) brunette-female (377 titles) ---> brunette (12659 titles) night-time (789 titles) ---> nighttime (35 titles)    crawl-space (27 titles) ---> crawlspace (59 titles)  environmental-issues (328 titles) ---> environmental-issue (374 titles) counter-culture (208 titles) ---> counterculture (36 titles)


1.4K Messages


24.3K Points

4 years ago

Again, it seems that "killer"should usually be changed to "murderer" unless the "killer" is JUST killing animals and not humans.  Ending the life of a human being is murder, while ending the life of an animal is killing.  (There are, of course, some keyword exceptions i.e. "cop-killer"). "mustached-man" seems archaic, as if the "mustache" was somehow applied to or imprinted upon a man.  My preference is "man-with-a-mustache," although that only has 177 titles.  (Part of my thinking on this is to have some kind on uniformity.  The worst example is "bespectacled-man," which has been merged into "man-wears-eyeglasses."  Similarly, "bearded-man" -- as if it was innate or inborn -- should be "man-with-a-beard," "tattooed-man" should be "man with-a-tattoo," unless it is a "tattooed-man" in a circus sideshow, or a man with tattoos covering most of his body.)  And, I wonder if you even need "mustache" or "beard" or "eyeglasses," etc. as keywords  if someone has been designated as having them on his or her person.


2.7K Messages


47K Points

@bradley_kent  Are you not familiar with the term "serial killer?" That term has 64.4 million hits on a Google search. (Compare that to less than one million hits for "serial murderer.") It is irrelevant whether animals or humans are being killed. Like it or not, "serial killer" is a widely used term to describe someone who kills multiple people. I prefer "bearded man" over "man-with-a-beard." The former is simpler and conveys the same thing. The same with "tattooed-man," which I prefer over "man-with-a-tattoo." There is nothing about the phrases "bearded man" or "tattooed man" that implies the person was born with the beard or tattoo(s). Just like the phrase "blonde woman" doesn't imply the person was borne with blonde hair. I do agree with you about "bespectacled-man," "bespectacled-male," "bespectacled-woman," etc. These are terrible keywords.


1.4K Messages


24.3K Points

Of course, I am familiar with the term "serial-killer."  That may be another exception. The point with "beard," mustache," tattoo," is meant to suggest some sort of general uniformity, although there will always be exceptions.  What are we to anticipate:  "mutton-chopped-man," "acned-teenager," "caned-man," etc., instead of "man-with-mutton-chops." "teenager-with-acne," "man-uses-a-cane," etc. I've never understood what determines "long-hair" versus "short-hair," as in "long-haired-man," "short-haired-woman," etc?   And there is someone who has subjectively determined that "long-hair" is "feminine" and "short-hair" is "masculine."   (Tell Sampson and Sinead O'Connor that!). I've even seen "shoulder-length-hair" as a keyword!  (I guess that is neither "long" nor "short"!)  This is all a subjective judgment. Some of this concern is augmented with the use of the prepositions "with" and "in."  "with" implies "along side," while "in" implies "inside."  "woman-with-glasses" implies that the glasses are "with" or "along side" her, perhaps on a table or in a handbag.   "man-in-underwear" can imply that the man might be literally "inside" the underwear! ("man-wears-underwear," is, of course, a better keyword.  And any physically handicapped person who is in need of one will tell you that he/she  "uses" a wheelchair," and is not "in" the wheelchair. We can strive for consistency, but there will always be exceptions. My point is to have keywords be as objective and factual as possible.


2.7K Messages


47K Points

@bradley_kent  The person responsible for the "feminine" versus "masculine" hair keywords, as well as most of the "long" versus "short" and "shoulder-length" hair keywords, is none other than the leggy/bacon/pig/canine/miniskirted-beauty/eye-candy/etc./etc. keyword spammer whom we have discussed at length recently. I have deleted much of his "feminine" versus "masculine" hair keywords when I have come across them. He should not be the arbiter of whether hair is sufficiently feminine or masculine.  Similarly, the "long" versus "short" hair keywords are just terrible keywords in general -- almost every single title on IMDb includes characters with short and/or long hair. These keywords are meaningless. The same keyword spammer may also be responsible for much of the "bespectacled" keywords. I know for sure he is responsible for all of the "miniskirted" keywords. 


1.4K Messages


24.3K Points

I tend to be against mass mergers without an audit because this can lead to a lot of other keyword problems, especially with keywords assigned to episodes of a tv series.   As we know, keywords listed at the series level SHOULD NOT also be listed at the episode level, EXCEPT for anthology series.

2.7K Messages


47K Points

@bradley_kent  As we know, keywords listed at the series level SHOULD NOT also be listed at the episode level, EXCEPT for anthology series. No, we don't know that. Quite the opposite, in fact.



17.7K Messages


316K Points

3 years ago

Hi keyword_expert - Following up on these old Keyword requests, I can confirm that the merges and conversions have all been applied, except for:  psychotic-killer (49 titles) ---> psychopathic-killer (321 titles) ---> psycho-killer (346 titles), as technically there are slight differences between 'psychotic' and 'psycopathic'. Cheers!