Marco's profile

2.7K Messages

 • 

81.5K Points

Friday, May 7th, 2021 4:58 PM

Closed

Colo(u)r keywords

Most of the keywords with the word colo(u)r in it are spelled the American way: https://www.imdb.com/find?s=kw&q=color . However, there are also quite some keywords where the word is spelled the British way: https://www.imdb.com/find?s=kw&q=colour . Is it possible for a staffer to merge them and create a block for either color or colour so this problem won't get any bigger in the future?

Employee

 • 

16.7K Messages

 • 

305.4K Points

3 years ago

Hi Marco -

 

The existing keywords "Colour" have now been merged into "Color", the change should be live on the site shortly.

 

I have also taken further steps to block the British spelling in the contribution form for future submissions (no offense, our site preference is American English).

 

Cheers!

Champion

 • 

3K Messages

 • 

72.3K Points

Can we do so for the editorial department occupation "colourist"?

10.6K Messages

 • 

223.9K Points

I see no merit in doing anything special with "colorist" and "colourist", as the idea is to make sure IMDb's data matches what appears on screen.

2.7K Messages

 • 

81.5K Points

@daniel_francis_gardecki  "people like Marco, Bradley and Adrian trying to force the minority Americanism language on the rest of the English speaking world."

This is a pretty bold statement. Do you have any evidence to back up your statement? I have stated in my OP "create a block for either color or colour". Where am I trying to force American English on the rest of the English speaking world? Also, you overestimate me quite a bit if you think I can influence "the rest of the English speaking world".

Don't get me wrong, I appreciate your passion when it comes to language (I wish more people were passionate about language), but it doesn't help your case if you state things about specific people that simply aren't true.

10.6K Messages

 • 

223.9K Points

I interpreted "trying to force" to mean "standing by and letting the influence proceed uninhibited", as it is clear to me that the extreme (and exaggerated) wording reflects a tantrum against those who are largely indifferent. Haha.

For those who many be interested, there is actually a set of rather involved origin stories concerning how there came to be two standards for the spellings of English words. One part of particular interest to me is the fact that the Constitution of the United States most certainly conforms to British spelling, but outside of that document in its original form, government documents in the United States just about all are written with American spelling. Several constitutional amendments have such spelling. In some ways, it feels like push toward American spelling was just to spite Alexander Hamilton or something, perhaps the Constitution, but how can that make sense? Maybe it was just more convenient at the time and thereafter. To be honest, the United States have drifted very far from their founding principles, ever since basically George Washington retired from federal public office. For one thing, initially there was a Department of War. The originating statute titled it honestly, but then a little over century afterward, it was retitled into the more euphemistic "Department of Defense" (not even "Department of Defence") when it was merged with the Department of the Navy.

2.7K Messages

 • 

81.5K Points

@Michelle "The existing keywords "Colour" have now been merged into "Color", the change should be live on the site shortly."

Thanks for merging 'colour' into 'color', but I was actually hoping that all keywords with 'colour'/'color' could be merged. Is this possible please?

(edited)

478 Messages

 • 

8.4K Points

@Michelle Sort of related question... the keywords under 'cause' and 'attribute' etc... can duplicates be combined? For example liver cirrhosis and Liver cirrhosis? Liver Disease and liver disease? There are a TON like this. Plus misspellings as well. I''d love to help clean this up. If I start a list and periodically send it to someone would that be helpful? I know it's not top priority. Also... is there a way to keyword search for Covid deaths? I notice the number associated with that term is rising at an unfortunately high rate. Just curious.

Examples:

Place

 
New York, USA (400) 
New York City, New York, USA (1,283)
New York City, New York (3)
New York City, NY, USA (1)
New York City, USA (5)
New York, New York, USA (301)
New York City, New York, USA (1)
La Spezia (1)
La Spezia, Italy (2)
La Spezia, Liguria, Italy (8)
Leningrad, RSFSR, USSR (76)
Leningrad, RSFSR, USSR {now St. Petersburg, Russia] (62)
Leningrad, Russia (1)
Leningrad, Russia, USSR (13)
Leningrad, Russia, SFSR, USSR (12)
Leningrad, Russian, SFSR, USSR [now Russia] (1)
Leningrad, Russian, SFSR, USSR [now St. Petersburg, Russia] (9)
Leningrad, Soviet Union (2)
Leningrad, Soviet Union (now Saint Petersburg, Russia) (4)
Leningrad, Soviet Union (now St. Petersburg, Russia) (2)
Leningrad, USSR (39)
Leningrad, USSR [now St. Petersburg, Russia] (7)
Leningrad, USSR [now Saint Petersburg, Russia] (2)
Saint Petersburg, Russia (18)
Saint Petersburg, Russia Empire (5)
Saint Petersburg, Russia Empire [now Russia] (8)
Philadelphia, PA, USA (1)
Philadelphia, USA (3)
Philadelphia, Pennsylvania USA (2)
Philadelphia, Pennsylvania, USA (484)
Philadelphia, Philadelphia County, Pennsylvania, USA (1)
Lincoln, England, UK (1)
Lincoln, Lincolnshire, England, UK (16)
London (35)
London, UK (18)
London City, England, UK (2)
London, England (15)
London, England, Great Britain, UK (1)
London, England, UK (1,162)
London, England, United Kingdom (5)
London, Greater London, England, UK (10)
London, Greater England, UK (1)
UK (345) and England, UK (417)  
Long Beach, New York, USA (1)
Long Beach,Long Island, New York, USA (1)
Long Branch, New Jersey (1)
Long Branch, New Jersey. USA (9)
Los ANgeles (1)
Los Angeles Ca (1)
Los Angeles County, California, USA (245)
Los Angeles, CA, USA (6)
Los Angeles, California, USA (2,066)
los angeles, California, USA (1)
Los Angeles, Califoórnia, EUA (1)
Los Angeles, USA (1)
California (18)
California, USA (492)
California, USA (undisclosed) (2)
Chilba, Japan (52)
Chilba. Japan (1)
Chicago (4)
chicago ILL (4)
Chicago Illinois (1)
Chicago Illinois, USA (2)
Chicago, Illinois, USA (842)
Chicago; Illinois, USA (1)
Chicago, Illinois, USA, (1)
Chicago, USA (6)
U.S.A. (4)
USA (1,985)
US (1)
Budapest, Hungary (1,388)
Budapest (3)
Budapest, Austria-Hungary [now Hungary] (163)
Paris (15)
Paris France (2)
Paris, France (1,116)
Cause
homicide (760)
murder (72)
Murdered (1)
murdered (150)
murder victim of the Sandy Hook Elementary School massacre (2)
murder victim of the SandyHook Elementary School massacre (1)
murder suicide (3)
murder-suicide (1)
murdered by beating (2)
murder by beating (2)
murder by beating (2)
murder by gunshot (13)
murdered by gunshot (41)
murder by stabbing (3)
murdered by stabbing (17)
Myeloma (1)
myeloma (6)
motorbike accident (8)
Motorcycle accident (1)
motorcycle accident (123)
Motorcycle crash (1)
motorcycle crash (11)
motorcycle accident, hit by a truck (1)
Myocradial infarct (2)
myocardial infarction (80)
snake bite (3)
snakebite (1)
Sepsis (1)
sepsis (39)
septic infection (2)

(edited)

10.6K Messages

 • 

223.9K Points

That bit about item attributes is another problem that definitely needs to be addressed, and it is even harder to deal with than keywords because contributors are not provided a way to find out which specific title pages or specific name pages bear the data items having a parenthetical note formed by a common word, term, phrase or string. Perhaps a new thread ought to be created about that problem, as we're otherwise focused on a specific keyword synonym here.

2.7K Messages

 • 

47K Points

@Marco I see that @Michelle has not yet responded to this follow-up post of yours:

Thanks for merging 'colour' into 'color', but I was actually hoping that all keywords with 'colour'/'color' could be merged. Is this possible please?

Does anyone know if it's possible for IMDB staff to permanently merge certain words and phrases contained within all existing and future keywords in favor of other words and phrases?

As applied here, is it possible to simply permanently merge all occurrences of "colour" contained within all existing and future keywords in favor of "color?"

If so, that would certainly solve existing problems, and prevent future problems, like this:

color-in-episode-title (817 titles)
colour-in-episode-title (542 titles)

However, it may not make sense to do this with "colour" specifically, because the word "colour" can be part of longer proper names, such as the bands Living Colour and Ocean Colour Scene.

But if it is possible to do this type of "perma-merging," it might make sense to use that method to solve other keyword problems involving systemic keyword discrepancies (e.g., "mustache" vs. "moustache," "sulfur" vs. "sulphur," "airplane" vs. "aeroplane," "dialogue" vs. "dialog"). Each one of those words is used within multiple other keywords on IMDb.

Mainly I am just curious to know if the "perma-merging" method is even possible. Does anyone know?

Even if it wouldn't work for any British vs. American spelling discrepancies, it might make sense to use this method in other contexts, such as commonly misspelled words (e.g., people frequently misspell "athlete" as "athelete," "camouflage" as "camoflage," etc.).

2.7K Messages

 • 

47K Points

p.s. I noticed this part of @Michelle's post:

I have also taken further steps to block the British spelling in the contribution form for future submissions

At least according to that statement, it is possible to block certain words and phrases from "future submissions" (which may or may not include new keywords).

With that said,  despite what Michelle said about "colour," that spelling does not appear to be currently blocked. I tested this and was able to create a new keyword containing the word "colour." 

Either a block was placed on "colour" and it was later reversed, or no block was ever made in the first place.

Finally, there is no need to broadly block "colour" anyway, because as I mentioned earlier, keywords should be allowed to contain the spelling "colour" in certain circumstances, such as if "colour" is part of a proper name.

2.7K Messages

 • 

81.5K Points

@keyword_expert

At least according to that statement, it is possible to block certain words and phrases from "future submissions" (which may or may not include new keywords).

This is absolutely possible. Try adding donald-trump, william-shakespeare or beautiful as a keyword to a title and you'll see that they're blocked.

Either a block was placed on "colour" and it was later reversed, or no block was ever made in the first place.

I just tried entering 'colour' as a keyword and it automatically changed to 'color' after I clicked on "check these updates" with a notification saying "'colour' has been replaced with 'color' in accordance with IMDb rules."

2.7K Messages

 • 

47K Points

Marco, you are misunderstanding my question. I am not asking whether it is possible to block a specific keyword. I already know that is possible. I am asking two other questions.

First, is it possible to block the use of a specific word or phrase within all future keywords? Using one of your examples, blocking the word "beautiful" would also prevent the future keyword "beautiful-wife." Is this possible?

Second, assuming the answer to the first question is yes, then is it also possible to permanently merge a specific word or phrase within all existing and future keywords in favor of another word or phrase? For example, since people often misspell "athlete" as "athelete." could the misspelled "athelete" within all existing and future keywords be merged into "athlete?" If that is possible, then if someone tries to create a new keyword "professional-athelete,"it would be automatically reverted to "professional-athlete."  

Do those clarifications make sense?

I still would like to know if these types of universal blocks and mergers are possible.

@Michelle   implied in this thread that the answer to the first question is yes, that it is possible to block a specific word or phrase within all keywords. Here is what she said about "colour":

I have also taken further steps to block the British spelling in the contribution form for future submissions

However, I tested what @Michelle said, and I was able to easily create a new keyword called "colour-test" (which I later deleted). I can't tell whether the block on the British spelling of "colour" was ever in fact made, or was made and later reversed.

And sorry to repeat myself, but in this particular case I believe it would not be advisable to permanently block the British spelling of "colour" within all keywords, because there will be proper names that should include that spelling (e.g., the band Living Colour).

It occurs to me that I may be misinterpeting Michelle's above-quoted statement. She was probably just intending to say that the specific keyword "colour" has now been blocked. But even if that is the case, I would still like to know whether it is possible to universally block and merge a specific word or phrase within all keywords.

(edited)

2.7K Messages

 • 

81.5K Points

@keyword_expert I see what you mean now and I hope a staffer will respond to answer your questions.

2.7K Messages

 • 

47K Points

@Marco 

I recently learned from this post by @Will that when a keyword is permanently blocked and merged in favor of another keyword, IMDb staff refer to that as "an auto-convert." I assume "auto-conversion" would also be appropriate terminology.

So to clarify for IMDb staff, I am asking these two questions:

(1) Can a specific word or phrase used within all existing keywords be merged in favor of a different word or phrase?

(1) Can a specific word or phrase used within all future keywords be set up for auto-conversion to a different word or phrase? 

I do know for a fact that certain words or phrases within keywords (whether in a keyword that already exists or in a newly created keyword) are used, this can trigger a manual review of the keyword by IMDb staff, instead of triggering automatic approval or rejection of the keyword. One example of such a word or phrase is "rape." Any time that particular word is used within a keyword addition, the keyword addition will be manually reviewed by IMDb staff.

Since we know based on that example that it is possible to trigger manual review for a specific word or phrase, then it would also seem theoretically possible to trigger auto-conversion of a word or phrase to a different word or phrase, or at least to block certain words or phrases within all future keywords. But the system may not be set up that way to allow for this yet.

(edited)

2.7K Messages

 • 

47K Points

3 years ago

Hello Marco and all,

I just want to share that I have manually cleaned up the "colour" keywords by converting them to "color." Some other contributors may have also helped with this cleanup even before I pitched in. And most of the keywords I converted were on 2020 and 2021 titles, so I suspect this cleanup work has been done in years prior.

I kept the spelling "colour" in a few keywords:

- colour-in-episode-title (539 titles)  - For this one, there are way too many titles to change manually, so I included this keyword in my latest list for proposed future mergers and auto-conversions.

colour-sergeant (1 title)  - This is a Canadian term and should remain with the "colour" spelling.

trooping-the-colour (16 titles) - This is a British term and should remain with the "colour" spelling.

reference-to-living-colour (3 titles) - This is a band which is listed on IMDb (thus also requiring the "reference-to" prefix in the keyword).

reference-to-ocean-colour-scene (2 titles) - This is a band which is also listed on IMDb

(edited)

10.6K Messages

 • 

223.9K Points

Aye, excellent work.

2.7K Messages

 • 

81.5K Points

@keyword_expert Splendid work!