Giancarlo_Cairella's profile
Employee

Employee

 • 

500 Messages

 • 

42.2K Points

Thursday, February 4th, 2021 7:52 PM

New rule regarding the display of Chinese and some other Asian names

New rule regarding the display of Chinese and some other Asian names

Until recently, our name formatting rules included a paragraph outlining a requirement to submit Chinese names in Western format (i.e. “family name, first name”). For example, actor Chow Yun-Fat was listed on the site as Yun-Fat Chow (because "Yun-Fat" is the family name)

We have now removed this requirement. Effective immediately, when adding a new name or correcting an existing one, please enter it in the same format it’s supposed to be shown (e.g. “Chow Yun-Fat”) and/or in the way preferred by the person to whom it refers (if known) and our submission system will change it so that it is properly stored and displayed.




 

10.7K Messages

 • 

225.4K Points

4 years ago

Unless I'm mistaken, I'm a little disappointed that a change to the formatting rules has been undertaken instead of the system being upgraded in such a way that ensures every person's the family name would be displayed before his or her given name, in cases where it would appropriate, even though the names are stored with the family name followed by a comma followed by the given name. As I understand it, this would require an additional data field (i.e. a column) in the filmography item data type, or an equivalent hack. Why not simply provide a way for contributors to override the aspect of the submission form related to the bit about "The following fixes have been applied automatically" (which happens whenever a comma is omitted from the name field)? On a side note, why not provide a way for contributors to input names that contain single-letter names?

10.7K Messages

 • 

225.4K Points

Exactly! We know that IMDb has a team that can implement these kinds of things.

36 Messages

 • 

486 Points

4 years ago

Most asian/arabic/eastern-european people do not have an official english translation of their names, so there are many people listed on the IMDb that are the same people only under different spellings or one does not find the correct name. There should be a text field for names that allows contributors to add the original name signs/letters like an a.k.a.-section for those names and their pronounciations in different languages like it has been done with the movie titles (if there is no official translation from the person or movie title itself). With the original signs/letters a name or a movie title would be found better or sometimes even at all.
For example: Chow Yun-Fat's original name is 周潤發 - spoken it would be Zhōu Rùn Fā, so not quite the displayed name. But there has been established an international name by media/agents or the person to spell him Chow Yun-Fat. That would be his imdb display name.
Some people have the same original name like 周, but some are spelled Chow, some are spelled Zhou, so it is very difficult to find those people in the database. That's why I propose to establish a db-text field for the original name signs/letters.

10.7K Messages

 • 

225.4K Points

For the longest time, IMDb has had problems storing text in a charset other than 128-bit ASCII, and apparently the problems partially still exist, but there is increasing amounts of support for other scripts, interestingly via UTF-8.

36 Messages

 • 

486 Points

I don't know since when it is possible to save asian characters, but I like the possibility to search for the originally written title because it's often impossible to search for i.e. a russian movie because the in latin letters spelled title never turns up anywhere or in different writing on the net.

UTF-8 is by far the better approach because of its increasing usage in the www. It's like the CD to audio tape. IMDb coders should put their main effort to functionality than to the looks. I still use the classic layout because I miss infos at the head and I can't select/copy the names anymore.

78 Messages

 • 

1.5K Points

4 years ago

The “/Asian” is extremely vague as to what comes under this rule and what doesn’t.

What about Japanese names? English Wikipedia and The Movie Database put these by default with the family name last, except with pre-Meiji names and art names of performers of traditional Japanese arts. But the Japanese government is pushing for them to romanised in the order they are pronounced in Japanese (in which the family name is usually first) and some bodies with links to the government like the NHK, UniJapan (including with their JFDB, which is a good resource for finding out the order names are pronounced in Japanese) and various countries’ branches of the Japan Foundation follow this rule.

What about Hungarian names, in which the family name comes first, but which are not Asian?

What about mixed Asian/Western names, like Kang Daniel and Jackie Chan? These are most often written with the family name last, but not always: Kang Daniel is always romanised officially with the family name (Kang) first.

What about people with completely East Asian names but whom live or lived primarily in countries where family names normally come last, like Sessue Hayakawa, Gok Wan, Kazuo Ishiguro or Yoko Ono?


How should the second syllable of a two-syllable name separated by a hyphen be capitalised? Both English Wikipedia and The Movie Database specify that the second syllable should by default not have a capital letter (e.g. Chow 
Yun-fat), the only exceptions being when it can be proved that the capitalised form is preferred by the person, or is otherwise official or the form conventionally used in English. The example suggests that capitalising both syllables is preferred, but there is no clear statement either way on this issue.

And what about character names – does the change affect them at all?

am really glad that IMDb is doing something about this, though my enthusiasm might not show from what I’ve written above, as it will eliminate all those “(as …)” attributes which only differ in name order being added and mean that cinemas which copy information from IMDb will be getting these names in the correct order. But such a drastic change in policy as this needs a detailed explanation of what it affects and does not to accompany it.

(edited)

10.7K Messages

 • 

225.4K Points

I find it problematic that the IMDb company didn't bother to open up a discussion on these matters prior to amending the policy concerning them. I've complained before about them doing things like this. What really needs to be done is an overhaul whereby the proper display can be customized or something halfway to that effect in such a way as that it still clear which name part is the family name and which name part is the given name. Alphabetical ordering of people by full name factors into why one part of a name is distinguished from the other, but glyph-based languages do not have an alphabet nature. This "quick fix" breaks the ordering system where by family name takes precedence over given name.

Champion

 • 

7.4K Messages

 • 

276K Points

Jeorj: I agree. Why not post a proposal for the rule change here, let people comment on it, then implement the rule? That would at least show that these issues had been taken into account.

Employee

 • 

500 Messages

 • 

42.2K Points

This change is limited in scope and is meant to address a longstanding issue with some names, which were sometimes stored and displayed in a format that often different from how the person typically represented themselves. The original reason for that strict formatting was based on technical requirements (mainly sorting) that are no longer prevalent, so we are no longer asking submitters to adhere to it. It doesn't affect character names (which are always meant to be entered and displayed as they appear on-screen in cast listings) and it only affects those names that were previously stored in that particular format despite not being the same as the most common or preferred one. So, Gong Li can now be listed as "Gong Li" (the most commonly used and preferred variant of the name) instead of "Li Gong", but Jackie Chan will remain listed as Jackie Chan as it's always been.

The announcement specifically singled out Chinese name because they were the names that were most commonly affected by the previous restriction, but any name that is currently stored on IMDb in a format inconsistent with the way the person to whom it refers typically uses it can and should be corrected accordingly -- there are likely several Korean names that fall in that category, and possibly others (mostly, but not exclusively, Asian).
  

10.7K Messages

 • 

225.4K Points

One of my main hangups has to do with the fact that the submission form always forces the presence of a comma in form fields that refer to people's names, reversing the parts of the full name around, which makes the process confusing in regards to names transliterated from Chinese and Korean, or the names of people in societies wherein everybody's full name in the language corresponding to the society is formed by the expression of the person's family name followed by the person's given name. I've observed the submission interface since this announcement, and I can tell there is absolutely no mechanism in place that detects full names transliterated from Chinese or Koren. For names like that, there shouldn't even be a comma in the stored rendition of the name! The problem would be solved.

Employee

 • 

500 Messages

 • 

42.2K Points

It's true that the legacy 'name1, name2' format separated with a comma has several limitations, but for now we have to stick with it.

We will eventually address that and possibly move to a different format, but that's a much larger project that we can't embark on at this time.

78 Messages

 • 

1.5K Points

I appreciate that Giancarlo_Cairella has taken the time to reply to some of the concerns other contributors and I have voiced, but “possibly others” is still an inadequate definition of which people are and are not affected by this change.

China is the most populated country in the world currently, but Asia also includes many of the world’s other most populated and most prolific film-and-TV-making countries.

In particular, Japan was the most prolific country after the USA for film production through a large chunk of the 20th century, but there has been no specific guidance so far on whether this change applies to Japanese names or not.

And what kinds of Korean names fall into this category, and which do not? South Korean cinema and TV has been attracting a lot of attention even in mainstream Western media in the last few years, so how much the policy affects the credits on them is also important to define.

“The way the person to whom it refers typically uses it” is a good start, but it is not yet a detailed enough rule for determining the correct order in many cases, because many people are credited by or give their name in different orders and with different spacing or hyphenation depending on the context.

For example, there is a popular actor and musician whose name is written in Japanese as 星野源, which is pronounced as HOSHINO Gen. On the cover images of his music releases from a few years ago or more, his name is written in Roman characters as HOSHINO Gen. But, on the cover images of his most recent music releases, it is written as Gen HOSHINO. JFDB orders it as HOSHINO Gen. On the front page of his official website, it is currently written as Gén HOSHINO.

Hungarian names are also conventionally switched into family-name-last order when used in English-language contexts. E.g., the site of animation studio Kecskemétfilm refers to its best-known film director as JANKOVICS Marcell on its Hungarian version but Marcell JANKOVICS on its English version.

So, to begin with, are we to use the order in which the name is pronounced in the context of the person’s native language? Or the order it is written in when it is used in an English-language context?

If it is the latter, how do we decide which order and which spelling to use when this differs across different times and different contexts?

And what do we do if we have no way of telling which order the person prefers? This is the case with most people who only worked before the Internet age (except for the few names that broke through in the West and some specific regions like Hong Kong where romanisation was more common); it is still the case with most credits other than cast, directors and the lead staff that make it into billing blocks.

There needs to be, at the minimum, a default order, romanisation system and format for spacing multi-syllable names for each culture affected to fall back to when the preferences of the person/their management/their estate cannot be determined.

Again, I do not object to this change in policy in itself (far from it; I think it’s come about 30 years too late), but I feel that the rules of it need to be much more fleshed-out before it’s ready to put into practice.

(edited)

78 Messages

 • 

1.5K Points

There are, certainly, a lot more ways of formatting names in the world than <personal name> <family name> or <family name> <personal name>, across Asia and other parts of the world.

Another one is Icelandic names, which take the format <personal name> <patronymic> (or, rarely, <personal name> <matronymic>), and when they are listed alphabetically they're sorted by the personal name. Because IMDb assumes that everyone in the world uses the <personal name> <family name> format, it mistakes the patronymics for family names and incorrectly sorts names by them rather than the personal name.

Back when IMDb was started, around 1990, China was known for having more than the rest of the world's population combined, so, at least then, there were definitely more people alive with a family name that came first than with a family name that came last (even more so when one factors in Japan and Korea). If the founders were planning on it becoming a global database that included all people with a credit on professional moving-image media anywhere in the world back then (which, to be fair, I don't imagine they were), then it would have made logical sense to work in a feature then that allowed the majority of the world's names to be both displayed correctly and alphabetised correctly, and it would have saved the current staff from the difficulty of starting to undo this oversight over 30 years later. But, well, that wasn't to be.

However, regarding finding out people's real names: what someone's real, legal name is does not determine what their displayed name on IMDb should be.

If someone's legal name has been made public, then it should be mentioned in their biography. As should their birth name, if it differs, and that should also be in the field specifically for those.

However, the guide for names states:

The primary name for a person in IMDb is the one by which they are most often credited.

So knowing a person's real, legal name is not needed for this. It goes on how their name is currently most commonly written in on-screen credits. When someone's name has been given as inconsistently in on-screen credits as daniel_francis_gardecki is describing, then deciding what is recent enough to be considered current or not is difficult to decide on, and there are no more detailed guidelines to help with these cases. I would suggest trying to find out what the person prefers themselves (from what is used on their or their agency's or production company's website and/or public social media accounts) and if that is consistent and also a name they are credited by on screen in at least a few instances to go with that, and provide links to those official sites and accounts and stills of credits on screen under that name with any submission to change the primary name.

But what their legal name is does not, as a rule, affect what their primary displayed name should be.

(edited)

5 Messages

 • 

104 Points

4 years ago

Nice consideration. It will now be better and convenient for the Chinese/Asian members. Hope it won't affect any functionality in any way.

78 Messages

 • 

1.5K Points

@RoberMiller With the way it has been implemented, it affects functionality in one significant way: it prevents names from being sorted alphabetically correctly.

Now names with the family name last will still be sorted by the family name, but names with the family name first will be sorted by the personal name.

It also makes names confusing to enter. To enter, say, Zhang Yimou so that it is displayed correctly (with the family name, Zhang, first), one needs to enter it as “Yimou, Zhang”, which is in itself incorrect.

On the plus side, names with the family name first will now be displayed correctly, which is arguably worth the trade-off in functionality. But it’s possible to imagine a way in which the change could be implemented that would take a lot more work on IMDb’s part but would preserve correct alphabetical sorting and name entry, by people having an attribute that would decide which order their names are displayed in (that could also be used to make family names capitalised, like they are on JFDB, which would make which order the name is in always clear).

(edited)

36 Messages

 • 

486 Points

I think, if you add i.e. cast to a movie, the automatic db-search-function when you enter the name should work . There should be (as with the movies) a db-section with alternate names. If the name isn't found or new, there should be a field to enter the original written name (as credited without the comma) to create the new entry and another field to select the regional origin to format the name for the db.
So if you add 濱口竜介, there should be a selection for „japanese“ to write the name as is. And if the name has an entry there should be a field for alternate names like Ryûsuke Hamaguchi or/and Ryusuke Hamaguchi - and a field to format this name - in this case „latin“. (or do it the other way around)

But with the amount of names that would have to be checked out (and often merged because of the different spelled/written name entries) it seems to be an enormous undertaking. But it should be done. And if you then would search for 濱口竜介 there should be the director's entry of Ryûsuke Hamaguchi popping up/down.

78 Messages

 • 

1.5K Points

I'm afraid that if you're hoping for more Unicode support any time soon, you're setting yourself up for disappointment. It took about 30 years since the first version of Unicode was released for support for Japanese titles to be added. Considering that, I can't see IMDb supporting all writing systems for both titles and names within my lifetime – or, even if others come a lot quicker than that, not in the next few years, anyway.

I don't mean that as a complaint; I'm not a programmer, and I doubt that I could have ever done what they have so far in working in what support there is in any amount of years.

There are other, more-recently started databases of movies with full support for Unicode, which is much easier for them as they have the great advantage of Unicode already being around when they started, and so they were able to use it from the beginning. IMDb came a bit too early; converting it over has been an ongoing project in some form for most of its history – which is understandable in that it cannot be easy to covert something that is constantly having data added to and changed on it, but, were the staff to block changes to it, the database would risk losing its relevance.

(edited)

10.7K Messages

 • 

225.4K Points

Support for Han glyphs and Kanji glyphs recently came into existence for movie alternate title items, so now it is just a question of doing the same thing with names of people.

78 Messages

 • 

1.5K Points

It's not, unfortunately, because people and companies don't have akas in the same way movies do, to begin with.

People only have akas as a result of the "(as …)" attribute being used with their name. The only way to add non-Roman akas would therefore be to make attribute fields Unicode.

78 Messages

 • 

1.5K Points

4 years ago

After thinking about this for a while (I do mostly correct and add data on Chinese, Japanese and Korean content, so it has an enormous impact on my contributing), I'm wondering from how Giancarlo_Cairella's posts have worded things but not stated it explicitly, if IMDb is preparing to drop any division between family and personal names, storing each name as a single string of text which can only be sorted by the very first letter, as there's no indication in it of which part it should be sorted by?

This is how The Movie Database deals with names, and I can see the logic in it.

It breaks sorting by family name when the family name comes last, but the only part of IMDb that I use which alphabetical sorting majorly affects is the lists of credits in each staff section, which are currently ordered by the people's names. If these were ordered alphabetically by the name of the role, not the person, that would (a) make a lot more sense, in my opinion, and make them easier to read through and (b) mean what part of their names people are sorted by would be less important, as there are typically only a relative few people with exactly the same job name on a production.

Casts are automatically sorted alphabetically when too few of them have order numbers, but that in most cases is only a temporary measure until order numbers are added.

If that change happens, I'm sure many will be outraged. Adding in the function of custom display orders for names might be better, but removing the division might be easier to implement and would still be a big improvement over the current situation, in my opinion.

It would make it far easier to deal not only with different cultures' formats for names but also the names of bands like "Guns n' Roses" (IMDb would consider Roses the family name and sort it under R), people known by titles like "Queen Elizabeth II" (IMDb would consider II the family name and sort it under I), and people who use descriptive stage names like "Meat Loaf" (IMDb would consider Loaf the family name and sort it under L). We humans know that those interpretations are wrong, but non-AI programming cannot tell that.

The same happens to names of Korean people who go by a personal name with a space between the syllables, most famously BTS member Jung Kook. IMDb would consider Kook the family name and sort it under K. Which is like considering Lett to be Scarlett Johansson's family name and sorting her under LI know that interpretation is wrong, but not only can unintelligent programming not tell that, people who don't know anything about Korean names can also not tell that.

Currently, these "special" names can only be made to be sorted correctly by contacting IMDb staff, which is a hassle for everyone involved. For a non-anglophone name like Jung Kook it requires understanding how names can be formatted in different cultures, which IMDb do not have staff that are experts in for all the cultures of the world, and that means people requesting the names to be corrected have to explain the cultural context each time.

Sorting all names by the very first letter of them is not traditional in most of Europe or America, but the great advantage of doing that over any other conceivable solution is it doesn't require specialist knowledge either on the part of IMDb or contributors.

I'm old enough to remember that IMDb used to require leading articles of English-language titles to go at the end, after a comma: the primary title of The Magnificent Seven would be displayed that way but stored as "Magnificent Seven, The". I suspect this practice was ended because (a) IMDb staff didn't have specialist knowledge of sorting conventions for all the world's languages (which can be complex: French traditionally treats titles beginning with definite and with indefinite articles differently) and (b) the system was getting confused by titles with commas in them, and the number of them needing to be specially corrected by staff was too much for the staff to handle.

That was a break from tradition which it appears everyone is now used to, as I see no one complaining about it and asking it to be retracted in these forums. Maybe simplifying how names are sorted may end up being considered just as normal after a similar amount of time?

111 Messages

 • 

1.5K Points

3 years ago

I notice nobody mentioned Arabic names in particular... probably because they get so LONG, with 20 or more names in the string, including several REPEATED, particularly prefixes (or whatever you call it, like "von" or "de" in Germanic names, which can open up another naming problem like how do you alphabetize by family name if it's preceded by de or von or van - I often see the name under D for "de" or V for "von" or "van", not to mention that those prefixes are not capitalized usually...)

Where was I? This is Soooo confusing. Okay, so the Arabic prefix I see commonly is "AL" or "EL", but other names, could they be given? middle? family? are sometimes repeated in the full name.

The long Arabic names occur because the full name describes the person's ancestors and linked families. (Reminds me of the Spain convention, carried over into some early latino-american personages, of the family name always including both the father's and mother's (maiden) names separated by "y" (Spanish for "and"). Enrique Granados y Campiña, always listed without the name after the y (I can't remember which is father, which is mother, I think mother is first), so file under G. But what about those who are known more by the second name? Using western comma conventions, the family name given first could be the one AFTER the "y"...)

For another digression, there's the confusing (to me) Slavic practice of adding a suffix to a woman's name, like "a" or "ova" or "aya", sometimes even modifying the final letter to add an "a" BEFORE it, as well as a suffix after it (Tarkovsky -> Tarkovskaya) that in some societies means she's married, in others it's added at birth.

At least with Korean names, you can always tell which is the given name because it's the hyphenated one. Sorta like the first and middle names, hyphenated together. Then there are Chinese names with hyphenated family names, who act in Korean movies...

Even without a hyphen, though, how does one tell which name is family and which is given? Zhang Dong vs Dong Zhang. I can see previous editors have been confused because some of these are the same people, need to be merged.

Oh, and there doesn't even seem to be any common convention when it comes to gender. Zhang or Dong can be male or female. Maybe only the middle name will tell you: Young-jin Jo (actor), Young-ji Jo (actress).

Anyway, how would one figure out what part of a long Arabic name is the family name, when the full name actually contains SEVERAL family names? Check out Star Trek Deep Space Nine's Alexander Siddig https://www.imdb.com/name/nm0796502/bio credited originally as Siddig El Fadil, with "Siddig" technically (?) being his (or one of his) family name, later changing his credit to Alexander Siddig, by western conventions, even though "Alexander" doesn't even feature in his full name (he just made it up), which is (get ready) Siddig El Tahir El Fadil El Siddig Abderrahman Mohammed Ahmed Abdel Karim El Mahdi (I love it). And his nickname is Sid, so uh maybe Siddig is a given name not a family name... Or maybe it's BOTH?? It occurs once at the beginning, and again later with an EL prefix, so I dunno... Good thing he doesn't insist on being listed with his full name, or the IMDb computers would have the CPU equivalent of an electron heart attack.

Better not think about it too much. Just list names any old way. That seems to be the reasoning behind IMDb lifting the restrictions on Asian names. I agree though that the comma should now be considered obsolete, and the editing form shouldn't automatically reformat the name you entered without a comma by switching the names around and adding it.

However, it would be nice to know what a person's family name was. I feel a bit weird referring to Sion Sono (or Sono Sion) as just "Sono" or "Sion" if one of those is his given name (I can never remember which!); it's a bit too forward and disrespectful, especially considering Japanese cultural politeness.

Sometimes it's not confusing at all. For instance I've never seen Akira Kurosawa's name in conventional Asian format "Kurosawa Akira" looks just odd to me, and I'd never speak it that way, even if it's more correct.

(edited)

10.7K Messages

 • 

225.4K Points

For sure, the announcement represents something of a quick fix. The technical underpinnings of the submission form itself are unchanged.

278 Messages

 • 

6K Points

1 year ago

I apologize for bringing up this topic, but actor Chow Yun-fat's family name is Chow, not Yun-fat as written in the opening post.

(because "Yun-Fat" is the family name)

10.7K Messages

 • 

225.4K Points

You're right, plur62. I didn't notice that before, but I understood what was meant.

278 Messages

 • 

6K Points

1 year ago

By the way, there are no rules for Korean names on the site, so there are thousands where the family name comes first (as it should be) and also thousands where the given name comes first, it's a complete mess.

I'm trying to fix it, but it's a lot of work, and my contributions are often declined for no reason.

(edited)

10.7K Messages

 • 

225.4K Points

I kind of wish the submission form had two separate fields for family name and given name, but that's just wish versus the reality of how tough it is for IMDb developers to amend the fundamental paradigms upon which the database is built (and thereby add things like new genres among other things). Maybe someday, it will all be fixed.

78 Messages

 • 

1.5K Points

At the least, it doesn't seem impossible that there could be a way to flip around names en masse in light of this policy change (if you think it’s a lot of work to individually correct Korean names affected by it, you don’t want to start thinking about how many more Chinese names are affected), considering the family and personal names are all separated by commas in IMDb’s system.

(edited)

10.7K Messages

 • 

225.4K Points

I'm worried that it might be tricky to discern which names have or haven't already been flipped around. I mean, this would be easy enough for those who have a two-syllable given name, but for the few who have a one-syllable given name and a one-syllable family name, it won't be so easy, and of course this is in reference to Chinese names and Korean names, as Japanese names are whole other ball game.