taewong's profile

6 Messages

 • 

370 Points

Mon, Mar 25, 2013 12:37 PM

54

Support for Unicode.

Unicode is not fully supported in IMDb. For example, in Polish: you could change all references by searching “milosc” and then changing them to “miłość”. And Jiří Hnídek is written without an r-hacek on the start of their first name. It can also do the same for the ILM person Coşku Özdemır which is an Turkish person listed on Cinefex.

Responses

1 Message

 • 

100 Points

8 y ago

I agree. This affects titles, names, characters, discussion, and probably more.

Where I run into the problem most is in the discussion forums. If you paste non-ASCII characters copied from somewhere else (for example, to show a symbol that was in the film, or indeed to show the native-language title of the film), they just get turned into what appears to be the HTML text code for those characters, instead of the symbol itself.

It's 2013. This shouldn't be happening.

Champion

 • 

14.3K Messages

 • 

421.7K Points

Since this is the International Movie Data Base, it is truly surprising that they do not support unicode.

Champion

 • 

4.6K Messages

 • 

236.3K Points

Except that it's "Internet Movie Database," not "International." ;)

Champion

 • 

14.3K Messages

 • 

421.7K Points

This must be a Freudian slip. [wink]
Reminder to self: Don't post when tired.

Champion

 • 

4.6K Messages

 • 

236.3K Points

LOL. Too be honest, though, it's almost like they want it to be known that way. Most mentions of the spelled-out name are gone. Kind of like Kentucky Fried Chicken is only KFC now. New visitors seem to be having a hard time figuring out what the site is...video streaming, file sharing?

Champion

 • 

1.9K Messages

 • 

146.1K Points

Or after taking some random prescription medication you found lying in the meep.

6 Messages

 • 

370 Points

Yeah. Do not post nonsense. You have accidentally removed a comment (you need to dispute this remove).

6 Messages

 • 

370 Points

8 y ago

Since MobyGames supports Unicode, macrons in Japanese are OK for long vowels. Note that the title ends with a punctuation mark (full stop). Hungarian, Czech, Polish, Romanian, Slovak etc. requires a bunch of accented letters.

Champion

 • 

14.3K Messages

 • 

421.7K Points

8 y ago

It is almost like Randall Munroe has been reading this forum.
http://xkcd.com/1209/

6 Messages

 • 

370 Points

8 y ago

You quote the comic: “The Skywriter we hired has terrible Unicode support.”

After correcting Miroslav Kure's suname to Miroslav Kuře (to match Czech support: the Danish/Faroese/Norwegian ø is rcaron) in Battle for Wesnoth 1.11.1 contribution community, you have many problems with the Internet Archive Wayback Machine this time. First the connection is too slow to load and you get the error mesage “The machine that serves this file is down. We're working on it.” twice. Unicode in their own forum affects subjects (titles) and more. Note that the thread has nonsense!

Champion

 • 

1.9K Messages

 • 

92.6K Points

8 y ago

This has been mentioned many times over the past few years. A bit of history may help here.

When IMDb first started, it was updated by an automated email system. This was at a time when some of the email routers still only handled 7-bit ASCII and special encoding was needed to ensure that 8-bit codes would not be trashed. Moreover, some characters (e.g. | the 'pipe') were used internally (and in the email) as controls/delimiters. This is why you may sometimes see older contributors indicate a credit update as :

John Doe | 2nd Pirate | 22

By the time Unicode became standard, the system had grown quite complex. Before Unicode can be implemented, every part of the system needs to be checked and potentially modified to ensure that it will not be broken by any of the Unicode codes.

IMDb is currently in the process of moving the various lists (sections) to new internal systems. I hope and expect that they are designing these systems so that they will be able to support Unicode.

Once the moves have been completed, we may see support for Unicode, but don't expect it any time soon.

6 Messages

 • 

370 Points

8 y ago

You will need an answer. You have removed the first reply by accident. Where a name includes a suffix, we use a comma to separate it from the name. On game credits and indexes it is not treated as an integral part of the surname. Examples are:

Hernandez, Jonathan, Jr
Rowe, William A., Jr.
Tibbetts, Richard S., III

It thinks that the Get Satisfaction software uses Unicode. It supports different accented characters for Eastern European languages.

Champion

 • 

4.6K Messages

 • 

236.3K Points

The change log says you removed it...??? What the..??

6 Messages

 • 

370 Points

This reply was removed on 2013-03-25.

Champion

 • 

4.6K Messages

 • 

236.3K Points

Yep. And:


3 months ago
taewong, the poster:
Removed a reply in this topic
Reason: removed by the poster

1 Message

 • 

60 Points

8 y ago

Actually, it seems that after the message-board makeover, Unicode support is even worse! At least with the old ones you could enter most extended ASCII glyphs (assuming proper code-page is set). But now anything that is above 127 doesn’t work.

1 Message

 • 

82 Points

7 y ago

It's year 2014 and some Czech characters are still not supported.

8 Messages

 • 

160 Points

7 y ago

It's almost 2015 and Greek characters aren't supported AT ALL.

5 Messages

 • 

116 Points

7 y ago

This reply was created from a merged topic originally titled
How many years will it take you to understand UNICODE?.


In 2009, in Contact #3034383 (http://www.imdb.com/helpdesk/thread?tid=3034383) you the owner of IMDB promised professional usage of UNCODE "in a little while". It is now 5 years and a half later and your web site is still crippled with no UNICODE implementation. 5 years and a half??? Don't you fill embarrassed with your "professionalism"? Shall we wait another 5 years for IMDB to understand the word "international"?

(This post is addressed solely and specifically to IMDb staff.)

5 Messages

 • 

116 Points

Correction: "don't you FEEL"