2.7K Messages
•
83K Points
Add a process to clean-up empty name pages on IMDb
I just came across the name page for a Grace Anne Cochran (https://www.imdb.com/name/nm5507911/). It's not a new name page, there are no credits listed, there is no other information listed and a check via the Edit Page also didn't show any credits. Therefore, I think this page should be deleted, but maybe there's more than meets my eye.
dan_dassow
Champion
•
19.4K Messages
•
477.1K Points
6 years ago
It looks like this name page has been on IMDb since at least 2014. There is one Internet Archive snapshot, which also shows that the page is blank.
https://web.archive.org/web/20170218081242/https://www.imdb.com/name/nm5507911/
1
0
Col_Needham
Employee
•
7.3K Messages
•
179.2K Points
6 years ago
This type of clean-up is best handled by an automated process which will fix the issue once and for all.
If there are other cases, please report them here and it will add weight to the necessity to create this process. Unfortunately it is simply not scalable to take one-off requests for every empty name, sorry. For an example of the scale, in 5+ years 36,080 threads have been posted to Get Satisfaction -- we estimate that out of over 8 million names on IMDb, more than 36K are empty (< 0.5% BTW)
4
Marco
2.7K Messages
•
83K Points
6 years ago
The same goes for the (wrongfully formatted) name page of Thomas Van der Vorst (https://www.imdb.com/name/nm6027204/).
0
0
dan_dassow
Champion
•
19.4K Messages
•
477.1K Points
6 years ago
name.basics.tsv.gz – Contains the following information for names:
- nconst (string) - alphanumeric unique identifier of the name/person
- primaryName (string)– name by which the person is most often credited
- birthYear – in YYYY format
- deathYear – in YYYY format if applicable, else ‘\N’
- primaryProfession (array of strings)– the top-3 professions of the person
- knownForTitles (array of tconsts) – titles the person is known for
Any person who does not have any knownForTitles would be a candidate for removal. The list of name pages without credits would still need to be curated to ensure that valid pages are not removed.I suspect that ljdoncel (Champion) could readily produce this report since he has access to SAS (Statistical Analysis System) and computing facilities that can handle the size of the data set.
9
Marco
2.7K Messages
•
83K Points
6 years ago
1