2 Messages
•
70 Points
IMDB Datasets no longer including some movies?
Sometime over the last two-three weeks (Between files downloaded on 2022-07-10 and 2022-07-24), it seems as if the IMDB datasets available from https://datasets.imdbws.com/ no longer include some movies.
Download https://datasets.imdbws.com/title.basics.tsv.gz for instance, and try to find the following IMDB-ids entries, there on July 10 but not on July 24:
tt0044502 Clash by Night (1952)
tt0047573 Them! (1954)
tt0048977 The Bad Seed (1956)
tt0050539 The Incredible Shrinking Man (1957)
tt0053290 Solomon and Sheba (1959)
tt0056700 The Wonderful World of the Brothers Grimm (1962)
tt0057449 The Raven (1963)
tt0060980 The Silencers (1966)
tt0065421 The AristoCats (1970)
The same IMDB-ids seem to have disappeared from https://datasets.imdbws.com/title.ratings.tsv.gz as well.
I did re-download the files on July 25 and got the same results missing.
What could explain this?
johnny_m
7 Messages
•
150 Points
1 year ago
The dataset is broken. It now only includes 3,477,496 titles. It should have 3 times that number almost.
The data is corrupted after the title "Kneeling for Justice: A San Francisco Memorial for George Floyd". The value in tconst for that character is "ial for George Floyd".
Could some at IMDb please correct this?
Thank you!
0
0