Since the dawn of mankind, one question has always been asked, “What can I do to ease my boredom?” People have tried various methods such as drawing, painting, litterature, sports, even games. One of today’s common day answers has been to play a game, in particular a video game. The earliest known video game was Pong, a game created during the 1950s which was used to mirror tennis. As technology improved, new types of video games were created and devices made to handle them. Today, the video game industry is a multi-billion dollar industry with a wide variety of games. Video game genres that can appeal to different types of people such as strategy, action, adventure, racing, role-playing, sport, shooter, puzzle, simulation, platform, fighting, etc. Strategy video games are video games that emphasize skillful thinking and planning to achieve victory. Racing video games are video games where the player participates in a racing competition. Role-playing games are video games where players assume the roles of characters in a fictional setting. Sports games are video games that simulate the practice of sports. Shooter games are video games where the focus is almost entirely on the defeat of the character's enemies using the weapons given to the player,usually firearms. Puzzle games are video games that emphasize puzzle-solving. Simulation games are games that are typically designed to closely simulate real world activities. Platform games are video games where the objective is to move the player character between points in an environment. Fighting games are video games that involve combat between pairs of fighters.
People are now able to play on devices such handheld consoles, consoles, PC, smartphones, etc. Some iconic video games include Super Mario Bros., Pac-Man, Pokemon, Tetris, Pong, etc. Everyday, video games are becoming more commonplace than before. According to Statista, there are roughly 3.24 billion gamers across the globe. And with a multi-billion dollar industry, one might say that video games are quickly becoming the preferred pastime of people. There is a lot of money spent in developing games and by analyzing factors such as sales, genre, and region. We will be able to pick up on trends to help maximize potential profit.
The goal of this assignment is to analyze sales of video games across the years, different genres, and regions to answer questions such as “Which is the most popular genre?”, “What genre sells best by region?”, and many more. For people unfamiliar with video games, we hope to provide an adequate understanding.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
In the below Video Games dataset, there are 15 columns. We have
df= pd.read_csv("Video_Games.csv")
df
Name | Platform | Year_of_Release | Genre | Publisher | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | Critic_Score | Critic_Count | User_Score | User_Count | Developer | Rating | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Wii Sports | Wii | 2006.0 | Sports | Nintendo | 41.36 | 28.96 | 3.77 | 8.45 | 82.53 | 76.0 | 51.0 | 8 | 322.0 | Nintendo | E |
1 | Super Mario Bros. | NES | 1985.0 | Platform | Nintendo | 29.08 | 3.58 | 6.81 | 0.77 | 40.24 | NaN | NaN | NaN | NaN | NaN | NaN |
2 | Mario Kart Wii | Wii | 2008.0 | Racing | Nintendo | 15.68 | 12.76 | 3.79 | 3.29 | 35.52 | 82.0 | 73.0 | 8.3 | 709.0 | Nintendo | E |
3 | Wii Sports Resort | Wii | 2009.0 | Sports | Nintendo | 15.61 | 10.93 | 3.28 | 2.95 | 32.77 | 80.0 | 73.0 | 8 | 192.0 | Nintendo | E |
4 | Pokemon Red/Pokemon Blue | GB | 1996.0 | Role-Playing | Nintendo | 11.27 | 8.89 | 10.22 | 1.00 | 31.37 | NaN | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16714 | Samurai Warriors: Sanada Maru | PS3 | 2016.0 | Action | Tecmo Koei | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN |
16715 | LMA Manager 2007 | X360 | 2006.0 | Sports | Codemasters | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN |
16716 | Haitaka no Psychedelica | PSV | 2016.0 | Adventure | Idea Factory | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN |
16717 | Spirits & Spells | GBA | 2003.0 | Platform | Wanadoo | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN |
16718 | Winning Post 8 2016 | PSV | 2016.0 | Simulation | Tecmo Koei | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN |
16719 rows × 16 columns
We decided to drop the columns "Publisher", "Critic_Score", "Critic_Count", "User_Score", "User_Count", "Developer", and "Rating" because these columns will not be relevant in our analysis, and many of the entries in the dataset were missing data in these columns. Due to those missing entries, it would be impossible for us to accurately use information based on those columns.
df = df.drop(columns=['Publisher', 'Critic_Score', 'Critic_Count', "User_Score", "User_Count", "Developer", "Rating"])
df
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
0 | Wii Sports | Wii | 2006.0 | Sports | 41.36 | 28.96 | 3.77 | 8.45 | 82.53 |
1 | Super Mario Bros. | NES | 1985.0 | Platform | 29.08 | 3.58 | 6.81 | 0.77 | 40.24 |
2 | Mario Kart Wii | Wii | 2008.0 | Racing | 15.68 | 12.76 | 3.79 | 3.29 | 35.52 |
3 | Wii Sports Resort | Wii | 2009.0 | Sports | 15.61 | 10.93 | 3.28 | 2.95 | 32.77 |
4 | Pokemon Red/Pokemon Blue | GB | 1996.0 | Role-Playing | 11.27 | 8.89 | 10.22 | 1.00 | 31.37 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16714 | Samurai Warriors: Sanada Maru | PS3 | 2016.0 | Action | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16715 | LMA Manager 2007 | X360 | 2006.0 | Sports | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16716 | Haitaka no Psychedelica | PSV | 2016.0 | Adventure | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16717 | Spirits & Spells | GBA | 2003.0 | Platform | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16718 | Winning Post 8 2016 | PSV | 2016.0 | Simulation | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16719 rows × 9 columns
We also decided to drop games released before 2000 to make the data more recent along with games after 2016 since there are only 4 of them, games that have DC(Dreamcast) or WS(WonderSwan) as their platform since these platforms had little data and we had never heard of them until looking at this dataset, and games with a missing value for their Platform. We also convert the years from float to int to help with visualization and other future operations
for x in df.index:
if df["Year_of_Release"][x] < 2000.0 or df["Year_of_Release"][x] > 2016.0:
df.drop(x, inplace=True)
elif pd.isna(df["Year_of_Release"][x]):
df.drop(x, inplace=True)
elif df["Platform"][x] == "DC" or df["Platform"][x] == "WS":
df.drop(x, inplace=True)
df = df.astype({"Year_of_Release": int})
df
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
0 | Wii Sports | Wii | 2006 | Sports | 41.36 | 28.96 | 3.77 | 8.45 | 82.53 |
2 | Mario Kart Wii | Wii | 2008 | Racing | 15.68 | 12.76 | 3.79 | 3.29 | 35.52 |
3 | Wii Sports Resort | Wii | 2009 | Sports | 15.61 | 10.93 | 3.28 | 2.95 | 32.77 |
6 | New Super Mario Bros. | DS | 2006 | Platform | 11.28 | 9.14 | 6.50 | 2.88 | 29.80 |
7 | Wii Play | Wii | 2006 | Misc | 13.96 | 9.18 | 2.93 | 2.84 | 28.92 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16714 | Samurai Warriors: Sanada Maru | PS3 | 2016 | Action | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16715 | LMA Manager 2007 | X360 | 2006 | Sports | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16716 | Haitaka no Psychedelica | PSV | 2016 | Adventure | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16717 | Spirits & Spells | GBA | 2003 | Platform | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16718 | Winning Post 8 2016 | PSV | 2016 | Simulation | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
14435 rows × 9 columns
One of the first questions we want to ask is “Which genre sells best by platform?”
First, we divided the data based on platform where
threeds = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != '3DS':
threeds.drop(x, inplace=True)
threeds.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
33 | Pokemon X/Pokemon Y | 3DS | 2013 | Role-Playing | 5.28 | 4.19 | 4.35 | 0.78 | 14.60 |
40 | Mario Kart 7 | 3DS | 2011 | Racing | 5.03 | 4.02 | 2.69 | 0.91 | 12.66 |
47 | Pokemon Omega Ruby/Pokemon Alpha Sapphire | 3DS | 2014 | Role-Playing | 4.35 | 3.49 | 3.10 | 0.74 | 11.68 |
53 | Super Mario 3D Land | 3DS | 2011 | Platform | 4.89 | 3.00 | 2.14 | 0.78 | 10.81 |
62 | New Super Mario Bros. 2 | 3DS | 2012 | Platform | 3.66 | 3.14 | 2.47 | 0.63 | 9.90 |
ds = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'DS':
ds.drop(x, inplace=True)
ds.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
6 | New Super Mario Bros. | DS | 2006 | Platform | 11.28 | 9.14 | 6.50 | 2.88 | 29.80 |
10 | Nintendogs | DS | 2005 | Simulation | 9.05 | 10.95 | 1.93 | 2.74 | 24.67 |
11 | Mario Kart DS | DS | 2005 | Racing | 9.71 | 7.47 | 4.13 | 1.90 | 23.21 |
19 | Brain Age: Train Your Brain in Minutes a Day | DS | 2005 | Misc | 4.74 | 9.20 | 4.16 | 2.04 | 20.15 |
20 | Pokemon Diamond/Pokemon Pearl | DS | 2006 | Role-Playing | 6.38 | 4.46 | 6.04 | 1.36 | 18.25 |
gb = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'GB':
gb.drop(x, inplace=True)
gb.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
133 | Pokémon Crystal Version | GB | 2000 | Role-Playing | 2.55 | 1.56 | 1.29 | 0.99 | 6.39 |
741 | Wario Land 3 | GB | 2000 | Platform | 1.11 | 0.51 | 0.34 | 0.23 | 2.20 |
748 | Donkey Kong Country | GB | 2000 | Platform | 1.04 | 0.72 | 0.30 | 0.13 | 2.19 |
752 | Yu-Gi-Oh: Duel Monsters 4 | GB | 2000 | Role-Playing | 0.00 | 0.00 | 2.17 | 0.01 | 2.18 |
897 | The Legend of Zelda: Oracle of Ages | GB | 2001 | Action | 0.92 | 0.53 | 0.41 | 0.06 | 1.92 |
gba = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'GBA':
gba.drop(x, inplace=True)
gba.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
25 | Pokemon Ruby/Pokemon Sapphire | GBA | 2002 | Role-Playing | 6.06 | 3.90 | 5.38 | 0.50 | 15.85 |
58 | Pokemon FireRed/Pokemon LeafGreen | GBA | 2004 | Role-Playing | 4.34 | 2.65 | 3.15 | 0.35 | 10.49 |
131 | Pokémon Emerald Version | GBA | 2004 | Role-Playing | 2.57 | 1.58 | 2.06 | 0.21 | 6.41 |
162 | Super Mario Advance | GBA | 2001 | Platform | 3.14 | 1.24 | 0.91 | 0.20 | 5.49 |
166 | Mario Kart: Super Circuit | GBA | 2001 | Racing | 2.62 | 1.64 | 0.99 | 0.23 | 5.47 |
gc = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'GC':
gc.drop(x, inplace=True)
gc.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
111 | Super Smash Bros. Melee | GC | 2001 | Fighting | 4.41 | 1.04 | 1.39 | 0.22 | 7.07 |
112 | Mario Kart: Double Dash!! | GC | 2003 | Racing | 4.12 | 1.77 | 0.87 | 0.19 | 6.95 |
136 | Super Mario Sunshine | GC | 2002 | Platform | 4.01 | 1.26 | 0.87 | 0.17 | 6.31 |
233 | The Legend of Zelda: The Wind Waker | GC | 2002 | Action | 2.60 | 0.99 | 0.89 | 0.13 | 4.60 |
356 | Luigi's Mansion | GC | 2001 | Action | 2.38 | 0.67 | 0.46 | 0.10 | 3.60 |
n64 = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'N64':
n64.drop(x, inplace=True)
n64.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
400 | The Legend of Zelda: Majora's Mask | N64 | 2000 | Action | 1.90 | 0.67 | 0.73 | 0.06 | 3.36 |
549 | Pokémon Stadium 2 | N64 | 2000 | Strategy | 1.02 | 0.36 | 1.13 | 0.23 | 2.73 |
613 | Perfect Dark | N64 | 2000 | Action | 1.55 | 0.75 | 0.16 | 0.06 | 2.52 |
683 | Mario Tennis | N64 | 2000 | Sports | 0.78 | 0.40 | 1.06 | 0.07 | 2.32 |
781 | Tony Hawk's Pro Skater | N64 | 2000 | Sports | 1.68 | 0.40 | 0.00 | 0.03 | 2.11 |
pc = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PC':
pc.drop(x, inplace=True)
pc.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
85 | The Sims 3 | PC | 2009 | Simulation | 0.99 | 6.42 | 0.0 | 0.60 | 8.01 |
138 | World of Warcraft | PC | 2004 | Role-Playing | 0.08 | 6.21 | 0.0 | 0.00 | 6.29 |
192 | Diablo III | PC | 2012 | Role-Playing | 2.44 | 2.16 | 0.0 | 0.54 | 5.14 |
218 | StarCraft II: Wings of Liberty | PC | 2010 | Strategy | 2.57 | 1.68 | 0.0 | 0.58 | 4.84 |
288 | World of Warcraft: The Burning Crusade | PC | 2007 | Role-Playing | 2.57 | 1.52 | 0.0 | 0.00 | 4.09 |
ps = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PS':
ps.drop(x, inplace=True)
ps.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
175 | Final Fantasy IX | PS | 2000 | Role-Playing | 1.62 | 0.77 | 2.78 | 0.14 | 5.30 |
223 | Driver 2 | PS | 2000 | Action | 2.36 | 2.10 | 0.02 | 0.25 | 4.73 |
227 | Tony Hawk's Pro Skater 2 | PS | 2000 | Sports | 3.05 | 1.41 | 0.02 | 0.20 | 4.68 |
244 | Dragon Quest VII: Warriors of Eden | PS | 2000 | Role-Playing | 0.20 | 0.14 | 4.10 | 0.02 | 4.47 |
332 | Harry Potter and the Sorcerer's Stone | PS | 2001 | Action | 1.37 | 2.00 | 0.14 | 0.22 | 3.73 |
ps2 = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PS2':
ps2.drop(x, inplace=True)
ps2.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
17 | Grand Theft Auto: San Andreas | PS2 | 2004 | Action | 9.43 | 0.40 | 0.41 | 10.57 | 20.81 |
24 | Grand Theft Auto: Vice City | PS2 | 2002 | Action | 8.41 | 5.49 | 0.47 | 1.78 | 16.15 |
28 | Gran Turismo 3: A-Spec | PS2 | 2001 | Racing | 6.85 | 5.09 | 1.87 | 1.16 | 14.98 |
38 | Grand Theft Auto III | PS2 | 2001 | Action | 6.99 | 4.51 | 0.30 | 1.30 | 13.10 |
48 | Gran Turismo 4 | PS2 | 2004 | Racing | 3.01 | 0.01 | 1.10 | 7.53 | 11.66 |
ps3 = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PS3':
ps3.drop(x, inplace=True)
ps3.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
16 | Grand Theft Auto V | PS3 | 2013 | Action | 7.02 | 9.09 | 0.98 | 3.96 | 21.04 |
34 | Call of Duty: Black Ops II | PS3 | 2012 | Shooter | 4.99 | 5.73 | 0.65 | 2.42 | 13.79 |
37 | Call of Duty: Modern Warfare 3 | PS3 | 2011 | Shooter | 5.54 | 5.73 | 0.49 | 1.57 | 13.32 |
41 | Call of Duty: Black Ops | PS3 | 2010 | Shooter | 5.99 | 4.37 | 0.48 | 1.79 | 12.63 |
54 | Gran Turismo 5 | PS3 | 2010 | Racing | 2.96 | 4.82 | 0.81 | 2.11 | 10.70 |
ps4 = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PS4':
ps4.drop(x, inplace=True)
ps4.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
31 | Call of Duty: Black Ops 3 | PS4 | 2015 | Shooter | 6.03 | 5.86 | 0.36 | 2.38 | 14.63 |
42 | Grand Theft Auto V | PS4 | 2014 | Action | 3.96 | 6.31 | 0.38 | 1.97 | 12.61 |
77 | FIFA 16 | PS4 | 2015 | Sports | 1.12 | 6.12 | 0.06 | 1.28 | 8.57 |
87 | Star Wars Battlefront (2015) | PS4 | 2015 | Shooter | 2.99 | 3.49 | 0.22 | 1.28 | 7.98 |
92 | Call of Duty: Advanced Warfare | PS4 | 2014 | Shooter | 2.81 | 3.48 | 0.14 | 1.23 | 7.66 |
psp = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PSP':
psp.drop(x, inplace=True)
psp.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
91 | Grand Theft Auto: Liberty City Stories | PSP | 2005 | Action | 2.90 | 2.81 | 0.24 | 1.73 | 7.69 |
163 | Monster Hunter Freedom Unite | PSP | 2008 | Role-Playing | 0.47 | 0.55 | 4.13 | 0.34 | 5.48 |
204 | Grand Theft Auto: Vice City Stories | PSP | 2006 | Action | 1.70 | 1.99 | 0.16 | 1.18 | 5.03 |
215 | Monster Hunter Freedom 3 | PSP | 2010 | Role-Playing | 0.00 | 0.00 | 4.87 | 0.00 | 4.87 |
272 | Daxter | PSP | 2006 | Platform | 2.45 | 1.01 | 0.00 | 0.75 | 4.21 |
psv = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'PSV':
psv.drop(x, inplace=True)
psv.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
868 | Minecraft | PSV | 2014 | Misc | 0.18 | 0.64 | 0.90 | 0.24 | 1.96 |
1219 | Uncharted: Golden Abyss | PSV | 2011 | Shooter | 0.53 | 0.66 | 0.13 | 0.22 | 1.53 |
1294 | Call of Duty Black Ops: Declassified | PSV | 2012 | Action | 0.71 | 0.43 | 0.07 | 0.26 | 1.47 |
1485 | Assassin's Creed III: Liberation | PSV | 2012 | Action | 0.53 | 0.48 | 0.06 | 0.24 | 1.32 |
1595 | LittleBigPlanet PS Vita | PSV | 2012 | Platform | 0.35 | 0.61 | 0.02 | 0.27 | 1.25 |
wii = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'Wii':
wii.drop(x, inplace=True)
wii.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
0 | Wii Sports | Wii | 2006 | Sports | 41.36 | 28.96 | 3.77 | 8.45 | 82.53 |
2 | Mario Kart Wii | Wii | 2008 | Racing | 15.68 | 12.76 | 3.79 | 3.29 | 35.52 |
3 | Wii Sports Resort | Wii | 2009 | Sports | 15.61 | 10.93 | 3.28 | 2.95 | 32.77 |
7 | Wii Play | Wii | 2006 | Misc | 13.96 | 9.18 | 2.93 | 2.84 | 28.92 |
8 | New Super Mario Bros. Wii | Wii | 2009 | Platform | 14.44 | 6.94 | 4.70 | 2.24 | 28.32 |
wiiu = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'WiiU':
wiiu.drop(x, inplace=True)
wiiu.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
110 | Mario Kart 8 | WiiU | 2014 | Racing | 3.15 | 2.15 | 1.28 | 0.51 | 7.09 |
185 | New Super Mario Bros. U | WiiU | 2012 | Platform | 2.30 | 1.34 | 1.27 | 0.32 | 5.22 |
216 | Super Smash Bros. for Wii U and 3DS | WiiU | 2014 | Fighting | 2.60 | 1.08 | 0.81 | 0.38 | 4.87 |
247 | Splatoon | WiiU | 2015 | Shooter | 1.54 | 1.18 | 1.46 | 0.26 | 4.43 |
248 | Nintendo Land | WiiU | 2012 | Misc | 2.52 | 1.11 | 0.46 | 0.33 | 4.42 |
x360 = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'X360':
x360.drop(x, inplace=True)
x360.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
14 | Kinect Adventures! | X360 | 2010 | Misc | 15.00 | 4.89 | 0.24 | 1.69 | 21.81 |
23 | Grand Theft Auto V | X360 | 2013 | Action | 9.66 | 5.14 | 0.06 | 1.41 | 16.27 |
29 | Call of Duty: Modern Warfare 3 | X360 | 2011 | Shooter | 9.04 | 4.24 | 0.13 | 1.32 | 14.73 |
32 | Call of Duty: Black Ops | X360 | 2010 | Shooter | 9.70 | 3.68 | 0.11 | 1.13 | 14.61 |
35 | Call of Duty: Black Ops II | X360 | 2012 | Shooter | 8.25 | 4.24 | 0.07 | 1.12 | 13.67 |
xb = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'XB':
xb.drop(x, inplace=True)
xb.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
78 | Halo 2 | XB | 2004 | Shooter | 6.82 | 1.53 | 0.05 | 0.08 | 8.49 |
129 | Halo: Combat Evolved | XB | 2001 | Shooter | 4.98 | 1.30 | 0.08 | 0.07 | 6.43 |
466 | Tom Clancy's Splinter Cell | XB | 2002 | Action | 1.85 | 1.04 | 0.00 | 0.13 | 3.02 |
508 | The Elder Scrolls III: Morrowind | XB | 2002 | Role-Playing | 2.09 | 0.63 | 0.03 | 0.11 | 2.86 |
569 | Fable | XB | 2004 | Role-Playing | 1.99 | 0.58 | 0.00 | 0.09 | 2.66 |
x1 = df.copy(deep=True)
for x in df.index:
if df["Platform"][x] != 'XOne':
x1.drop(x, inplace=True)
x1.head()
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
99 | Call of Duty: Black Ops 3 | XOne | 2015 | Shooter | 4.59 | 2.11 | 0.01 | 0.68 | 7.39 |
165 | Grand Theft Auto V | XOne | 2014 | Action | 2.81 | 2.19 | 0.00 | 0.47 | 5.48 |
179 | Call of Duty: Advanced Warfare | XOne | 2014 | Shooter | 3.22 | 1.55 | 0.01 | 0.48 | 5.27 |
242 | Halo 5: Guardians | XOne | 2015 | Shooter | 2.78 | 1.27 | 0.03 | 0.41 | 4.48 |
270 | Fallout 4 | XOne | 2015 | Role-Playing | 2.51 | 1.32 | 0.01 | 0.38 | 4.22 |
Just from these tables we can see which Game was the most sold for each platform and how many games were sold on each platform. Based on this data, from 2000 to 2016 3DS sold 512 units of games, DS sold 2120 units of games, GB sold 27 units of games, GBA sold 786 units of games, GC sold 542 units of games, N64 sold 70 units of games, PC sold 912 units of games, PS sold 274 units of games, PS2 sold 2127, PS3 sold 1306, PS4 sold 392, PSP sold 1193, PSV sold 427, Wii sold 1286, WiiU sold 147, X360 sold 1232, XB sold 803, and XOne sold 247. We will also be using these tables to further analyze the data later.
Pokémon, Mario, are exclusive to Nintendo platforms so it makes sense that those games would have high sales on those devices. It is the same for Halo which is exclusive to Xbox, and Final Fantasy is typically exclusive to Playstation. For Xbox, Japanese sales tend to be very low due to Japanese preferring Sony's Playstation and Nintendo over Microsoft's Xbox.
Then we create 2 separate tables displaying the total amount of sales for each platforms. We decided to separate each table based on the type of console since it would make more sense to compare a handheld console against other handheld consoles.
handheld_data = [['3DS', threeds.Global_Sales.sum()], ['DS', ds.Global_Sales.sum()], ['GB', gb.Global_Sales.sum()],
['GBA', gba.Global_Sales.sum()], ['PSP', psp.Global_Sales.sum()], ['PSV', psv.Global_Sales.sum()]]
handheld_platforms = pd.DataFrame(handheld_data, columns = ['Platform', 'Global_Sales'])
handheld_platforms = handheld_platforms.sort_values(by=["Global_Sales"], ascending=False)
handheld_platforms
Platform | Global_Sales | |
---|---|---|
1 | DS | 803.42 |
3 | GBA | 313.56 |
4 | PSP | 289.79 |
0 | 3DS | 257.92 |
5 | PSV | 53.83 |
2 | GB | 29.00 |
at_home_data = [['GC', gc.Global_Sales.sum()], ['N64', n64.Global_Sales.sum()], ['PC', pc.Global_Sales.sum()],
['PS', ps.Global_Sales.sum()], ['PS2', ps2.Global_Sales.sum()], ['PS3', ps3.Global_Sales.sum()],
['PS4', ps4.Global_Sales.sum()], ['Wii', wii.Global_Sales.sum()], ['WiiU', wiiu.Global_Sales.sum()],
['X360', x360.Global_Sales.sum()], ['XB', xb.Global_Sales.sum()], ['XOne', x1.Global_Sales.sum()]]
at_home_platforms = pd.DataFrame(at_home_data, columns = ['Platform', 'Global_Sales'])
at_home_platforms = at_home_platforms.sort_values(by=["Global_Sales"], ascending=False)
at_home_platforms
Platform | Global_Sales | |
---|---|---|
4 | PS2 | 1233.46 |
9 | X360 | 961.39 |
5 | PS3 | 931.15 |
7 | Wii | 891.74 |
6 | PS4 | 314.19 |
10 | XB | 252.09 |
2 | PC | 206.54 |
0 | GC | 197.14 |
11 | XOne | 159.44 |
3 | PS | 140.56 |
8 | WiiU | 82.16 |
1 | N64 | 37.35 |
We also want to ask is “Which genre sells best for each region?”. Now we divided the data based on Genre
action = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Action':
action.drop(x, inplace=True)
action
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
16 | Grand Theft Auto V | PS3 | 2013 | Action | 7.02 | 9.09 | 0.98 | 3.96 | 21.04 |
17 | Grand Theft Auto: San Andreas | PS2 | 2004 | Action | 9.43 | 0.40 | 0.41 | 10.57 | 20.81 |
23 | Grand Theft Auto V | X360 | 2013 | Action | 9.66 | 5.14 | 0.06 | 1.41 | 16.27 |
24 | Grand Theft Auto: Vice City | PS2 | 2002 | Action | 8.41 | 5.49 | 0.47 | 1.78 | 16.15 |
38 | Grand Theft Auto III | PS2 | 2001 | Action | 6.99 | 4.51 | 0.30 | 1.30 | 13.10 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16695 | Dynasty Warriors: Eiketsuden | PS3 | 2016 | Action | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16696 | Metal Gear Solid V: Ground Zeroes | PC | 2014 | Action | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16699 | Planet Monsters | GBA | 2001 | Action | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16703 | The Longest 5 Minutes | PSV | 2016 | Action | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16714 | Samurai Warriors: Sanada Maru | PS3 | 2016 | Action | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
3077 rows × 9 columns
adventure = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Adventure':
adventure.drop(x, inplace=True)
adventure
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
160 | Assassin's Creed | X360 | 2007 | Adventure | 3.28 | 1.64 | 0.07 | 0.56 | 5.54 |
219 | Assassin's Creed | PS3 | 2007 | Adventure | 1.91 | 2.00 | 0.09 | 0.82 | 4.82 |
433 | L.A. Noire | PS3 | 2011 | Adventure | 1.27 | 1.29 | 0.12 | 0.49 | 3.17 |
437 | Club Penguin: Elite Penguin Force | DS | 2008 | Adventure | 1.87 | 0.97 | 0.00 | 0.30 | 3.14 |
463 | Heavy Rain | PS3 | 2010 | Adventure | 1.29 | 1.21 | 0.06 | 0.47 | 3.03 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16691 | Neo Angelique Special | PSP | 2008 | Adventure | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16693 | Real Rode | PS2 | 2008 | Adventure | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16707 | Strawberry Nauts | PSV | 2016 | Adventure | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16709 | 15 Days | PC | 2009 | Adventure | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16716 | Haitaka no Psychedelica | PSV | 2016 | Adventure | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
1186 rows × 9 columns
fighting = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Fighting':
fighting.drop(x, inplace=True)
fighting
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
39 | Super Smash Bros. Brawl | Wii | 2008 | Fighting | 6.62 | 2.55 | 2.66 | 1.01 | 12.84 |
96 | Super Smash Bros. for Wii U and 3DS | 3DS | 2014 | Fighting | 3.27 | 1.37 | 2.43 | 0.48 | 7.55 |
111 | Super Smash Bros. Melee | GC | 2001 | Fighting | 4.41 | 1.04 | 1.39 | 0.22 | 7.07 |
216 | Super Smash Bros. for Wii U and 3DS | WiiU | 2014 | Fighting | 2.60 | 1.08 | 0.81 | 0.38 | 4.87 |
280 | Street Fighter IV | PS3 | 2009 | Fighting | 2.03 | 1.04 | 0.58 | 0.52 | 4.16 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16511 | World Heroes Anthology | PS2 | 2007 | Fighting | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16560 | Shijyou Saikyou no Deshi Kenichi: Gekitou! Rag... | PS2 | 2007 | Fighting | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16617 | Dragon Ball Z for Kinect | X360 | 2012 | Fighting | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16668 | Mahou Sensei Negima!? Neo-Pactio Fight!! | Wii | 2007 | Fighting | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16677 | Mortal Kombat: Deadly Alliance | GBA | 2002 | Fighting | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
633 rows × 9 columns
misc = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Misc':
misc.drop(x, inplace=True)
misc
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
7 | Wii Play | Wii | 2006 | Misc | 13.96 | 9.18 | 2.93 | 2.84 | 28.92 |
14 | Kinect Adventures! | X360 | 2010 | Misc | 15.00 | 4.89 | 0.24 | 1.69 | 21.81 |
19 | Brain Age: Train Your Brain in Minutes a Day | DS | 2005 | Misc | 4.74 | 9.20 | 4.16 | 2.04 | 20.15 |
61 | Just Dance 3 | Wii | 2011 | Misc | 5.95 | 3.11 | 0.00 | 1.06 | 10.12 |
68 | Just Dance 2 | Wii | 2010 | Misc | 5.80 | 2.85 | 0.01 | 0.78 | 9.44 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16625 | DJ Max Technika Tune | PSV | 2012 | Misc | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16630 | The Ultimate Battle of the Sexes | Wii | 2010 | Misc | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16636 | Style Book: Cinnamoroll | DS | 2006 | Misc | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16640 | Deal or No Deal | PC | 2006 | Misc | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16711 | Aiyoku no Eustia | PSV | 2014 | Misc | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
1597 rows × 9 columns
platform = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Platform':
platform.drop(x, inplace=True)
platform
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
6 | New Super Mario Bros. | DS | 2006 | Platform | 11.28 | 9.14 | 6.50 | 2.88 | 29.80 |
8 | New Super Mario Bros. Wii | Wii | 2009 | Platform | 14.44 | 6.94 | 4.70 | 2.24 | 28.32 |
49 | Super Mario Galaxy | Wii | 2007 | Platform | 6.06 | 3.35 | 1.20 | 0.74 | 11.35 |
53 | Super Mario 3D Land | 3DS | 2011 | Platform | 4.89 | 3.00 | 2.14 | 0.78 | 10.81 |
59 | Super Mario 64 | DS | 2004 | Platform | 5.01 | 3.07 | 1.25 | 0.97 | 10.30 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16358 | Strider (2014) | PS3 | 2014 | Platform | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16360 | Goku Makaimura Kai | PSP | 2007 | Platform | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16605 | The Land Before Time: Into the Mysterious Beyond | GBA | 2006 | Platform | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16712 | Woody Woodpecker in Crazy Castle 5 | GBA | 2002 | Platform | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16717 | Spirits & Spells | GBA | 2003 | Platform | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
719 rows × 9 columns
puzzle = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Puzzle':
puzzle.drop(x, inplace=True)
puzzle
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
26 | Brain Age 2: More Training in Minutes a Day | DS | 2005 | Puzzle | 3.43 | 5.35 | 5.32 | 1.18 | 15.29 |
188 | Professor Layton and the Curious Village | DS | 2007 | Puzzle | 1.21 | 2.43 | 1.03 | 0.52 | 5.19 |
308 | Professor Layton and the Diabolical Box | DS | 2007 | Puzzle | 0.90 | 1.76 | 0.92 | 0.37 | 3.94 |
415 | Professor Layton and the Unwound Future | DS | 2008 | Puzzle | 0.60 | 1.57 | 0.82 | 0.27 | 3.26 |
489 | Pac-Man Collection | GBA | 2001 | Puzzle | 2.07 | 0.77 | 0.05 | 0.05 | 2.94 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16653 | Real Crimes: The Unicorn Killer | DS | 2011 | Puzzle | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16665 | Bookworm Deluxe | PC | 2006 | Puzzle | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16683 | XI Coliseum | PSP | 2006 | Puzzle | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16701 | Bust-A-Move 3000 | GC | 2003 | Puzzle | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16702 | Mega Brain Boost | DS | 2008 | Puzzle | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
479 rows × 9 columns
racing = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Racing':
racing.drop(x, inplace=True)
racing
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
2 | Mario Kart Wii | Wii | 2008 | Racing | 15.68 | 12.76 | 3.79 | 3.29 | 35.52 |
11 | Mario Kart DS | DS | 2005 | Racing | 9.71 | 7.47 | 4.13 | 1.90 | 23.21 |
28 | Gran Turismo 3: A-Spec | PS2 | 2001 | Racing | 6.85 | 5.09 | 1.87 | 1.16 | 14.98 |
40 | Mario Kart 7 | 3DS | 2011 | Racing | 5.03 | 4.02 | 2.69 | 0.91 | 12.66 |
48 | Gran Turismo 4 | PS2 | 2004 | Racing | 3.01 | 0.01 | 1.10 | 7.53 | 11.66 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16634 | Sébastien Loeb Rally Evo | XOne | 2016 | Racing | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16637 | SBK Superbike World Championship | PSP | 2008 | Racing | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16662 | Driving Simulator 2011 | PC | 2011 | Racing | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16690 | Yattaman Wii: BikkuriDokkiri Machine de Mou Ra... | Wii | 2008 | Racing | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16713 | SCORE International Baja 1000: The Official Game | PS2 | 2008 | Racing | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 |
1032 rows × 9 columns
role_playing = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Role-Playing':
role_playing.drop(x, inplace=True)
role_playing
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
20 | Pokemon Diamond/Pokemon Pearl | DS | 2006 | Role-Playing | 6.38 | 4.46 | 6.04 | 1.36 | 18.25 |
25 | Pokemon Ruby/Pokemon Sapphire | GBA | 2002 | Role-Playing | 6.06 | 3.90 | 5.38 | 0.50 | 15.85 |
27 | Pokemon Black/Pokemon White | DS | 2010 | Role-Playing | 5.51 | 3.17 | 5.65 | 0.80 | 15.14 |
33 | Pokemon X/Pokemon Y | 3DS | 2013 | Role-Playing | 5.28 | 4.19 | 4.35 | 0.78 | 14.60 |
47 | Pokemon Omega Ruby/Pokemon Alpha Sapphire | 3DS | 2014 | Role-Playing | 4.35 | 3.49 | 3.10 | 0.74 | 11.68 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16629 | Tengai Makyo: Dai Yon no Mokushiroku | PSP | 2006 | Role-Playing | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16639 | Blazer Drive | DS | 2008 | Role-Playing | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16655 | The Rise of the Argonauts | PC | 2008 | Role-Playing | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16673 | Super Robot Taisen: Original Generation | GBA | 2002 | Role-Playing | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16705 | Eiyuu Densetsu: Sora no Kiseki Material Collec... | PSP | 2007 | Role-Playing | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
1294 rows × 9 columns
shooter = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Shooter':
shooter.drop(x, inplace=True)
shooter
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
29 | Call of Duty: Modern Warfare 3 | X360 | 2011 | Shooter | 9.04 | 4.24 | 0.13 | 1.32 | 14.73 |
31 | Call of Duty: Black Ops 3 | PS4 | 2015 | Shooter | 6.03 | 5.86 | 0.36 | 2.38 | 14.63 |
32 | Call of Duty: Black Ops | X360 | 2010 | Shooter | 9.70 | 3.68 | 0.11 | 1.13 | 14.61 |
34 | Call of Duty: Black Ops II | PS3 | 2012 | Shooter | 4.99 | 5.73 | 0.65 | 2.42 | 13.79 |
35 | Call of Duty: Black Ops II | X360 | 2012 | Shooter | 8.25 | 4.24 | 0.07 | 1.12 | 13.67 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16580 | DoDonPachi Daifukkatsu: Black Label | X360 | 2011 | Shooter | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16581 | Space Raiders | GC | 2003 | Shooter | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16606 | Transformers: War for Cybertron (XBox 360, PS3... | PC | 2010 | Shooter | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16700 | Breach | PC | 2011 | Shooter | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16710 | Men in Black II: Alien Escape | GC | 2003 | Shooter | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
1127 rows × 9 columns
simulation = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Simulation':
simulation.drop(x, inplace=True)
simulation
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
10 | Nintendogs | DS | 2005 | Simulation | 9.05 | 10.95 | 1.93 | 2.74 | 24.67 |
43 | Animal Crossing: Wild World | DS | 2005 | Simulation | 2.50 | 3.45 | 5.33 | 0.86 | 12.13 |
73 | Animal Crossing: New Leaf | 3DS | 2012 | Simulation | 2.03 | 2.36 | 4.39 | 0.39 | 9.16 |
85 | The Sims 3 | PC | 2009 | Simulation | 0.99 | 6.42 | 0.00 | 0.60 | 8.01 |
156 | Cooking Mama | DS | 2006 | Simulation | 3.07 | 1.91 | 0.07 | 0.57 | 5.63 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16620 | National Geographic Panda (JP sales) | DS | 2008 | Simulation | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16689 | Help Wanted: 50 Wacky Jobs (jp sales) | Wii | 2008 | Simulation | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16694 | Pony Friends 2 | PC | 2009 | Simulation | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16708 | Plushees | DS | 2008 | Simulation | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16718 | Winning Post 8 2016 | PSV | 2016 | Simulation | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
768 rows × 9 columns
sports = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Sports':
sports.drop(x, inplace=True)
sports
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
0 | Wii Sports | Wii | 2006 | Sports | 41.36 | 28.96 | 3.77 | 8.45 | 82.53 |
3 | Wii Sports Resort | Wii | 2009 | Sports | 15.61 | 10.93 | 3.28 | 2.95 | 32.77 |
13 | Wii Fit | Wii | 2007 | Sports | 8.92 | 8.03 | 3.60 | 2.15 | 22.70 |
15 | Wii Fit Plus | Wii | 2009 | Sports | 9.01 | 8.49 | 2.53 | 1.77 | 21.79 |
77 | FIFA 16 | PS4 | 2015 | Sports | 1.12 | 6.12 | 0.06 | 1.28 | 8.57 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16680 | G1 Jockey 4 2008 | PS3 | 2008 | Sports | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 |
16692 | Outdoors Unleashed: Africa 3D | 3DS | 2011 | Sports | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16697 | PGA European Tour | N64 | 2000 | Sports | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16704 | Mezase!! Tsuri Master DS | DS | 2009 | Sports | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16715 | LMA Manager 2007 | X360 | 2006 | Sports | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
1975 rows × 9 columns
strategy = df.copy(deep=True)
for x in df.index:
if df["Genre"][x] != 'Strategy':
strategy.drop(x, inplace=True)
strategy
Name | Platform | Year_of_Release | Genre | NA_Sales | EU_Sales | JP_Sales | Other_Sales | Global_Sales | |
---|---|---|---|---|---|---|---|---|---|
218 | StarCraft II: Wings of Liberty | PC | 2010 | Strategy | 2.57 | 1.68 | 0.00 | 0.58 | 4.84 |
549 | Pokémon Stadium 2 | N64 | 2000 | Strategy | 1.02 | 0.36 | 1.13 | 0.23 | 2.73 |
582 | Halo Wars | X360 | 2009 | Strategy | 1.54 | 0.80 | 0.04 | 0.24 | 2.62 |
815 | Yu-Gi-Oh! The Eternal Duelist Soul | GBA | 2001 | Strategy | 1.64 | 0.36 | 0.00 | 0.07 | 2.07 |
1078 | Sid Meier's Civilization V | PC | 2010 | Strategy | 0.98 | 0.52 | 0.00 | 0.17 | 1.68 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16615 | Palais de Reine | PS2 | 2007 | Strategy | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 |
16621 | Codename: Panzers Complete Collection | PC | 2016 | Strategy | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16644 | Hospital Tycoon | PC | 2007 | Strategy | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
16682 | End of Nations | PC | 2012 | Strategy | 0.01 | 0.00 | 0.00 | 0.00 | 0.01 |
16706 | STORM: Frontline Nation | PC | 2011 | Strategy | 0.00 | 0.01 | 0.00 | 0.00 | 0.01 |
548 rows × 9 columns
Just from these tables we can see which Game was the most sold for each genre and how many games were sold on each genre. We will also be using these tables to further analyze the data later.
Then we create another table displaying the total amount of sales for each genre, and the average amount of sales for each genre based on region.
genre_data = [['Action',action.NA_Sales.mean(),action.EU_Sales.mean(),action.JP_Sales.mean(),action.Other_Sales.mean(),
action.Global_Sales.mean(),action.Global_Sales.sum()],
['Adventure',adventure.NA_Sales.mean(),adventure.EU_Sales.mean(),adventure.JP_Sales.mean(),
adventure.Other_Sales.mean(),adventure.Global_Sales.mean(),adventure.Global_Sales.sum()],
['Fighting',fighting.NA_Sales.mean(),fighting.EU_Sales.mean(),fighting.JP_Sales.mean(),
fighting.Other_Sales.mean(),fighting.Global_Sales.mean(),fighting.Global_Sales.sum()],
['Misc',misc.NA_Sales.mean(),misc.EU_Sales.mean(),misc.JP_Sales.mean(),misc.Other_Sales.mean(),
misc.Global_Sales.mean(),misc.Global_Sales.sum()],
['Platform',platform.NA_Sales.mean(),platform.EU_Sales.mean(),platform.JP_Sales.mean(),
platform.Other_Sales.mean(),platform.Global_Sales.mean(),platform.Global_Sales.sum()],
['Puzzle',puzzle.NA_Sales.mean(),puzzle.EU_Sales.mean(),puzzle.JP_Sales.mean(),puzzle.Other_Sales.mean(),
puzzle.Global_Sales.mean(),puzzle.Global_Sales.sum()],
['Racing',racing.NA_Sales.mean(),racing.EU_Sales.mean(),racing.JP_Sales.mean(),racing.Other_Sales.mean(),
racing.Global_Sales.mean(),racing.Global_Sales.sum()],
['Role-Playing',role_playing.NA_Sales.mean(),role_playing.EU_Sales.mean(),role_playing.JP_Sales.mean(),
role_playing.Other_Sales.mean(),role_playing.Global_Sales.mean(),role_playing.Global_Sales.sum()],
['Shooter',shooter.NA_Sales.mean(),shooter.EU_Sales.mean(),shooter.JP_Sales.mean(),shooter.Other_Sales.mean(),
shooter.Global_Sales.mean(),shooter.Global_Sales.sum()],
['Simulation',simulation.NA_Sales.mean(),simulation.EU_Sales.mean(),simulation.JP_Sales.mean(),
simulation.Other_Sales.mean(),simulation.Global_Sales.mean(),simulation.Global_Sales.sum()],
['Sports',sports.NA_Sales.mean(),sports.EU_Sales.mean(),sports.JP_Sales.mean(),sports.Other_Sales.mean(),
sports.Global_Sales.mean(),sports.Global_Sales.sum()],
['Strategy',strategy.NA_Sales.mean(),strategy.EU_Sales.mean(),strategy.JP_Sales.mean(),
strategy.Other_Sales.mean(),strategy.Global_Sales.mean(),strategy.Global_Sales.sum()]]
genres = pd.DataFrame(genre_data, columns = ['Genre','Mean_NA_Sales','Mean_EU_Sales','Mean_JP_Sales',
'Mean_Other_Sales','Mean_Global_Sales','Total_Global_Sales'])
genres = genres.sort_values(by=["Total_Global_Sales"], ascending=False)
genres
Genre | Mean_NA_Sales | Mean_EU_Sales | Mean_JP_Sales | Mean_Other_Sales | Mean_Global_Sales | Total_Global_Sales | |
---|---|---|---|---|---|---|---|
0 | Action | 0.244348 | 0.151560 | 0.042808 | 0.056890 | 0.495931 | 1525.98 |
10 | Sports | 0.297909 | 0.171914 | 0.038258 | 0.063823 | 0.572218 | 1130.13 |
8 | Shooter | 0.439423 | 0.260887 | 0.020594 | 0.088882 | 0.810133 | 913.02 |
7 | Role-Playing | 0.218161 | 0.120317 | 0.185680 | 0.039815 | 0.563872 | 729.65 |
3 | Misc | 0.228723 | 0.123175 | 0.053494 | 0.044421 | 0.450188 | 718.95 |
6 | Racing | 0.262800 | 0.185572 | 0.026570 | 0.066773 | 0.541851 | 559.19 |
4 | Platform | 0.358206 | 0.194715 | 0.076704 | 0.057858 | 0.687844 | 494.56 |
9 | Simulation | 0.211159 | 0.136328 | 0.052292 | 0.036953 | 0.436914 | 335.55 |
2 | Fighting | 0.253981 | 0.117709 | 0.069684 | 0.050126 | 0.491548 | 311.15 |
1 | Adventure | 0.069250 | 0.041737 | 0.030506 | 0.012454 | 0.154056 | 182.71 |
5 | Puzzle | 0.133925 | 0.082985 | 0.050480 | 0.021127 | 0.289374 | 138.61 |
11 | Strategy | 0.084069 | 0.059398 | 0.050201 | 0.015949 | 0.210274 | 115.23 |
We chose a custom palette of colors because the default colors were very similar so it was hard to observe the graphs
colors = ["yellow", "black","olive","red","orange","pink","green","grey","purple","blue", "cyan","lime"]
new_palette = sns.xkcd_palette(colors)
fig, ax = plt.subplots()
scatterplot = sns.scatterplot(x='Year_of_Release',y='Global_Sales',data=df, ax=ax, hue = 'Genre', palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Global Sales vs Year of Release for each Game')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
In this scatter plot you are able to see how much each individual game sold for each year along with what genre they belonged to. Most games stayed under the 20 million mark but some games like Wii Sports in 2006 or Mario Kart Wii were able to hit higher number like 82.53 and 35.52 million. There does not seem to be a trend in the data just a few peaks for games that did really well.
sns.set(rc = {'figure.figsize':(12,4)})
fig, ax = plt.subplots()
handheld_platforms = handheld_platforms.sort_values(by=["Platform"], ascending=True)
barplot = sns.barplot(x='Platform',y='Global_Sales',data=handheld_platforms, ax=ax)
plt.title('Total Global Sales for all games in each Handheld Platform from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the Total Global Sales for all games in each Handheld Platform. DS is at the top with about 800 million total sales and GB is at the bottom with less than 50 million total sales.
fig, ax = plt.subplots()
at_home_platforms = at_home_platforms.sort_values(by=["Platform"], ascending=True)
barplot = sns.barplot(x='Platform',y='Global_Sales',data=at_home_platforms, ax=ax)
plt.title('Total Global Sales for all games in each At Home Platform from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the Total Global Sales for all games in each At Home Platform. PS2 is at the top with over 1200 million total sales and N64 is at the bottom with less than 100 million total sales.
fig, ax = plt.subplots()
genres = genres.sort_values(by=["Genre"], ascending=True)
barplot = sns.barplot(x='Genre',y='Total_Global_Sales',data=genres, ax=ax, palette=new_palette)
plt.title('Total Global Sales for all games in each Genre from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
In this bar plot you are able to see the Total Global Sales for each Genre. Action is at the top with over 1500 millions sales and Strategy is at the bottom with less than 200 million sales.
fig, ax = plt.subplots()
barplot = sns.barplot(x='Genre',y='Mean_Global_Sales',data=genres, ax=ax, palette=new_palette)
plt.title('Average Global Sales for all games in each Genre from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
In this bar plot you are able to see the Average Global Sales for each Genre. Shooter is at the top with about 0.8 million sales per game and Adventure is at the bottom with about 0.15 million sales per game.
fig, ax = plt.subplots()
barplot = sns.barplot(x='Genre',y='Mean_NA_Sales',data=genres, ax=ax, palette=new_palette)
plt.title('Average North American Sales for all games in each Genre from 2000 to 2016')
plt.ylabel('NA Sales(in millions)')
Text(0, 0.5, 'NA Sales(in millions)')
In this bar plot you are able to see the Average North American Sales for each Genre. Shooter is at the top with about 0.45 million sales per game and Adventure is at the bottom with about 0.75 million sales per game. The main difference this plot has vs the global sales plot is that Racing games were much higher than Role-Playing games in North America.
fig, ax = plt.subplots()
barplot = sns.barplot(x='Genre',y='Mean_EU_Sales',data=genres, ax=ax, palette=new_palette)
plt.title('Average European Sales for all games in each Genre from 2000 to 2016')
plt.ylabel('EU Sales(in millions)')
Text(0, 0.5, 'EU Sales(in millions)')
In this bar plot you are able to see the Average European Sales for each Genre. Shooter is at the top with just over 0.25 million sales per game and Adventure is at the bottom with just under 0.05 million sales per game. Just like North America, Europe also sold more Racing games than Role-Playing games.
fig, ax = plt.subplots()
barplot = sns.barplot(x='Genre',y='Mean_JP_Sales',data=genres, ax=ax, palette=new_palette)
plt.title('Average Japanese Sales for all games in each Genre from 2000 to 2016')
plt.ylabel('JP Sales(in millions)')
Text(0, 0.5, 'JP Sales(in millions)')
In this bar plot you are able to see the Average Japanese Sales for each Genre. Role-Playing is easily at the top with just over 0.175 million sales per game and Shooter is at the bottom with just under 0.025 million sales per game. This Graph is drastically different from the first 3. The first thing to note is Role-Playing is far at the top where in the other graphs it was just average. The second thing to note is that all the other genres are at about the same level. The third thing to note which is the most shocking is that Shooter games were at the bottom for Japan, but in the first 3 graphs it was at the top.
fig, ax = plt.subplots()
barplot = sns.barplot(x='Genre',y='Mean_Other_Sales',data=genres, ax=ax, palette=new_palette)
plt.title('Average Other Regional Sales for all games in each Genre from 2000 to 2016')
plt.ylabel('Other Region Sales(in millions)')
Text(0, 0.5, 'Other Region Sales(in millions)')
In this bar plot you are able to see the Average Other Regional Sales for each Genre. Shooter is at the top with over 0.08 million sales per game and Adventure is at the bottom with just under 0.01 million sales per game. Other Regions had a similar graph to North America and Europe which means Japan is almost like an outlier.
df = df.sort_values(by=["Genre"], ascending=True)
sns.pointplot(x="Year_of_Release", y="Global_Sales", hue="Genre", data=df, legend=False, ci=0, palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Average Global Sales for all games in each Genre from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
In this graph you can see how each genre performed on average for each year. Shooter is the one that sticks out the most and seems to have an overall postive trend. Platform also seems to have a positive trend until 2013.
We decided to do the more recent platforms since the other ones don't really make games anymore
fig, ax = plt.subplots()
threeds = threeds.sort_values(by=["Genre"], ascending=True)
barplot = sns.barplot(x='Genre',y='Global_Sales',data=threeds, ax=ax,ci=0, palette=new_palette)
plt.title('Average Global Sales for all games in each Genre for 3DS from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the average global sales for each genre for the 3DS. At the top is Racing at 1.5 million sales per game and Adventure is at the bottom with about 0.1 million sales per game.
fig, ax = plt.subplots()
sns.pointplot(x="Year_of_Release", y="Global_Sales", hue="Genre", data=threeds, legend=False, ci=0, palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Average Global Sales for all games in each Genre for 3DS from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This graph shows the average global sales for each genre for the 3DS for each year. In 2011 both platform and racing were both at the top, in 2012 platform and simulation were at the top, in 2013 role-playing and simulation were at the top, in 2014 fighting was far at the top which is most likely because thats when Super Smash Bros was released, in 2015 it was simulation, and in 2016 it was role-playing.
fig, ax = plt.subplots()
pc = pc.sort_values(by=["Genre"], ascending=True)
barplot = sns.barplot(x='Genre',y='Global_Sales',data=pc, ax=ax, ci=0, palette=new_palette)
plt.title('Average Global Sales for all games in each Genre for PC from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the average global sales for each genre for the PC. At the top is Role-Playing at 0.5 million sales per game, and both fighting and puzzle are at the bottom with about 0.5 million sales per game.
fig, ax = plt.subplots()
sns.pointplot(x="Year_of_Release", y="Global_Sales", hue="Genre", data=pc, legend=False, ci=0, palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Average Global Sales for all games in each Genre for PC from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This graph shows the average global sales for each genre for the PC for each year. At first misc is at the top in 2000, but this genre does not have data for every year on the PC. Simulation has the most obvious peaks in 2002, 2009, and 2014. Role-playing had the highest peak in 2004, and 3 much smaller peaks in 2007, 2012, and 2015.
fig, ax = plt.subplots()
ps4 = ps4.sort_values(by=["Genre"], ascending=True)
barplot = sns.barplot(x='Genre',y='Global_Sales',data=ps4, ax=ax, ci=0, palette=new_palette)
plt.title('Average Global Sales for all games in each Genre for PS4 from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the average global sales for each genre for the PS4. Easily at the top is Shooter at over 2 million sales per game which is most likely due to Call of Duty being such a popular game series on the PS4, and puzzle is at the bottom with a number so small it might as well be 0 million sales per game.
fig, ax = plt.subplots()
sns.pointplot(x="Year_of_Release", y="Global_Sales", hue="Genre", data=ps4, legend=False, ci=0, palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Average Global Sales for all games in each Genre for PS4 from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This graph shows the average global sales for each genre for the PS4 for each year. Shooter is consistently at the top for all 4 years which makes sense when looking at its corresponding bar graph.
fig, ax = plt.subplots()
wiiu = wiiu.sort_values(by=["Genre"], ascending=True)
barplot = sns.barplot(x='Genre',y='Global_Sales',data=wiiu, ax=ax, ci=0, palette=new_palette)
plt.title('Average Global Sales for all games in each Genre for WiiU from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the average global sales for each genre for the WiiU. Easily at the top is Racing at over 2.5 million sales per game and adventure is at the bottom with a number that seems to be less than 0.1 million sales per game.
fig, ax = plt.subplots()
sns.pointplot(x="Year_of_Release", y="Global_Sales", hue="Genre", data=wiiu, legend=False, ci=0, palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Average Global Sales for all games in each Genre for WiiU from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This graph shows the average global sales for each genre for the WiiU for each year. Platform begins at the top in 2012 most likely due to a Mario game. In 2014 Racing and Fighting peak which is probably due to Mario Kart 8 and Super Smash Bros. Shooter peaked in 2015 due to Splatoons Release.
fig, ax = plt.subplots()
x1 = x1.sort_values(by=["Genre"], ascending=True)
barplot = sns.barplot(x='Genre',y='Global_Sales',data=x1, ax=ax, ci=0, palette=new_palette)
plt.title('Average Global Sales for all games in each Genre for Xbox One from 2000 to 2016')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This bar graph shows the average global sales for each genre for the Xbox One. Easily at the top is Shooter at 1.6 million sales per game which is most likely due to Call of Duty being such a popular game series on the Xbox One, and Strategy is at the bottom with about 0.1 million sales per game. Also note that there is no puzzle games listed for the Xbox One.
fig, ax = plt.subplots()
sns.pointplot(x="Year_of_Release", y="Global_Sales", hue="Genre", data=x1, legend=False, ci=0, palette=new_palette)
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=5)
plt.title('Average Global Sales for all games in each Genre for Xbox One from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This graph shows the average global sales for each genre for the Xbox One for each year. Shooter is consistently at the top for all 4 years which makes sense when looking at its corresponding bar graph.
In this last section we will be predicting the trend for each genre using linear regression. By doing this we gain an estimate to how future sales are likely to do. A negative slope will suggest a decrease in sales globally. Where a postive slope will suggest an increase in sales globally. This could be useful for comapnies when they are trying to develop new games, and this could also be useful for investors trying to invest in companies.
sns.regplot(x="Year_of_Release", y="Global_Sales", data=action,x_estimator=np.mean, ci=0, color = 'yellow')
plt.title('Average Global Sales for all games in Action from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there seems to be a negative trend for action games. There is also an outlier in 2013.
sns.regplot(x="Year_of_Release", y="Global_Sales", data=adventure,x_estimator=np.mean, ci=0, color = 'black')
plt.title('Average Global Sales for all games in Adventure from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there seems to be a negative trend for adventure games
sns.regplot(x="Year_of_Release", y="Global_Sales", data=fighting,x_estimator=np.mean, ci=0, color = 'olive')
plt.title('Average Global Sales for all games in Fighting from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there seems to be a negative trend for fighting games
sns.regplot(x="Year_of_Release", y="Global_Sales", data=misc,x_estimator=np.mean, ci=0, color = 'red')
plt.title('Average Global Sales for all games in Misc from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there seems to be a negative trend for misc games
sns.regplot(x="Year_of_Release", y="Global_Sales", data=platform,x_estimator=np.mean, ci=0, color = 'orange')
plt.title('Average Global Sales for all games in Platform from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there seems to be a slight positive trend for platformers. This is also the first graph so far to have a positive trend.
sns.regplot(x="Year_of_Release", y="Global_Sales", data=puzzle,x_estimator=np.mean, ci=0, color = 'pink')
plt.title('Average Global Sales for all games in Puzzle from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there seems to be a negative trend for puzzle games
sns.regplot(x="Year_of_Release", y="Global_Sales", data=racing,x_estimator=np.mean, ci=0, color = 'green')
plt.title('Average Global Sales for all games in Racing from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there is a very slight negative trend for racing games
sns.regplot(x="Year_of_Release", y="Global_Sales", data=role_playing,x_estimator=np.mean, ci=0, color = 'grey')
plt.title('Average Global Sales for all games in Role-Playing from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
For this graph there is a obvious negative trend for role-playing games
sns.regplot(x="Year_of_Release", y="Global_Sales", data=shooter,x_estimator=np.mean, ci=0, color='purple')
plt.title('Average Global Sales for all games in Shooter from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
This graph has a very obvious positive trend for shooter games since games like Call of Duty and Halo do very well.
sns.regplot(x="Year_of_Release", y="Global_Sales", data=simulation,x_estimator=np.mean, ci=0, color = 'blue')
plt.title('Average Global Sales for all games in Simulation from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
There is a negative trend for Simulation games, but there is a bit of an outlier in 2005.
sns.regplot(x="Year_of_Release", y="Global_Sales", data=sports,x_estimator=np.mean, ci=0, color = 'cyan')
plt.title('Average Global Sales for all games in Sports from 2000 to 2016')
plt.xlabel('Year of Release')
plt.ylabel('Global Sales(in millions)')
Text(0, 0.5, 'Global Sales(in millions)')
There is a positive trend for sports games since games like Fifa, Madden, and 2K usually do very well.
sns.regplot(x="Year_of_Release", y="Global_Sales", data=strategy,x_estimator=np.mean, ci=0, color = 'lime')
<AxesSubplot:xlabel='Year_of_Release', ylabel='Global_Sales'>
Last but not least we have strategy games which also have a negative trend.
With an ever growing industry, video games are becoming commonplace in society. Based upon what we have done in this project, we see the value that data analysts bring when analyzing vast amounts of data. Initially, we thought that the action genre would have been the best performing genre globally for sales but the shooter genre was the best performing genre globally. We also saw the regional difference between countries. For example, in Japan, the best selling genre is the role-playing genre whereas in North America, the shooting genre is the best selling genre.
If you are interested and you have more recent data involving more recent console platforms such as the Nintendo Switch, PS5, and Xbox Series X. You could analyze other factors such as micro-transactions, season-passes, and how COVID19 affected sales.
So in conclusion, we have modified the dataframe to remove columns and rows that were not needed for our analysis. Hopefully, this tutorial provided an insight into what game developers, and analysts have to take into consideration when trying to make a new game. We have graphs showing the global sales of video games by genre, region, and platform. This information can be useful if a developer wants to figure what type of games will potentially lead to the most profit. We have shown that we can answer questions such as “Would this genre sell well on this video game platform?”, “Which genre is best for maximizing sales globally?”,or “Is it worth it to develop a game in a particular genre?”