import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
Scores
Introduction:
This time you will create the data.
Exercise based on Chris Albon work, the credits belong to him.
Step 1. Import the necessary libraries
In [1]:
Step 2. Create the DataFrame that should look like the one below.
In [2]:
= {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
raw_data 'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'],
'female': [0, 1, 1, 0, 1],
'age': [42, 52, 36, 24, 73],
'preTestScore': [4, 24, 31, 2, 3],
'postTestScore': [25, 94, 57, 62, 70]}
= pd.DataFrame(raw_data)
df df
first_name | last_name | female | age | preTestScore | postTestScore | |
---|---|---|---|---|---|---|
0 | Jason | Miller | 0 | 42 | 4 | 25 |
1 | Molly | Jacobson | 1 | 52 | 24 | 94 |
2 | Tina | Ali | 1 | 36 | 31 | 57 |
3 | Jake | Milner | 0 | 24 | 2 | 62 |
4 | Amy | Cooze | 1 | 73 | 3 | 70 |
Step 3. Create a Scatterplot of preTestScore and postTestScore, with the size of each point determined by age
Hint: Don’t forget to place the labels
In [3]:
='preTestScore' , y='postTestScore' , s=df['age'].values) df.plot.scatter(x
Step 4. Create a Scatterplot of preTestScore and postTestScore.
This time the size should be 4.5 times the postTestScore and the color determined by sex
In [4]:
='preTestScore' , y='postTestScore' , s=df['postTestScore']*4.5 , c='female' , colormap='viridis') df.plot.scatter(x