'Labore modi labore voluptatem non ipsum dolor. Ut neque tempora voluptatem porro numquam dolor adipisci. Consectetur velit eius est. Modi ipsum ut aliquam tempora quaerat consectetur eius. Neque labore porro porro labore. Sed consectetur velit velit sed sed porro. Etincidunt quiquia dolorem est dolor magnam. Magnam tempora modi etincidunt consectetur etincidunt.'
LAB: Fake report
Motivation
Reproducible data gathering
Data and metadata should be retrieved programmatically.
Here is an example from the OECD API documentation. Other providers have their own API for example Eurostat.
url = "https://sdmx.oecd.org/public/rest/data/OECD.SDD.STES,DSD_STES@DF_CLI/.M.LI...AA...H?startPeriod=2023-02&dimensionAtObservation=AllDimensions&format=csvfilewithlabels"
# df = pd.read_csv(url)
# df.infoNote that in your report, data gathering chunks should go to the appendix.
200 countries, 50 years, 20 lines of code
You should build your report and your presentation around plots and tables.
/home/boucheron/Documents/IFEBY310/.nlp-venv/lib/python3.12/site-packages/gapminder/data.py:1: UserWarning:
pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
Display tables with style (no kitsch please).
| Continent | level_1 | Country | Year | Life Expectancy | Population (Millions Inhab.) | GDP per Capita | |
|---|---|---|---|---|---|---|---|
| 0 | Africa | 1262 | Reunion | 1962 | 57.7 | 0.4 | $3,174 |
| 1 | Africa | 1348 | Sierra Leone | 1972 | 35.4 | 2.9 | $1,354 |
| 2 | Americas | 1115 | Nicaragua | 2007 | 72.9 | 5.7 | $2,749 |
| 3 | Americas | 1610 | United States | 1962 | 70.2 | 186.5 | $16,173 |
| 4 | Asia | 702 | India | 1982 | 56.6 | 708.0 | $856 |
| 5 | Asia | 1213 | Philippines | 1957 | 51.3 | 26.1 | $1,548 |
| 6 | Europe | 1148 | Norway | 1992 | 77.3 | 4.3 | $33,966 |
| 7 | Europe | 150 | Bosnia and Herzegovina | 1982 | 70.7 | 4.2 | $4,127 |
| 8 | Oceania | 68 | Australia | 1992 | 77.6 | 17.5 | $23,425 |
| 9 | Oceania | 1096 | New Zealand | 1972 | 71.9 | 2.9 | $16,046 |
Visualization matters
Wizardry
Appendix
The appendix should contain the code that helped you create the document
# Load the data
#
import pandas as pd
import plotly
import plotly.express as px
pd.options.plotting.backend = "plotly"
from gapminder import gapminder
#
# Sample 2 rows per continent and format for display
#
df = (
gapminder
.groupby("continent", group_keys=True)
.apply(lambda g:
g.sample(n=2, random_state=42), include_groups=False)
.reset_index()
.assign(pop=lambda x: x["pop"] / 1e6)
)
df = df.rename(columns={
"country": "Country",
"continent": "Continent",
"year": "Year",
"lifeExp": "Life Expectancy",
"pop": "Population (Millions Inhab.)",
"gdpPercap": "GDP per Capita",
})
df.style.format(
{
"Population (Millions Inhab.)": "{:.1f}",
"Life Expectancy": "{:.1f}",
"GDP per Capita": "${:,.0f}"
}
)
#
#
# A neater color scale
#
neat_color_scale = {
"Africa" : "#01d4e5",
"Americas" : "#7dea01" ,
"Asia" : "#fc5173",
"Europe" : "#fde803",
"Oceania" : "#536227"
}
pd.options.plotting.backend = 'plotly'
#
# Use plotly backend for dataframes
#
a_year = gapminder["year"].sample(1, random_state=42).iloc[0]
df = gapminder[gapminder["year"] == a_year].copy()
p = px.scatter(
data_frame=df,
x="gdpPercap",
y="lifeExp",
size="pop",
color="continent",
hover_data={
"country": True,
"pop": ":,",
"gdpPercap": ":,.0f",
"lifeExp": ":.1f"},
log_x=True,
size_max=15,
color_discrete_map=neat_color_scale,
opacity=0.5,
title=f"Gapminder {a_year}",
labels={
"gdpPercap": "Yearly Income per Capita",
"lifeExp": "Life Expectancy",
},
)
p.update_layout(
xaxis_title="Yearly Income per Capita",
yaxis_title="Life Expectancy",
annotations=[
dict(
text="From sick and poor (bottom left) to healthy and rich (top right)",
xref="paper",
yref="paper",
x=0.5,
y=-0.12,
showarrow=False,
font=dict(size=10),
)
],
)
p
#
# Animated scatter on the whole gapminder dataframe, year as frame
#
p_animated = px.scatter(
data_frame=gapminder,
x="gdpPercap",
y="lifeExp",
size="pop",
color="continent",
animation_frame="year",
hover_data={"country": True, "pop": ":,", "gdpPercap": ":,.0f", "lifeExp": ":.1f"},
log_x=True,
size_max=15,
color_discrete_map=neat_color_scale,
opacity=0.5,
title="Gapminder",
labels={
"gdpPercap": "Yearly Income per Capita",
"lifeExp": "Life Expectancy",
},
)
p_animated.update_layout(
height=500,
width=750,
xaxis_title="Yearly Income per Capita",
yaxis_title="Life Expectancy",
showlegend=True,
)
p_animated