Session 2: Python & R Fundamentals

GDP Growth Nowcasting Workshop — Jamaica

Diego A. Guerrero

2026-05-01

Learning Objectives

Familiarize with the basic concepts of Python and R
Identify and manipulate data structures
Run and execute Python / R scripts
Perform basic data uploading and cleaning
Compute summary statistics and visualize time series

Introduction to Python

Why Python?

Speed, reproducibility, flexibility, and a vast ecosystem of libraries
Integrates with R, SQL, Excel, Stata, and APIs
Scales from a small dataset to cloud-scale Big Data
Free — no licensing costs for government use

Key Libraries

pandas — data manipulation
numpy — numerical computing
matplotlib — static plots
seaborn — statistical graphics

scikit-learn — machine learning
statsmodels — econometric models
openpyxl — Excel I/O
jupyter — interactive notebooks

First Python Program

print("Hello, Jamaica!")
print(2 + 5)

Hello, Jamaica!
7

Python Syntax

Variables and Types

country = "Jamaica"
gdp = 20_500      # USD millions (approximate)
growth_rate = 0.035
island = True

print(country, gdp)
print(growth_rate, island)

Jamaica 20500
0.035 True

Variables are stored in RAM — keep datasets manageable.

Data Types

a = 10           # integer
b = 3.14         # float
c = "Economics"  # string
d = True         # boolean

print(type(a), type(b), type(c), type(d))

<class 'int'> <class 'float'> <class 'str'> <class 'bool'>

Common types:

int — whole numbers
float — decimals
str — text
bool — True/False

String Formatting

country = "Jamaica"
gdp = 20500 
growth_rate = 0.035

print(f"The country {country} has a GDP of USD {gdp}M "
      f"and grew at {growth_rate*100:.1f}%.")

The country Jamaica has a GDP of USD 20500M and grew at 3.5%.

Arithmetic Operations

gdp_1 = 19890
gdp_2 = 20500

# Growth rate
growth = gdp_2 / gdp_1 - 1

print(f"GDP grew by {round(growth*100, 2)}%.")

GDP grew by 3.07%.

Conditional Statements

Python uses relational operators (>, <, >=, !=, ==, in, not) that return booleans (True/False).

growth = 0.035

if growth > 0:
    print("Economy expanding")
elif growth == 0:
    print("No growth")
else:
    print("Economy contracting")

Economy expanding

Conditional Example: GDP

gdp_growth = -1.5   # percent

if gdp_growth >= 3:
    print("Strong growth")
elif gdp_growth >= 1:
    print("Moderate growth")
elif gdp_growth >= 0:
    print("Stagnation")
else:
    print("Recession")

print("Program finished.")

Recession
Program finished.

Data Structures (Collections)

Structure	Example	Key Features
List	`[2020, 2021, 2022]`	Ordered, mutable
Tuple	`(2020, 2021, 2022)`	Ordered, immutable
Dictionary	`{"year": 2023, "gdp": 3.5}`	Key–value pairs
Set	`{2020, 2021, 2022}`	Unordered, unique

Loops

for — iterate over a sequence:

years = [2020, 2021, 2022, 2023, 2024]
for year in years:
    print(year)

while — repeat while a condition is true:

count = 0
while count < 4:
    print(f"Quarter {count + 1}")
    count += 1

Quarter 1
Quarter 2
Quarter 3
Quarter 4

Loops and Conditionals

gdp_series = [2.1, -3.5, 1.8, 0.4, 3.2, -0.2, 2.9]

for i, g in enumerate(gdp_series):
    if g < 0:
        print(f"Q{i+1}: recession ({g}%)")
    else:
        print(f"Q{i+1}: growth ({g}%)")

Q1: growth (2.1%)
Q2: recession (-3.5%)
Q3: growth (1.8%)
Q4: growth (0.4%)
Q5: growth (3.2%)
Q6: recession (-0.2%)
Q7: growth (2.9%)

Functions

A function is a reusable block of code that performs a specific task.

def annualized_growth(q_growth):
    """Convert quarterly growth to annualized rate."""
    return ((1 + q_growth / 100) ** 4 - 1) * 100

print(f"Annualized: {annualized_growth(0.875):.2f}%")

Annualized: 3.55%

return sends a value back to the caller.

Functions and Tables

def make_growth_table(values):
    """Return a list of (quarter, growth) rows."""
    table = []
    for i, v in enumerate(values):
        table.append((f"Q{i+1}", round(v, 2)))
    return table

gdp_q = [0.8, -0.9, 1.1, 0.7]
result = make_growth_table(gdp_q)

print("Quarter | Growth%")
print("--------+--------")
for row in result:
    print(f"{row[0]:7} | {row[1]:6}")

Quarter | Growth%
--------+--------
Q1      |    0.8
Q2      |   -0.9
Q3      |    1.1
Q4      |    0.7

DataFrames

Table-like structure from the pandas library:

import pandas as pd

data = {
    "Country": ["Jamaica", "Trinidad", "Barbados", "Guyana"],
    "GDP_USD_B": [18.5, 28.1, 5.7, 26.8],
    "Population_M": [2.8, 1.4, 0.28, 0.79]
}
df = pd.DataFrame(data)
df

	Country	GDP_USD_B	Population_M
0	Jamaica	18.5	2.80
1	Trinidad	28.1	1.40
2	Barbados	5.7	0.28
3	Guyana	26.8	0.79

Data in Python

Importing Data

import pandas as pd

# Load Jamaica data
df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")
df.head(3)

	RGDP0000	UMBL0000	MLIA1001	MLIA1002	MLIA1003	MLIA1004	MLIA1000	MLIA0003	MLIA0002	MLIA0004	...	UGTR0021	UGTR0006	UGTR0001	REXC0001	REXC0002	XEMP0003	XIMP0003	XEMP0004	XIMP0004	UGTR0000
date
2015-02-01	NaN	54304.080413	31890.278	135559.643	249451.351	116008.373	532909.645	52036.644	1501.182	14318.718	...	10.0	16.0	9.0	NaN	NaN	NaN	NaN	137.776543	25.294407	27.600
2015-03-01	376071.0	55526.300906	29976.701	127010.461	245889.630	112828.497	515705.289	56535.456	588.502	13734.467	...	9.0	19.0	15.0	NaN	NaN	NaN	NaN	143.741899	24.350571	27.050
2015-04-01	NaN	59615.749986	29459.217	130600.536	249975.925	124345.188	534380.866	45095.208	1278.866	14176.166	...	11.0	21.0	10.0	NaN	NaN	NaN	NaN	149.343548	27.563384	28.325

3 rows × 992 columns

Importing Various Formats

import pandas as pd

# CSV
df = pd.read_csv("data.csv", parse_dates=["date"], index_col="date")

# Excel
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")

# Text file
df = pd.read_table("data.txt", sep="\t")

Exporting

df.to_csv("output/data_clean.csv", index=True)
df.to_excel("output/data_clean.xlsx", index=False)
df.to_json("output/data_clean.json", orient="records")

json is a very efficient format in data science that stores records as dictionaries.

Data Exploration

import pandas as pd
df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")
df[["RGDP0000", "UGTR0000", "UMBL0000"]].tail(6)

Data Exploration

	RGDP0000	UGTR0000	UMBL0000
date
2025-11-01	NaN	50.525	52926.440532
2025-12-01	406765.0	49.650	74554.652345
2026-01-01	NaN	46.125	NaN
2026-02-01	NaN	NaN	NaN
2026-03-01	NaN	NaN	NaN
2026-04-01	NaN	NaN	NaN

DataFrame Summary

import pandas as pd
df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")

print("=== Info ===")
print(df.info())           # column names, dtypes, missing values

print("=== Summary stats ===")
print(df.describe())       # numeric summary statistics

print("=== Shape ===")
print(df.shape)            # (rows, columns)

print("=== Column names ===")
print(df.columns.tolist()[:10])   # first 10 column names

print("=== Data types ===")
print(df.dtypes)

GDP Series

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")
gdp = df["RGDP0000"].dropna()

gdp.plot(color="#005BAC", linewidth=2, figsize=(7, 4))
plt.title("Jamaica Real GDP", fontsize=14, weight="bold")
plt.ylabel("JMD Millions")
plt.xlabel("")
plt.tight_layout()
plt.show()

Selecting and Filtering

import pandas as pd
df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")

# Select one column
gdp = df["RGDP0000"].dropna()

# Filter to post-2018
recent = gdp[gdp.index >= "2018-01-01"]

# Describe
recent.describe()

count        32.000000
mean     410715.406250
std       22563.746281
min      338323.000000
25%      400882.000000
50%      408963.000000
75%      428480.000000
max      438665.000000
Name: RGDP0000, dtype: float64

Complex Filtering

import pandas as pd
df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")
gdp = df["RGDP0000"].dropna()

# Keep only recession quarters (negative growth)
gdp_growth = gdp.pct_change() * 100
recession = gdp_growth[gdp_growth < 0]
print(recession)

date
2015-06-01    -1.270239
2015-09-01    -0.988166
2020-03-01    -0.591655
2020-06-01   -16.689321
2022-12-01    -0.033343
2023-06-01    -0.066758
2023-12-01    -0.778219
2024-03-01    -0.054704
2024-06-01    -0.997453
2024-09-01    -1.055556
2025-12-01    -7.272064
Name: RGDP0000, dtype: float64

Combining Datasets: concat, merge, join

Method	Use	Based On	Example	Notes
`pd.concat()`	Stack rows or columns	Axis	`pd.concat([df1, df2])`	Simple append
`pd.merge()`	SQL-style JOIN on key columns	Key columns	`pd.merge(df1, df2, on="date")`	Most flexible
`df.join()`	Combine on index	Index	`df1.join(df2)`	Convenient for time series

Merge Example

import pandas as pd

df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")

# Select GDP and Google Trends index
gdp   = df[["RGDP0000"]].dropna()
gtrends = df[["UGTR0000"]].dropna()

# Merge on date index
merged = gdp.join(gtrends, how="inner")
merged.tail(6)

	RGDP0000	UGTR0000
date
2024-09-01	424159.0	47.950
2024-12-01	431334.0	48.275
2025-03-01	436021.0	47.875
2025-06-01	437133.0	45.775
2025-09-01	438665.0	50.175
2025-12-01	406765.0	49.650

Reshaping

wide → long: `melt()`

long_df = df.melt(id_vars="date",
                  value_vars=["RGDP0000", "UGTR0000"],
                  var_name="variable",
                  value_name="value")
long_df.head()

long → wide: `pivot()`

wide_df = long_df.pivot(index="date", columns="variable", values="value")
wide_df.head()

Resample

Transforms data to a different frequency (e.g. monthly → quarterly):

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")

# Google Trends — monthly
gt_monthly = df["UGTR0000"].dropna()

# Resample to quarterly mean
gt_quarterly = gt_monthly.resample("QS").mean()

plt.figure(figsize=(8, 4))
plt.plot(gt_monthly.index, gt_monthly, alpha=0.4, label="Monthly")
plt.plot(gt_quarterly.index, gt_quarterly, color="red",
         linewidth=2, label="Quarterly")
plt.legend(frameon=False)
plt.title("Google Trends Index — Monthly vs Quarterly")
plt.show()

Resample Frequencies

Code	Frequency	Example Date
`"D"`	Daily	2026-01-01
`"W"`	Week-end	2026-01-05
`"MS"`	Month-start	2026-01-01
`"QS"`	Quarter-start	2026-01-01
`"AS"`	Year-start	2026-01-01

Activity

Loading and Visualizing Jamaica Data

Open workshop_code/s2_basic.ipynb and:

Load two datasets gdp.csv and sp500.csv
Perform basic cleaning and merge
Visualize
Compute growth rates for each variable

Appendix — Visualization

Line Plot

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")
gdp = df["RGDP0000"].dropna()
gdp_g = gdp.pct_change() * 100

fig, ax = plt.subplots(figsize=(8, 5))
ax.plot(gdp_g.index, gdp_g, color="#005BAC", linewidth=2)
ax.axhline(0, color="grey", linestyle="--", linewidth=0.8)
ax.set_title("Jamaica Real GDP Growth (QoQ%)", fontsize=14, weight="bold")
ax.set_ylabel("Percent")
plt.tight_layout()
plt.show()

Bar Chart

import matplotlib.pyplot as plt

sectors = ["Agriculture", "Manufacturing", "Tourism", "Finance", "Other"]
shares  = [6, 9, 30, 18, 37]

fig, ax = plt.subplots(figsize=(7, 4))
ax.bar(sectors, shares,
       color=["#1f77b4", "#ff7f0e", "#2ca02c", "#9467bd", "#d62728"])
ax.set_title("Jamaica GDP by Sector (approx.)", fontsize=13, weight="bold")
ax.set_ylabel("Percent of GDP")
ax.grid(axis="y", alpha=0.3)
plt.tight_layout()
plt.show()

Scatter Plot

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("../data/data.csv", parse_dates=["date"], index_col="date")
df_q = df[["RGDP0000", "UGTR0000"]].resample("QS").mean()
df_q = df_q.pct_change().dropna() * 100

fig, ax = plt.subplots(figsize=(7, 5))
ax.scatter(df_q["UGTR0000"], df_q["RGDP0000"],
           s=70, color="#005BAC", alpha=0.7, edgecolors="white")
ax.set_title("Google Trends vs GDP Growth", fontsize=13, weight="bold")
ax.set_xlabel("Google Trends Growth (%)")
ax.set_ylabel("GDP Growth (%)")
ax.grid(alpha=0.3)
plt.tight_layout()
plt.show()

Session 2: Python & R Fundamentals

Learning Objectives

Introduction to Python

Key Libraries

First Python Program

Python Syntax

Variables and Types

Data Types

String Formatting

Arithmetic Operations

Conditional Statements

Conditional Example: GDP

Data Structures (Collections)

Loops

Loops and Conditionals

Functions

Functions and Tables

DataFrames

Data in Python

Importing Data

Importing Various Formats

Exporting

Data Exploration

Data Exploration

DataFrame Summary

GDP Series

Selecting and Filtering

Complex Filtering

Combining Datasets: concat, merge, join

Merge Example

Reshaping

wide → long: melt()

long → wide: pivot()

Resample

Resample Frequencies

Activity

Loading and Visualizing Jamaica Data

Appendix — Visualization

Line Plot

Bar Chart

Scatter Plot

wide → long: `melt()`

long → wide: `pivot()`