Topics covered: Installation, variables, data types, operators, control flow, functions, Git
Learning objectives: By the end of this week you will be able to apply python fundamentals for data science concepts to real datasets, write executable Python code for each technique, and complete both graded assignments independently.
Python is the dominant language in data science. Install Anaconda Distribution (anaconda.com) which bundles Python 3.11, Jupyter Notebook, and 250+ scientific packages. A variable is a named storage location. Python has four basic scalar types: int (whole numbers), float (decimal numbers), str (text), and bool (True/False).
student_name = 'Amara Okafor'
age = 28
height_m = 1.72
is_enrolled = True
print(type(student_name)) # <class 'str'>
print(type(age)) # <class 'int'>
# Type conversion
year_str = '2024'
year_int = int(year_str)
print(year_int + 1) # 2025
Python supports 7 arithmetic operators (+, -, *, /, //, %, **). Comparison operators return booleans (==, !=, <, >, <=, >=). Logical operators combine boolean expressions (and, or, not). Conditional execution uses if/elif/else with 4-space indentation. For loops iterate over sequences using range() or directly over lists.
# Conditional logic - loan risk classification
credit_score = 680
income = 350000 # NGN annual
if credit_score >= 750 and income >= 500000:
risk_tier = 'Prime'
elif credit_score >= 650 and income >= 250000:
risk_tier = 'Standard'
elif credit_score >= 550:
risk_tier = 'Sub-prime'
else:
risk_tier = 'Declined'
print(f'Loan decision: {risk_tier}')
# For loop with enumerate
datasets = ['titanic', 'iris', 'boston_housing']
for i, name in enumerate(datasets, start=1):
print(f'{i}. Loading: {name}.csv')
A function is a reusable named block of code. Define with def, add parameters, and use return to output a value. Default parameters allow callers to omit arguments. Git tracks changes: git add stages files, git commit saves a snapshot, git push uploads to GitHub. Every data science project must be version-controlled from day 1.
def calculate_bmi(weight_kg, height_m):
"""Calculate BMI and return value with WHO classification."""
bmi = weight_kg / (height_m ** 2)
if bmi < 18.5:
category = 'Underweight'
elif bmi < 25.0:
category = 'Normal weight'
elif bmi < 30.0:
category = 'Overweight'
else:
category = 'Obese'
return {'bmi': round(bmi, 2), 'category': category}
result = calculate_bmi(70, 1.75)
print(result) # {'bmi': 22.86, 'category': 'Normal weight'}
Submit completed notebooks to your GitHub repository before the next session. Feedback within 48 hours.
Build a BMI calculator that validates inputs (weight 20-300 kg, height 0.5-2.5 m), returns the BMI value and WHO classification, and tests 5 different inputs. Push to a new GitHub repository.
Write celsius_to_fahrenheit(temp) and fahrenheit_to_celsius(temp) functions, then print a conversion table from 0 to 100 C in steps of 10.