IFEBY310 Syllabus
NoteOrganization
We will have one weeky lecture. Each lecture is organized around Slides and Notebooks. We will switch from blackboard to laptop and back. You are invited to bring your laptop to the lectures.
| Day | Hour | Room | Start | |
|---|---|---|---|---|
| Lecture | Monday | 10:45 - 12:45 | Halle aux Farines 166 E | 2026-01-12–2026-04-13 |
We will not attempt to complete the notebooks during the sessions. You are expected to complete the noteboks on your own time. Solutions (at least partial solutions) are available on the course website.
You can fork the course repository and post issues, comments, and corrections.
NoteObjectives
During this course, you shall learn to:
- Handle middlesize data using Python Data Stack: Numpy/Scipy/Pandas
- Scale up and down with Dask
- Handle Big Data with Spark (PySpark)
- Manage and store data using dedicated columnar formats (Parquet, ORC, Avro, Arrow)
NoteCommunication
Course material: s-v-b.github.io/IFEBY310 Fork the repo, use github issues to send feedback (no email please)
Alerts are spread through Moodle
Register at Moodle portal to be updated
NoteRéférences
- Pandas Book
- Python Data Science Handbook
- Dask
- Spark
- Data pipelines
- Data pipelines
- Alice
- Documentation PostGres
- Next Generation Databases NoSQLand Big Data, Guy Harrison
- Guy Harrison Blog
- Databases trends and applications
- Upcoming book “Principles of Databases”, by Marcelo Arenas, Pablo Barcelo, Leonid Libkin, Wim Martens, and Andreas Pieris.
ImportantEvaluation
Two homeworks/projects
Grading
Tip Trucs
- Have a look at slides before the course
- Don’t jump to corrections
- Use online help (StackOverflow, ChatGPT, copilot, …)
- Read error messages
WarningCode of conduct
TL;DR: No cheating!
ImportantSave the dates !
- January 12: Course kick-off
- March 2nd: Winter Holidays
- April 6: Eastern Monday
- April 13: Last session