IFEBY310 Syllabus

Organization

We will have one weeky lecture. Each lecture is organized around Slides and Notebooks. We will switch from blackboard to laptop and back. You are invited to bring your laptop to the lectures.

Day Hour Room Start
Lecture Friday 15:45 - 17:46 Sophie Germain 014 2025-01-17

We will not attempt to complete the notebooks during the sessions. You are expected to complete the noteboks on your own time. Solutions (at least partial solutions) are available on the course website.

You can fork the course repository and post issues, comments, and corrections.

Objectives

During this course, you shall learn to:

  • Handle middlesize data using Python Data Stack: Numpy/Scipy/Pandas
  • Scale up and down with Dask
  • Handle Big Data with Spark (PySpark)
  • Manage and store data using dedicated columnar formats (Parquet, ORC, Avro, Arrow)
Communication

Course material: s-v-b.github.io/IFEBY310 Fork the repo, use github issues to send feedback (no email please)

Alerts are spread through Moodle

Register at Moodle portal to be updated

We use the PostGreSQL server from UFR de Mathématiques. To obtain an ENT account follow procedure Moodle

You may install and use

Courses

slides

Notebooks (html and ipynb)

Evaluation
  • Two homeworks/projects

  • Grading

Trucs
  • Have a look at slides before the course
  • Don’t jump to corrections
  • Use online help (StackOverflow, ChatGPT, copilot, …)
  • Read error messages
Code of conduct

TL;DR: No cheating!

Save the dates !
  • January 17: Course kick-off
  • February 14: No session
  • February 28: Winter Holidays
  • March 7: Session 6
  • April 4: Room 2017
  • April 18: Eastern Holidays
  • April 25: Eastern Holidays
  • May 2: Session 12

Université Paris Cité calendar

M1 MIDS calendar