Skip to content

tulip-lab/privacy-aware-data-science

Repository files navigation

GitHub watchers GitHub Release Date GitHub commits since latest release (by SemVer) GitHub issues GitHub pull requests

GitHub watchers GitHub forks GitHub stars


Privacy-aware Data Science

  • Also known as Differential Privacy or Data Privacy, this course (unit) was originally designed for various groups of research students in some top Asia Pacific universities, including Hunan University, University of Chinese Academy of Sciences etc. (since 2018).

  • Materials in this course include resources collected from various open-source online repositories. You are free to use, change and distribute this package.

  • If you found any issue/bug for this site, please submit an issue at tulip-lab/privacy-aware-data-science: GitHub issues

  • Pull requests are welcome: GitHub pull requests

  • Preliminary unit πŸ‘‰ : GitHub watchers

  • Point of Contact πŸ‘‰ : Prof. Gang Li

Prepared by TULIP Lab


πŸ’‘ Content

This course (aka unit) offers a focused study of Differential Privacy, tailored for data science and computer science professionals. It starts with an overview of data privacy concerns, leading into the core concepts of differential privacy, including Ξ΅-differential privacy, Ξ΄-approximations, and noise addition mechanisms like the Laplace and Exponential methods.

Key components include applying differential privacy to statistical analysis and machine learning, adapting conventional techniques to uphold privacy standards. Advanced topics cover federated learning and decentralized systems, emphasizing the field's evolving nature. Discussions on ethical and legal aspects of data privacy are included, preparing students to implement privacy-preserving solutions in various professional settings. The course aims to equip participants with essential skills in designing and managing privacy-conscious data analysis projects.

πŸ“’ Sessions

Students will have access to a comprehensive range of subject materials, comprising slides handouts, assessment documents, and relevant readings. It is recommended that students commence their engagement with each session by thoroughly reviewing the pertinent slides handouts and readings to obtain a comprehensive understanding of the content.

Additionally, students are encouraged to supplement their knowledge by conducting independent research, utilizing online resources or referring to textbooks that cover relevant information related to the topics under study.

πŸ—“οΈ Session Plan

The proposed unit is structured to encompass a total of 100 class hours. This allocation includes 80 hours dedicated to instruction and teaching, complemented by 20 hours set aside for student presentations and discussions.

For optimal integration into university curricula, it is suggested that this unit be divided into two distinct segments (or two consecutive units). This approach is more aligned with typical academic scheduling and facilitates a more manageable and effective learning experience.

Privacy-aware Data Science (I)

The unit plan is as below:

πŸ”¬
Session
🏷️
Category
πŸ“’
Topic
🎯
ULOs
πŸ‘¨β€πŸ«
Activity
0️⃣ Preliminary πŸ“– Induction ULO1 GitHub watchers
1️⃣ Preliminary πŸ“– Theoretical Foundations ULO1
2️⃣ Core πŸ“– Data Privacy ULO1
3️⃣ Core πŸ“– Privacy Attacks ULO1 ULO2
4️⃣ Core πŸ“– Differential Privacy ULO1 ULO2
5️⃣ Core πŸ“– Composition of Differential Privacy ULO1 ULO2
6️⃣ Core πŸ“– Sparse Vector Technique ULO1 ULO2
7️⃣ Core πŸ“– Query Release and The Net Mechanism ULO1 ULO2
πŸ…°οΈ Student Work πŸ“– Selected Topics in DP ULO3 GitHub watchers
8️⃣ Core πŸ“– DUA: Database Update Algorithm ULO1 ULO2
9️⃣ Core πŸ“– PTR Mechanism and S&A Mechanism ULO1 ULO2 ULO3
πŸ”Ÿ Core πŸ“– Fundamental Law of Information Reconstruction ULO1 ULO2 ULO3
πŸ…±οΈ Student Work πŸ“– Selected Topics in DP ULO3 GitHub watchers

Privacy-aware Data Science (II)

The unit plan is as below:

πŸ”¬
Session
🏷️
Category
πŸ“’
Topic
🎯
ULOs
πŸ‘¨β€πŸ«
Activity
1️⃣ Advanced πŸ“– PATE ULO1
2️⃣ Advanced πŸ“– M-DP and Local-DP ULO1
3️⃣ Advanced πŸ“– DP Learning ULO1 ULO2
4️⃣ Advanced πŸ“– DP SGD ULO1 ULO2
5️⃣ Advanced πŸ“– DP Clustering ULO1 ULO2
6️⃣ Advanced πŸ“– Renyi-DP and zCDP ULO1 ULO2
7️⃣ Advanced πŸ“– Privacy Amplification ULO1 ULO2
πŸ…°οΈ Student Work πŸ“– Selected Topics in Advanced DP ULO3 GitHub watchers
8️⃣ Advanced πŸ“– TBA ULO1 ULO2
9️⃣ Advanced πŸ“– TBA ULO1 ULO2 ULO3
πŸ”Ÿ Advanced πŸ“– TBA ULO1 ULO2 ULO3
πŸ…±οΈ Student Work πŸ“– Selected Topics in Advanced DP ULO3 GitHub watchers
πŸ† Advanced πŸ“– [Invited Talk and Discussions] ULO1 ULO2 GitHub watchers

🈡 Assessment

Every cohort might be assessed differently, depending on the specific requirements of your universities.

The assessment of the unit is mainly aimed at assessing the students' achievement of the unit learning outcomes (ULOs, a.k.a. objectives), and checking the students' mastery of those theorey and methods covered in the unit.

πŸ“– Assessment Plan

The detailed assessment specification and marking rubrics can be found at: S00D-Assessment. The relationship between each assessment task and the ULOs are shown as follows:

πŸ”¬
Task
πŸ‘¨β€πŸ«
Category
🎯
ULO1
🎯
ULO2
🎯
ULO3
Percentage
1️⃣ Presentation 50% 25% 25% 25%
2️⃣ Project 30% 70% 50%
2️⃣ Report
Presentation
20% 40% 40% 25%

πŸ—“οΈ Submission Due Dates

  • SRM 2024 - The final assessment files submissions due date is πŸ—“οΈ Saturday, 18/05/2024 (tentative), group of one member only (individual work) for all tasks.

It is expected that you will submit each assessment component on time. You will not be allowed to start everything at the last moment, because we will provide you with feedback that you will be expected to use in future assessments.

γŠ™οΈ

If you find that you are having trouble meeting your deadlines, contact the Unit Chair.

πŸ“š References

This course uses several key references or textbooks, together with relevant publications from TULIP Lab:

πŸ‘‰ Contributors

Thanks goes to these wonderful people 🌷

Made with contributors-img.