Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation of units for batch7 #381

Open
cimendes opened this issue May 22, 2023 · 17 comments
Open

Allocation of units for batch7 #381

cimendes opened this issue May 22, 2023 · 17 comments
Labels
Batch 7 priority:high QA AOR Falls under the responsibility of the Quality Assurance (QA) AOR. Teaching AOR Falls under the responsibility of the Teaching AOR.

Comments

@cimendes
Copy link
Member

cimendes commented May 22, 2023

This is a work in progress! Template: #330

Description

The goal of this issue is to assess interest and have a pre-allocation of batch7's
teaching work and QA.

The units, overall work needed and release and delivery dates are listed
below.

Units

Admissions

  • Unit
    • Adjustment to the new Python and Pandas versions.
    • Learning notebooks - minimum change: review
    • Example notebooks - minimum change: review
    • Exercise notebook - minimum change: review, new datasets
  • To be released on 23 October 2023
  • To be ready on: soon
SLU Name Last year instructor Batch 7 instructor Last year QA
SLU01 Pandas 101 @majkah0 @majkah0 @Jujulian3
SLU02 Subsetting Data in Pandas @jgomes959 @jgomes959 @jgerebelo
SLU03 Visualization with Pandas & Matplotlib @Gustavo-SF @kagglekim @SaraOGomes
Test @fabiocruz @danizao @majkah0 @minhhoang1023 @Gustavo-SF
Test on 23 October 2023

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the test verification

Specialization 1 + Bootcamp

  • Project manager: José Rebelo @jgerebelo
  • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • Learning notebook
    • Example notebook
    • Exercise notebook
  • To be released on:
    • SLU04 - SLU10 learning notebooks: 19 November 2023
    • SLU04 - SLU10 exercise notebooks: 26 November 2023
    • SLU11- SLU19 learning and exercise notebooks: 26 November 2023
  • To be ready in November
SLU Name Last year instructor Batch 7 instructor Last year QA Batch 7 QA
SLU04 Basic Stats with Pandas @SaraOGomes @cmm79 @jgomes959 @BG2602
SLU05 Covariance & Correlation @kagglekim @cmm79 @anaritarc @BG2602
SLU06 Dealing with Data Problems @majkah0 @TeignmouthElectron @SaraOGomes @BG2602
SLU07 Regression with Linear Regression @jgerebelo @joaogilsa @carlacotas @Mohamedgaber9
SLU08 Metrics for Regression @marianahenriques1 @joaogilsa @cd702 @Mohamedgaber9
SLU09 Classification with Logistic Regression @majkah0 @majkah0 @carlacotas @CaitlinHulse
SLU10 Metrics for Classification @phgui @majkah0 @majkah0 @CaitlinHulse
SLU11 Tree-Based Models @anaritarc @margaridantunes @carlacotas @Mohamedgaber9
SLU12 Feature Engineering (aka Real World Data) @danizao João Nobre @anaritarc @Mohamedgaber9
SLU13 Bias-Variance tradeoff & Model Selection @jgerebelo @rodrigomverissimo @anaritarc @BG2602
SLU14 Model complexity & Overfitting @Gustavo-SF @Gustavo-SF @Jujulian3 @BG2602
SLU15 Hyperparameter Tuning @jgomes959 @jgomes959 @SaraOGomes @BG2602
SLU16 Workflow @cimendes @fabiocruz @TeignmouthElectron
SLU17 Ethics & Fairness @hershaw @majkah0 @Gustavo-SF @TeignmouthElectron
SLU18 Support Vector Machines (SVM) (optional unit) @cimendes @majkah0 @Jujulian3
SLU19 k-Nearest Neighbors (kNN) (optional unit) @cimendes @majkah0 @Jujulian3
Group SLUs Batch 7 QA lead* Batch 7 backup QA**
QA1 SLU04, SLU05, SLU06 @BG2602 Caitlin Hulse
QA2 SLU07, SLU08 @Mohamedgaber9 Cora
QA3 SLU09, SLU10 Caitlin Hulse
QA4 SLU11, SLU12 @Mohamedgaber9
QA5 SLU13, SLU14, SLU15 @BG2602
QA6 SLU16, SLU17 @@TeignmouthElectron

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)

Bootcamp presentations

Bootcamp presentations will be split in two parts. Presentations will be given by senior instructors. This is what is expected from each instructor:

  • The presentation should be <= 60 min including student questions. The presentation should be on concepts and insights for the given topic, not the technical implementation in Python.
  • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
  • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
    Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

Bootcamp part 1, Sunday 26. November 2023

  • Instructor 1: LTPlabs
    • ~ 30 min: Intro to data science, SLU04 - Basic Stats with Pandas, SLU05 - Covariance and Correlation
    • ~ 30 min: SLU06 - Dealing with Data Problems
  • Instructor 2: José Rebelo, EDP
    • 30 - 60 min: SLU07 - Regression with Linear Regression, SLU08 - Metrics for Regression
  • Instructor 3: LTPlabs
    • 30 - 60 min, SLU09 - Classification with Logistic Regression, SLU10 - Metrics for Classification

Bootcamp part 2, Sunday 3. December 2023

  • Instructor 4: João Ascensão, Stratio - TBC
    *45-60 min: SLU11 - Tree-Based Models, SLU12 - Feature Engineering

  • Instructor 5: Maria Cristina Dominguez

    • ~60 min: SLU13 - Bias-Variance tradeoff & Model Selection, SLU14 - Model complexity and Overfitting, SLU15 - Hyperparameter Tuning
  • Instructor 6: Sam Hopkins, DareData

    • 30-60 min: SLU16 - Workflow, SLU17 - Ethics and Fairness
  • Hackathon 1

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 17 December 2023

  • To be ready in November

Work unit Name Last year instructor Batch 7 instructor Last year QA Batch 7 QA
Hackathon 1 Binary Classification @wilsonramos1 @jgomes959 ? .

Specialization 2, 8 January 2024 - 4 February 2024

  • Project manager: Kim Pronk @kagglekim

  • Senior instructor:

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 8 January (BLU01), 15 January (BLU02), 22 January (BLU03)

  • To be ready in ** December 2023**

  • Hackathon 2

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 4 February (Hackathon 02)

  • To be ready in mid January

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @martinb-bb
BLU01 Messy Data @JerBouma @majkah0
BLU02 Advanced Wrangling @minhhoang1023 @cd702
BLU03 Data Sources @jmaslek @anaritarc
Hackathon 2 Data Wrangling @martinb-bb @JerBouma @minhhoang1023 @DidierRLopes
  • Batch 7 QA Lead BLU01/BLU02/BLU03: @AhmedEmad2525
  • Batch 7 backup QA BLU01/BLU02/BLU03:

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 3, 5 February - 3 March 2024

  • Project manager: Mária Hanulová @majkah0

  • Senior instructor: Telmo Felgueira, Loka / JungleAI

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 5 February (BLU04), 12 February (BLU05), 19 February (BLU06)

  • To be ready in January

  • Hackathon 3

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 3 March (Hackathon 03)

  • To be ready in mid February

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @TSFelg @TSFelg
BLU04 Time Series Concepts @PedroRibeiro80 @Sonia-se
BLU05 Classical Time Series Models @jgerebelo @carlacotas
BLU06 Machine Learning for Time Series @jdpsc @TeignmouthElectron @SaraOGomes
Hackathon 3 Timeseries @TSFelg @Gustavo-SF
  • Batch 7 QA Lead BLU04/BLU05/BLU06: @Mohamedgaber9
  • Batch 7 backup QA BLU04/BLU05/BLU06:

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 4, 4 March - 31 March 2024

  • Project manager:

  • Senior instructor:

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 4 March (BLU07), 11 March (BLU08), 18 March (BLU09)

  • To be ready in February

  • Hackathon 4

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 31 March (Hackathon 04)

  • To be ready in ** mid March**

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @CatarinaSilva
BLU07 Feature Extraction @CatarinaSilva @cd702
BLU08 Dimensionality Reduction @CatarinaSilva @majkah0
BLU09 Information Extraction @CatarinaSilva @carlacotas
Hackathon 4 NLP BancoBPI

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 5 - this will be an optional specialization

  • Project manager:
  • Junior instructors
    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
  • To be released in March/April
  • To be ready in March
Work unit Name Last year instructor Batch 6 instructor(s) Last year QA
- Spec lead
BLU10 Non-personalised Recommender @majkah0 @anaritarc
BLU11 Personalized Recommenders @majkah0 @anaritarc
BLU12 Workflow @majkah0 @anaritarc
Hackathon 5 Recommender Systems

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 6, 1 April - 28 April 2024

  • Project manager:

  • Senior instructor: Gustavo Fonseca, LDSA

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 1 April (BLU13), 8 April (BLU14), 15 April (BLU15)

  • To be ready in March

  • Hackathon

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 28 April (Hackathon 06)

  • To be ready in mid April

Extra session about Venture Capital: Armilar

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @cimendes @Gustavo-SF
BLU13 Basic model Deployment @cimendes @carlacotas
BLU14 Deployment in the real world @cimendes
BLU15 Model CSI @cimendes @carlacotas
Hackathon 6 Data science in real world @CatarinaSilva @cimendes @InesPessoa
  • Batch 7 QA Lead BLU13/BLU14/BLU15:
  • Batch 7 backup QA BLU13/BLU14/BLU15:

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Capstone, 29 April - 15 July 2024

  • Preparing a strong dataset and problem
  • Help building documents/forms/etc
  • Replying to students QA
  • Beta-testing/QAing
  • Grading capstone
  • To be released on 29 April
  • To be ready in mid April
Work unit Name Last year instructor (s) Batch 7 instructor(s)
- Capstone @minhhoang1023 @cimendes @fabiocruz @Gustavo-SF @anaritarc @majkah0

Other possible extra sessions:

  • NOS (LLM, Data Science in Real World);
  • AICEP (Classfication, Data Science in Real World);
  • BPI (Classfication, Data Science in Real World)
@cimendes cimendes added priority:high Teaching AOR Falls under the responsibility of the Teaching AOR. QA AOR Falls under the responsibility of the Quality Assurance (QA) AOR. Batch 7 labels May 22, 2023
@VascoMano
Copy link

I am available for QA for both the SLU2 and the admission test.

Note: I don't have access to this repository

@majkah0
Copy link
Contributor

majkah0 commented May 31, 2023

I'm in for SLU09 logistic regression and something else, maybe the optional SLUs. Also for the exam QA.
I would also be in for an optional hackathon training SLU, number 10.5 and I'd like to have Rita for QA for that 🙂 .
What will be the schedule for batch 7? If I could have one student success wish, it would be to change the bootcamp structure - 1 week longer, with lectures divided in two and 2 office hours. 😁 We have discussed this a bit during this year, but then no conclusions were made.

@cd702
Copy link

cd702 commented Jun 28, 2023

Hello, I would also like to know if there is already an update on the schedule?

@cd702
Copy link

cd702 commented Aug 25, 2023

I would be interested in QA for SLU06 or SLU08 as well as 12 or 13 but it is so quiet here i do not know if this is the right place...

@majkah0
Copy link
Contributor

majkah0 commented Aug 25, 2023

Hi @cd702 thank you, this is the right place! In fact, I was thinking about contacting you today :) I'm starting to bring in some life now. Would you also be in as instructor? We are doing just minimal maintenance this year, fixing the errors.

@cd702
Copy link

cd702 commented Aug 25, 2023

I will have another look at the units and let you know. Are the dates in the instructors repo confirmed?

@majkah0
Copy link
Contributor

majkah0 commented Sep 14, 2023

@cd702 Hi Cora, sorry for the late reply, the dates will be confirmed this weekend.

@jgomes959
Copy link

@majkah0 Hi Mária, I can still be responsible for SLU02 this year :)

@majkah0
Copy link
Contributor

majkah0 commented Sep 15, 2023

Perfect, thank you @jgomes959 . Just to warn you - Pandas has changed row subsetting, so there might be a lot of corrections to do.

@anaritarc
Copy link

anaritarc commented Sep 23, 2023

Hi, doesn't Vasco already has a list of who will be responsible for each teaching?
Also maybe we should update it, per Spec, even for the bootcamp, and highlight the fact that the process has changes

@majkah0
Copy link
Contributor

majkah0 commented Sep 24, 2023

@cd702 Hi Cora, this is finally moving :) You were interested in QA for SLU 06, 08, 12 or 13. Can I sign you up for some of those? These units will go out at the end of November/beginning of December.
We are also looking for QA for the admissions units and test which will go out in mid October.
This year, there will be minimal unit development. The QA should happen first, then the instructor will correct all the issues. So you can already start, the instructor repo is set up.

@cd702
Copy link

cd702 commented Sep 25, 2023

Hi Maria, that's great news :) in this case you can sign me up as QA for units 6 and 8 as well as 12. If you need me there I can also have a look over SLU 1 and try out the admissions test once it's ready :)

@majkah0
Copy link
Contributor

majkah0 commented Sep 25, 2023

Thank you very much Cora!

@Gustavo-SF
Copy link
Contributor

Added myself to QA of Admissions test and SLU14 Instructor.

@mafaldavs
Copy link

i am available for QA SLU11, SLU12, SLU14

@Gustavo-SF
Copy link
Contributor

Hey @mafaldavs, welcome!, SLU12 is already taken. Is there any other you would like to take?

@anaritarc
Copy link

anaritarc commented Oct 2, 2023

Hi,

Sorry @mafaldavs, @Gustavo-SF and @cd702 this is not the structure for QA this year, I still have to update, but basically each person will be responsible for 3 SLUs.

I'll update it, this template is not to be used for QA, I'm thinking the best way to incorporate it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Batch 7 priority:high QA AOR Falls under the responsibility of the Quality Assurance (QA) AOR. Teaching AOR Falls under the responsibility of the Teaching AOR.
Projects
None yet
Development

No branches or pull requests

8 participants