Skip to content

Commit

Permalink
Assignment 0.0
Browse files Browse the repository at this point in the history
  • Loading branch information
vigji committed Mar 13, 2024
1 parent d3f91ad commit 05fecfa
Show file tree
Hide file tree
Showing 6 changed files with 701 additions and 137 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ This course will start covering **the basics of Python usage** and build up from

## Lectures recordings
- [Lecture 0.1](https://youtu.be/TMss3OOHrLE): Data structures (list, dict, tuple, set)
- [Lecture 0.2](https://youtu.be/34A9iWaIqvM): Flow controls (if/else, for)
- [Lecture 0.2](https://youtu.be/34A9iWaIqvM): Flow controls (if/else, for loops)
- [Lecture 0.3](https://youtu.be/ZlN6qyjW488): while loops; functions

---

Expand Down
351 changes: 351 additions & 0 deletions assignments/Assignments_0.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,351 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f54dd50c",
"metadata": {},
"source": [
"# Assignments - module 0\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vigji/python-cimec/blob/main/assignments/Assignments_0.ipynb)\n",
"\n",
"This notebook contains the assignments to complete for credits for the first module. \n",
"\n",
"**Submission**: Once you're happy with your solutions, send it to me in any form (email the file, share it through Colab/Google Drive, send me a link to your GitHub repo...).\n",
"\n",
"**Deadline**: 15th of July 2024\n",
"\n",
"**Evaluation**: There is no grade, but I will pass assignments that showcase a reasonable degree of understanding og the covered topics. Do your best, and feel free to ask for help if you are struggling! \n",
"\n",
"(Also, try to keep in mind not only the goal of the exercise, but also all the coding best practices we have been considering in the lectures.)"
]
},
{
"cell_type": "markdown",
"id": "9ce5d72f",
"metadata": {},
"source": [
"# 0. Cryptography\n",
"\n",
"The [Caesar cipher](https://en.wikipedia.org/wiki/Caesar_cipher) is a simple encryption technique where each letter in the plaintext is shifted a certain number of places down or up the alphabet. For example, with a shift of 1, 'A' becomes 'B', 'B' becomes 'C', etc., and 'Z' would wrap around to 'A'."
]
},
{
"cell_type": "markdown",
"id": "e2cb8832",
"metadata": {},
"source": [
"#### Exercise 0.0\n",
"\n",
"The first thing that we will need to work with this cipher is a way to shift a given letter text up or down the alphabet.\n",
"This could be done in two ways:\n",
"1. Start from a list (or a string) of all letters of the alphabet; find the index of the letter to encrypt; add the given shift to the index, and use this new index to find the new letter\n",
"2. Rely on [ASCII](https://en.wikipedia.org/wiki/ASCII) encoding of characters to map each letter to its integer ascii encoding with the `ord()` function, add the shift to the integer, and convert it back with the `chr()` function\n",
"\n",
"Make sure that your code works with any possible positive or negative shift over the English alphabet (hint: remember the modulo operation...)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "148bebac",
"metadata": {},
"outputs": [],
"source": [
"letter_to_convert = \"a\" # try different letters\n",
"shift = 4 # play with different values\n",
"\n",
"...\n",
"\n",
"encoded_letter = ..."
]
},
{
"cell_type": "markdown",
"id": "0f2c0ea9",
"metadata": {},
"source": [
"#### Exercise 0.1\n",
"\n",
"Now, let's use the code above to encode/decode full words! (note that the same code can either decode a encrypted message or decode a plaintext message, using positive and negative shifts)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "96be39c2",
"metadata": {},
"outputs": [],
"source": [
"# Write a function that takes a word and a shift as input, and returns the shifted text as output\n",
"# (e.g., word=\"hello\", shift=1 should produce \"ifmmp\")\n",
"\n",
"def shift_word(word, shift):\n",
" ...\n"
]
},
{
"cell_type": "markdown",
"id": "12c57f2b",
"metadata": {},
"source": [
"#### Exercise 0.2\n",
"\n",
"Longer texts will have uppercase words and punctuation. Starting from the function of exercise 0.1, write a new function that can handle more complex text, and return its encrypted version."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "343aa483",
"metadata": {},
"outputs": [],
"source": [
"# Write a function that takes a word and a shift as input, and returns the shifted text as output\n",
"# (e.g., text=\"\"Hello, World!\"\", shift=1 should produce \"Ifmmp, Xpsme!\")\n",
"# Hint: you can check out if a character is a letter with the .isalpha() method\n",
"\n",
"def shift_text(text, shift):\n",
" ..."
]
},
{
"cell_type": "markdown",
"id": "d3916d37",
"metadata": {},
"source": [
"#### [optional] Exercise 0.3\n",
"Sometimes, we do not know in advance the register shift ([they certainly did not know it at Bletchley Part](https://en.wikipedia.org/wiki/Bombe)). Still, we could leverage a brute force attack to test all possible combinations (there are only 26!) and check each solution agains an English dictionary (assuming that the original message will be in English), to see which shift gives the best matching"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52bf9351",
"metadata": {},
"outputs": [],
"source": [
"# Here' a function that will give you a reasonable list of English words:\n",
"import requests\n",
"\n",
"\n",
"def get_english_words_list():\n",
" \"\"\"Download a reasonably complete English dictionary.\"\"\"\n",
" resp = requests.get(\"https://www.mit.edu/~ecprice/wordlist.10000\")\n",
" return resp.text.split(\"\\n\")\n",
"\n",
"english_words = get_english_words_list() # here's how to get a list of english words"
]
},
{
"cell_type": "markdown",
"id": "b525f519",
"metadata": {},
"source": [
"## 1. Spotted UniTn"
]
},
{
"cell_type": "markdown",
"id": "db12042e",
"metadata": {},
"source": [
"In this exercise, we'll be doing some stats on a dataset of all the people employed at UniTn scraped from the UniTn website.\n",
"\n",
"**Note**: We have not learned yet how to use arrays, matrices, and dataframes. Some of the analysis in this exercise will inevitabily look a bit cumbersome, because they are - with the tools we have now. They'll become a piece of cake with `pandas`!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56ff6ba0",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import requests\n",
"\n",
"\n",
"def get_unitn_hr_dataset():\n",
" \"\"\"Download all data about UniTn employees from their website.\n",
"\n",
" !!!Note: all information we are using here is made openly available from\n",
" the university. However, please do appreciate the power of similar data\n",
" scraping through any of the online platforms we're giving our data to,\n",
" were there some security holes!\n",
" This is no endorsment toward trying anything like that yourself, hacking\n",
" is bad. No seriously, it is. Also, copyright is good.\n",
"\n",
" Returns:\n",
"\n",
" list : A list of uni employees.\n",
"\n",
" \"\"\"\n",
"\n",
" # This string contains the address at which we'll find the dataset:\n",
" UNITN_PEOPLE_URL = \"https://dati.unitn.it/du/Person/en\"\n",
"\n",
" # Get page response:\n",
" response = requests.get(UNITN_PEOPLE_URL)\n",
"\n",
" # Parse a json from the page:\n",
" json_data = json.loads(response.text)\n",
"\n",
" # Get actual data and return:\n",
" return json_data[\"value\"][\"data\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae543d19",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "669c931b",
"metadata": {},
"source": [
"#### Exercise 1.0\n",
"\n",
"Call the function and try to have a look at the result. How many people are employed at the university? How many at each department? Which is the deparment with the most professors?\n",
"\n",
"Make a nice `print` of all those results! (You'll see a lot of different departments. You can filter results for the ones with at least 10 people)\n",
"\n",
"- If people have multiple affiliations, count them in each one of them. Eg, if someone is listed under both `\"Center for Mind/Brain Sciences - CIMeC\"` and `\"CeRiN - Center for Neurocognitive Rehabilitation\"`, put the person in the count for both departments.\n",
"- If a person is listed with two different roles at the same department (e.g., as both `\"Graduate student\"` and `\"Research intern\"`) count that person only once for that department."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "60e39898",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "db624bbe",
"metadata": {},
"source": [
"#### Exercise 1.1\n",
"\n",
"Imagine you want to call-bomb the `\"Department of Economics and Management\"` for a prank. You'll first need a list of all the phone numbers you can find in that department. Create that list!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f560dd1",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "759cf40d",
"metadata": {},
"source": [
"#### Exercise 1.2\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "999a6f38",
"metadata": {},
"source": [
"Use the function below to get a dictionary of Italian names divided by gender. \n",
"\n",
"Then, print out the gender ratio (how many women, how many men) for all the position roles that you can find in the dataset (filter out positions with less then 10 people). Then, jump to conclusions!\n",
"\n",
"- If a person has multiple roles count them for each of the roles they have\n",
"- Yes, it can be erroneous to infer gender just from the name; here we assume potential errors will average out in the large numbers.\n",
"- Yes, this will consider only Italian employees. You can print out how many names were left out (and which ones), and if you want try and improve the function by including international names in the list as well!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cf71c38b",
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"\n",
"\n",
"def get_names():\n",
" \"\"\"Download a list of italian names, divided by gender.\n",
"\n",
" Returns:\n",
"\n",
" dict : A dictionary of masculine and feminine names.\n",
"\n",
" \"\"\"\n",
"\n",
" # This string contains the address at which we'll find the names:\n",
" FIRST_NAMES_URL = \"https://gist.githubusercontent.com/metalelf0/a2ab283d0d5fd9b4b8a10d6427630627/raw/b848ffee70464fd39714a1a621f3a2eba6c3812e/italian_names.md\"\n",
"\n",
" # Get page response:\n",
" response = requests.get(FIRST_NAMES_URL)\n",
"\n",
" # read the response as string:\n",
" raw_content = response.text\n",
"\n",
" # split lines and exclude fir header (# Male names):\n",
" full_names_list = raw_content.split(\"\\n\")[1:]\n",
"\n",
" # Look for the header \"# Female names\":\n",
" female_header_idx = full_names_list.index(\"# Female names\")\n",
"\n",
" # Names before header are male, after are female:\n",
" return dict(\n",
" male=full_names_list[:female_header_idx],\n",
" female=full_names_list[female_header_idx + 1 :],\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "d7b0a566",
"metadata": {},
"source": [
"## 2. Classes\n",
"\n",
"Pending..."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5209c7cd",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "course-env",
"language": "python",
"name": "course-env"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading

0 comments on commit 05fecfa

Please sign in to comment.