Skip to content

devil-mice-labs/learn-ml-on-gcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

Machine Learning on Google Cloud

An opinionated collection of resources for getting started with ML on Google Cloud.

Compiled by your friendly independent consultant Oliver Frolovs at Devil Mice Labs 😈

Staying up-to-date and exploring possibilities

For hardcore and very curious techies:

  • GitHub repo with common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product. But there are many interesting real-world examples of how Google Cloud services can be used.

Must-know Google Cloud concepts

The concepts and services introduced in this section should be learned by every Google Cloud practitioner without exception, as they are crucial for access and cost controls and directly or indirectly impact every action taken in Google Cloud.

Caution

A single security incident of the right kind can wipe out your career, if not the entire business you work for. An unexpected cloud bill can turn a profitable year into a year of losses. Don't let it happen to you — study this section at the earliest opportunity in your cloud journey.

The remainder of this chapter is split into sections with each section introducing a new core concept or service.

Resource Hierarchy

You can think of Google Cloud as a collection of services (APIs) such as Kubernetes Engine (API) or Google Cloud Storage (API). You make use of these services (APIs) to provision various resources such as Kubernetes clusters and storage buckets. Once a resource is provisioned and configured, you can access and use it.

All those resources exist in a tree-like resource hierarchy and you should know about it as well as about its constituent parts such as the Folder, the Project, and the Organisation resources described in the referenced docs page.

The Resource Manager, which is a part of IAM section of Google Cloud Console, provides a convenient way to visualise and hierarchically manage resources by project, folder, and organisation. You can find more information about the Resource Manager in its documentation.

The resource hierarchy also provides attach points and inheritance for access management and organisation policies.

Identity and Access Management

Identity and Access Management (IAM) service lets administrators authorize who can take action on specific resources, giving you full control and visibility to manage Google Cloud resources centrally. Read the IAM overview page to learn how IAM works. The IAM concepts you should know are: Principals, Resource, Permissions, IAM Roles (basic and predefined), IAM Allow Policy, IAM Deny Policy, and IAM Conditions.

Now that you know about the resource hierarchy and IAM policies, read the page that explains how to use resource hierarchy for access control. Remember that the levels of the resource hierarchy — organisations, folders, and projects provide attach points where the access management policies can be configured.

You should also read service accounts overview.

You might also want to review IAM best practices.

Organisation Policies

TODO Organization Policy Service

Audit Logs

Google Cloud services write audit logs that record administrative activities and accesses within your Google Cloud resources. Audit logs help you answer "who did what, where, and when?" within your Google Cloud resources. You should read Cloud Audit Logs overview to learn what kinds of Audit Logs are available and which ones are enabled by default and how this is done.

Having learned about the different kinds of Audit Logs, you might want to enable some of the optional Audit Logs for your Google Cloud environment. The documentation can tell you what is logged under each kind of Audit Log for different Google Cloud APIs.

Note, that Audit Logs configuration is a part of IAM policy and as such, Audit Logs can be configured at the same levels of Google Cloud resource hierarchy as the IAM policies that were introduced in the previous section. The inheritance also works for Audit Logs configuration.

So hopefully now you can see the connection between the resource hierarchy, the IAM policies, and the Audit Logs configuration.

TODO Access Transparency

Quotas and Limits

Quotas restrict how much of a particular shared Google Cloud resource you can use. Learn to view and manage quotas.

Many services also have limits that are unrelated to the quota system. Limits are fixed constraints, such as maximum file sizes or database schema limitations, which cannot be increased or decreased.

Billing budgets and exports

Learn to create Cloud Billing budgets and alerts to monitor your spending on Google Cloud.

BigQuery

Learn to create custom cost controls for BigQuery at user and project level.

Books

Code samples

Samples showcasing Google Cloud for machine learning and data applications.

github.com / GoogleCloudPlatform / generative-ai

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI.

github.com / GoogleCloudPlatform / applied-ai-engineering-samples

This repository contains code samples and notebooks demonstrating how to use Generative AI on Google Cloud Vertex AI.

github.com / GoogleCloudPlatform / asl-ml-immersion

This repository contains Jupyter notebooks meant to be run on Vertex AI. It is maintained by Google Cloud's Advanced Solutions Lab (ASL) team.

In particular, the notebooks in this repository cover a wide range of model architectures targeting different data modalities implemented mainly in Tensorflow and Keras, as well as the tools on Google Cloud's Vertex AI for operationalizing Tensorflow, Scikit-learn and PyTorch models at scale (e.g. Vertex training, tuning, and serving, TFX and Kubeflow pipelines).

github.com / GoogleCloudPlatform / vertex-ai-samples

This repository contains notebooks, code samples, sample apps, and other resources that demonstrate how to use, develop and manage machine learning and generative AI workflows using Google Cloud Vertex AI.

github.com / GoogleCloudPlatform / specialized-training-content

This repository contains files that are used for instructor-led training courses as part of the Cloud Learning Services Specialized Training program. These files are used in classroom demos, hands-on lab activities, and for supplemental discussions around course topics. Files are organized by course.

github.com / GoogleCloudPlatform / training-data-analyst

Labs and demos for courses for GCP Training.

github.com / GoogleCloudPlatform / python-docs-samples

Python samples for Google Cloud Platform products.

Datasets

The public data sources for your machine learning experiments.

Sandboxes

Environments for writing and running code in your browser.

Open Colab · New Notebook

Colab is a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs. Colab is especially well suited to machine learning, data science, and education.

Machine Learning

Machine Learning Glossary by Google defines many general machine learning terms, plus terms specific to TensorFlow.

Google’s Machine Learning Crash Course is an introduction to (the theory of) machine learning. Ignore the subtitle, there is no need to understand or learn TensorFlow to benefit from this course. All important mathematical concepts are at least referenced here, and many are explained quite well as well.

Other machine learning courses by Google are quite interesting too! 😈

DeepLearning.ai offers a great set of free courses.

End-to-end user journey for each model on BigQuery ML docs page will alert you to what's possible with BigQuery ML.

Generative AI learning path is one of the great resources available on Google Cloud Skills Boost.

Security

In addition to resources listed in the "Google Cloud fundamentals" section:

Protecting confidential data in Vertex AI Workbench user-managed notebooks is a blueprint solution by Google to protect your user-managed Vertex AI notebooks.

VPC Service Controls can be configured to enforce a security perimeter around your Google Cloud resources, reducing the risk of data exfiltration or data breach.

Organisation Policy Service allows you to set constraints at various levels of your Google Cloud resource hierarchy, restricting the ways in which these resources can be created and used, thus protecting yourself from security incidents and overspend.

Access Transparency logs are great for keeping an eye on Google personnel accessing your organisation's data 😈 This service is not enabled by default, you'd need a paid support package to enable the service. As of the time of writing, the basic support goes for $29 a month which is a steal, IMO, and under the current rules it will qualify you for enabling this service.

Python

There are too many books, online resources, and courses available on this. I learned a lot from this one -- Real Python. The free version is good but I think the subscription is worth it -- you get access to even more content and they do Office Hours on Slack where there is a very lively learner community.

Certification

If you approach it right, Google Cloud Certification offers a fantastic opportunity both to learn and practice new skills, and to validate your ability to design and build with Google Cloud technology.

Warning

Don't use "exam dumps" or past exams in any shape or form to prepare for your exam. Disregarding the ethical aspect, by skipping study you miss a valuable opportunity to actually learn a useful skill and grow as an individual. Don't do a disservice to yourself.

Associate level certification: Cloud Engineer

At the apex of the hierarchy there is Professional certification:

You can opt in to have your earnder certification credentials public on Google Cloud Skills Directory, powered by Credly.

For TensorFlow ML library, there is a TensorFlow Developer Certificate.

And for infrastructure automation Terraform Associate is the way to go. The official study guide is quite good.

Renamed products

Some changes in product names. You might see the old names in some places—for example, in videos.

  • Vertex AI Search and Conversation is now Vertex AI Agent Builder 🔗 (April 24, 2024)

    • Vertex AI Search kept its name
    • Vertex AI Conversation to Vertex AI Agents
  • Generative AI App Builder is now Vertex AI Search and Conversation 🔗 (October 09, 2023)

    • Enterprise Search to Vertex AI Search
    • Conversational AI to Vertex AI Conversation
  • Bard is now Gemini

  • Duet AI is now Gemini and Gemini for Workspace

  • Vertex AI was AI Platform until 🔗 May 18, 2021. Some APIs and endpoint domain names still use the old name.