Skip to content

Latest commit

 

History

History
981 lines (689 loc) · 43.1 KB

course_1_Core_Infrastructure.md

File metadata and controls

981 lines (689 loc) · 43.1 KB

GCP Fundamentals: Core Infrastructure

Content

What is Cloud Computing

The US National Institute of Standards and Technology created a definition It has 5 equally important traits:

  1. computing resources on-demand and self-service. All you have to do is use a simple interface and you get the processing power, storage, and network you need, with no need for human intervention.
  2. access these resources over the net from anywhere you want. The provider of those resources has a big pool of them and allocates them to customers out of that pool. That allows the provider to get economies of scale by buying in bulk and pass the savings on to the customers. Customers don't have to know or care about the exact physical location of those resources.
  3. the resources are elastic. If you need more resources you can get more, rapidly. If you need less, you can scale back.
  4. the customers pay only for what they use or reserve as they go. If they stop using resources, they stop paying.

IaaS vs PaaS vs SaaS

Virtualized data centers brought you Infrastructure as a Service, IaaS, and Platform as a Service, PaaS offerings.

  • IaaS offerings provide raw compute, storage, and network organized in ways that are familiar from data centers.
  • PaaS offerings, on the other hand, bind application code you write to libraries that give access to the infrastructure your application needs. That way, you can just focus on your application logic.

In the IaaS model, you pay for what you allocate.

In the PaaS model, you pay for what you use.

Both sure beat the old way where you bought everything in advance based on lots of risky forecasting. As Cloud Computing has evolved, the momentum has shifted towards managed infrastructure and managed services. GCP offers many services in which you need not worry about any resource provisioning at all. We'll discuss many in this course. They're easy to build into your applications and you pay per use.

What about SaaS? Of course, Google's popular applications like, Google Search, Gmail, Google Docs and Google Drive are Software as a Service applications in that they're consumed directly over the internet by end users (e.g. GSuite).

GCP_IaaS_PaaS_FaaS.png

article: https://cloud.google.com/blog/products/gcp/time-to-hello-world-vms-vs-containers-vs-paas-vs-faas

GCP Multi-regions, Regions & Zones

GCP_regions_zones.png

  • in several zones of the same regions for fault tolerance,
  • in several regions around the world for better performance.

Pricing innovations

  • Per second billing: Google was the first major Cloud provider to bill by the second rather than rounding up to bigger units of time for its virtual machines as a service offering. This may not sound like a big deal, but charges for rounding can really add up for customers who are creating and running lots of virtual machines. Per second billing is offered for a virtual machine use through Compute Engine and for several other services too which we'll also look at in this course. Kubernetes engine which is Container Infrastructure as a Service, Cloud Dataproc which is the open source big data system Hadoop as a Service, and App Engine's Flexible Environment, which is a Platform as a Service.
  • Discounts for sustained use: Compute Engine offers automatically applied sustained use discounts which are automatic discounts that you get for running a virtual machine for a significant portion of the billing month. When you run an instance for more than 25 percent of a month, Compute Engine automatically gives you a discount for every incremental minute you use it. Here's one more way Compute Engine saves money. Later in this course, you'll learn about how virtual machines are configured. Among other things, you specify how much memory and how many virtual CPUs they should have.
  • Custom virtual machine types: Normally, you pick a virtual machine type from a standard set of these values, but Compute Engine also offers custom virtual machine types, so that you can fine-tune the sizes of the virtual machines you use. That way, you can tailor your pricing for your workloads.

Multi-layered security approach

video

GCP_regions_zones.png

Starting with Google Cloud

your workloads in GCP:

  • you use projects to organize them.
  • You use Google Cloud Identity, and Access Management, also called IM, or IAM to control who can do what.
  • you use your choice of several interfaces to connect.

Projects

Projects are the main way you organize the resources you use in GCP. Use them to group together related resources, usually because they have a common business objective.

IAM

The principle of least privilege is very important in managing any kind of compute infrastructure, whether it's in the Cloud or on-premises. This principle says that each user should have only those privileges needed to do their jobs. In a least-privilege environment, people are protected from an entire class of errors. GCP customers use IM to implement least privilege, and it makes everybody happier. There are four ways to interact with GCP's management layer:

  • through the web-based console,
  • through the SDK and its command-line tools,
  • through the APIs,
  • and through a mobile app.

Google_Security.png

When you build an application on your on-premises infrastructure, you're responsible for the entire stack security. From the physical security of the hardware, and the premises in which they're housed, through the encryption of the data on disk, the integrity of your network, all the way up to securing the content stored in those applications. When you move an application to Google Cloud Platform, Google handles many of the lower layers of security. Because of its scale, Google can deliver a higher level of security at these layers than most of its customers could afford to do on their own. The upper layers of the security stack remain the customers' responsibility. Google provides tools such as IAM to help customers implement the policies they choose at these layers.

GCP resource hierarchy

video #1

video #2

GCP_organization_heirarchy.png

GCP_resource_hierarchy.png Projects_in_a_Folder.png

Policies are inherated downwards in the hierarchy. GCP_project_identifiers.png

Example_without_folders.png

Identity and Access Management (IAM)

video

Who (Account/Identity)/Doing what (Roles)/On which resources?

IAM lets administrators authorize who can take action on specific resources. An IAM policy has:

  • a "who" part: The "who" part names the user or users you're talking about. The "who" part of an IAM policy can be defined either by:
    • a Google account,
    • a Google group,
    • a Service account,
    • an entire G Suite,
    • or a Cloud Identity domain.
  • a "can do what" part: The "can do what" part is defined by an IAM role. An IAM role is a collection of permissions. Most of the time, to do any meaningful operations, you need more than one permission. For example, to manage instances in a project, you need to create, delete, start, stop, and change an instance. So the permissions are grouped together into a role that makes them easier to manage.
  • and an "on which resource" part:

roles_in_IAM.png 3 kinds of roles in Cloud IAM

  • Primitive roles are broad. You apply them to a GCP project and they affect all resources in that project. These are the owner, editor, and viewer roles. If you're a viewer on a given resource, you can examine it but not change its state. If you're an editor, you can do everything a viewer can do, plus change its state. And if you are an owner, you can do everything an editor can do, plus manage rolls and permissions on the resource.
  • The owner role on a project also lets you do one more thing: set up billing. - - Often, companies want someone to be able to control the billing for a project without the right to change the resources in the project. And that's why you can grant someone the billing administrator role.

Predefined roles Be careful, if you have several people working together on a project that contains sensitive data, primitive roles are probably too coarse. Fortunately, GCP IAM provides a finer grained types of roles. GCP services offer their own sets of predefined roles and they define where those roles can be applied. For example, later in this course, we'll talk about Compute Engine, which offers virtual machines as a service. Compute Engine offers a set of predefined roles, and you can apply them to Compute Engine resources in a given project, a given folder, or in an entire organization. Another example. Consider Cloud Bigtable, which is a managed database service. Cloud Bigtable offers roles that can apply across an entire organization to a particular project or even to individual Bigtable database instances.

*IAM more fine-grained predefined roles on particular services (video).

A lot of companies have a least-privileged model in which each person in your organization has the minimum amount of privilege needed to do his or her job.

Services Accounts

mid of video

What if you want to give permissions to a Compute Engine virtual machine, rather than to a person? Then you would use a service account.

service_accounts.png

Interacting with GCP

video

GCP_4_ways_interactions.png

APIs Explorer

Cloud Marketplace (formerly Cloud Launcher)

console.cloud.google.com/marketplace

video

A Quick way to get access to "solutions" with minimum effort.

e.g.:

  • deploying a LAMP stack on GCP (LAMP stands for Linux, Apache, MySQL, PHP) > video demo.

Virtual Machines on GCP

Compute Engine lets you run virtual machines on Google's global infrastructure.

Virtual Private Cloud (VPC) Network

Your VPC networks connect your GCP resources to each other and to the internet:

  • you can segment your networks,
  • you can use firewall rules to restrict access to instances, and
  • you can create static routes to forward traffic to specific destinations.

The way a lot of people get started with GCP is:

  • to define their own Virtual Private Cloud inside their first GCP project,
  • or they can simply choose the default VPC and get started with that.

VPC_regions_subnets.png

The Virtual Private Cloud networks that you define have global scope.

They can have subnets in any GCP region worldwide and subnets can span the zones that make up a region. "In Google Cloud VPCs, subnets have regional scope."

This architecture makes it easy for you to define your own network layout with global scope:

  • You can also have resources in different zones on the same subnet.
  • You can dynamically increase the size of a subnet in a custom network by expanding the range of IP addresses allocated to it. Doing that doesn't affect already configured VMs. In this example, your VPC has one network. So far, it has one subnet defined in GCP us-east1 region. Notice that it has two Compute Engine VMs attached to it. They're neighbors on the same subnet even though they are in different zones. You can use this capability to build solutions that are resilient but still have simple network layouts.

Compute Engine

video

  • OS, type of disk storage, software pre-installed with startup scripts, ...
  • snapshots of disks
  • preemptible VMs for jobs that can be stopped and restarted.

Important VPC capabilities

video

  • provided router
  • provided Firewall instances to control traffic going through
  • VPC Peering to interconnect outside your project 2 VPCs on GCP. And Shared VPC would provide the IAM functionalities to specify Who can access What.
  • Cloud Load Balancing: Google Cloud Load Balancing allows you to balance HTTP-based traffic across multiple Compute Engine regions.

Cloud_Load_balancing.png

  • Cloud DNS: programmable with REST API
  • Cloud CDN: CDN stands for Content Delivery Network, a global system of ddge caches to cache content close to your users: better user-experience, less requests. Just enable it with checkbox!

Create a VM from Console

Create a VM with gcloud in Cloud Shell

Create the new VM in the same region, but in another zone, setting us-central1-c as the new default zone: VM_gcloud.png

SSH the new VM: VM_ping_ssh.png

Install a simple webserver and edit its homepage on the new VM: VM_install_simple_webserver.png

VM_edit_webserver_homepage.png

Check that the running webserver serves the homepage locally on the new VM:

VM_check_webser_serves_local_homepage.png

Check that the running webserver serves the homepage from the other VM on the same VPC:

VM1_seeing_homepage_of_VM2.png

Lab about Compute Engine

notes

GCP Storage Options

video

Storage_COMPARISON.png

GCP has other storage options to meet your needs for:

  • structured,
  • unstructured,
  • transactional
  • and relational data.

Its core storage options:

  • Cloud Storage
  • Google Big Table (NoSQL)
  • Cloud SQL, (RDBMS)
  • Cloud Spanner (RDBMS),
  • Cloud Data Store,

Cloud Storage & Cloud Storage interactions

Cloud Storage objects are immutable

4 classes:

  • "multi-regional" & "regional" are high-performance object classes
  • "nearline" & "coldline" are backup/archivable storage

Cloud_Storage_interactions_4classes.png

Storage_CloudStorage_bring_data_in.png

Storage_CloudStorage_interactions_with_other_GCP_services.png

Storage_CloudStorage_buckets.png

Google Cloud Bigtable

video

Definition: Google CLoud (fully-managed) NoSQL, BigData, database service for TeraBytes applications (up to PetaBytes of data).

Storage_BigTable_interactions.png

Google Cloud Datastore

vidoe

Definition: Google CLoud Datastore is a managed horizontally scalable NoSQL database.

Google Cloud SQL

video

a managed Relational Database Management System (RDBMS) database service. Either based on MySQL or PostgreSQLBeta.

It manages "database transactions".

Google Cloud Spanner

video, together with Cloud SQL

Definition: Google Spanner is horizontally scalable Relational Database Management System (RDBMS).

Lab: Cloud Storage / Cloud SQL

video

notes

Containers, Kubernetes, and Kubernetes Engine

video

Containers

containers_IaaS.png

containers_containers.png

It scales like PaaS, but gives you nearly the same flexibility as IaaS.

With this abstraction, your code is ultra portable, and you can treat the OS and the hardware as a blackbox. You can go from your laptop to the cloud without changing or rebuilding anything.

containers_python_app.png containers_python_requirements.png containers_dockerfile.png
containers_build_run_container.png

Kubernetes

video

Let's you orchestrate many containers on many hosts, scale them as microsrvices and deploy rollouts and rollbacks.

Kubernetes is a set of APIs that you can use to deploy containers on a set of nodes called a cluster

GKE is "hosted Kubernetes" by Google!!

GKE clusters:

  • can be customised,
  • can be deloyed in one gcloud command: gcloud container clusters create k1
  • check status in the Admin Console
  • Then, you deploy containers on nodes using a wrapper around one or more containers called a pod. A Pod is the smallest unit in Kubernetes that you create or deploy
containers_k8s_cluster.png containers_k8s_configure_with_GKE.png containers_k8s_POD_containers.png

One way to run a container in a POD in Kubernetes is to use kubectl:

containers_k8s_kubectl.png containers_k8s_kubectl_networking.png containers_k8s_kubectl_simple_loadbalancer.png

To get the IP of the load balancer:

containers_k8s_kubectl_get_IPs.png

To scale the deployment:

containers_k8s_kubectl_scale_deployment.png containers_k8s_kubectl_autoscale_deployment.png

the real strength of Kubernetes comes when you work in a declarative way. Instead of issuing commands, you provide a configuration file that tells Kubernetes what you want your desired state to look like, and Kubernetes figures out how to do it.

containers_k8s_kubectl_configuration_file.png containers_k8s_kubectl_configuration_file_app_nginx.png containers_k8s_kubectl_configuration_file_app_nginx_scale_3_to_5.png containers_k8s_kubectl_run_configuration_file.png

Watch the PODs come online:

containers_k8s_kubectl_watch_PODs_come_onine.png

Which ones are deployed?

containers_k8s_kubectl_watch_PODs_deployed.png

Find out the external IP of the service(s):

containers_k8s_kubectl_watch_PODs_get_IPs.png

And hit a public IP from a client:

containers_k8s_kubectl_watch_PODs_hit_IPs_from_client.png

What happens when you want to upload a new version of your app?

It might be too risky to rollout all of your services all at once!

Use kubectl rollout ... or change your deployment configuration file and apply the changes using kubectl apply:

New PODs will get created according to your update strategy:

Here's an example configuration that will create a new version of your pods one-by-one, and wait for a new pod to be available before destroying one of the old pods.

containers_k8s_kubectl_update_code_rollout.png

Lab: Containers / Kubernetes / GKE

video: k8s on GKE is much more...

There are a lot of features in Kubernetes and GKE we haven't even touched on, such as configuring health checks, setting session affinity, managing different rollout strategies, and deploying pods across regions for high availability. But for now, that's enough. In this module, you've learned how to build a run containerized applications, orchestrate and scale them on a cluster, and deploy them using rollouts. Now you'll see how to do it in a demo and practice it in a lab exercise.

App Engine

  • video #1

  • video #2

  • Compute infrastructure (IaaS): Compute Engine & Kubernetes Engine

  • Platform-as-a-Service (PaaS) and "focus on your code": App Engine

App Engine:

  • scales automatically
  • you pay for what you use
  • 2 environments: Standard & Flexible

Standard Environment

  • Simpler type: works in sandbox (why it can scales)
  • Some environment at NO charge
  • has contraints: "60 seconds time-out" (if not suitable, move to Flexible)
  • Runtime for Java, Python, PHP and Go

AppEngine_Standard_example_workflow.png

Flexible Environment

  • runs on containers (running on VMs)
  • Allows to configure your containers
  • Use standard runtime

Comparison Standard vs Flexible

AppEngine_Comparison_standard_flexible.png

Comparison App Engine vs Kubernetes Engine

AppEngine_Comparison_app_engine_vs_Kubernetes_engine.png

Google Cloud Endpoints and Apigee Edge

video

  • What's an API? a clean, well-defined interface that abstract away needless details on the service's implementation

Google provides 2 APIs-related approach:

  • Cloud Endpoints: managed API proxies
  • Apigee Edge: also a managed API proxies, but business-oriented (rate limiting, quotas, analytics), providing a software service to OTHER companies.
API_Cloud_Endpoints.png API_Cloud_Endpoints_managed_Proxy.png
API_Cloud_Endpoints_supported_platforms.png

API_ApigeeEdge.png

Many users of Apigee Edge are providing a software service to other companies and those features come in handy.

Because of the backend services for Apigee Edge need not be in GCP, engineers often use it when they are "taking apart" a legacy application. Instead of replacing a monolithic application in one risky move, they can instead use Apigee Edge to peel off its services one by one, standing up microservices to implement each in turn, until the legacy application can be finally retired.

Lab: Getting Started with App Engine

Development in the cloud

video

  • Git: hosted git (private) provider > Cloud Source Repositories
  • Managed applications which can be triggered (events in Cloud Storage, in Cloud Pub/Sub or an HTTP call): Cloud Functions (beta)

Deployment: Infrastructure as code

video

It's often more efficient to use a template to set up your GCP environment.

That means a specification of what the environment should look like. It's declarative rather than imperative, using either:

  • yaml markup language
  • or Python

Then you give the template to Deployment Manager, and you allows you to version control your deployment templates in Git repositories (e.g. Cloud Source Repositories).

Monitoring: Proactive instrumentation (Stackdriver & )

video

You can't run an application stably without monitoring. Monitoring lets you figure out whether the changes you made were good or bad. It lets you respond with information rather than with panic, when one of your end users complains that your application is down

Stackdriver is GCP's tool for monitoring, logging and diagnostics.

Monitoring_StackDriver.png

Monitoring_StackDriver_services.png

lab: Getting Started with Deployment Manager and Stackdriver

Big Data and Machine Learning

BigData_ML_tools.png

Dataproc

BigData_ML_Dataproc.png

  • Spark/SparkSQL for data mining
  • MLlib (Spark ML libraries)

Dataflow

BigData_ML_Dataflow.png BigData_ML_Dataflow_pipelines.png BigData_ML_Dataflow_ETL_tool.png BigData_ML_Dataflow_orchestration.png

BigData_ML_Dataflow_pipeline_example.png

BigQuery

BigData_ML_BigQuery.png BigData_ML_BigQuery_fully_managed.png BigData_ML_BigQuery_global_service.png

Cloud Pub/Sub and Cloud Datalab

BigData_ML_PubSub.png BigData_ML_PubSub_usage.png
BigData_ML_Datalab.png BigData_ML_Datalab_interactions.png

Google Cloud Machine Learning Platform

BigData_ML_ML_platform.png

Machine learning APIs

BigData_ML_API_Vision.png BigData_ML_API_Vision_aplications.png
BigData_ML_API_Natural_Language.png BigData_ML_API_Natural_Language_applications.png

BigData_ML_API_Translation.png

BigData_ML_API_VideoBeta.png

lab: Getting Started with BigQuery

Summary

video

  • GCP: a continuum of servces from "managed infrastructure" to "dynamic infrastructure"

summary_GCP_continuum_managed_to_dynamic_infrastructure.png

  • GCP offers a variety of Load balancers:

various_load_balancing_services.png

  • GCP offers various ways to interconnect other networks to it:

Interconnect_other_network_to_GCP.png

  • GCP offers various type of storage:

various_storages.png

various_storages_classes.png

Resources/Articles