Logging, Monitoring, and Observability in Google Cloud

Course Overview

Discover how to keep an eye on, troubleshoot, and enhance the performance of your infrastructure and applications. This course, which is based on the principles of Site Reliability Engineering (SRE), includes a mix of lectures, demonstrations, practical labs, and case studies from real-world situations. You will practise full-stack monitoring, real-time log management and analysis, production code debugging, and CPU and memory consumption profiling in this course.

The emblem that is shown above can be yours if you've finished this course! Visit your profile page to see all the badges you have earned. Increase the visibility of your cloud career by showcasing your acquired knowledge.

Pre-requisites

Professionals should have the following prerequisites in order to get the most benefit out of this course: - Google Cloud Fundamentals: Core Infrastructure - Basic knowledge of coding or scripting - Expertise with Linux operating systems and command-line tools.

Audience Profile

The following participants are targeted for this course: Cloud architects, Administrators, SysOps personnel, Cloud developers, and DevOps personnel..

Learning Objectives

  • Create and put into effect a logging and monitoring infrastructure with good architecture.
  • Define service level indicators (SLIs) with service level objectives called SLOs.
  • Build effective monitoring dashboards and alerts.
  • Infrastructure for Google Cloud is monitored, troubleshooted, and improved..

Content Outline

Welcome to Google Cloud's Logging, Monitoring, and Observability! Learn about the topics covered in this course, how to access course materials, and how to give comments by using the tools listed below.

This session will spend some time providing a high-level overview of the many products that make up the Logging, Monitoring, and Observability package from Google Cloud.

We cover a number of Site Reliability Engineering (SRE) concepts in this session and how we may apply them to reduce customer suffering. A customer is a cloud-based system's consumer in this context.

Alerting gives timely awareness to problems in your cloud applications so you can resolve the issues quickly. In this module, you will learn how to develop alerting strategies, define alerting policies, add notification channels, identify types of alerts and common uses for each, construct and alert on resource groups, and manage to alert policies programmatically.

Monitoring is about keeping track of exactly what's happening with the resources we've spun up inside Google's Cloud. In this module, we'll look at options and best practices related to monitoring project architectures. We'll differentiate the core Cloud IAM roles needed to decide who can do what regarding monitoring. Just like architecture, this is another crucial early step. We will examine some Google-created default dashboards and see how to use them appropriately. We will create charts and use them to build custom dashboards to show resource consumption and application load. And finally, we will define uptime checks to track liveliness and latency.

In the next part of our Metrics discussion, let's take a little time to examine the art of Configuring Google Cloud Services for Observability. In this module, we're going to spend a little time learning how to integrate Logging and monitoring agents into Compute Engine VMs and images using Agents, enable and utilize Kubernetes Monitoring, extend and clarify Kubernetes monitoring with Prometheus, and expose custom metrics through code, and with the help of OpenCensus.

This module will examine some of Google Cloud's advanced Logging and analysis capabilities. Specifically, in this module, you will learn to identify and choose among resource tagging approaches, define log sinks, create monitoring metrics based on log entries, link application errors to Logging and other operation tools using Error Reporting, and export logs to BigQuery for long term storage and SQL based analysis.

This session will focus on using Google's Cloud Audit logs, as well as monitoring as it relates to the VPC network. You will learn how to enable Packet Mirroring, explain the functions of the Network Intelligence Center, gather and analyse VPC Flow, Firewall Rule, and Cloud NAT logs, as well as how to use Cloud Audit logs to find out who, what, and when. Best procedures for audit logging will also be covered.

Up to this point in our course, we've primarily focused on ways to inspect and monitor the status of our systems running in Google Cloud. But no matter how solid your planning, design, architecture, and preventive maintenance strategies are, things will go wrong. How you manage those incidents will significantly impact user perception when they go wrong. In this module, you will gain knowledge on how to handle incidents systematically.

The Application Performance Management products (Cloud Trace, Cloud Debugger, and Cloud Profiler) offer a set of tools to give insight into how your code and services are operating and to help troubleshoot when necessary when deploying apps to Google Cloud.

In our final module, we discuss optimizing Google Cloud's operations suite costs. Specifically, you will learn to analyze resource utilization costs for operations-related components within Google Cloud and implement best practices for controlling the cost of operations within Google Cloud.

FAQs

A: Observability and monitoring are often referenced simultaneously in conversations about IT software development and operations (DevOps) strategies. While both play a crucial part in keeping your systems, data, and security perimeter safe, observability and monitoring are complementary capabilities and are not the same thing. Before exploring the differences, we must define each term to fully grasp how observability and monitoring support your IT goals and needs.

A: The difference between observability and monitoring focuses on whether data pulled from an IT system is predetermined. Monitoring is a solution that collects and analyzes predetermined data removed from individual plans. Observability is a solution that aggregates all data produced by all IT systems.

A: The best tools for observability provide the end-to-end visibility, monitoring, and telemetry data needed across a dispersed IT infrastructure. For many organizations, that includes cloud-native applications and cloud environments. For example, observability and monitoring in AWS are essential for many businesses. Still, many tools can't manage the complexity necessary to provide the observability needed within a cloud environment. 

A: To attend the training session, you should have operational Desktops or Laptops with the required specification and a good internet connection to access the labs. 

A: We recommend you attend the live session to practice & clarify the doubts instantly and get more value from your investment. However, if, due to some contingency, you have to skip the class, Radiant Techlearning will help you with the recorded session of that particular day. However, those recorded sessions are not meant only for personal consumption and NOT for distribution or any commercial use.

A: Radiant Techlearning has a data center containing a Virtual Training environment for participants' hand-on-practice. 

Participants can easily access these labs over Cloud with the help of a remote desktop connection. 

Radiant virtual labs allow you to learn from anywhere and in any time zone. 

A: The learners will be enthralled as we engage them in the natural world and Oriented industry projects during the training program. These projects will improve your skills and knowledge and give you a better experience. These real-time projects will help you a lot in your future tasks and assignments.

Send a Message.


  • Enroll