Cloud DataLab or Jupyter Notebook


Cloud Datalab

An interactive tool for data exploration, analysis, visualization, and machine learning.


What is Cloud Datalab?

  • As a GCP service, it's basically a preexisting technology that gets wrapped in some GCP conveniences.
  • That preexisting technology is called Jupyter Notebooks.
    • Jupyter Notebooks an interactive webpages, look a bit like a web-based text editor,
    • Jupyter Notebooks can contain documentation, code, and most importantly, elements which are the results of compiled code.
  • Datalab/Jupyter elements could include graphs, visualizations,or just the results of some mathematical calculation.


So I can say, Data lab is a simple notebook in which we can add Some text to explain what my notebook is all about

  • I can also add some Python code right inside the notebook, and then I can actually run that code and the results will be displayed inside my notebook.
  • Datalab notebook, something called a kernel process is launched on the VM hosting the notebook.
  • The DataLab kernel process can execute code within the notebook and access GCP services like BigQuery or ML Engine.
  • Everything you do while working on your notebook exists within a session run on this kernel.
  • Outputs from one section of code can be used in the next section of code.
  • As its mixed media support in Jupyter Notebooks/Data lab, they are a fantastic way to share and collaborate on information.
  • Datalab/Jupyter Notebooks have built-in support for a variety of Python graphing and plotting libraries.
  • You can easily share notebooks with other people, allowing them to run calculations for themselves with their own instance of the notebook.
  • Cloud Datalab configures a Google Source Repository for your notebooks, which is automatically cloned onto the persistent disk attached to your Datalab instance.

GCP, Accenture, AlgaeStudy,  AlgaeServices, hyptechie


If Jupyter Notebooks are so great on their own, why do we need Cloud Datalab?

 It's all about convenience.

  • Google's Datalab command line tool manages the lifecycle of a Datalab instance, which hosts your notebooks and kernel processes.
  • It can quickly create a Datalab VM in seconds without you having to worry about downloading, installing, and configuring the correct software.
  • Datalab creates the Git repo for your notebooks in Google Cloud Source Repositories, can be cloned onto the persistent disk of the Datalab instance.
  • Changes to notebooks can then be committed back to the repo. You can also safely delete a Datalab instance while retaining its persistent disk.
  • You can then use the existing disk when you create a new Datalab instance.

To Create Google Cloud Datalab, follow the below instructions


  • Enable Cloud Source Repositories API  (If your repository is not created, this option is disable for use)
  • Open Google Cloud shell and run the below command to Create DataLab VM
    • gcloud components update
    • gcloud components install datalab
    • datalab create datalab-demo
  • Say yes in option for SSH key and copy output URL to  open Datalab instance





No comments:

Post a Comment