Using Markdown In Jupyter Notebook

  
Notebook

In this short guide, I'll show you how to add R to Jupyter Notebook. I will review the complete steps to add R from scratch. Contrary to what you might think, Jupyter doesn’t limit you to working solely with Python: the notebook application is. How To Use R Markdown Notebook’s.

Jupyter notebook tutorial on how to install, run, and use Jupyter for interactive matplotlib plotting, data analysis, and publishing code


New to Plotly?

Plotly is a free and open-source graphing library for Python. We recommend you read our Getting Started guide for the latest installation or upgrade instructions, then move on to our Plotly Fundamentals tutorials or dive straight in to some Basic Charts tutorials.

Markdown

Introduction¶

Jupyter has a beautiful notebook that lets you write and execute code, analyze data, embed content, and share reproducible work. Jupyter Notebook (previously referred to as IPython Notebook) allows you to easily share your code, data, plots, and explanation in a sinle notebook. Publishing is flexible: PDF, HTML, ipynb, dashboards, slides, and more. Code cells are based on an input and output format. For example:

Installation¶

There are a few ways to use a Jupyter Notebook:

  • Install with pip. Open a terminal and type: $ pip install jupyter.
  • Windows users can install with setuptools.
  • Anaconda and Enthought allow you to download a desktop version of Jupyter Notebook.
  • nteract allows users to work in a notebook enviornment via a desktop application.
  • Microsoft Azure provides hosted access to Jupyter Notebooks.
  • Domino Data Lab offers web-based Notebooks.
  • tmpnb launches a temporary online Notebook for individual users.

Getting Started¶

Once you've installed the Notebook, you start from your terminal by calling $ jupyter notebook. This will open a browser on a localhost to the URL of your Notebooks, by default http://127.0.0.1:8888. Windows users need to open up their Command Prompt. You'll see a dashboard with all your Notebooks. You can launch your Notebooks from there. The Notebook has the advantage of looking the same when you're coding and publishing. You just have all the options to move code, run cells, change kernels, and use Markdown when you're running a NB.

Helpful Commands¶

- Tab Completion: Jupyter supports tab completion! You can type object_name.<TAB> to view an object’s attributes. For tips on cell magics, running Notebooks, and exploring objects, check out the Jupyter docs.
- Help: provides an introduction and overview of features.

- Quick Reference: open quick reference by running:

- Keyboard Shortcuts:Shift-Enter will run a cell, Ctrl-Enter will run a cell in-place, Alt-Enter will run a cell and insert another below. See more shortcuts here.

Languages¶

The bulk of this tutorial discusses executing python code in Jupyter notebooks. You can also use Jupyter notebooks to execute R code. Skip down to the [R section] for more information on using IRkernel with Jupyter notebooks and graphing examples.

Package Management¶

When installing packages in Jupyter, you either need to install the package in your actual shell, or run the ! prefix, e.g.:

You may want to reload submodules if you've edited the code in one. IPython comes with automatic reloading magic. You can reload all changed modules before executing a new line.

Solid helium properties. Some useful packages that we'll use in this tutorial include:

  • Pandas: import data via a url and create a dataframe to easily handle data for analysis and graphing. See examples of using Pandas here: https://plotly.com/pandas/.
  • NumPy: a package for scientific computing with tools for algebra, random number generation, integrating with databases, and managing data. See examples of using NumPy here: https://plotly.com/numpy/.
  • SciPy: a Python-based ecosystem of packages for math, science, and engineering.
  • Plotly: a graphing library for making interactive, publication-quality graphs. See examples of statistic, scientific, 3D charts, and more here: https://plotly.com/python.

Import Data¶

You can use pandas read_csv() function to import data. In the example below, we import a csv hosted on github and display it in a table using Plotly:

Use dataframe.column_title to index the dataframe:

Most pandas functions also work on an entire dataframe. For example, calling std() calculates the standard deviation for each column.

Plotting Inline¶

You can use Plotly's python API to plot inside your Jupyter Notebook by calling plotly.plotly.iplot() or plotly.offline.iplot() if working offline. Plotting in the notebook gives you the advantage of keeping your data analysis and plots in one place. Now we can do a bit of interactive plotting. Head to the Plotly getting started page to learn how to set your credentials. Calling the plot with iplot automaticallly generates an interactive version of the plot inside the Notebook in an iframe. See below:

Plotting multiple traces and styling the chart with custom colors and titles is simple with Plotly syntax. Additionally, you can control the privacy with sharing set to public, private, or secret.

Now we have interactive charts displayed in our notebook. Hover on the chart to see the values for each bar, click and drag to zoom into a specific section or click on the legend to hide/show a trace.

Plotting Interactive Maps¶

Plotly is now integrated with Mapbox. In this example we'll plot lattitude and longitude data of nuclear waste sites. To plot on Mapbox maps with Plotly you'll need a Mapbox account and a Mapbox Access Token which you can add to your Plotly settings.

3D Plotting¶

Using Numpy and Plotly, we can make interactive 3D plots in the Notebook as well.

Animated Plots¶

Checkout Plotly's animation documentation to see how to create animated plots inline in Jupyter notebooks like the Gapminder plot displayed below:

Plot Controls & IPython widgets¶

Add sliders, buttons, and dropdowns to your inline chart:

Additionally, IPython widgets allow you to add sliders, widgets, search boxes, and more to your Notebook. See the widget docs for more information. For others to be able to access your work, they'll need IPython. Or, you can use a cloud-based NB option so others can run your work.

Executing R Code¶

IRkernel, an R kernel for Jupyter, allows you to write and execute R code in a Jupyter notebook. Checkout the IRkernel documentation for some simple installation instructions. Once IRkernel is installed, open a Jupyter Notebook by calling $ jupyter notebook and use the New dropdown to select an R notebook.

See a full R example Jupyter Notebook here: https://plotly.com/~chelsea_lyn/14069

Using Markdown In Jupyter Notebook

Additional Embed Features¶

We've seen how to embed Plotly tables and charts as iframes in the notebook, with IPython.display we can embed additional features, such a videos. For example, from YouTube:

LaTeX¶

We can embed LaTeX inside a Notebook by putting a $$ around our math, then run the cell as a Markdown cell. For example, the cell below is $$c = sqrt{a^2 + b^2}$$, but the Notebook renders the expression.

Or, you can display output from Python, as seen here.

$displaystyle F(k) = int_{-infty}^{infty} f(x) e^{2pi i k} dx$

Exporting & Publishing Notebooks¶

We can export the Notebook as an HTML, PDF, .py, .ipynb, Markdown, and reST file. You can also turn your NB into a slideshow. You can publish Jupyter Notebooks on Plotly. Simply visit plot.ly and select the + Create button in the upper right hand corner. Select Notebook and upload your Jupyter notebook (.ipynb) file!The notebooks that you upload will be stored in your Plotly organize folder and hosted at a unique link to make sharing quick and easy.See some example notebooks:

Publishing Dashboards¶

Users publishing interactive graphs can also use Plotly's dashboarding tool to arrange plots with a drag and drop interface. These dashboards can be published, embedded, and shared.

Publishing Dash Apps¶

For users looking to ship and productionize Python apps, dash is an assemblage of Flask, Socketio, Jinja, Plotly and boiler plate CSS and JS for easily creating data visualization web-apps with your Python data analysis backend.

Jupyter Gallery¶

For more Jupyter tutorials, checkout Plotly's python documentation: all documentation is written in jupyter notebooks that you can download and run yourself or checkout these user submitted examples!

-->

February 2018

Volume 33 Number 2

[Artificially Intelligent]

By Frank La February 2018

The Jupyter Notebook is an open source, browser-based tool that allows users to create and share documents that contain live code, visualizations and text. Jupyter Notebooks are not application development environments, per se. Rather, they provide an interactive “scratch pad” where data can be explored and experimented with. They offer a browser-based, interactive shell for various programming languages, such as Python or R, and provide data scientists and engineers a way to quickly experiment with data by providing a platform to share code, observations and visualizations. Notebooks can be run locally on a PC or in the cloud through various services, and are a great way to explore data sets and get a feel for a new programming language. For developers accustomed to a more traditional IDE, however, they can be bewildering at first.

Structure of a Jupyter Notebook

Jupyter Notebooks consist of a series of “cells” arranged in a linear sequence. Each cell can either be text or code. Text cells are formatted in MarkDown, a lightweight markup language with plain text formatting syntax. Code cells contain code in the language associated with the particular notebook. Python notebooks can execute only Python code and not R, while an R notebook can execute R and not Python.

Figure 1 shows the IntroToJupyterPython notebook available as a sample file on the Data Science Virtual Machine. Note the language indicator for the notebook in the upper right corner of the browser window, showing that the notebook is attached to the Python 2 runtime. The circle to the right of “Python 2” indicates the current state of the kernel. A filled circle indicates that the kernel is in use and a program is executing. A hollow circle indicates the kernel is idle. Also take note that the main body of the notebook contains text as well as code and a graphed plot.


Figure 1 Tutorial Notebook Introducing the Core Features of a Jupyter Notebook

Creating a Jupyter Notebook in Azure Notebooks

There are several options to run a Jupyter notebook. However, the fastest way to get started is by using the Azure Notebook service, which is in preview mode at the time of this writing. Browse over to notebooks.azure.com and sign in with your Microsoft ID credentials. If prompted, grant the application the permissions it asks for. For first time users, the site will prompt you for a public user ID. This will create a URL to host your profile and to share notebooks. If you do not wish to set this up at this time, click “No Thanks.” Other­wise, enter a value and click Save.

The screen will now show your profile page. The Azure Notebook service stores Jupyter Notebooks in Libraries. In order to create a notebook, first you must create a library. Under Libraries, there’s a button to add a library. Click on it to create a new library. In the dialog box that follows, enter a name for the Library and an ID for the URL to share it. If you wish to make this library public, check the box next to “Public Library.” Checking or unchecking the box next to “Create a README.md” will auto­matically insert a README.md file for documentation purposes. Click Create to create a new library.

Now, your profile page will have one library listed. Click on it to bring up the contents of the library. Right now, the only item is the README.md file. Click on the New button to add a new item. In the ensuing dialog, enter a name for the new item and choose Python 3.6 Notebook from the drop down list next to Item Type and click New.

Once the item is created, it will appear in the library with a .IPYNB file extension. Click on it to launch an instance of the Jupyter server in Azure. Note that a new browser tab or window will open and that the interface looks more like the screen in Figure 1. Click inside the text box and write the following code:

Choose Run Cells from the Cell menu at the top of the page. The screen should look like Figure 2.


Figure 2 Hello World! in a Jupyter Notebook

Click inside the blank cell that Jupyter added and choose Cell > Cell Type > Markdown from the menu bar. Then add the following text.

Click the save icon and close the browser tab. Back in the Library window, click on the notebook file again. The page will reload and the markdown formatting will take effect.

Next, add another code cell, by clicking in the markdown cell and choosing Insert Cell Below from the Insert menu. Previously, I stated that only Python code could be executed in Python notebook. That is not entirely true, as you can use the “!” command to issue shell commands. Enter the following command into this new cell.

Choose Run Cells from the Run menu or click the icon with the Play/Pause symbol on it. The command returns a listing of the contents of the directory, which contains the notebook file and the README.md file. Once again, Jupyter added a blank cell after the response. Type the following code into the blank cell:

Run the cell and after a moment a scatter plot will appear in the results. For those familiar with Python, the first line of code may look unfamiliar, as it is part of the IPython kernel which executes Python code in a Jupyter notebook. The command %matplotlib inline instructs the IPython runtime to display graphs generated by matplotlib in-line with the results. This type of command, known as a “magic” command, starts with “%”. A full exploration of magic commands warrants its own article. For further information on magic commands, refer to the IPython documentation at bit.ly/2CfiMvh.

For those not familiar with Python, the previous code segment imports two libraries, NumPy and Matplotlib. NumPy is a Python package for scientific computing (numpy.org) and Matplotlib is a popular 2D graph plotting library for Python (matplotlib.org). The code then generates two arrays of 100 random numbers and plots the results as a scatter plot. The final line of code displays the graph. As the numbers are randomly generated, the graph will change slightly each time the cell is executed.

Notebooks in ML Workbench

So far, I have demonstrated running Jupyter networks as part of the Azure Notebook service in the cloud. However, Jupyter notebooks can be run locally, as well. In fact, Jupyter notebooks are integrated into the Azure Machine Learning Workbench product. In my previous article, I demonstrated the sample Iris Classification project. While the article did not mention it, there is a notebook included with the project that details all the steps needed to create a model, as well as a 3D plot of the iris dataset.

To view it, open the Iris Classifier sample project from last month’s column, “Creating Models in Azure ML Workbench” (msdn.com/magazine/mt814992). If you did not create the project, follow the directions in the article to create a project from the template. Inside the project, shown in Figure 3, click on the third icon (1) from the top of the vertical toolbar on the left-hand side of the window, then click on iris (2) in the file list.


Figure 3 Viewing Notebooks in an Azure Machine Learning Workbench Project

The notebook file loads, but the notebook server is not running—the results shown are cached from a previous execution. To make the notebook interactive, click on the Start Notebook Server (3) button to activate the local notebook server.

Scroll down to the empty cell immediately following the 3D graph and enter the following code to view the first five records in the iris data set:

Choose Insert Cell Below from the Insert menu and enter the following code into the empty cell to display the correlation matrix:

The output should look like Figure 4.


Figure 4 Correlation Matrix for the Iris Data Set

A correlation matrix displays the correlation coefficient between various fields in a data set. A correlation coefficient measures the linear dependence between two variables, with values closer to 1 indicating a positive correlation and values closer to -1 indicating a negative correlation. Values closer to 0 indicate a lack of correlation between the two fields. For example, there’s a strong correlation between Petal Width and Petal Length with a value of 0.962757. On the other hand, the correlation between Sepal Width and Sepal Length is much weaker with a value of -0.109369. Naturally, each field has a 1.0 correlation with itself.

Anacondas

Thus far, I’ve only used Jupyter notebooks as part of either a Microsoft cloud service or locally using Microsoft software. However, Jupyter is open source and can run independent of the Microsoft ecosystem. One popular toolset is Anaconda (anaconda.com/download), an open source distribution of the Python and R for Windows, Mac and Linux. Jupyter ships as part of this install. Running Jupyter locally initializes a Web server locally on port 8888. Note that, on my system, I can only create a Python 3 notebook as that is the only kernel I have installed on my PC.

Data Science Virtual Machines

Running a Jupyter notebook server locally is ideal for scenarios where Internet access isn’t reliable or guaranteed. For more compute-intensive tasks, it may be wiser to create a virtual machine and run Jupyter on more powerful hardware. To make this task easier, Azure offers the Data Science Virtual Machine image for both Windows and Linux, with the most popular data science tools already installed.

Creating a VM from this image is fast and simple. From the Azure Portal, click on the New icon and search for Data Science Virtual Machine” There are several options available. However, I’ve found that the Ubuntu image is the most feature-packed. Choose the Data Science Virtual Machine for Linux (Ubuntu) image and create a virtual machine by following the steps in the wizard. Once the machine is up and running, configure the VM for remote desktop access. Refer to documentation on how to connect to a Linux VM at bit.ly/2qgHOZo.

When connected to the machine, double-click on the Jupyter icon on the desktop. A terminal window will open, followed by a browser window a moment later. When clicking on the New button to create a new notebook, you have quite a few more choices of environments and languages, as demonstrated in Figure 5.


Figure 5 Runtimes Available for the Data Science Virtual Machine for Ubuntu

Using Markdown In Jupyter Notebook Download

Using markdown in jupyter notebook using

Along with the various runtime environments, the Data Science Virtual Machine for Ubuntu includes numerous sample notebooks. These notebooks provide guidance on everything from the basics of Azure ML to more advanced topics like CNTK and TensorFlow.

Wrapping Up

Jupyter notebooks are an essential tool for data science work, but they tend to confuse many developers because the platform lacks the basic features needed to develop software. This is by design. Jupyter notebooks are not intended for that task.

What notebooks do is provide a collaborative mechanism where data scientists can explore data sets, experiment with different hypotheses and share observations with colleagues. Jupyter notebooks can run locally on a PC, Mac or Linux. Azure ML Workbench even includes a notebook server embedded into the product for easier experimentation with data. Notebooks can also be run in the cloud as part of a service, such as Azure Notebooks, or on a VM with more capable hardware.

Frank La Vigneleads the Data & Analytics practice at Wintellect and co-hosts the DataDriven podcast. He blogs regularly at FranksWorld.com and you can watch him on his YouTube channel, “Frank’s World TV” (FranksWorld.TV).

Using Markdown In Jupyter Notebook Using

Thanks to the following technical experts for reviewing this article: Andy Leonard