Skip to content

Working with data

Working with Data

Jupyter notebooks are famous for their convenience to explore, experiment and visualize. All these steps include working with data, including data manipulations and looking on data, tables and charts.

Workspace has several popular python data packages including numpy, pandas, and matplotlib. To start with, copy tutorial notebooks into the project forlder.

cp -r /home/examples/tutorials /home/project

Pandas

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Workspace contains basic pandas tutorial notebook. Open in Jupyter tutorials/pandas.ipynb and follow the steps

Pandas tutorial

Explore entire pandas documentation

Pandas SQL

Pandas is great for complex data transformations, but some queries on data frames can be much easier with the standard SQL.

Workspace has Fugue SQL which allows querying and transforming pandas data frames.

FugueSQL integrates with Python by creating extensions and applying them in the %%fsql cells. Open in Jupyter tutorials/sql-pandas.ipynb and run with examples cells one-byone, following the instructions

Pandas fugue sql

Charts

Plotting charts - is one of the most typical steps in experiments, analytics and data science.

Workspace has several popular plotting packages installed, which you can use to visualise trends and patterns in your data:

Workspace includes tutorial notebook tutorials/charts.ipynb, open and follow the instructions

Pandas charts