Data and Visual Analytics — Course Introduction

Rezana Dowra
3 min readDec 30, 2022

--

This article serves as my personal notes for the course CSE 6242 Data and Visual Analytics taken at Georgia Tech University (GaTech) during Spring 2023.

This course will introduce you to broad classes of techniques and tools for analysing and visualising data at scale.

Its emphasis is on how to complement computation and visualisation to perform effective analysis. We will cover methods from each side, and hybrid ones that combine the best of both worlds.

Course Introduction

This course will give you the tools to analyse and represent data.

Today we have access to large sets of data, however human beings can only maintain approximately 7 +- 2 items in their working memory. Our goal is to condense these large data sets into valuable, relevant and important things that people can hold in their memories.

We need to summarise data in order to understand it.

We achieve this by transforming data into insights by taking techniques from two approaches which include data mining and human computer interfaces.

Combining data mining and HCI to achieve data visual analytics.

Data mining focuses on automatic techniques and they include clustering and classification techniques.
Since they are automatic they can easily scale to millions of items. Human computer interaction helps use understand data in an intuitive way. This focuses on interaction and visualisation techniques.
This course combines computation and human intuition these two areas of focus.

Why data visual analytics?

  1. The best way to start answering this question is to understand “What is data and visual analytics?” This is an interdisciplinary science combining computation techniques and interactive visualisation to help transform data to help making an important decision or making discovery. Thus the motivation behind this is the ability to make informed decisions or discovering information from the data.

There are a couple of things worth considering when attempting to do data visual analytics. Some challenges including how to store and retrieve data efficiently as well as how to scale algorithms, working with distributed systems how do we perform testing, visualisation etc.

2. More data is being created every day and there is a need for the processing of this data. Especially in fields like medical/ sports/ finance/ marketing etc. There is also a need for these careers

Course goals and expectations

  • Learn visual and computational techniques and use them in a complementary way
  • Gain a breadth of knowledge
  • Learn practical know-how by working on real data and problems.
Course schedule

The course schedule is made up of multiple parts. The parts in green are data collection, cleaning and integration. We then have a blue section representing data analytics and visualisation and finally presentation and dissemination.

These are building blocks as appose to rigid steps. These building blocks can be revisited or some can be skipped depending on the data and your goals.

The course topics

  1. Course Introduction
  2. Analytics Building Blocks
  3. Data Science Buzzwords
  4. Data Collection
  5. SQLite
  6. Data Cleaning
  7. Code Back-up & Version Control
  8. Data Integration
  9. Data Analytics, Concepts and Tasks
  10. Visualisation 101
  11. Fixing Common Visualisation Issues
  12. Data Visualisation for Web (D3)
  13. Scalable Computing: Hadoop
  14. Scalable Computing: Pig
  15. Scalable Computing: Hive
  16. Scalable Computing: Spark
  17. Scalable Computing: HBase
  18. Classification
  19. Visualisation for Classification
  20. Introduction to Clustering
  21. Graph Analytics
  22. Ensemble Method
  23. Scaling up Algorithms with Virtual Memory
  24. Text Analytics

The other topics covered in the course will be posted as I go through the course — the above list should become a list of links.

Hope you learned something.

-R

--

--

No responses yet