Data Visualization: A Complete Free Course (17 Hours of Content)

Data Visualization: A Complete Free Course (17 Hours of Content)

Why Data Visualization Matters

In a world drowning in data, the ability to present information visually is one of the most valuable skills across every industry. A table of 10,000 sales records tells you little at a glance. A line chart of monthly revenue with a trend line tells an immediate story.

Effective data visualization:

  • Reveals patterns and outliers that tables hide
  • Communicates findings to non-technical stakeholders
  • Supports faster, better decision-making
  • Makes presentations memorable and persuasive

This course takes you from the foundational principles through practical implementation in multiple tools and programming libraries.

Part 1: Principles of Effective Data Visualization

The Grammar of Graphics

Leland Wilkinson's "Grammar of Graphics" (1999) is the conceptual foundation of most modern visualization tools (ggplot2, Vega-Lite, D3.js). It decomposes visualizations into layers:

  • Data: The underlying dataset
  • Aesthetics: How data attributes map to visual properties (position, color, size, shape)
  • Geometries: The visual marks (bars, lines, points, areas)
  • Statistics: Transformations applied to data (binning, smoothing, summarizing)
  • Scales: How data values map to visual values (linear, logarithmic, ordinal)
  • Coordinate system: Cartesian, polar, geographic
  • Facets: Small multiples (the same plot for different subsets)

Understanding this grammar lets you reason about visualizations systematically rather than memorizing chart types.

Choosing the Right Chart Type

The choice of chart type should be driven by the relationship you want to show:

Comparison (How do things compare?)

  • Bar chart: Comparing categorical values (sales by region)
  • Grouped bar: Comparing categories across groups
  • Bullet chart: Comparing against a target

Trend over time (How does something change?)

  • Line chart: Continuous data over time (stock price, temperature)
  • Area chart: Cumulative values over time
  • Calendar heatmap: Daily values over months/years

Part-to-whole (What is the composition?)

  • Pie/donut chart: A few categories (use sparingly — humans are bad at comparing angles)
  • Treemap: Many hierarchical categories
  • Stacked bar: Composition that also shows comparison

Distribution (How is data spread?)

  • Histogram: Distribution of a continuous variable
  • Box plot: Median, quartiles, outliers
  • Violin plot: Distribution shape more detailed than box plot

Correlation (Is there a relationship?)

  • Scatter plot: Relationship between two continuous variables
  • Bubble chart: Three continuous variables
  • Heat map (correlation matrix): Many pairwise relationships

Geospatial (Where does something occur?)

  • Choropleth map: Values by geographic region
  • Dot map: Individual data points on a map
  • Flow map: Movement between locations

Design Principles

Data-ink ratio (Edward Tufte): Maximize the ratio of ink that represents data vs. total ink. Remove gridlines, borders, 3D effects, and decorative elements that don't add information.

Lie factor: The apparent visual change in a chart should be proportional to the actual change in data. Bar charts must start at zero. Using area instead of height to represent a 2× change makes it look 4× larger.

Pre-attentive attributes: Our visual system instantly processes certain attributes before conscious attention: color hue, color saturation, size, shape, orientation, and position. Use these strategically to direct attention.

Gestalt principles explain how we group visual elements:

  • Proximity: Elements close together appear related
  • Similarity: Elements that look alike appear related
  • Continuity: We follow lines and curves
  • Closure: We complete incomplete shapes

Part 2: Color in Data Visualization

Color is powerful but frequently misused.

Sequential vs. Diverging vs. Categorical Palettes

Sequential: A single hue from light to dark — for ordered data (temperatures, income levels):

Light yellow → Orange → Dark brown

Diverging: Two hues from a neutral midpoint — for data with a meaningful center (profit/loss, temperature above/below freezing):

Blue → White/Gray → Red

Categorical/Qualitative: Distinct hues — for unordered categories (countries, product lines):

Blue, Orange, Green, Red, Purple, Brown, Pink, Gray

The ColorBrewer palette system (colorbrewer2.org) provides research-backed palettes optimized for maps and charts.

Accessibility

~8% of men have color vision deficiency. Always:

  • Choose palettes that distinguish colors even in grayscale
  • Don't rely on color alone to encode information — add labels, patterns, or icons
  • Test with colorblindness simulators (Coblis, Colour Contrast Analyzer)

Part 3: Chart.js — Web Charts Made Easy

Chart.js is a beginner-friendly JavaScript library that produces beautiful, responsive charts with minimal code:

<canvas id="myChart"></canvas>
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
<script>
const ctx = document.getElementById('myChart');
new Chart(ctx, {
  type: 'bar',
  data: {
    labels: ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'],
    datasets: [{
      label: 'Monthly Revenue ($K)',
      data: [65, 78, 52, 91, 84, 103],
      backgroundColor: 'rgba(99, 102, 241, 0.8)',
      borderColor: 'rgb(99, 102, 241)',
      borderWidth: 1,
    }],
  },
  options: {
    responsive: true,
    plugins: {
      legend: { position: 'top' },
      title: { display: true, text: '2024 Revenue by Month' },
    },
    scales: {
      y: { beginAtZero: true },
    },
  },
});
</script>

For line charts with multiple series:

new Chart(ctx, {
  type: 'line',
  data: {
    labels: ['Q1', 'Q2', 'Q3', 'Q4'],
    datasets: [
      {
        label: 'Product A',
        data: [120, 190, 150, 210],
        borderColor: '#6366F1',
        fill: false,
        tension: 0.4,  // Smooth curves
      },
      {
        label: 'Product B',
        data: [85, 110, 130, 160],
        borderColor: '#F59E0B',
        fill: false,
        tension: 0.4,
      },
    ],
  },
});

Part 4: D3.js — The Power Tool

D3 (Data-Driven Documents) is the most powerful JavaScript visualization library. It binds data to DOM elements and provides tools to transform and visualize that data. D3's learning curve is steep, but it can create any visualization imaginable:

// Classic bar chart with D3
const data = [{ label: 'A', value: 30 }, { label: 'B', value: 80 },
              { label: 'C', value: 45 }, { label: 'D', value: 60 }];

const svg = d3.select('#chart')
  .append('svg')
  .attr('width', 500)
  .attr('height', 300);

const x = d3.scaleBand()
  .domain(data.map(d => d.label))
  .range([40, 480])
  .padding(0.2);

const y = d3.scaleLinear()
  .domain([0, d3.max(data, d => d.value)])
  .range([260, 20]);

// Draw bars
svg.selectAll('.bar')
  .data(data)
  .join('rect')
    .attr('class', 'bar')
    .attr('x', d => x(d.label))
    .attr('y', d => y(d.value))
    .attr('width', x.bandwidth())
    .attr('height', d => 260 - y(d.value))
    .attr('fill', '#6366F1');

// Add axes
svg.append('g')
  .attr('transform', 'translate(0, 260)')
  .call(d3.axisBottom(x));

svg.append('g')
  .attr('transform', 'translate(40, 0)')
  .call(d3.axisLeft(y));

Part 5: Python Data Visualization

Python offers multiple visualization libraries:

Matplotlib — the foundational library:

import matplotlib.pyplot as plt
import numpy as np

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Bar chart
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
revenue = [65, 78, 52, 91, 84, 103]
axes[0].bar(months, revenue, color='#6366F1', alpha=0.8)
axes[0].set_title('Monthly Revenue')
axes[0].set_ylabel('Revenue ($K)')

# Scatter plot
x = np.random.randn(100)
y = 2 * x + np.random.randn(100) * 0.5
axes[1].scatter(x, y, alpha=0.5, color='#F59E0B')
axes[1].set_title('Correlation Plot')

plt.tight_layout()
plt.savefig('charts.png', dpi=150, bbox_inches='tight')
plt.show()

Seaborn — statistical visualization built on Matplotlib:

import seaborn as sns
import pandas as pd

# Load sample data
tips = sns.load_dataset('tips')

# Distribution
sns.histplot(data=tips, x='total_bill', hue='day', kde=True)

# Box plot
sns.boxplot(data=tips, x='day', y='total_bill', hue='sex')

# Correlation heatmap
corr = tips.select_dtypes(include='number').corr()
sns.heatmap(corr, annot=True, cmap='RdBu_r', center=0)

Plotly — interactive visualizations for web and Jupyter:

import plotly.express as px

df = px.data.gapminder()

fig = px.scatter(
    df.query("year==2007"),
    x="gdpPercap", y="lifeExp",
    size="pop", color="continent",
    hover_name="country",
    log_x=True,
    title="GDP vs Life Expectancy (2007)",
)
fig.show()  # Interactive in browser

Building an Interactive Dashboard

A complete dashboard combines multiple charts with filters:

  • Filter controls: Date pickers, dropdowns, sliders
  • Summary KPIs: Big number cards at the top
  • Trend charts: Time series showing the main metrics
  • Breakdown charts: Bar or treemap showing composition
  • Detail table: Filterable grid for drill-down

Libraries for production dashboards:

  • Plotly Dash (Python): Full web app framework built on Plotly
  • Streamlit (Python): Fastest way to share Python data apps
  • Observable (JavaScript/D3): Notebook environment for web-native visualizations
  • Metabase / Superset: Open-source BI tools for non-programmers
Share: