This article is an extension of the article Hexagonal Binning in which this method of data aggregation is shown. It starts showing how a dataset is represented in scatterplot and concludes showing it with a visual representation using hexagonal bins. The charts shown in the article are all generated using the D3 JavaScript library.

In this article, we will see how to apply various analyzes to a dataset (in CSV format) using only the D3 library. By way of example, we will use two dataset contained in two different CVS files. First we will see how it is possible plotting them as **scatterplots**, and then, once we have performed a **hexagonal** **binning **on them, we will show them in a **hexagonal heatmap **chart.

(In another article the **rectangular binning** is discussed).

For those who are not familiar with the development of charts using JavaScript libraries I suggest to read the book *Beginning JavaScript Charts with jqPlot, D3 and Highcharts*. This book contains many examples (over 250 examples) in which it is explained, step by step, how to develop the most common types of charts using various JavaScript libraries.

These are the two datasets contained in CSV files.

- scatterplot01 : it is a “sparse” dataset
- scatterplot02 : it is a dataset producing a linear trend

To better understand how the two datasets are distributed in a XY plane, let’s implement a web page that is able to produce a scatterplot chart. The code of the web page is below. Once copied into a text editor, save it as ** scatterplot01.html**.

This page reads the content of the ** scatterplot01.csv **file. In order to do the same thing with the second file, replace the file name with the new one, as shown here:

1 2 |
d3.csv("scatterplot02.csv", function(error, data) { |

and save the web page as * scatterplot02.html*.

This is the scatterplot produced with the dataset contained in the * scatterplot02.csv *file.

and this is the scatterplot produced with the dataset contained in the * scatterplot01.csv *file.

As we can see, it is readily apparent that the **scatterplot02** dataset follows a linear trend, while the **scatterplot01** dataset does not show any particular trend or cluster. So, while in the first case a scatterplot representation may be sufficient to perform an analysis on the dataset, in the second case it is necessary to apply a different method of analysis, which also takes account of the density distribution of the points in the space represented by the XY plane. A very efficient method of aggregation to be applied to the dataset “spread” is precisely the **hexagonal binning**.

The D3 library provides us with a plugin which is specialized for this type of analysis: * d3.hexbin.js*. This plugin contains a

*d3.layout*(

*) which is able, once a dataset is provided, to perform a hexagonal binning on it, generating a corresponding data structure. This data structure contains the bins in which the XY plane is divided, each bin is indexed by the*

**d3.hexbin****and**

*i***values, the points that each bin contains, the coordinate of the center of the hexagon and finally the counting of points enclosed in it. Thanks to the values contained in this data structure it is up to the D3 library to convert these data into graphics.**

*j*This is the code of the web page which allows us to perform the hexagonal binning on datasets.

As we can see in Fig.3, where before we had a uniform distribution, it is evident a linear trend marked by a set of dark hexagons that cut diagonally across the XY plane.

If we apply the hexagonal binning to the other dataset, we obtain the following chart:

As you can see, it is almost superfluous to apply this kind of analysis to this type of dataset.