Capybara Team

Riccardo Rorato, Alessandro Arata, Gagandeep Singh, Alessandro Penco, Gabriele Caletti, Alessandro Drago

This lab is about visualizing distributions via histograms and scatter/bubble plots.

Histogram of tree heights

With this histogram we want to visualize the distribution of tree heights of all species and ages. Since the great variability of tree ages and species, we expected a less uniform distribution, instead we can see a gaussian-like shape (though a little skewed), as if we were visualizing a single kind and age of trees. We have a very high tree, that forces us to have the upper bound of the x asis to be 60!

Boxplot of tree heights

A boxplot is the right visualization to show statistical info about the distribution: for example the median, some specified percentiles, and possibly the outliers. The wiskers extend 3/2 of the box length, starting from the box minimum / maximum. (following the idea from wikipedia).

Scatter plot

We chose to show the CO2 absorbtion on the X axis againts the tree height, on the Y axis. We can see an intuitive correlation between the two (higher/bigger trees absorb more CO2), but it is less accentuated for some species (encoded by color).

Multiple Scatter plot

We show the same data as the plot above, separating the 6 most frequent species. We also plot the regression line between the two variables: we can see that the Tilia Europaea is the most "efficent" in absorbing CO2, considering its height.

Bubble chart

Again a scatter plot where we show one more variable (the canopy cover of each tree) by enconing it to the bubble size. Platanus hispanica has the biggest canopies (and they also scale the most when compared to the height).