May 15, 2018 (Vol. 38, No. 10)
URL:
http://distill.pub/2016/misread-tsne
Rating:
Strong Points: Nice interactive graphics, simple design
Weak Points: None
Summary:
Especially within the context of high-dimensional single-cell RNA-sequencing datasets, biological research articles increasingly include “t-SNE plots”—plots designed to convert multidimensional datasets into lower-dimensional representations based on similarities in the data. For example, data points corresponding to different “cell types” are often depicted as distinct clusters in a 2D plot. While these plots can be very useful, they are accompanied by many caveats that are often unknown to the casual observer. Enter the article, “How to Use t-SNE Effectively” on Distill.pub (a website that provides articles to simply explain concepts in machine learning). This article uses interactive simulations to teach site visitors the effects of altering different parameters (e.g., points per cluster, dimensions, and “perplexity”) used by the t-SNE clustering algorithm. A nice text explanation and embedded illustrations drive home the main points.