High-dimensional data in phylogenetics

Last updated on Oct 19, 2020

What is high-dimensional data?

The concept of high-dimensionality applies to data sets that contain a high number of variables, so high that it often surpasses the population sample size.

For example, when the number of genes affecting a phenotypes is higher than the number of people the genetic data was drawn from.

This is a statistical problem.

Check these: https://www.statisticshowto.com/dimensionality/#:~:text=High%20Dimensional%20means%20that%20the,tens%20of%20hundreds%20of%20samples.

From the previous post: «Each added variable results in an exponential decrease in predictive power.»

Book: https://www.springer.com/gp/book/9783642201912

When is high-dimensionality a problem in phylogenetics?

In phylogenetics, we usually have as sample size the number of tips on the phylogeny. Should we be using also the number of nodes?

High-dimensional data in phylogenetics

Luna L. Sánchez Reyes

Postdoctoral Research Scholar
University of California, Merced

Related

High-dimensional data in phylogenetics

Luna L. Sánchez Reyes

Postdoctoral Research Scholar University of California, Merced

Related

Postdoctoral Research Scholar
University of California, Merced