0

Stratified sampling is old, and very significant.

  • Donald Knuth (high priest of computer science) uses it for evaluating the work of his PhD students, and for teaching his deeply and sincerely held religious beliefs. (link)
  • Royal Society article from 1934 on the topic. (link)

In the r-interface to h2o.ai they have a method to split frames "h2o.splitframe". Is there a way to make a stratified split along the distinct elements of another column?

Here are R packages that do not do this in h2o:

EngrStudent
  • 1,924
  • 31
  • 46

1 Answers1

1

You don't need to apply stratified sampling before model training since h2o.ai provides different types of fold_assigment parameter including "Stratified". It applies "Stratified" sampling during training so you only need to set fold_assingment and fold_column parameters. You can find the details in the link below. http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/fold_assignment.html?highlight=stratified#example

dogankadriye
  • 538
  • 3
  • 13