Add common preprocessing functions

sklearn and pandas have some preprocessing functions which we use a lot. However, these preprocessors aren't aware of the semantic types of each variable. We want to apply preprocessing only where it makes sense, and we can do this within Dataset since each column is tagged with a type. By default, each preprocessing function should only apply to a certain type of feature, but we should allow for flexibility if people want to ignore the feature types.

Some must-haves to implement first (names can be changed):

make_one_hot - converts CATEGORICAL features to binary indicators. If a categorical feature has k categories, this should be converted to k or k-1 binary indicators. This should gracefully handle missing data. For categories with a large k, we might only only want to include the top 10 or 100, useful for long-tail categories.
impute_{mean,median} - converts missing data in NUMERICAL features to the mean or median, and optionally adds an extra binary column to mark where missing values were imputed. There is no need to impute non-numerical features, as these can just be treated as a separate category.
center_and_rescale - rescales INTERVAL data to be have mean 0 (centering) and variance 1 (rescaling). For RATIO data, only performs the scaling step.

Some other things that will be useful soon:

clip - squish values to be in [-1, 1] or some other set range. Good for clipping outliers.
log - convert RATIO features to log space
random_polynomial - randomly generate polynomial combinations of NUMERICAL features.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information