[R] Scaling - does it get any better results than not scaling?

Alex Zarebski @ez@reb@k| @end|ng |rom gm@||@com
Tue Jul 17 09:13:09 CEST 2018


Hey,

Nice question, I'm interested to see what others have to say on this.
I'd like to point out a couple of algorithmic points:

- If you are using regularisation the scaling /will/ lead to different
results.
- If you are using an iterative method to estimate something, (yes very
vague but you get the gist), it can be very useful to know the data is
scaled in a particular way, i.e., it can inform an initial guess for the
iterative method.

On a pedagogical note, it might be interesting to point out to your
students that the act of choosing an scaling/transformation/preprocessing
can be useful as a way of understanding your data better.

Cheers,
Alex

On Tue, Jul 17, 2018 at 4:58 PM Michael Thompson <
michael.thompson using manukau.ac.nz> wrote:

> Hi,
> I seem to remember from classes that one effect of scaling / standardising
> data was to get better results in any analysis. But what I'm seeing when I
> study various explanations on scaling is that we get exactly the same
> results, just that when we look at standardised data it's easier to see
> proportionate effects.
> This is all very well for the data scientist to further investigate, but
> from a practical point of view, (especially IF it doesn't improve the
> accuracy of the result) surely it adds complication to 'telling the story'
> of the model to non-DS people?
> So, is scaling a technique for the DS to use to find effects, while
> eventually delivering a non-scaled version to the users?
> I'd like to be able to give the true story to my students, not some fairy
> story based on my misunderstanding. Hope you can help with this.
> Michael
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list