Does regularization penalize models that are simpler than needed?Different regularization parameter per parameterWhy does not ridge regression perform feature selection although it makes use of regularization?Do discriminative models overfit more than generative models?Regularization for ARIMA modelsWhy do smaller weights result in simpler models in regularization?Why regularize all parameters in the same way?What are Regularities and Regularization?Are there empirical models that predict variance?Does regularization leads to stucking in local minima?Is there a theoretical reason why simple models perform better than complex models on time series forecasting tasks?

When handwriting 黄 (huáng; yellow) is it incorrect to have a disconnected 草 (cǎo; grass) radical on top?

Why was Sir Cadogan fired?

Obtaining database information and values in extended properties

Rotate ASCII Art by 45 Degrees

How obscure is the use of 令 in 令和?

How seriously should I take size and weight limits of hand luggage?

how do we prove that a sum of two periods is still a period?

Why is it a bad idea to hire a hitman to eliminate most corrupt politicians?

How can saying a song's name be a copyright violation?

Avoiding the "not like other girls" trope?

Does Dispel Magic work on Tiny Hut?

Bullying boss launched a smear campaign and made me unemployable

Why didn't Boeing produce its own regional jet?

Do creatures with a speed 0ft., fly 30ft. (hover) ever touch the ground?

Different meanings of こわい

Convert seconds to minutes

Finitely generated matrix groups whose eigenvalues are all algebraic

What is the opposite of "eschatology"?

How to travel to Japan while expressing milk?

How can a day be of 24 hours?

Is it possible to map the firing of neurons in the human brain so as to stimulate artificial memories in someone else?

Send out email when Apex Queueable fails and test it

How do I exit BASH while loop using modulus operator?

Getting extremely large arrows with tikzcd



Does regularization penalize models that are simpler than needed?


Different regularization parameter per parameterWhy does not ridge regression perform feature selection although it makes use of regularization?Do discriminative models overfit more than generative models?Regularization for ARIMA modelsWhy do smaller weights result in simpler models in regularization?Why regularize all parameters in the same way?What are Regularities and Regularization?Are there empirical models that predict variance?Does regularization leads to stucking in local minima?Is there a theoretical reason why simple models perform better than complex models on time series forecasting tasks?













3












$begingroup$


Yes, regularization penalizes models that are more complex than needed. But does it also penalize models that are simpler than needed?










share|cite|improve this question









$endgroup$







  • 1




    $begingroup$
    Given we use an appropriate testing procedure to select our regularisation parameter strength, it should not penalise any models unnecessarily. (+1)
    $endgroup$
    – usεr11852
    2 days ago















3












$begingroup$


Yes, regularization penalizes models that are more complex than needed. But does it also penalize models that are simpler than needed?










share|cite|improve this question









$endgroup$







  • 1




    $begingroup$
    Given we use an appropriate testing procedure to select our regularisation parameter strength, it should not penalise any models unnecessarily. (+1)
    $endgroup$
    – usεr11852
    2 days ago













3












3








3





$begingroup$


Yes, regularization penalizes models that are more complex than needed. But does it also penalize models that are simpler than needed?










share|cite|improve this question









$endgroup$




Yes, regularization penalizes models that are more complex than needed. But does it also penalize models that are simpler than needed?







machine-learning predictive-models modeling regularization






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked 2 days ago









alienflowalienflow

275




275







  • 1




    $begingroup$
    Given we use an appropriate testing procedure to select our regularisation parameter strength, it should not penalise any models unnecessarily. (+1)
    $endgroup$
    – usεr11852
    2 days ago












  • 1




    $begingroup$
    Given we use an appropriate testing procedure to select our regularisation parameter strength, it should not penalise any models unnecessarily. (+1)
    $endgroup$
    – usεr11852
    2 days ago







1




1




$begingroup$
Given we use an appropriate testing procedure to select our regularisation parameter strength, it should not penalise any models unnecessarily. (+1)
$endgroup$
– usεr11852
2 days ago




$begingroup$
Given we use an appropriate testing procedure to select our regularisation parameter strength, it should not penalise any models unnecessarily. (+1)
$endgroup$
– usεr11852
2 days ago










1 Answer
1






active

oldest

votes


















5












$begingroup$

For regularization terms similar to $left|thetaright|_2^2$ in effect, no they don't, they only push toward simplicity, i.e. parameters closer to zero.



Error terms such as $sum_i left|y_i - f_theta(x_i)right|_2^2$ are responsible for fighting back toward complexity (penalizing over-simplification), since the simplest model, i.e. $theta = 0$, leads to a high error.



We balance these two forces by using a regularization parameter ($lambda$) in a summation like
$$frac1Nsum_i=1^N left|y_i - f_theta(x_i)right|_2^2 + lambdaleft|thetaright|_2^2,$$
where higher $lambda$ forces the model toward more simplicity.






share|cite|improve this answer











$endgroup$












  • $begingroup$
    So, regularization like L2, L1 correspond to the first case, right?
    $endgroup$
    – alienflow
    2 days ago






  • 1




    $begingroup$
    @alienflow yes they all force toward zero (most simple).
    $endgroup$
    – Esmailian
    2 days ago











Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f400388%2fdoes-regularization-penalize-models-that-are-simpler-than-needed%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









5












$begingroup$

For regularization terms similar to $left|thetaright|_2^2$ in effect, no they don't, they only push toward simplicity, i.e. parameters closer to zero.



Error terms such as $sum_i left|y_i - f_theta(x_i)right|_2^2$ are responsible for fighting back toward complexity (penalizing over-simplification), since the simplest model, i.e. $theta = 0$, leads to a high error.



We balance these two forces by using a regularization parameter ($lambda$) in a summation like
$$frac1Nsum_i=1^N left|y_i - f_theta(x_i)right|_2^2 + lambdaleft|thetaright|_2^2,$$
where higher $lambda$ forces the model toward more simplicity.






share|cite|improve this answer











$endgroup$












  • $begingroup$
    So, regularization like L2, L1 correspond to the first case, right?
    $endgroup$
    – alienflow
    2 days ago






  • 1




    $begingroup$
    @alienflow yes they all force toward zero (most simple).
    $endgroup$
    – Esmailian
    2 days ago















5












$begingroup$

For regularization terms similar to $left|thetaright|_2^2$ in effect, no they don't, they only push toward simplicity, i.e. parameters closer to zero.



Error terms such as $sum_i left|y_i - f_theta(x_i)right|_2^2$ are responsible for fighting back toward complexity (penalizing over-simplification), since the simplest model, i.e. $theta = 0$, leads to a high error.



We balance these two forces by using a regularization parameter ($lambda$) in a summation like
$$frac1Nsum_i=1^N left|y_i - f_theta(x_i)right|_2^2 + lambdaleft|thetaright|_2^2,$$
where higher $lambda$ forces the model toward more simplicity.






share|cite|improve this answer











$endgroup$












  • $begingroup$
    So, regularization like L2, L1 correspond to the first case, right?
    $endgroup$
    – alienflow
    2 days ago






  • 1




    $begingroup$
    @alienflow yes they all force toward zero (most simple).
    $endgroup$
    – Esmailian
    2 days ago













5












5








5





$begingroup$

For regularization terms similar to $left|thetaright|_2^2$ in effect, no they don't, they only push toward simplicity, i.e. parameters closer to zero.



Error terms such as $sum_i left|y_i - f_theta(x_i)right|_2^2$ are responsible for fighting back toward complexity (penalizing over-simplification), since the simplest model, i.e. $theta = 0$, leads to a high error.



We balance these two forces by using a regularization parameter ($lambda$) in a summation like
$$frac1Nsum_i=1^N left|y_i - f_theta(x_i)right|_2^2 + lambdaleft|thetaright|_2^2,$$
where higher $lambda$ forces the model toward more simplicity.






share|cite|improve this answer











$endgroup$



For regularization terms similar to $left|thetaright|_2^2$ in effect, no they don't, they only push toward simplicity, i.e. parameters closer to zero.



Error terms such as $sum_i left|y_i - f_theta(x_i)right|_2^2$ are responsible for fighting back toward complexity (penalizing over-simplification), since the simplest model, i.e. $theta = 0$, leads to a high error.



We balance these two forces by using a regularization parameter ($lambda$) in a summation like
$$frac1Nsum_i=1^N left|y_i - f_theta(x_i)right|_2^2 + lambdaleft|thetaright|_2^2,$$
where higher $lambda$ forces the model toward more simplicity.







share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited 2 days ago

























answered 2 days ago









EsmailianEsmailian

41615




41615











  • $begingroup$
    So, regularization like L2, L1 correspond to the first case, right?
    $endgroup$
    – alienflow
    2 days ago






  • 1




    $begingroup$
    @alienflow yes they all force toward zero (most simple).
    $endgroup$
    – Esmailian
    2 days ago
















  • $begingroup$
    So, regularization like L2, L1 correspond to the first case, right?
    $endgroup$
    – alienflow
    2 days ago






  • 1




    $begingroup$
    @alienflow yes they all force toward zero (most simple).
    $endgroup$
    – Esmailian
    2 days ago















$begingroup$
So, regularization like L2, L1 correspond to the first case, right?
$endgroup$
– alienflow
2 days ago




$begingroup$
So, regularization like L2, L1 correspond to the first case, right?
$endgroup$
– alienflow
2 days ago




1




1




$begingroup$
@alienflow yes they all force toward zero (most simple).
$endgroup$
– Esmailian
2 days ago




$begingroup$
@alienflow yes they all force toward zero (most simple).
$endgroup$
– Esmailian
2 days ago

















draft saved

draft discarded
















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f400388%2fdoes-regularization-penalize-models-that-are-simpler-than-needed%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

រឿង រ៉ូមេអូ និង ហ្ស៊ុយលីយេ សង្ខេបរឿង តួអង្គ បញ្ជីណែនាំ

QGIS export composer to PDF scale the map [closed] Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?Print Composer QGIS 2.6, how to export image?QGIS 2.8.1 print composer won't export all OpenCycleMap base layer tilesSave Print/Map QGIS composer view as PNG/PDF using Python (without changing anything in visible layout)?Export QGIS Print Composer PDF with searchable text labelsQGIS Print Composer does not change from landscape to portrait orientation?How can I avoid map size and scale changes in print composer?Fuzzy PDF export in QGIS running on macSierra OSExport the legend into its 100% size using Print ComposerScale-dependent rendering in QGIS PDF output

PDF-ში გადმოწერა სანავიგაციო მენიუproject page