Documentation

Mathlib.InformationTheory.KullbackLeibler.KLFun

The real function `fun x ↦ x * log x + 1 - x` #

We define klFun x = x * log x + 1 - x. That function is notable because the Kullback-Leibler divergence is an f-divergence for klFun. That is, the Kullback-Leibler divergence is an integral of klFun composed with a Radon-Nikodym derivative.

For probability measures, any function f that differs from klFun by an affine function of the form x ↦ a * (x - 1) would give the same value for the integral ∫ x, f (μ.rnDeriv ν x).toReal ∂ν. However, klFun is the particular choice among those that satisfies klFun 1 = 0 and deriv klFun 1 = 0, which ensures that desirable properties of the Kullback-Leibler divergence extend to other finite measures: it is nonnegative and zero iff the two measures are equal.

Main definitions #

klFun: the function fun x : ℝ ↦ x * log x + 1 - x.

This is a continuous nonnegative, strictly convex function on [0,∞), with minimum value 0 at 1.

Main statements #

integrable_klFun_rnDeriv_iff: For two finite measures μ ≪ ν, the function x ↦ klFun (μ.rnDeriv ν x).toReal is integrable with respect to ν iff the log-likelihood ratio llr μ ν is integrable with respect to μ.
integral_klFun_rnDeriv: For two finite measures μ ≪ ν such that llr μ ν is integrable with respect to μ, ∫ x, klFun (μ.rnDeriv ν x).toReal ∂ν = ∫ x, llr μ ν x ∂μ + (ν univ).toReal - (μ univ).toReal.

noncomputable def InformationTheory.klFun (x : ℝ) :

The function x : ℝ ↦ x * log x + 1 - x. The Kullback-Leibler divergence is an f-divergence for this function.

Equations

InformationTheory.klFun x = x * Real.log x + 1 - x

theorem InformationTheory.klFun_apply (x : ℝ) :

klFun x = x * Real.log x + 1 - x

theorem InformationTheory.klFun_zero :

theorem InformationTheory.klFun_one :

theorem InformationTheory.strictConvexOn_klFun :

StrictConvexOn ℝ (Set.Ici 0) klFun

klFun is strictly convex on [0,∞).

theorem InformationTheory.convexOn_klFun :

ConvexOn ℝ (Set.Ici 0) klFun

klFun is convex on [0,∞).

theorem InformationTheory.convexOn_Ioi_klFun :

ConvexOn ℝ (Set.Ioi 0) klFun

klFun is convex on (0,∞). This is an often useful consequence of convexOn_klFun, which states convexity on [0, ∞).

theorem InformationTheory.continuous_klFun :

Continuous klFun

klFun is continuous.

theorem InformationTheory.measurable_klFun :

Measurable klFun

klFun is measurable.

theorem InformationTheory.stronglyMeasurable_klFun :

MeasureTheory.StronglyMeasurable klFun

klFun is strongly measurable.

theorem InformationTheory.hasDerivAt_klFun {x : ℝ} (hx : x ≠ 0) :

HasDerivAt klFun (Real.log x) x

The derivative of klFun at x ≠ 0 is log x.

theorem InformationTheory.not_differentiableAt_klFun_zero :

¬DifferentiableAt ℝ klFun 0

@[simp]

theorem InformationTheory.deriv_klFun :

deriv klFun = Real.log

The derivative of klFun is log x. This also holds at x = 0 although klFun is not differentiable there since the default value of deriv in that case is 0.

theorem InformationTheory.not_differentiableWithinAt_klFun_Ioi_zero :

¬DifferentiableWithinAt ℝ klFun (Set.Ioi 0) 0

theorem InformationTheory.not_differentiableWithinAt_klFun_Iio_zero :

¬DifferentiableWithinAt ℝ klFun (Set.Iio 0) 0

@[simp]

theorem InformationTheory.rightDeriv_klFun {x : ℝ} :

derivWithin klFun (Set.Ioi x) x = Real.log x

The right derivative of klFun is log x. This also holds at x = 0 although klFun is not differentiable there since the default value of derivWithin in that case is 0.

@[simp]

theorem InformationTheory.leftDeriv_klFun {x : ℝ} :

derivWithin klFun (Set.Iio x) x = Real.log x

The left derivative of klFun is log x. This also holds at x = 0 although klFun is not differentiable there since the default value of derivWithin in that case is 0.

theorem InformationTheory.rightDeriv_klFun_one :

derivWithin klFun (Set.Ioi 1) 1 = 0

theorem InformationTheory.leftDeriv_klFun_one :

derivWithin klFun (Set.Iio 1) 1 = 0

theorem InformationTheory.tendsto_rightDeriv_klFun_atTop :

Filter.Tendsto (fun (x : ℝ) => derivWithin klFun (Set.Ioi x) x) Filter.atTop Filter.atTop

theorem InformationTheory.isMinOn_klFun :

IsMinOn klFun (Set.Ici 0) 1

theorem InformationTheory.klFun_nonneg {x : ℝ} (hx : 0 ≤ x) :

The function klFun is nonnegative on [0,∞).

theorem InformationTheory.klFun_eq_zero_iff {x : ℝ} (hx : 0 ≤ x) :

klFun x = 0 ↔ x = 1

theorem InformationTheory.tendsto_klFun_atTop :

Filter.Tendsto klFun Filter.atTop Filter.atTop

theorem InformationTheory.integrable_klFun_rnDeriv_iff {α : Type u_1} {mα : MeasurableSpace α} {μ ν : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] [MeasureTheory.IsFiniteMeasure ν] (hμν : μ.AbsolutelyContinuous ν) :

MeasureTheory.Integrable (fun (x : α) => klFun (μ.rnDeriv ν x).toReal) ν ↔ MeasureTheory.Integrable (MeasureTheory.llr μ ν) μ

For two finite measures μ ≪ ν, the function x ↦ klFun (μ.rnDeriv ν x).toReal is integrable with respect to ν iff llr μ ν is integrable with respect to μ.

theorem InformationTheory.integral_klFun_rnDeriv {α : Type u_1} {mα : MeasurableSpace α} {μ ν : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] [MeasureTheory.IsFiniteMeasure ν] (hμν : μ.AbsolutelyContinuous ν) (h_int : MeasureTheory.Integrable (MeasureTheory.llr μ ν) μ) :

∫ (x : α), klFun (μ.rnDeriv ν x).toReal ∂ν = ∫ (x : α), MeasureTheory.llr μ ν x ∂μ + (ν Set.univ).toReal - (μ Set.univ).toReal