HAOYUAN WU
HomeBlogPhotography
arrow_backBack to Blog

MATH3904 Notes

30 November 2024

Note: These are the notes for the class MATH3904 Introduction to Optimization at the University of Hong Kong. These notes are produced in class and may contain errors or missing content. Please use the note at your own discretion.

Chapter 1

Minimize
s.t.

: objective function
: feasible region
: feasible solution

Ingredients

  • Insight, Intuitive, Imaginative

  • Theorem

  • Algorithm (Qualitative – correctness, Quantitative – efficiency)

  • Computation

Example: Min. s.t.

Suppose ,

Equality holds if

Principle Axis Theorem: symmetric with eigenvalues , exists orthogonal s.t.

Example: Suppose is any square matrix. Min s.t. .

Let , is symmetric.

Answer: min eigenvalue of .

Takeaway: symmetric matrix diagonal matrix (with diagonalization).

Definition: A point is a local minimum point if there exists s.t. for all with .

in Min s.t. is a global minimum point.

Taylor’s theorem with Lagrange Remainder:

where the remainder for some between and .

Example: Assume a continuous function on has a unique local extremum at . Is it true that is a global minimum point or a global maximum point?

Assume is local min but not global min. s.t. . Consider interval , on which achieves maximum at an interior point , which is a local maximum.

A function is differentiable at if

  • exists at for all

Theorem:

if is differentiable at .

Proof: Assume .

If exists in a neighborhood of and continuous at for all is differentiable at .

is times differentiable

: use single variable calculus along a direction of the objective function.

Definition: Let . A vector is called a feasible direction at if there exists such that for any .

For an interior point, every non-zero vector is a feasible direction.

Stationary point:

for all

Proof: Let

If is positive definite, there exist p.d. s.t. .

Proof:

where is orthogonal and .

Let

Then .

It is not true that is p.s.d. if and only if all principal minors are non-negative. Counterexample: .

. is p.s.d. iff , , and

Max s.t. Min s.t.

Definition: is negative definite is positive definite. is negative semidefinite is positive semidefinite.

Proof of Theorem 1.7: is p.d. whenever is sufficiently small Positive definiteness implies continuity if . Explanation

is continuous because and is a linear combination of .

Example 1.8: To prove is p.d.

Equality holds only if (not feasible).

It can be further derived from Example 1.8 that is an optimal solution: By Taylor’s theorem, , where and .

As is p.d., .

To find local minimum:

(1) Find to make

Under (1),

If is p.d., is local min.

If is p.s.d., we need additional check.

By using the necessary conditions, we can usually identify a finite number of candidates of local minimum points, thereby reducing the original infinite problem to a finite one.

By using the sufficient conditions, we can ensure that a certain candidate is a local minimum point.

Definition: A set is convex if for any and , .

Definition: A function defined on a convex set is convex if for any and , .

2D: bowl shaped bullshit

Theorem: Convex if and only if for any integer , any , and any with ,

Proof: Induction on .

Theorem: Let be a convex function defined on a convex set . Then is continuous at every interior point of . (Convexity is stronger than continuity.)

The theorem does not hold at boundary points:

Theorem: (Convex functions can be combined) Let and be convex functions defined on a convex set . Then

  1. is convex on .
  2. is convex for any constant .
  3. is a convex set for any .

Examples of convex functions:

  1. if

Non-examples:

  1. : ,
  2. :

The set of points satisfying:

where is convex , defines a convex set. The constraint is often defined as

is concave if and only if is convex.

Theorem: Let . Then is convex if and only if for all .

Remark: is strictly convex if and only if for all with .

Remark: Let be convex over . If is differentiable at a fixed point . Then

Let . Then is convex over a convex set containing an interior point if and only if is p.s.d. .

No longer holds if not containing interior point:

Converse of Collary 1.12.1 is not true: , but .

is local min point (1st order necessary conditions)

Definition: is a function in if we can use rather than its entries to express it.

Tip: To find and of a function in , we may simply treat as a single variable function , first find and , and transform them back to and . Transpose to unify the dimensions of vectors or matrices.

Example 1.9:

Linear Regression: Suppose are points in the -plane and we wish to fit a straight line for these points in such a way that

is minimized.

Since , matrix is positive definite, so

Fact: Let . Then the following statements hold:

(a)

(b) Both and are positive semidefinite.

(c) is positive definite if . is positive definite if .

Proof of (a):

Denote ,

.

From Exercise 5: . does not have a local minimum point. (Set and )

Suppose is convex on , max on is either or .

Suppose is a convex function on and has a global max. Then is a constant function.

Proof: Let be the max point. .

Remark on limit superior

does not exist, because . However,

If a sequence converges over order , then it converges over order for .

Proof:

Remark: the higher order of convergence, the faster the convergence.

Chapter 2

Algebraical methods: (without using derivatives)

Example: Min where .

Min s.t. where is unimodal.

Unimodal functions are not necessarily continuous or differentiable.

Convex functions are not necessarily unimodal. Counterexample: .

Strictly convex functions are unimodal.

Remark: Since we do not know and , the optimal strategy is to guard against the worst possible outcome, i.e., to minimize

This can be achieved by placing and symmetrically w.r.t. the mid point of .

Remark: Golden section method converges linearly with conversion ratio .

Why is GSM more commonly used than FSM?

GSM is simpler and performance of GSM and FSM are similar.

Line search using derivatives

Newton’s method:

To solve the equation , since does not depend on and if , we set . Newton’s method takes the form

Fact of Theorem 2.1: if is continuous at , then

  • constant , such that sufficiently close to
  • If , then constant , such that sufficiently close to

Induction base: ,

A vector with is called a direction of steepest descent of at if it minimizes .

How to find in steepest descent method?:

(i) Observation: . If , then (in the example) except possibly one point where , so is strictly convex and hence unimodal.

(ii) Find , , , , . Find an interval of by checking the sign of .

Fact: Let be a symmetric matrix and let be the smallest eigenvalue of . Then ,

Remark 1: If , then SDM reaches the optimal solution in a single step, regardless of .

Remark 2: SDM converges linearly with a convergence ratio .

Proof of Lemma 2.2:

Convex polygon generated by is

is in the convex polygon.

so

Gram-Schmidt orthogonalization: Let be linearly independent vectors in . Define , , so are Q-orthogonal.

Two-phase algorithm for quadratic problem: (1) Find -orthogonal vectors . (2) Apply conjugate direction theorem.

where

Remark of Lemma 4.1: tangent plane at .

Theorem 4.4: If is negative definite, then is a local maximum point.

Theorem 4.5: for all .

There exists such that : If for all , , causing a contradiction.

The in Theorem 4.7 is a subset of the in Theorem 4.5.

In Example 5.2:

Fact: is p.d., as , with equality if and only if , which implies because has full column rank.

From (5.4), is non-singular.

Another proof: show that if and only if .

From (1),

From (2),

Solving the closed form formula of :

Let be a sequence of real nmbers. A constant is called a limit/clutter point of the sequence if there exists a subsequence such that .

Example:

or .

Robust sets contain no isolated points or lines.

Example 7.2: The Hessian of is p.d. for all interior points.

Primal (P): Min s.t.

Dual (D): Max s.t.

(1) If one of (P) and (D) has an optimal solution, then the other has an optimal solution and the optimal values are equal.

(2) The dual of (D) is (P)

(3) (Complementary slackness) a pair respectively feasible in a primal – dual pair is optimal if and only if

Last updated: March 20, 2026