Kac-Moody Lie Algebras
Chapter III: Representation theory

Arun Ram
Department of Mathematics and Statistics
University of Melbourne
Parkville, VIC 3010 Australia

Last update: 10 September 2012

This is a typed version of I.G. Macdonald's lecture notes on Kac-Moody Lie algebras from 1983.

To begin with, let A=(aij) be any n×n matrix over the field k. Eventually A will have to be a symmetrizable Cartan matrix, but we shall bring in that assumption only when it becomes necessary.

Recall that

𝔤=𝔤(A)=𝔥+ αR𝔤α (direct sum)

and that each root space 𝔤α is a finite-dimensional (1.7).

Let M be a 𝔤–module, i.e. a k–vector space on which 𝔤 acts, so that we are given a Lie algebra homomorphism π:𝔤𝔤𝔩 (M), which extends to π:U(𝔤) End(M), i.e M is a U(𝔤)–module. Notation (M,π) when I want to be pedantic. More often then not I shall suppress π and write x.v or xv for π(x)v (x𝔤,vM).


For any 𝔤–module M and any λ𝔥* we define

Mλ { vM:h.v= λ(h)vfor all h𝔥 } .

If Mλ0 we say that λ is a weight of M, that Mλ is the weight space and that the elements of Mλ are the weight vectors for the weight λ. We have

Mλ HomU(𝔥) (Eλ,M) ()

where Eλ is the 1–dimensional 𝔥–module defined by λ, that is to say Eλ=keλ where h.eλ=λ (h)eλ for all h𝔥. The isomorphism () associates to each vMλ the homomorphism EλM which takes eλ to v. From () it follows that (for a fixed λ𝔥*) MMλ is a left exact functor (from 𝔤–modules to 𝔥–modules).

Example: (M,π)= (𝔤,ad). The weight spaces are 𝔥 and the 𝔤α, and the set of weights is R{0}.


  1. For αR{0} and λ𝔥* we have

    𝔤α.Mλ Mλ+α
  2. The sum M= λ𝔥* Mλ is direct, and M is a 𝔤–submodule of M.

  3. If φ:MN is a 𝔤–module homomorphism, then φ(Mλ) Nλ for all λ.

  1. Let x𝔤α, vMλ, h𝔥. Then we calculate

    h.(x.v) = x.h.v+[h,x].v = λ(h)x.v+α (h)x.v = (λ+α)(h)x.v

    so that x.vMλ+α.

  2. If the sum Mλ is not direct, there will be nontrivial relations of the form

    i=1m vλi=0 (1)

    where vλi Mλi, vλi0 and λ1,,λm 𝔥* are all distinct. Choose such a relation with m (2) as small as possible. By operating on (1) with an element h𝔥, we obtain

    i=1mλi (h)vλi=0 (2)

    Choose h𝔥 such that λ1(h) λ2(h), multiply (1) by λ1(h) and subtract from (2). This produces a nontrivial relation of length <m: contradiction.

    Also it is clear from (i) that 𝔤α.MM for each αR{0}, whence 𝔤.MM with a prime?

  3. Obvious.

If M is any 𝔤–module, let P(M)𝔥* denote the set of weights of M. (It might be empty.) Also, for each λ𝔥*, let


and for any subset F of 𝔥* let

D(F)=λF D(λ).

We shall use this notation only for finite subsets F of 𝔥*.

Let 𝒪 denote the category of 𝔤–modules M which satisfy the following two conditions:

  1. M is 𝔥–diagonalizable with finite dimensional weight spaces, i.e.

    M=μP(M) Mμ

    (direct sum, by (3.1)), with each Mμ finite-dimensional;

  2. P(M)D(F) for some finite F𝔥*.

    The morphisms in 𝒪 are 𝔤–module homomorphisms.

(3.2) Let (E) 0MMM 0 be a short exact sequences of 𝔤–modules, with M𝒪. Then

  1. M,M𝒪;

  2. For each λ𝔥* the sequence (Eλ) 0Mλ fMλg Mλ0 is exact;

  3. P(M)=P(M) P(M).


Since M is 𝔥–diagonalizable we have M=λ Mλ by (1.5), and f(Mλ) Mλ by (3.1), so that Mλ is finite-dimensional and P(M)P(M) D(F). Hence M𝒪.

Next, we have g(Mλ) Mλ by (3.1), for all λ𝔥, hence

M=g(M)= g(Mλ) Mλ M;

consequently we have equality throughout, whence g(Mλ)= Mλ for each λ𝔥* and the sequence (Eλ) is therefore exact. Finally, Mλ is finite-dimensional and P(M) P(M), so that M𝒪, and (iii) is now obvious.

Recall the partial order λμ on 𝔥*: λμ iff λ-μ-1n uiαi with each ui0.

(3.3) Each module M𝒪 has at least one maximal weight.


Suppose M has no maximal weight. Then P(M) contains an infinite strictly increasing sequence μ1<μ2<. For each λF, the μiD(λ) form a subsequence. Since F is finite, at least one of these subsequences is infinite, say v1<v2< in D(λ). Each viD(λ)=λ- Q+, hence ht(λ-vi) is a nonnegative integer. It follows that the sequence (ht(λ-vi)) i1 is an infinite strictly decreasing sequence of integers 0, which is absurd.

If M has a unique maximal weight λ, then λ is called the highest weight of M.

Highest weight 𝔤–modules

We shall say that a 𝔤–module M is a highest weight (h.w.) module if

  1. M has a highest weight, say λ;

  2. M is generated (as U(𝔤)–module) by some vλMλ.

(3.4) Let M be a h.w. 𝔤–module, with highest weight λ. Then

  1. M𝒪;

  2. dimMλ=1;

  3. P(M)D(λ);

  4. M has a unique maximal submodule, hence a unique simple quotient;

  5. If M is a nonzero homomorphic image of M, then M is h.w. with h.w.

  1. We have λ+αiP(M), hence by (3.1) ei.vλ=0 (1in). It follows that 𝔫+.vλ=0, i.e. U(𝔫+). vλ=kvλ. Since U(𝔤)=U(𝔫-) U(𝔥)U(𝔫+), we have M=U(𝔤)vλ=U (𝔫-)vλ.

    Let y1,y2, be a k–basis of 𝔫- consisting of root vectors. By Poincaré-Birkhoff-Witt, the monomials y1r1 y2r2 form a k–basis of U(𝔫-), hence the vectors y1r1 y2r2vλ span M (as a k–vector space). But each such vector is a weight vector, for if yi𝔤-βi then y1r1 y2r2vλ M λ-r1β1- r2β2- . It follows that M is the sum of its weight spaces and that each weight space Mμ is finite-dimensional, for there are only finitely many solutions of the equation μ=λ-riβi in non-negative integers ri. Moreover each such μD(λ), and in particular Mλ is 1–dimensional, generated by vλ. So we have proved (i) – (iii).

  2. Let M be a proper submodule of M. Then M𝒪 (3.2), hence M=Mμ where Mμ=M Mμ. But Mλ=0, otherwise by (ii) Mλ would contain and hence M=M. It follows that

    MM+= μλMμ

    and hence the sum of all proper submodules of M is contained in M+, hence is a proper submodule. This proves (iv), and (v) is clear.

We shall now show how to construct all h.w. 𝔤–modules.

Verma modules

Let λ𝔥* and let Eλ as before denote the 1–dimensional 𝔥–module corresponding to λ: Eλ=kuλ where h.uλ=λ(h) uλ for all h𝔥.

Let 𝔟=𝔥+𝔫+ be the subalgebra of 𝔤 generated by 𝔥 and e1,,en. The subalgebra 𝔟 is a semidirect product 𝔫+𝔥, because 𝔫+ is an ideal in 𝔟 and 𝔟/𝔫+=𝔥. We may regard Eλ as a 𝔟–module by making 𝔫+ act trivially, i.e. 𝔫+.uλ=0. The Verma module V(λ) is defined to be the induced 𝔤–module

V(λ)= ind𝔟𝔤(Eλ) =U(𝔤)U(𝔟) Eλ.

Let vλ=1u λV(λ). Clearly vλ generates V(λ), and since U(𝔤)=U(𝔫-) U(something) we have V(λ)=U(𝔫-) .vλ, showing that V(λ) is a h.w. 𝔤–module with highest weight λ, and that it is free of rank 1 as a U(𝔫-)–module.

Alternative description of V(λ): let J(λ) denote the left ideal in U(𝔤) generated by something and all h-λ(h), h𝔥. Then

V(λ)U(𝔤)/ J(λ).

For if π is the representation of U(𝔟) on Eλ, then π:U(𝔟)k is such that π(ei)=0 (1in) and π(h)=λ(h), all h𝔥; hence K=Ker(π) is the left ideal of codimension 1 in U(𝔟) generated by 𝔫+ and all h-λ(h); tensoring the exact sequence (of left U(𝔟)–modules)

0KU(𝔟) Eλ0

with U(𝔤) (over U(𝔟)) gives

U(𝔤)U(𝔟) KU(𝔤)V(λ) 0

and the image of U(𝔤)U(𝔟)K in U(𝔤) is J(λ).

The Verma modules are the "universal" h.w. 𝔤–modules:


  1. V(λ) is a h.w. 𝔤–module with highest weight λ.

  2. Every h.w. 𝔤–module with highest weight λ is a homomorphic image of V(λ).

  1. Already observed above.
  2. Let M be a h.w. with generator xMλ. Then the ideal J(λ) kills x, hence M is a homomorphic image of U(𝔤)/J(λ)= V(λ).

By (3.4)(iv) it follows that V(λ) has a unique simple quotient L(λ): by (3.2) and (3.4)(i), we have L(λ)𝒪. Moreover, the L(λ) are precisely the simple objects in the category 𝒪:

(3.6) If M𝒪 is simple, then ML(λ) for a unique λ𝔥*.


By (3.3), M has at least one maximal weight, say λ. Let xMλ, x0. Then 𝔫+.x=0. (because λ+αiP(M), 1in), hence x is killed by J(λ), and therefore the submodule U(𝔤).x=M generated by x is a quotient of V(λ). Since M0 and M is simple, we have M=M; hence M is a simple quotient of V(λ), hence ML(λ).

Suppose also that ML(μ). Then we have a 𝔤–isomorphism something(λ)L (μ), under which weight spaces correspond (3.1). Hence λ is a weight of L(μ), whence λμ; similarly μλ and therefore λ=μ.

(3.7) Example. When λ=0, Eλ=k with trivial 𝔥–action (h.1=0), and V(0)=U(𝔫-). The maximal submodule of V(0) is the augmentation ideal of U(𝔫-), hence L(0) is the trivial 1–dimensional 𝔤–module.

(3.8) Ket M be a h.w. module. Then End𝔤(M)=k.


Let vλ=Mλ be a h.w. vector which generates M. If φ:MM is a 𝔤–module homomorphism something have φ(vλ)Mλ, hence =avλ for some ak (because dimMλ=1 (3.4)). The Kernel of φ-a.1 is a submodule of M which contains vλ, hence is the whole of M, i.e. φ-a.1=0.


Let ε be the set of all functions f:𝔥* such that Supp(f)D(F) for some finite subset F of 𝔥*: i.e. f(μ)=0 unless λ-μQ+ for some λF. Clearly ε is closed under addition and subtraction of functions; define multiplication by convolution:

(fg)(v)= λ+μ=v f(λ)g(μ) (finite sum) (1)

If Supp(f)D(F), Supp(g)D(G), then Supp(fg)D(F+G). Thus ε is a commutative ring.

A family (fj)jJ of functions in ε is summable if finite F𝔥* such that

  1. Supp(fj) D(F) for all jJ;

  2. for each λ𝔥* we have fj(λ)=0 for almost all jJ.

In that case the function f defined by

f(λ)=jJ fj(λ)

is well defined and belongs to ε, and we write f=fj.

For each λ𝔥* let eλ denote the characteristic function of λ. Then eλeμ=eλ+μ from the rule (1) defining multiplication. For any fε, the family (f(λ)eλ) λ𝔥* is summable, and we have

f=λf(λ) eλμF eμ [ [ e-α1,, e-αn ] ] (2)

Now let M𝒪 and define the formal character of M to be the function ch(M) defined by

ch(M)(λ)= dimMλ;

thus Supp(ch(M)) =P(M)D(F) for some finite F𝔥*, so that ch(M)ε and by (2) we have

ch(M)= λP(M) dimMλ.eλ

Thus ch(M) is nothing but the generating function for the multiplicities of the weights λP(M).

(3.9) ch is an additive function on the category 𝒪, i.e. if 0MMM0 is an exact sequence in , then

ch(M)= ch(M)+ ch(M).


This follows from (3.2)(ii) by counting dimensions

More generally, if 0M0M1 Mr0 is an exact sequence in 𝒪, we have

i=0r (-1)i ch(Mi)=0

by breaking up the exact sequence into short exact sequences and applying (3.9).

We shall first compute the character of a Verma module V(λ):

(3.10) Let λ𝔥*. Then

ch(V(λ))= eλ/ αR+ (1-e-α) mα =eλ/πsay

where as usual mα=dim𝔤α =dim𝔤-α. (The product on the right is a unit in ε.)


We saw earlier that V(λ) is a free U(𝔫-)–module of rank 1. As before, let y1,y2, be a k–basis of 𝔫- consisting of root vectors, say yi𝔤-βi. Then V(λ) has a k–basis consisting of weight vectors y1r1 y2r2 vλ, where each ri0 (and ri<), the vector just written being of weight λ-r1β1- r2β2-

Invariant bilinear form

As before, A=(aij) is any n×n matrix /k, and 𝔤=𝔤(A). Suppose that there exists a symmetric (k–valued) bilinear form x,y on 𝔤 such that

  1. [x,y],z= x,[y,z] for all x,y,z𝔤 (invariance)

  2. the restriction of , to 𝔥 is nondegenerate.

Then for any h𝔥 we have

h,hj = h,[ej,fj] = [h,ej],fj = αj(h)ej,fj

Let εj= ej,fj k. Condition (ii) ensures that εj0; taking h=hi in the calculation above we have

hi,hj= aijεj

and therefore the matrix AE=(aijεj) is symmetric, i.e. A is symmetrizable (E is a nonsingular diagonal matrix).

This proves the first part of

(3.11) Let x,y be an invariant symmetric bilinear form on 𝔤, whose restriction to 𝔥 is nondegenerate. Then

  1. the matrix A is symmetrizable

  2. x,y is nondegenerate on 𝔤

  3. x,y restricted to 𝔤α×𝔤β (where α,βR{0}) is

    1. zero if α+β0

    2. nondegernerate if α+β=0

  4. If αR and x𝔤x, y𝔤-α, then

    [x,y]= x,yhα

    where hα𝔥 is defined by h,hα= α(h) for all h𝔥.

  1. Let 𝔞= { x𝔤: x,𝔤=0 } . Since the form is invariant, 𝔞 is an ideal in 𝔤: for if x𝔞 and y,z𝔤 we have

    [x,y],z= x,[y,z]=0

    whence [x,y]𝔞.

    Now all ideals in 𝔞 are graded (1.7), hence 𝔞=𝔞α, where 𝔞α=𝔞𝔤α (and 𝔞0=𝔞𝔥). But 𝔞0=0, because if h𝔞0 then certainly h,𝔥=something whence h=0. But 𝔤 has no nontrivial ideals with trivial 𝔥–component, hence 𝔞=0.

  2. Let x𝔤α, y𝔤β, h𝔥. Then [x,h],y= x,[h,y] and thus

    -α(h) x,y=β(h) x,y

    If α+β0, choose h such that α(h)+β(h) 0. It follows that x,y=0, which proves (a).

    Next, suppose x𝔤α is such that x,𝔤-α=0. Then x,𝔤=0 by (a), whence x=0 by (ii).

  3. We have

    h,[x,y]= [h,x],y= α(h)x,y= h,hα x,y

    whence the result, by nondegeneracy.

Proposition (3.11) has a converse:

(3.12) Suppose that the matrix A is symmetrizable. Then there exists a nondegenerate symmetric invariant bilinear form x,y on 𝔤 (which therefore has the properties listed in (3.11)).


By assumption, there exist non-zero scalars εj such that aijεj= ajiεi. We shall first construct the form on 𝔥 (of (2.23)) and then extend it to 𝔤.

Choose a vector space complement 𝔥 of 𝔥 in 𝔥 (where as usual 𝔥=1nkhi) and define x,y on 𝔥×𝔥 by

x,hi = hi,x = εiαi(x) (x𝔥) y,z = 0 (y,z𝔥*)

To see that this form is nondegenerate, suppose that h𝔥 is such that h,𝔥=0. Then in particular we have εiαi(h)= h,hi=0, whence h1n Kerαi=𝔠 𝔥; thus λihi say, and then

1nλiεi αi(x)= h,x=0

for all x𝔥, so that λiεiαi=0 in 𝔥*, hence λ1==λn=0 and so h=something.

Recall the principal –grading of 𝔤:

𝔤r= htα=r 𝔤α;𝔤= r𝔤r; 𝔤0=𝔥

Let Gn=rn 𝔤r for n0.

The extension of x,y to G1 is unique, for by (3.11)(iii) we must have 𝔤α,𝔤β=0 if α+β0, and also ej,fj=εj. It is straightforward to verify that

[x,y],z= x,[y,z] (1)

whenever all 5 terms lie in G1.

We shall now extend , to a symmetric bilinear form on Gn (n2) by induction on n, such that (1) holds whenever all 5 terms are in Gn, and such that

𝔤i,𝔤j=0 ifi+j0 (2)

whenever in and jn.

So assume n2 and , defined on Gn-1, satisfying (1) and (2). To extend the form to Gn we have, in view of (2), only to define x,y on 𝔤n×𝔤-n. Write

x = i[si,ti] 𝔤n (3) y = j[uj,vj] 𝔤-n (4)

where si,ti (resp. uj,vj) are homogeneous of positive (resp. negative) degree, hence lie in Gn-1. Define now

y,x= x,y= j [x,uj],vj. (5)

The whole point is to show that this is well defined, i.e. that it does not depend on the expression (4) for y. For this purpose we make the following calculation: dropping the suffixes,

[[s,t],u],v = [s,[t,u]],v - [t,[s,u]],v (Jacobi) = - [t,u],[s,v] + [s,u],[t,v] (invariance) = - [s,v],[t,u] - [s,u],[v,t] (symmetry) = - s,[v,[t,u]] - s,[u[v,t]] (invariance) = s,[t,[u,v]] (Jacobi)

i.e. we have

[[s,t],u] ,v = s,[t,[u,v]] (6)

From (3), (4) and (6) it follows that

j [x,uj],vj = i,j [[si,ti],uj] ,vj = i,j si, [ti,[uj,vj]] = i si,[ti,y]

and therefore x,y (as defined by (5)) is well-defined, and satisfies the invariance condition (1) by our definition (5).

Notation. In 𝔥 we have hi,x= εiαi(x), in particular

hi,hj= aijεj= ajiεi

and an isomorphism θ:𝔥𝔥* definted by

θ(x)(y)= x,y

so that

θ(hi)(x)= hi,x=εi αi(x)

for all x𝔥, whence

θ(hi) = εiαi = αi θ-1(αi) = εi-1hi = hi

We use θ to transport the scalar product from 𝔥 to 𝔥*: thus

αi,αj= εi-1hi, εj-1hj =εi-1 aij=εj-1 aji

Casimir operator

In the classical situation, where 𝔤 is finite-dimensional, the Casimir operator plays an important role in representation theory. The invariant bilinear form may be regarded as an element B(𝔤𝔤)*= 𝔤*𝔤*; since it is non-degenerate it induces an isomorphism of 𝔤* with 𝔤, hence determines an element of 𝔤𝔤. The image of this in U(𝔤) (which is a quotient of the tensor algebra T(𝔤)) is the Casimir element ω. Since B is invariant it follows that ω is in the centre of U(𝔤), hence acts as a scalar on any simple 𝔤–module. Explicitly, if x1,,xn is any k–basis of 𝔤, let y1,,yn be the dual basis (so that xi,yj= δij); then ω=1n yixi.

In the present situation, where A is any symmetrizable matrix, we proceed as follows. Let αR+{0}; by (3.11), the bilinear form x,y is nondegenerate on 𝔤α×𝔤-α; choose a basis x1,,xm of 𝔤α (m=mα); let y1,,ym be the dual basis of 𝔤-α, and define

uα=i=1m yixiU(𝔤).

Then uα is independent of the choice of dual bases, for if x1,,xm; y1,, ym is another pair of dual bases, we have

yi = j xj,yi yj xj = i xj,yi xi

and therefore

iyixi= i,j xj,yi yjxi=j yjxj.

If αQ+ is not a root (or zero) we define uα=0 (the sum is empty).

Example: We have ei,fi=εi, hence uαi=εi-1 fiei.(1)

Let x𝔤β (βR{0}), then we have

[uα,x] = i=1m ( yixix- xyixi ) = i[yi,x] xi-iyi [x,xi] = vα,x- vα,xsay

where, for the same reason as before, vα,x and vα,x are independent of the choice of dual bases. Since x𝔤β and xi𝔤α we have [x,xi] 𝔤α+β. Let (xj) be a basis of 𝔤α+β, (yj) the dual basis of 𝔤-(α+β). Then

[x,xi] = j yj, [x,xi] xj = j [yj,x], xi xj (invariance)

and therefore

vα,x = iyi [x,xi] = i,j [yj,x], xi yixj = j[yj,x] xj= vα+β,x

i.e. we have the formula

vα,x= vα+β,x (x𝔤β) (2)


vα,x= vα-β,x (x𝔤β) (3)

In particular, vα,x=0 unless both α and α+β are positive roots (or 0); and likewise vα,x=0 unless both α,α-βR+ {0}.

Now let

u=αR+ uα

(in some completion of U(𝔤)...)

(3.13) We have

[u,ei] = -hiei [u,fi] = fihi [u,h] = 0(h𝔥)

where hi= image of αi under the isomorphism 𝔥*𝔥 induced by the bilinear form. (i.e. hi,h= αi(h), so that hi=εi-1 hi)


We compute:

[u,ei] = αR+ [uα,ei] = αR+ vα,ei- αR+ vα,ei = αR+ vα,ei- αR+ vα+αi,ei by (2)

But vα,ei=0 unless α-αiR+ {0}, and therefore

[u,ei]= vαi,ei= εi-1 [fi,ei]ei =-εi-1hi ei=-hiei.

Similarly we have

[u,fi] = αvα,fi -αR+ vα,fi = α vα+αi,fi -α vα,fi = -vαi,fi =-εi-1fi [fi,ei]= εi-1fihi =fihsomething?

Finally [u,h]=α ( vα,h- vα,h ) =0 by (2).

Choose an element ρ𝔥* such that

ρ(hi)=12 aii (1in)

(thus ρ(hi)=1 if A is a Cartan matrix). Then we have

ρ,αi= ρ(hi)= εi-1ρ(hi) =12εi-1 aii=12 αi,αi


2ρ,αi= αi,αi.

Now let M𝒪 and define a k–linear map


as follows: if vλMλ (λP(M)) then

Ω(vλ)= λ+ρ2 vλ+2u.vλ

where λ+ρ2= λ+ρ,λ+ρ and

u.vλ= αR+ uα.vλ

is a finite sum, because 𝔤α.vλ=0 for almost all αR+.

(3.14) ΩM is a 𝔤–module homomorphism.


Since 𝔤=𝔤(A) is generated by the ei, the fi and 𝔥, it is enough to verify that Ω commutes with the action of each of these elements. So we calculate:

Ω(ei.vλ) -eiΩ(vλ) = ( λ+αi+ρ2 - λ+ρ2 ) ei.vλ+2 [u,ei].vλ = αi,2λ+2ρ+αi ei.vλ-2 hi.ei.vλ by (3.1.something = ( αi,2λ+2ρ +αi -2 αi,λ +αi ) ei.vλ = αi,2ρ-αi ei.vλ=0.


Ω(fi,vλ)- fi.Ω(vλ) = ( λ-αi+ρ2 -λ+ρ2 ) fi.vλ+2 [u,fi].vλ = - αi,2λ+ 2ρ-αi fi.vλ+2fi hi.vλby (3.13) = ( - αi,2λ+2ρ -αi +2αi,λ ) fi.vλ = - αi,2ρ-αi fi.vλ=0.


Ω(h.vλ)- h.Ω(vλ) = ( λ+ρ2- λ+ρ2 ) h.vλ+2 [u,h].vλ = 0by (3.13) again.

Remark: Ω is functorial, i.e. if f:MN is a 𝔤–module homomorphism (with M,N𝒪) then the diagram

M f N ΩM ΩN M f N

commutes. For f commutes with the action of u, and preserves weight spaces.

(3.15) Example. Let M be a h.w. module with h. wt. λ. If vλMλ is a generator of M (3.4), we have 𝔤α.vλ=0 for all αR+, hence u.vλ=0 and therefore ΩM.vλ= λ+ρ2 vλ. Hence by (3.14) (since vλ generates M)

ΩM= λ+ρ2. 1M.

(3.16) Let M𝒪 be such that ΩM=a.1M for some scalar a. Let F be a finite subset of 𝔥* such that P(M)D(F), and let

S= { λD(F): λ+ρ2=a } .

Then there exist integers dλ,λS such that

ch(M)= -1λS dλeλ

where = αR+ (1-e-α) mα .


If μD(F) we have λ-μQ+ for some λF, hence ht(λ-μ) is an integer 0. Define the depth of μ (relative to F) to be

δ(μ)= max { ht(λ-μ): λF,μD(λ) }

so that λ(μ); also define

δ(M)=min { δ(μ):μ P(M) } .

Since F is finite there are only finitely many μD(F) of given depth; in particular, M has only finitely many weights μ of least depth δ(M), and they are all maximal weights. Call them μ1,,μr.

We shall kill the weight spaces Mμi (1ir). Let di=dimMμi, and let

V=i=1r V(μi)di.

Choose a k–basis of each Mμi and let φ:VM be the 𝔤–homomorphism which maps the generators of the summands of V to the chosen basis elements of the Mμi. Let M,M be the kernel and cokernel of φ, so that we have an exact sequence

0MV φMM 0.

Then M𝒪 because it is a submodule of V, and M𝒪 because it is a quotient of M. Now Ω acts as scalar multiplication by μi+ρ2 on V(μi)di (3.15), and hence also on the image φ(V(μi)di), which is a non zero submodule of M. Since by hypothesis Ω acts as scalar multiplication by a on M, it follows that μi+ρ2 =a, i.e. μiS (1ir). Hence Ω acts as a.1 on V, and hence on M; also on M. By construction we have δ(M) >δ(M) and δ(M) >δ(M), and by additivity of ch (3.9)

ch(M) = ch(V)+ ch(M) -ch(M) = i=1rdi chV(μi)+ ch(M) -ch(M).

Now repeat the same procedure on M and M. After we have done it m times we shall have say

ch(M)= μSmdμ chV(μ)+fm

where Sm is some finite subset of S, and δ(v)>m for all vSupp(fm). Now let m and we have

ch(M) = μSdμ chV(μ) = -1 μSdμ eμby (3.10).

Remark. Suppose in particular that M is a h.w. module, with highest weight λ. Then

ch(M) = -1 μD(λ) μ+ρ2= λ+ρ2 dμeμ

with dμ, and in particular dλ=1.

The Weyl-Kac character formula

From now on, A is a Cartan matrix.

Let (M,π) be a h.w. module, with highest weight λ𝔥*. Then each π(ei) is a locally nilpotent endomorphism of M. For if μP(M), say μ=λ- i=1n miαi, and if xMμ, then π(ei)mx Mμ+mαi=0 if m>mi.

If also each π(fi) is locally nilpotent on M, we shall say that M is a quasi-simple 𝔤–module. (Later we shall see that quasi-simple simple).

(3.17) Let (M,π) be a h.w. module with highest weight λ, and generator xMλ. If k1 such that π(fi)kx=0 for 1in, then M is quasi-simple.


Recall the formula (1.16)

xNy=r=0N (Nr) (adx)ry xN-r

(x,y associative ring R). Let vM, so that v=π(u)x for some uU(𝔤), and apply (1.16) with x=π(fi), y=π(u):

π(fi)Nv = π(fi)Nπ (u)x = r=0N (Nr)π (adfiru) π(fi)N-rx.

Now adfi is locally nilpotent on 𝔤 (1.19), hence also on U(𝔤), so that (adfi)mu=0 for some m1. Hence if N is large enough (N=k+m-1 would do) either rm or N-rk for each r[0,N], and so π(fi)Nsomething

(3.18) Let (M,π) be a quasi-simple 𝔤–module with highest weight λ. Then:

  1. ch(M) is W–invariant (as a function on 𝔥*)

  2. If μP(M) and 1in, then the set of integers r such that μ+rαiP(M) is a finite interval [-p,q] in , where p,q0 and p-q=μ(hi).

  3. If μP(M), then μ(hi) (1in).


We shall make use of the following formula:

eadxy= exye-x

for elements x,y of an associative –algebra, with x nilpotent (so that ex is defined). The proof is very simple: we have adx=λx-ρx, and λx,ρx commute, hence

eadxy = eλx-ρxy = eλx e-ρxy = λex ρe-xy = exye-x.
  1. Let x𝔤. Since adei and π(ei) are locally nilpotent, we have

    π(eadeix) = eadπ(ei) .π(x) = eπ(ei) π(x) e-π(ei).

    by the formula above. Similarly with ei replaced by fi. Hence if (as in Ch. II) we write

    wi= eadei e-adfi eadei

    then we have

    π(wix)= θiπ(x) θi-1 (1)


    θi= eπ(ei) e-π(fi) eπ(ei) GL(M)

    Now recall (2.3) that wih=wih for h𝔥. It follows from (1) that

    π(wih)= θiπ(h) θi-1. (2)

    Now let μP(M), vMμ, v0. Then π(h)v=μ(h)v (h𝔥) and therefore

    π(h) (θi-1v) = θi-1π (wih)vby (2) = θi-1 (μ(wih)v) = (wiμ)(h) θi-1v

    (since θi-1 is k–linear). This calculation shows that θi-1v Mwiμ, and hence that wiμP(M). Consequently P(M) is W–stable; also θi-1 takes Mμ into Mwiμ, so that dimMμdim Mwiμ; replacing μ by wiμ we get the opposite inequality, hence ch(M) is W–invariant.

  2. Same proof as (2.31) (root strings).

  3. Follows from (b).

A linear form λ𝔥* is integral if λ(hi) (1in); dominant integral if λ(hi) for 1in.

Let P (resp. P+) denote the set of all integral (resp. dominant integral) λ𝔥*. Notice that each αjP, because αj(hi)= aij: thus QP. (Warning: Q+P+). Clearly P+=PC (C the dual fundamental chamber).

(3.19) Let M be a quasi-simple 𝔤–module with highest weight λ. Then λP+. Conversely, if λP+ then L(λ) is quasi-simple.


Recall (1.17)

eifiN+1= fiN+1ei+ (N+1)fiN (hi-N).

Let vλMλ be a generator of M. Since M is quasi-simple, N0 such that fiN.vλ0, fiN+1.vλ=0; also ei.vλ=0, whence

0=eifiN+1 .vλ=(N+1) fiN (λ(hi)-N) vλ (1)

and therefore λ(hi)=N0. Thus λP+. (Notice that this gives another proof of (3.18)(c), namely that P(M)P: for if μP(M), then μλ-Q+P.)

For the second part, let vλ be the generator of L(λ) and let xi= fiλ(hi)+1 vλ. I claim that xi=0. For now we have from (1) that eifiN+1vλ=0 if N=λ(hi), i.e. eixi=0; also ej.xi= fiλ(hi)+1 ej.xi=0 if ji (because ej,fi then commute). Hence xi generates a proper submodule of L(λ). Since L(λ) is simple, we must have xi=0. By (3.17), it follows that L(λ) is quasi-simple.

Recall that ρ𝔥* was chosen such that ρ(hi)=12 aii (1in). Since A is now a Cartan matrix, this condition now becomes

ρ(hi)=1 (1in).

Thus ρP+.

For wW, let ε(w)=det (w)= (-1)l(w). (sign character of W).

(3.20) eρ is W–skew, i.e.

w(eρ)= ε(w).eρ

for all wW.


It is enough to verify this when w=wi is a generator of W. We have

wiρ=ρ-ρ(hi) αi=ρ-αi.

On the other hand (2.6), wi sends αi to -αi and permutes the positive roots αi. Thus

wi ( eραR+ (1-e-α)mα ) = eρ-αi (1-eαi) αR+ ααi (1-e-α)mα (sinceαihas multiplicity 1) = -eρ.

Now assume that the Cartan matrix A is symmetrizable. Then the scalar product on 𝔥 and 𝔥* is W–invariant (2.23), and we have

ρ,αi= εi-1ρ (hi)= εi-1>0 (1in) (1)

For the same reason, if λP+ we have

λ,αi= εi-1λ (hi)0. (2)

(3.21) Theorem (V. Kac) Let A be a symmetrizable Cartan matrix and let M be a quasi-simple 𝔤(A)–module with highest weight λ. Then

ch(M)= ( wWε(w) ew(λ+ρ) ) /eραR+ (1-e-α)mα.


From (3.16) we have, writing dμ=cμ+ρ,

eρ.ch(M)= μcμ+ρ eμ+ρ

summed over μD(λ) such that μ+ρ2= λ+ρ2, with coefficients cμ+ρ and, in particular, cλ+ρ=1.

Now ch(M) is W–invariant (3.18) and eρ is W–skew (3.20). Hence there product is W–skew, and therefore for each wW we have

μcμ+ρ eμ+ρ=μ ε(w)cμ+ρ ew(μ+ρ)

so that cw(μ+ρ)=ε (w)cμ+ρ. Hence if cμ+ρ0 we have w(μ+ρ)λ+ρ for all wW; choose w so that ht (λ+ρ-w (μ+ρ)) is minimal and put ν=w(μ+ρ). Then ht ( λ+ρ-wiν ) ht(λ+ρ-ν), i.e. ht(ν-wiν) 0 and therefore ν(hi)0, or equivalently ν,αi0.

Thus ν satisfies

  1. ν,αi0 (1in);

  2. νλ+ρ, i.e. λ+ρ=ν+ 1n miαi with coefficients mi0;

  3. ν2= w(μ+ρ)2 =μ+ρ2= λ+ρ2.

These three conditions force ν=λ+ρ; for we have

0 = λ+ρ2- ν2 = λ+ρ+ν,λ+ρ-ν = λ+ρ+ν, miαi = mi λ+ρ+ν,αi

But λ,αi0 by (2) because λP+ (3.19); ρ,αi>0 (1); and ν,αi0 ((i) above). Hence λ+ρ+ν,αi>0 (1in), and therefore all coefficients mi are 0, hence ν=λ+ρ and therefore (sincecλ+ρ=1)

μcμ+ρ eμ+ρ=wW ε(w) ew(λ+ρ).

Recall (3.19) that L(λ) is quasi-simple if λP+. The character formula (3.21) shows that if M is a quasi-simple 𝔤–module with highest weight λ (P+,by (3.19)) then ch(M) depends only on λ. It follows that

ch(M)=ch L(λ)

i.e. dimMμ=dim L(λ)μ for all μ. But L(λ) is in any case a homomorphic image of M, and so we conclude that M=L(λ):

(3.22) Every quasi-simple 𝔤–module is simple.

Another corollary of (3.21) is the "denominator formula":

(3.23) For any symmetrizable Cartan matrix we have

wWε(w) ewρ-ρ= αR+ (1-e-α)mα


Take λ=0 in (3.21) and observe that L(0) is the trivial 1–dimensional 𝔤–module (3.7), so that chL(0)=1.

We can write (3.23) in another form, as follows. Recall that for wW,

S(w) = { αR+: w-1αR- } = R+wR-

is a finite set (2.10). Define

s(w)= αS(w)α

a finite sum of positive roots. We then have the formula (for any Cartan matrix A)

(3.24) s( w)=ρ-wρ.


If w=wi1wir is a reduced word for w(r=l(w)) then (2.9)

S(w)= { αi1,wi1 αi2,, wi1wir-1 αir }

from which it follows that if w=wi2 wir

S(w)=αi1 wi1S(w)

i.e. if w=wiw with l(w)=l(w)-1, then

S(w)=αiwi S(w1)

and therefore

s(w)=αi+ wis(w). (1)

To prove (3.24), we proceed by induction on l(w). The result is clearly true when l(w)=0, for then w=1, s(w)=0. Assume l(w)>0 and write w=wiw as above, then

s(w) = αi+wi (ρ-wρ) by (1)ind. hyp. = αi+ ( ρ-ρ(hi)αi ) -wρ = ρ-wρ

since ρ(hi)=1.

By virtue of (3.24) we can rewrite (3.23) in the form


wWε(w) e-s(w)= αR+ (1-e-α)mα .

Also form (3.23) we can rewrite the character formula (3.21) in the form

(3.21) Let λP+, then

chL(λ)= wW ε(w) ew(λ+ρ) wW ε(w)ewρ .

(3.23) is a statement about the root system R and the Weyl group W; it may be formally inverted to give a formula for the multiplicities mα (recall that all real roots have multiplicity 1; the imaginary roots may have multiplicities mα>1).


  1. Suppose A is of finite type, so that 𝔤(A) is finite-dimensional and R is finite. In that case

    ρ=12 αR+α (1)

    For if δ is 12 the sum of the positive roots then by (2.6)

    wiδ = 12αR+ wiα = 12 ( -αi+ αR+ ααi α ) =δ-αi;

    but on the other hand wiδ=δ-δ (hi)αi, so that δ(hi)=1 for all i. Since A is nonsingular, 𝔥 is spanned by h1,,hn and therefore δ=ρ.

    The Denominator formula (3.23) now reads

    wWε(w) ewρ = eραR+ (1-e-α) = αR+ ( eα/2- e-α/2 )

    by virtue of ().where is this from? It is a polynomial identity in the group ring [12Q].

    For a specific example, take A of type An-1. Let u1,,un be the standard basis of n, then the roots may be taken to be ui-uj (ij) and the positive roots ui-uj (i<j). Thus

    ρ=12i<j (ui-uj)= 12i=1n (n+1-2i)ui

    Put xi=eui; the Weyl group W is here the symmetric group Sn acting by permuting the ui (or, equivalently, the xi). We have

    eρ = x1n-12 xxn-32 xn1-n2 = (x1xn) 1-n2 x1n-1 x2n-2

    and therefore

    wWε(w) ewρ= (x1xn) 1-n2 Δ(x1,,xn)

    where Δ(x1,,xn) =det(xin-j) is the Vandermonde determinant. On the other side,

    eραR+ (1-e-α) = eρi<j (1-xi-1xj) = (x1xn) 1-n2 i<j (xi-xj)

    and therefore the "denominator formula" in this case reduces to the familiar factorization of the Vanermonde determinant:

    Δ(x1,,xn) =i<j (xi-xj).

    So it is an essentially trivial polynomial identity in this case.

  2. Let A= ( 2 -2 -2 2 ) , of affine type. The Weyl group W is infinite dihedral, generated by reflections w1,w2. So its elements are 1 and

    w1w2w1 torterms (all r1) w2w1w2 torterms (all r1)

    We have w1(α1)= -α1, w2(α1)= α1-α1 (h2)α2= α1+2α2 and likewise w1(α2)=2 α1+α2, w2(α2)= -α2. So we get the following picture:

    α1 α2 δ w1α2 w2α1 2δ w2w1α2 w1w2α1 3δ

    Now if w=w1w2w1 to r terms then

    s(w) = α1+w1α2+ w1w2α1+ torterms, by (2.8) = α1+ (2α1+α2) +(3α1+2α2) + = 12r(r+1)α1 +12r(r-1)α2

    so that if we put x1=e-α1, x2=e-α2 we have

    wWε(w) e-s(w) = 1+r=1 (-1)r ( x112r(r+1) x212r(r-1) + x112r(r-1) x212r(r+1) ) = r (-1)r x112r(r+1) x212r(r-1)

    On the other hand, the positive roots are

    rα1+(r-1)α2, rα1+rα2, (r-1)α1+rα2 (r1)

    rα1+rα2=rδ is an imaginary root; in fact (as we shall see later) it has multiplicity 1. So we obtain the identity

    r(-1)r x112r(r+1) x212r(r-1) = r=1 ( 1-x1r x2r-1 ) ( 1-x1r-1 x2r ) ( 1-x1rx2r )

    in [[x1,x2]]. If we put x1x2=t, x1=x it takes the form

    r(-1)r xrt12r(r-1) =r=1 (1-xtr-1) (1-x-1tr) (1-tr)

    and is due to Jacobi (and earlier, unpublished, to Gauss): it is called Jacobi's triple product identity and it can be specialized in various ways, at least two of which are worth notice:

    1. Put t=x3 and we get

      r=1 (1-xn)= r (-1)r x12r(3r-1)

      Euler's pentagonal number theorem.

    2. Divide both sides by 1-x and then set x=1. On the product side we get r=1 (1-tr)3, and on the sum side

      r1 (-1)r t12r(r-1) xr-x1-r 1-x r1 (-1)r+1 (2r-1) t12r(r-1) = r0 (-1)r (2r+1) t12r(r+1)

      Thus we obtain another famous identity due to Jacobi:

      r=1 (1-tr)3= 1-3t+5t3-7 t6+9t10-11 t15+
  3. Finally, if A is symmetrizable and of indefinite type, the denominator formula can be used to compute the multiplicities of the imaginary roots. For a simple example take A= ( 2 -3 -3 2 ) . The Weyl group W is infinite dihedral:

    w1(α1) =-α1 w2(α1) =α1+3α2 w1(α2) =α2+3α1 w2(α2) =-α2

    So the real roots are α1,α2, α1+3α2, α2+3α1,

    w1(α1+3α2) = -α1+3 (α2+3α1) = 8α1+3α2 w1 (3α1+8α2) = -3α1+8 (α2+3α1) = 21α1+8α2

    the coefficients of which we recognise as Fibonacci numbers: the real roots are

    f2kα1+ f2k-2α2; f2k-2α1+ f2kα2 (k1)

    ( f0=0,f1=1, fr=fr-1+ fr-2 )

    From this we calculate easily

    s ( w1w2 kterms ) = (f2k+1-1)α1 +(f2k-1-1) α2

    so that the series wWε (w)e-s(w) is

    1-x1-x2+x14 x2+x2x24- x112x24-x14 x212+x133x212 +x112x233-

    (xi=e-αi) , i.e. it is a 'sparse' power series in x1,x2. By factorizing it as a product of factors ( 1-x1k1 x2k2 ) mk1,k2 we compute the multiplicities of the roots:

    =(1-x1) (1-x2) (1-x1x2)


I.G. Macdonald
Issac Newton Institute for the Mathematical Sciences
20 Clarkson Road
Cambridge CB3 OEH U.K.

Version: October 30, 2001

page history