Linear Discriminant Analysis(LDA)

PatternRecognition

Linear Discriminant Analysis(LDA) - C-Classes

FunnyPR 2014. 6. 2. 22:32

■ Linear Discriminant Analysis(LDA) - C-Classes

이전 글을 두개의 클래스를 판별하는 LDA에 대해서 알아 봤다. 그럼 여러개(C개)의 클래스를 어떻게 판별할 수 있을까? 접근은 2개의 클래스 판별 LDA 방법과 유사하다.

n-feature vectors를 가졌다면 다음과 같이 표현할 수 있다.

여기서 Y는 출력벡터, X는 입력벡터, W는 변환행렬이다.

- 즉 mxn입력 백터에 C개의 클래스를 LDA 분석을 하면 출력벡터는 c-1 by n개의 배열이 된다. 이에 중요한점은 각 클래스 마다 최적의 변환행렬을 계산해야 한다.

C개의 클래스를 가지는 입력 벡터를 LDA 분석하기 위한 단계는 다음과 같다.

1. 원래 데이터 차원에서 통계 계산

step 1: 클래스 내 분산 구하기

step 2: 클래스 간 분산 구하기

2. 투영된 데이터 차원에서 통계 계산

step 1: 평균 벡터 구하기

step 2: 클래스 내 분산 구하기

step 3: 클래스 간 분산 구하기

3. 목적행렬을 통한 최적 변환 행렬 찾기

는 최적 변환 행렬임

- 최적 변환 행렬은 일반적인 고유값 문제 해결로 얻을 수 있는 최고 고유값에 해당하는 고유벡터가 됨

- C개의 클래스는 C-1개의 변환행렬을 가짐

4. 차원 축소

추가적으로 LDA 접근은 두 가지 방법으로 나뉘어짐

- 클래스 종속: 각 클래스 마다 변환행렬 생성

- 클래스 독립: 하나의 변환행렬 생성

5. 예제

LDA에 쓸 3개의 클래스 샘플 생성

Colored By Color Scripter™

%clear

clear

%dataset Generation

%let the center of all classes be

Mu = [5;5];

%%for the first class

Mu1=[Mu(1)-3; Mu(2)+7];

CovM1 = [5 -1; -3 3];

%%for the second class

Mu2=[Mu(1)-2.5; Mu(2)-3.5];

CovM2 = [4 0; 0 4];

%%for the third class

Mu3=[Mu(1)+7; Mu(2)+5];

CovM3 = [3.5 1; 3 2.5];

%generating feature vectors using Box-Muller approach

%Generate a random variable following uniform(0,1) having two features and

%1000 feature vectors

U=rand(2,1000);

%Extracting from the generated uniform random variable two independent

%uniform random variables;

u1 = U(:,1:2:end);

u2 = U(:,2:2:end);

%Using u1 and u2, we will use Box-Muller method to generate the feature

%vectors to follow standard normal

X=sqrt((-2).*log(u1)).*(cos(2*pi.*u2));

clear u1 u2 U;

%now ... Manipulating the generated features N(0,1) to following certain

%mean and covariance orher than the standard normal

%First we will change its variance then we will change its mean

%Getting the eigen vectors and values of the covariance matrix

[V,D] = eig(CovM1);

% D is the eigen values matrix and V is the eigen vectors matrix

newX=X;

for j=1:size(X,2)

newX(:,j)=V*sqrt(D)*X(:,j);

end

%changing its mean

newX=newX+repmat(Mu1, 1, size(newX,2));

%now our dataset for the first class matrix will be

X1 = newX;

%each column is a feature vector, each row is a single feature

%...do the same for the other two classes with difference means and covariance matrices

[V,D] = eig(CovM2);

newX=X;

for j=1:size(X,2)

newX(:,j)=V*sqrt(D)*X(:,j);

end

newX=newX+repmat(Mu2, 1, size(newX,2));

X2 = newX;

[V,D] = eig(CovM3);

newX=X;

for j=1:size(X,2)

newX(:,j)=V*sqrt(D)*X(:,j);

end

newX=newX+repmat(Mu3, 1, size(newX,2));

X3 = newX;

%draw 2d scatter plot

figure;

hold on;

plot(X1(1,:), X1(2,:), 'ro', 'markersize',10, 'linewidth', 3);

plot(X2(1,:), X2(2,:), 'go', 'markersize',10, 'linewidth', 3);

plot(X3(1,:), X3(2,:), 'bo', 'markersize',10, 'linewidth', 3);

위 코드 수행시 아래와 같이 출력됨

LDA matlab 코드는 다음과 같다.

Colored By Color Scripter™

100

101

102

103

104

105

106

107

108

109

110

111

112

113

%% computing the LDA

% class means

Mu1= mean(X1')';

Mu3= mean(X2')';

Mu2= mean(X3')';

%overall mean

Mu = (Mu1+Mu2+Mu3)./3;

%class covariance matrices

S1=cov(X1');

S2=cov(X2');

S3=cov(X3');

%within-class scatter matrix

Sw=S1+S2+S3;

%number of samples of each class

N1=size(X1, 2);

N2=size(X2, 2);

N3=size(X3, 2);

%between-class scatter matrix

SB1=N1.*(Mu1-Mu)*(Mu1-Mu)';

SB2=N2.*(Mu2-Mu)*(Mu2-Mu)';

SB3=N3.*(Mu3-Mu)*(Mu3-Mu)';

SB=SB1+SB2+SB3;

%computing the LDA projection

invSw=inv(Sw);

invSw_by_SB = invSw*SB;

%getting the projection vectors

%[V,D]=EIG(X) produces a diagonal matrix D of eigenvalues and a

%full matrix V whose columns are the corresponding eigenvectors

[V,D]=eig(invSw_by_SB);

%the projcetion vectors - we will have at most C-1 projection vectors,

%from which we can choose the most important ones ranked by their

%corresponding eigen values ... lets investigate the two projection vectors

W1=V(:,1);

W2=V(:,2);

%%lets visualize them...

% we will plot the scatter plot to better visualize the features

hfig=figure;

axes1=axes('Parent',hfig,'FontWeight','bold','FontSize',12);

hold('all');

%Create xLabel

xlabel('X_1 - the first feature', 'FontWeight', 'bold', 'FontSize',12,'FontName', 'Garamond');

%Create yLabel

ylabel('X_2 - the second feature', 'FontWeight', 'bold', 'FontSize',12,'FontName', 'Garamond');

%the fist class

scatter(X1(1,:), X1(2,:),'r','LineWidth',2,'Parent',axes1);

hold on

%the second class

scatter(X2(1,:), X2(2,:),'g','LineWidth',2,'Parent',axes1);

hold on

%the third class

scatter(X3(1,:), X3(2,:),'b','LineWidth',2,'Parent',axes1);

hold on

%drawing the projection vectors

%the first vector

t=-10:25;

line_x1 = t.*W1(1);

line_y1 = t.*W1(1);

%the second vector

t=-5:20;

line_x2 = t.*W2(1);

line_y2 = t.*W2(2);

plot(line_x1, line_y1, 'k-', 'LineWidth',3);

hold on

plot(line_x2, line_y2, 'm-', 'LineWidth',3);

hold on

%projection data samples along the projections axes

%the first projection vector

y1_w1 = W1'*X1;

y2_w1 = W1'*X2;

y3_w1 = W1'*X3;

%projection limits

minY=min([min(y1_w1), min(y2_w1), min(y3_w1)]);

maxY=max([max(y1_w1), max(y2_w1), max(y3_w1)]);

y_w1=minY:0.05:maxY;

%for visualization lets compute the probability

%density function of the classes after projection

%the first class

y1_w1_Mu = mean(y1_w1);

y1_w1_sigma = std(y1_w1);

y1_w1_pdf = mvnpdf(y1_w1',y1_w1_Mu,y1_w1_sigma);

%the second class

y2_w1_Mu = mean(y2_w1);

y2_w1_sigma = std(y2_w1);

y2_w1_pdf = mvnpdf(y1_w1',y2_w1_Mu,y2_w1_sigma);

%the third class

y3_w1_Mu = mean(y3_w1);

y3_w1_sigma = std(y3_w1);

y3_w1_pdf = mvnpdf(y1_w1',y3_w1_Mu,y3_w1_sigma);

검은색이 고유값이 큰 고유벡터 값으로 판별되는 LD1 축이 되고, 다음 고유값에 따른 LD2 축이 보라색 선이 된다. 코드안에 차원축소를 한 데이터에 대해 PDF 분석 코드가 있다. 화면에 찍어야 하는데 그건 잘 모르겠다.

우선 LDA에 대해서 어떻게 접근해야 하는지 이제 좀 감이 잡힌다. 다시 LDA 전체 matlab 소스 코드를 첨부한다.

Colored By Color Scripter™

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

%clear

clear

%dataset Generation

%let the center of all classes be

Mu = [5;5];

%%for the first class

Mu1=[Mu(1)-3; Mu(2)+7];

CovM1 = [5 -1; -3 3];

%%for the second class

Mu2=[Mu(1)-2.5; Mu(2)-3.5];

CovM2 = [4 0; 0 4];

%%for the third class

Mu3=[Mu(1)+7; Mu(2)+5];

CovM3 = [3.5 1; 3 2.5];

%generating feature vectors using Box-Muller approach

%Generate a random variable following uniform(0,1) having two features and

%1000 feature vectors

U=rand(2,1000);

%Extracting from the generated uniform random variable two independent

%uniform random variables;

u1 = U(:,1:2:end);

u2 = U(:,2:2:end);

%Using u1 and u2, we will use Box-Muller method to generate the feature

%vectors to follow standard normal

X=sqrt((-2).*log(u1)).*(cos(2*pi.*u2));

clear u1 u2 U;

%now ... Manipulating the generated features N(0,1) to following certain

%mean and covariance orher than the standard normal

%First we will change its variance then we will change its mean

%Getting the eigen vectors and values of the covariance matrix

[V,D] = eig(CovM1);

% D is the eigen values matrix and V is the eigen vectors matrix

newX=X;

for j=1:size(X,2)

newX(:,j)=V*sqrt(D)*X(:,j);

end

%changing its mean

newX=newX+repmat(Mu1, 1, size(newX,2));

%now our dataset for the first class matrix will be

X1 = newX;

%each column is a feature vector, each row is a single feature

%...do the same for the other two classes with difference means and covariance matrices

[V,D] = eig(CovM2);

newX=X;

for j=1:size(X,2)

newX(:,j)=V*sqrt(D)*X(:,j);

end

newX=newX+repmat(Mu2, 1, size(newX,2));

X2 = newX;

[V,D] = eig(CovM3);

newX=X;

for j=1:size(X,2)

newX(:,j)=V*sqrt(D)*X(:,j);

end

newX=newX+repmat(Mu3, 1, size(newX,2));

X3 = newX;

%draw 2d scatter plot

figure;

hold on;

plot(X1(1,:), X1(2,:), 'ro', 'markersize',10, 'linewidth', 3);

plot(X2(1,:), X2(2,:), 'go', 'markersize',10, 'linewidth', 3);

plot(X3(1,:), X3(2,:), 'bo', 'markersize',10, 'linewidth', 3);

%% computing the LDA

% class means

Mu1= mean(X1')';

Mu3= mean(X2')';

Mu2= mean(X3')';

%overall mean

Mu = (Mu1+Mu2+Mu3)./3;

%class covariance matrices

S1=cov(X1');

S2=cov(X2');

S3=cov(X3');

%within-class scatter matrix

Sw=S1+S2+S3;

%number of samples of each class

N1=size(X1, 2);

N2=size(X2, 2);

N3=size(X3, 2);

%between-class scatter matrix

SB1=N1.*(Mu1-Mu)*(Mu1-Mu)';

SB2=N2.*(Mu2-Mu)*(Mu2-Mu)';

SB3=N3.*(Mu3-Mu)*(Mu3-Mu)';

SB=SB1+SB2+SB3;

%computing the LDA projection

invSw=inv(Sw);

invSw_by_SB = invSw*SB;

%getting the projection vectors

%[V,D]=EIG(X) produces a diagonal matrix D of eigenvalues and a

%full matrix V whose columns are the corresponding eigenvectors

[V,D]=eig(invSw_by_SB);

%the projcetion vectors - we will have at most C-1 projection vectors,

%from which we can choose the most important ones ranked by their

%corresponding eigen values ... lets investigate the two projection vectors

W1=V(:,1);

W2=V(:,2);

%%lets visualize them...

% we will plot the scatter plot to better visualize the features

hfig=figure;

axes1=axes('Parent',hfig,'FontWeight','bold','FontSize',12);

hold('all');

%Create xLabel

xlabel('X_1 - the first feature', 'FontWeight', 'bold', 'FontSize',12,'FontName', 'Garamond');

%Create yLabel

ylabel('X_2 - the second feature', 'FontWeight', 'bold', 'FontSize',12,'FontName', 'Garamond');

%the fist class

scatter(X1(1,:), X1(2,:),'r','LineWidth',2,'Parent',axes1);

hold on

%the second class

scatter(X2(1,:), X2(2,:),'g','LineWidth',2,'Parent',axes1);

hold on

%the third class

scatter(X3(1,:), X3(2,:),'b','LineWidth',2,'Parent',axes1);

hold on

%drawing the projection vectors

%the first vector

t=-10:25;

line_x1 = t.*W1(1);

line_y1 = t.*W1(1);

%the second vector

t=-5:20;

line_x2 = t.*W2(1);

line_y2 = t.*W2(2);

plot(line_x1, line_y1, 'k-', 'LineWidth',3);

hold on

plot(line_x2, line_y2, 'm-', 'LineWidth',3);

hold on

%projection data samples along the projections axes

%the first projection vector

y1_w1 = W1'*X1;

y2_w1 = W1'*X2;

y3_w1 = W1'*X3;

%projection limits

minY=min([min(y1_w1), min(y2_w1), min(y3_w1)]);

maxY=max([max(y1_w1), max(y2_w1), max(y3_w1)]);

y_w1=minY:0.05:maxY;

%for visualization lets compute the probability

%density function of the classes after projection

%the first class

y1_w1_Mu = mean(y1_w1);

y1_w1_sigma = std(y1_w1);

y1_w1_pdf = mvnpdf(y1_w1',y1_w1_Mu,y1_w1_sigma);

%the second class

y2_w1_Mu = mean(y2_w1);

y2_w1_sigma = std(y2_w1);

y2_w1_pdf = mvnpdf(y1_w1',y2_w1_Mu,y2_w1_sigma);

%the third class

y3_w1_Mu = mean(y3_w1);

y3_w1_sigma = std(y3_w1);

y3_w1_pdf = mvnpdf(y1_w1',y3_w1_Mu,y3_w1_sigma);

[1] http://www.di.univr.it/documenti/OccorrenzaIns/matdid/matdid437773.pdf

[2] http://www.bytefish.de/blog/pca_lda_with_gnu_octave/

저작자표시 비영리