[텍스트 분석] tensorflow
임포트
import tensorflow as tf
from tensorflow.keras.layers import *
모델 정의
model = tf.keras.Sequential([
Dense(1, activation='sigmoid')
])
훈련 설정
model.compile(loss='binary_crossentropy', metrics=['accuracy'])
훈련
model.fit(x_train.A, y_train, epochs=10)
Epoch 1/10
25/25 [==============================] - 0s 1ms/step - loss: 0.6970 - accuracy: 0.4725
Epoch 2/10
25/25 [==============================] - 0s 792us/step - loss: 0.6878 - accuracy: 0.5337
Epoch 3/10
25/25 [==============================] - 0s 834us/step - loss: 0.6794 - accuracy: 0.6125
Epoch 4/10
25/25 [==============================] - 0s 917us/step - loss: 0.6713 - accuracy: 0.6712
Epoch 5/10
25/25 [==============================] - 0s 795us/step - loss: 0.6635 - accuracy: 0.7175
Epoch 6/10
25/25 [==============================] - 0s 792us/step - loss: 0.6557 - accuracy: 0.7700
Epoch 7/10
25/25 [==============================] - 0s 1ms/step - loss: 0.6479 - accuracy: 0.8062
Epoch 8/10
25/25 [==============================] - 0s 709us/step - loss: 0.6403 - accuracy: 0.8363
Epoch 9/10
25/25 [==============================] - 0s 856us/step - loss: 0.6329 - accuracy: 0.8537
Epoch 10/10
25/25 [==============================] - 0s 813us/step - loss: 0.6257 - accuracy: 0.8600
<keras.src.callbacks.History at 0x12297826800>
평가
model.evaluate(x_test.A, y_test)
7/7 [==============================] - 0s 1000us/step - loss: 0.6529 - accuracy: 0.7300
[0.6529121994972229, 0.7300000190734863]
가중치
weight, bias = model.trainable_weights
word_weight = pd.DataFrame({
'단어': cv.get_feature_names_out(),
'가중치': weight.numpy().flat
})
긍정 단어
word_weight.sort_values('가중치', ascending=False).head(10)
단어 | 가중치 | |
---|---|---|
153 | delicious | 0.253083 |
365 | loved | 0.167681 |
268 | great | 0.166258 |
12 | amazing | 0.164991 |
221 | fantastic | 0.163819 |
364 | love | 0.161932 |
24 | atmosphere | 0.158757 |
312 | ice | 0.155323 |
102 | chef | 0.151462 |
56 | best | 0.139563 |
부정 단어
word_weight.sort_values('가중치').head(10)
단어 | 가중치 | |
---|---|---|
37 | bad | -0.211356 |
65 | bland | -0.182218 |
977 | worst | -0.163735 |
266 | got | -0.156488 |
937 | wanted | -0.147896 |
928 | waited | -0.147509 |
964 | won | -0.146980 |
546 | probably | -0.145420 |
825 | think | -0.144671 |
929 | waiter | -0.143003 |
예측
prob = model.predict(x_test.A)
다층신경망
model = tf.keras.Sequential([
Dense(32, activation='relu'),
Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train.A, y_train, epochs=10, verbose=0)
<keras.src.callbacks.History at 0x1229a19fbb0>
model.evaluate(x_test.A, y_test)
7/7 [==============================] - 0s 8ms/step - loss: 0.5000 - accuracy: 0.7750
[0.5000040531158447, 0.7749999761581421]