logo

[텍스트 분석] tensorflow

임포트

import tensorflow as tf
from tensorflow.keras.layers import *

모델 정의

model = tf.keras.Sequential([
    Dense(1, activation='sigmoid')
])

훈련 설정

model.compile(loss='binary_crossentropy', metrics=['accuracy'])

훈련

model.fit(x_train.A, y_train, epochs=10)
Epoch 1/10
25/25 [==============================] - 0s 1ms/step - loss: 0.6970 - accuracy: 0.4725
Epoch 2/10
25/25 [==============================] - 0s 792us/step - loss: 0.6878 - accuracy: 0.5337
Epoch 3/10
25/25 [==============================] - 0s 834us/step - loss: 0.6794 - accuracy: 0.6125
Epoch 4/10
25/25 [==============================] - 0s 917us/step - loss: 0.6713 - accuracy: 0.6712
Epoch 5/10
25/25 [==============================] - 0s 795us/step - loss: 0.6635 - accuracy: 0.7175
Epoch 6/10
25/25 [==============================] - 0s 792us/step - loss: 0.6557 - accuracy: 0.7700
Epoch 7/10
25/25 [==============================] - 0s 1ms/step - loss: 0.6479 - accuracy: 0.8062
Epoch 8/10
25/25 [==============================] - 0s 709us/step - loss: 0.6403 - accuracy: 0.8363
Epoch 9/10
25/25 [==============================] - 0s 856us/step - loss: 0.6329 - accuracy: 0.8537
Epoch 10/10
25/25 [==============================] - 0s 813us/step - loss: 0.6257 - accuracy: 0.8600

<keras.src.callbacks.History at 0x12297826800>

평가

model.evaluate(x_test.A, y_test)
7/7 [==============================] - 0s 1000us/step - loss: 0.6529 - accuracy: 0.7300

[0.6529121994972229, 0.7300000190734863]

가중치

weight, bias = model.trainable_weights
word_weight = pd.DataFrame({
    '단어': cv.get_feature_names_out(),
    '가중치': weight.numpy().flat
})

긍정 단어

word_weight.sort_values('가중치', ascending=False).head(10)
단어가중치
153delicious0.253083
365loved0.167681
268great0.166258
12amazing0.164991
221fantastic0.163819
364love0.161932
24atmosphere0.158757
312ice0.155323
102chef0.151462
56best0.139563

부정 단어

word_weight.sort_values('가중치').head(10)
단어가중치
37bad-0.211356
65bland-0.182218
977worst-0.163735
266got-0.156488
937wanted-0.147896
928waited-0.147509
964won-0.146980
546probably-0.145420
825think-0.144671
929waiter-0.143003

예측

prob = model.predict(x_test.A)

다층신경망

model = tf.keras.Sequential([
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train.A, y_train, epochs=10, verbose=0)
<keras.src.callbacks.History at 0x1229a19fbb0>
model.evaluate(x_test.A, y_test)
7/7 [==============================] - 0s 8ms/step - loss: 0.5000 - accuracy: 0.7750

[0.5000040531158447, 0.7749999761581421]
Previous
혼동 행렬