Policy Gradiant using Keras
Browse files- DQN_v1.ipynb +22 -0
- Policy_gradiant_cartpole_v1.ipynb +483 -0
- Test_custom_loss.ipynb +804 -0
- fin_rl_dqn_v1.ipynb +5 -0
- fin_rl_dqn_v2.ipynb +0 -0
- fin_rl_policy_gradiant_v1.ipynb +0 -0
- fin_rl_qlearning_v1.ipynb +1 -1
DQN_v1.ipynb
CHANGED
@@ -11,6 +11,28 @@
|
|
11 |
"#### This version implements DQN with Keras\n"
|
12 |
]
|
13 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
{
|
15 |
"cell_type": "code",
|
16 |
"execution_count": 1,
|
|
|
11 |
"#### This version implements DQN with Keras\n"
|
12 |
]
|
13 |
},
|
14 |
+
{
|
15 |
+
"cell_type": "markdown",
|
16 |
+
"metadata": {},
|
17 |
+
"source": [
|
18 |
+
"Hi everybody, I just finished coding a DQN from scratch that can solve CartPole-v1. https://huggingface.co/bonadio/rl-fin/blob/main/DQN_v1.ipynb \n",
|
19 |
+
"It takes 6000 steps, here is the result https://huggingface.co/bonadio/rl-fin/blob/main/DQN_v1_result.mp4.\n",
|
20 |
+
"\n",
|
21 |
+
"I can say that coding a RL algorithm is really challenge. The biggest difficult is that there are a lot of parameters to tune and it is hard to know when you are on the right track. \n",
|
22 |
+
"For me, the DQN main points where: \n",
|
23 |
+
"1- number of layers of the NN I started with only 2 layers of 64, but had to grow to 3 layers with 512,256,128\n",
|
24 |
+
"\n",
|
25 |
+
"2- NN frequency update, when you copy the weights from one NN to the other. I started with 50 steps, but end up with 5, the more often you update the better.\n",
|
26 |
+
"\n",
|
27 |
+
"3- Size of the batch, the number of samples that you take from the replay memory, seems that the bigger the best. I started with 10 and end up with 100. \n",
|
28 |
+
"\n",
|
29 |
+
"4- Epsilon decay, seen that the network only really starts to learn after it stops taking random positions. A very small value as end value is good, I used 0.001\n",
|
30 |
+
"\n",
|
31 |
+
"5- The size of the memory replay does not play a big difference I used 10.000 but I think it could be smaller like 5.000\n",
|
32 |
+
"\n",
|
33 |
+
"Now that I have a working DQN I will adapt it to trade Ethereum. I will convert my code from Q-learning to DQN. "
|
34 |
+
]
|
35 |
+
},
|
36 |
{
|
37 |
"cell_type": "code",
|
38 |
"execution_count": 1,
|
Policy_gradiant_cartpole_v1.ipynb
ADDED
@@ -0,0 +1,483 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"cells": [
|
3 |
+
{
|
4 |
+
"cell_type": "markdown",
|
5 |
+
"metadata": {
|
6 |
+
"id": "nwaAZRu1NTiI"
|
7 |
+
},
|
8 |
+
"source": [
|
9 |
+
"# Policy Gradiant\n",
|
10 |
+
"\n",
|
11 |
+
"#### This version implements Policy Gradiant with Keras to solve cartpole\n"
|
12 |
+
]
|
13 |
+
},
|
14 |
+
{
|
15 |
+
"cell_type": "code",
|
16 |
+
"execution_count": 13,
|
17 |
+
"metadata": {
|
18 |
+
"id": "Nm5rvpUZNxDp"
|
19 |
+
},
|
20 |
+
"outputs": [],
|
21 |
+
"source": [
|
22 |
+
"# %%capture\n",
|
23 |
+
"# !pip install gym==0.22\n",
|
24 |
+
"# !pip install pygame\n",
|
25 |
+
"# !apt install python-opengl\n",
|
26 |
+
"# !apt install ffmpeg\n",
|
27 |
+
"# !apt install xvfb\n",
|
28 |
+
"# !pip install pyvirtualdisplay\n",
|
29 |
+
"# !pip install pyglet==1.5.1"
|
30 |
+
]
|
31 |
+
},
|
32 |
+
{
|
33 |
+
"cell_type": "code",
|
34 |
+
"execution_count": 14,
|
35 |
+
"metadata": {
|
36 |
+
"colab": {
|
37 |
+
"base_uri": "https://localhost:8080/"
|
38 |
+
},
|
39 |
+
"id": "LNXxxKojNTiL",
|
40 |
+
"outputId": "c48489ab-d67f-448e-9362-746a4e6bcba2"
|
41 |
+
},
|
42 |
+
"outputs": [
|
43 |
+
{
|
44 |
+
"name": "stdout",
|
45 |
+
"output_type": "stream",
|
46 |
+
"text": [
|
47 |
+
"2.9.2\n"
|
48 |
+
]
|
49 |
+
},
|
50 |
+
{
|
51 |
+
"data": {
|
52 |
+
"text/plain": [
|
53 |
+
"<pyvirtualdisplay.display.Display at 0x7fbeaf7ea190>"
|
54 |
+
]
|
55 |
+
},
|
56 |
+
"execution_count": 14,
|
57 |
+
"metadata": {},
|
58 |
+
"output_type": "execute_result"
|
59 |
+
}
|
60 |
+
],
|
61 |
+
"source": [
|
62 |
+
"import tensorflow as tf\n",
|
63 |
+
"from tensorflow.keras import layers, Model, Input\n",
|
64 |
+
"from tensorflow.keras.utils import to_categorical\n",
|
65 |
+
"import tensorflow.keras.backend as K\n",
|
66 |
+
"\n",
|
67 |
+
"import gym\n",
|
68 |
+
"from gym import spaces\n",
|
69 |
+
"from gym.utils import seeding\n",
|
70 |
+
"from gym import wrappers\n",
|
71 |
+
"\n",
|
72 |
+
"from tqdm.notebook import tqdm\n",
|
73 |
+
"from collections import deque\n",
|
74 |
+
"import numpy as np\n",
|
75 |
+
"import random\n",
|
76 |
+
"from matplotlib import pyplot as plt\n",
|
77 |
+
"from sklearn.preprocessing import MinMaxScaler\n",
|
78 |
+
"\n",
|
79 |
+
"import io\n",
|
80 |
+
"import base64\n",
|
81 |
+
"from IPython.display import HTML, Video\n",
|
82 |
+
"print(tf.__version__)\n",
|
83 |
+
"\n",
|
84 |
+
"# Virtual display\n",
|
85 |
+
"from pyvirtualdisplay import Display\n",
|
86 |
+
"\n",
|
87 |
+
"virtual_display = Display(visible=0, size=(1400, 900))\n",
|
88 |
+
"virtual_display.start()"
|
89 |
+
]
|
90 |
+
},
|
91 |
+
{
|
92 |
+
"cell_type": "code",
|
93 |
+
"execution_count": 17,
|
94 |
+
"metadata": {
|
95 |
+
"id": "c84LoGsXNnJo"
|
96 |
+
},
|
97 |
+
"outputs": [],
|
98 |
+
"source": [
|
99 |
+
"# custom model to be able to run a custom loss with parameters\n",
|
100 |
+
"class CustomModel(tf.keras.Model):\n",
|
101 |
+
" def custom_loss(self,y, y_pred, d_returns):\n",
|
102 |
+
" # print(\"y\", y.shape)\n",
|
103 |
+
" # K.print_tensor(y)\n",
|
104 |
+
" # print(\"y Pred\", y_pred.shape) \n",
|
105 |
+
" # K.print_tensor(y_pred)\n",
|
106 |
+
" # print(\"d_retur\", d_returns.shape) \n",
|
107 |
+
" # K.print_tensor(d_returns)\n",
|
108 |
+
" # crossentropy \n",
|
109 |
+
" log_like = y * K.log(y_pred)\n",
|
110 |
+
" # print(\"-log_like\", log_like.shape) \n",
|
111 |
+
" # K.print_tensor(log_like)\n",
|
112 |
+
" # print(\"-Log_lik * d_returns\")\n",
|
113 |
+
" # K.print_tensor(-log_like * d_returns)\n",
|
114 |
+
" # print(\"k_sum\")\n",
|
115 |
+
" # K.print_tensor(K.sum(-log_like * d_returns ))\n",
|
116 |
+
" return K.sum(-log_like * d_returns )\n",
|
117 |
+
" \n",
|
118 |
+
" def train_step(self, data):\n",
|
119 |
+
" # Unpack the data. Its structure depends on your model and\n",
|
120 |
+
" # on what you pass to `fit()`.\n",
|
121 |
+
" if len(data) == 3:\n",
|
122 |
+
" x, y, sample_weight = data\n",
|
123 |
+
" else:\n",
|
124 |
+
" sample_weight = None\n",
|
125 |
+
" x, y = data\n",
|
126 |
+
"\n",
|
127 |
+
" # check if we passed the d_return\n",
|
128 |
+
" if isinstance(x, tuple):\n",
|
129 |
+
" x, d_return = x\n",
|
130 |
+
"\n",
|
131 |
+
" with tf.GradientTape() as tape:\n",
|
132 |
+
" y_pred = self(x, training=True) # Forward pass\n",
|
133 |
+
" # Compute the loss value.\n",
|
134 |
+
" y = tf.cast(y, tf.float32)\n",
|
135 |
+
" loss = self.custom_loss(y, y_pred, d_return)\n",
|
136 |
+
"\n",
|
137 |
+
" # Compute gradients\n",
|
138 |
+
" trainable_vars = self.trainable_variables\n",
|
139 |
+
" gradients = tape.gradient(loss, trainable_vars)\n",
|
140 |
+
"\n",
|
141 |
+
" # Update weights\n",
|
142 |
+
" self.optimizer.apply_gradients(zip(gradients, trainable_vars))\n",
|
143 |
+
"\n",
|
144 |
+
" # Update the metrics.\n",
|
145 |
+
" # Metrics are configured in `compile()`.\n",
|
146 |
+
" self.compiled_metrics.update_state(y, y_pred, sample_weight=sample_weight)\n",
|
147 |
+
"\n",
|
148 |
+
" # Return a dict mapping metric names to current value.\n",
|
149 |
+
" # Note that it will include the loss (tracked in self.metrics).\n",
|
150 |
+
" return {m.name: m.result() for m in self.metrics}"
|
151 |
+
]
|
152 |
+
},
|
153 |
+
{
|
154 |
+
"cell_type": "code",
|
155 |
+
"execution_count": 18,
|
156 |
+
"metadata": {
|
157 |
+
"id": "sF8L5d-GNnJp"
|
158 |
+
},
|
159 |
+
"outputs": [],
|
160 |
+
"source": [
|
161 |
+
"class Policy:\n",
|
162 |
+
" def __init__(self, env=None, action_size=2):\n",
|
163 |
+
"\n",
|
164 |
+
" self.action_size = action_size\n",
|
165 |
+
"\n",
|
166 |
+
" # Hyperparameters\n",
|
167 |
+
" self.gamma = 0.95 # Discount rate\n",
|
168 |
+
"\n",
|
169 |
+
" self.learning_rate = 1e-2\n",
|
170 |
+
" \n",
|
171 |
+
" # Construct DQN models\n",
|
172 |
+
" self.env = env\n",
|
173 |
+
" self.action_size = action_size\n",
|
174 |
+
" self.action_space = [i for i in range(action_size)]\n",
|
175 |
+
" print(\"action space\",self.action_space)\n",
|
176 |
+
" # self.saved_log_probs = None\n",
|
177 |
+
" self.model= self._build_model()\n",
|
178 |
+
" self.model.summary()\n",
|
179 |
+
"\n",
|
180 |
+
" def _build_model(self):\n",
|
181 |
+
" \n",
|
182 |
+
" x = Input(shape=(4,), name='x_input')\n",
|
183 |
+
" # y_true = Input( shape=(2,), name='y_true' )\n",
|
184 |
+
" d_returns = Input(shape=[1], name='d_returns')\n",
|
185 |
+
"\n",
|
186 |
+
" l = layers.Dense(16, activation = 'relu')(x)\n",
|
187 |
+
" l = layers.Dense(16, activation = 'relu')(l)\n",
|
188 |
+
" y_pred = layers.Dense(self.action_size, activation = 'softmax', name='y_pred')(l)\n",
|
189 |
+
" \n",
|
190 |
+
" optimizer = tf.keras.optimizers.Adam(learning_rate=self.learning_rate)\n",
|
191 |
+
"\n",
|
192 |
+
" # model_train = Model( inputs=[x], outputs=[y_pred], name='train_only' )\n",
|
193 |
+
" model_train = CustomModel( inputs=x, outputs=y_pred, name='train_only' )\n",
|
194 |
+
" # model_predict = Model( inputs=x, outputs=y_pred, name='predict_only' )\n",
|
195 |
+
" model_train.compile(loss=None, optimizer=optimizer, metrics = ['accuracy'])\n",
|
196 |
+
" # model_train.compile(loss=None, optimizer=optimizer, metrics = ['accuracy'], run_eagerly = True)\n",
|
197 |
+
"\n",
|
198 |
+
" return model_train\n",
|
199 |
+
"\n",
|
200 |
+
"\n",
|
201 |
+
" def act(self, state):\n",
|
202 |
+
" # print(\"Act state\",state)\n",
|
203 |
+
" probs = self.model.predict(np.array([state]), verbose=0)[0]\n",
|
204 |
+
" # print(\"probs\",probs)\n",
|
205 |
+
" action = np.random.choice(self.action_space, p=probs)\n",
|
206 |
+
" # print(\"Action\",action)\n",
|
207 |
+
" # return the action and the log of the probability \n",
|
208 |
+
" # return action, np.log(probs[action])\n",
|
209 |
+
" return action\n",
|
210 |
+
"\n",
|
211 |
+
"\n",
|
212 |
+
" # this implements the reinforce \n",
|
213 |
+
" def learn(self, n_training_episodes=None, max_t=None, print_every=100):\n",
|
214 |
+
" # Help us to calculate the score during the training\n",
|
215 |
+
" scores_deque = deque(maxlen=100)\n",
|
216 |
+
" scores = []\n",
|
217 |
+
" # Line 3 of pseudocode\n",
|
218 |
+
" for i_episode in range(1, n_training_episodes+1):\n",
|
219 |
+
" # saved_log_probs = []\n",
|
220 |
+
" saved_actions = []\n",
|
221 |
+
" saved_state = []\n",
|
222 |
+
" rewards = []\n",
|
223 |
+
" state = self.env.reset()\n",
|
224 |
+
" # Line 4 of pseudocode\n",
|
225 |
+
" for t in range(max_t):\n",
|
226 |
+
" saved_state.append(state)\n",
|
227 |
+
" action = self.act(state)\n",
|
228 |
+
" # action, log_prob = self.act(state)\n",
|
229 |
+
" # saved_log_probs.append(log_prob)\n",
|
230 |
+
" saved_actions.append(action)\n",
|
231 |
+
" state, reward, done, _ = self.env.step(action)\n",
|
232 |
+
" rewards.append(reward)\n",
|
233 |
+
" if done:\n",
|
234 |
+
" break \n",
|
235 |
+
" scores_deque.append(sum(rewards))\n",
|
236 |
+
" scores.append(sum(rewards))\n",
|
237 |
+
" \n",
|
238 |
+
" # Line 6 of pseudocode: calculate the return\n",
|
239 |
+
" returns = deque(maxlen=max_t) \n",
|
240 |
+
" n_steps = len(rewards) \n",
|
241 |
+
" # Compute the discounted returns at each timestep,\n",
|
242 |
+
" # as \n",
|
243 |
+
" # the sum of the gamma-discounted return at time t (G_t) + the reward at time t\n",
|
244 |
+
" #\n",
|
245 |
+
" # In O(N) time, where N is the number of time steps\n",
|
246 |
+
" # (this definition of the discounted return G_t follows the definition of this quantity \n",
|
247 |
+
" # shown at page 44 of Sutton&Barto 2017 2nd draft)\n",
|
248 |
+
" # G_t = r_(t+1) + r_(t+2) + ...\n",
|
249 |
+
" \n",
|
250 |
+
" # Given this formulation, the returns at each timestep t can be computed \n",
|
251 |
+
" # by re-using the computed future returns G_(t+1) to compute the current return G_t\n",
|
252 |
+
" # G_t = r_(t+1) + gamma*G_(t+1)\n",
|
253 |
+
" # G_(t-1) = r_t + gamma* G_t\n",
|
254 |
+
" # (this follows a dynamic programming approach, with which we memorize solutions in order \n",
|
255 |
+
" # to avoid computing them multiple times)\n",
|
256 |
+
" \n",
|
257 |
+
" # This is correct since the above is equivalent to (see also page 46 of Sutton&Barto 2017 2nd draft)\n",
|
258 |
+
" # G_(t-1) = r_t + gamma*r_(t+1) + gamma*gamma*r_(t+2) + ...\n",
|
259 |
+
" \n",
|
260 |
+
" \n",
|
261 |
+
" ## Given the above, we calculate the returns at timestep t as: \n",
|
262 |
+
" # gamma[t] * return[t] + reward[t]\n",
|
263 |
+
" #\n",
|
264 |
+
" ## We compute this starting from the last timestep to the first, in order\n",
|
265 |
+
" ## to employ the formula presented above and avoid redundant computations that would be needed \n",
|
266 |
+
" ## if we were to do it from first to last.\n",
|
267 |
+
" \n",
|
268 |
+
" ## Hence, the queue \"returns\" will hold the returns in chronological order, from t=0 to t=n_steps\n",
|
269 |
+
" ## thanks to the appendleft() function which allows to append to the position 0 in constant time O(1)\n",
|
270 |
+
" ## a normal python list would instead require O(N) to do this.\n",
|
271 |
+
" for t in range(n_steps)[::-1]:\n",
|
272 |
+
" disc_return_t = (returns[0] if len(returns)>0 else 0)\n",
|
273 |
+
" returns.appendleft( self.gamma*disc_return_t + rewards[t] ) \n",
|
274 |
+
" \n",
|
275 |
+
" ## standardization of the returns is employed to make training more stable\n",
|
276 |
+
" eps = np.finfo(np.float32).eps.item()\n",
|
277 |
+
" ## eps is the smallest representable float, which is \n",
|
278 |
+
" # added to the standard deviation of the returns to avoid numerical instabilities \n",
|
279 |
+
" returns = np.array(returns)\n",
|
280 |
+
" returns = (returns - returns.mean()) / (returns.std() + eps)\n",
|
281 |
+
" # self.saved_log_probs = saved_log_probs\n",
|
282 |
+
" \n",
|
283 |
+
" # Line 7:\n",
|
284 |
+
" saved_state = np.array(saved_state)\n",
|
285 |
+
" # print(\"Saved state\", saved_state, saved_state.shape)\n",
|
286 |
+
" saved_actions = np.array(to_categorical(saved_actions, num_classes=self.action_size))\n",
|
287 |
+
" # print(\"Saved actions\", saved_actions, saved_actions.shape)\n",
|
288 |
+
" returns = returns.reshape(-1,1)\n",
|
289 |
+
" # print(\"Returns\", returns, returns.shape)\n",
|
290 |
+
" # this is the trick part, we send a tuple so the CustomModel is able to split the x and use \n",
|
291 |
+
" # the returns inside to calculate the custom loss\n",
|
292 |
+
" self.model.train_on_batch(x=(saved_state,returns), y=saved_actions)\n",
|
293 |
+
"\n",
|
294 |
+
" # policy_loss = []\n",
|
295 |
+
" # for action, log_prob, disc_return in zip(saved_actions, saved_log_probs, returns):\n",
|
296 |
+
" # policy_loss.append(-log_prob * disc_return)\n",
|
297 |
+
" # policy_loss = torch.cat(policy_loss).sum()\n",
|
298 |
+
" \n",
|
299 |
+
" # # Line 8: PyTorch prefers gradient descent \n",
|
300 |
+
" # optimizer.zero_grad()\n",
|
301 |
+
" # policy_loss.backward()\n",
|
302 |
+
" # optimizer.step()\n",
|
303 |
+
" \n",
|
304 |
+
" if i_episode % print_every == 0:\n",
|
305 |
+
" print('Episode {}\\tAverage Score: {:.2f}'.format(i_episode, np.mean(scores_deque)))\n",
|
306 |
+
" \n",
|
307 |
+
" return scores\n",
|
308 |
+
"\n",
|
309 |
+
" #\n",
|
310 |
+
" # Loads a saved model\n",
|
311 |
+
" #https://medium.com/@Bloomore/how-to-write-a-custom-loss-function-with-additional-arguments-in-keras-5f193929f7a0\n",
|
312 |
+
" #\n",
|
313 |
+
" def load(self, name):\n",
|
314 |
+
" self.model.load_weights(name)\n",
|
315 |
+
"\n",
|
316 |
+
" #\n",
|
317 |
+
" # Saves parameters of a trained model\n",
|
318 |
+
" #\n",
|
319 |
+
" def save(self, name):\n",
|
320 |
+
" self.model.save_weights(name)\n",
|
321 |
+
"\n",
|
322 |
+
" def play(self, state):\n",
|
323 |
+
" return np.argmax(self.model.predict(np.array([state]), verbose=0)[0])"
|
324 |
+
]
|
325 |
+
},
|
326 |
+
{
|
327 |
+
"cell_type": "code",
|
328 |
+
"execution_count": null,
|
329 |
+
"metadata": {
|
330 |
+
"colab": {
|
331 |
+
"base_uri": "https://localhost:8080/"
|
332 |
+
},
|
333 |
+
"id": "z64C2rO7NnJq",
|
334 |
+
"outputId": "fd4c1942-c7b6-49af-cead-e6a48ee987d0"
|
335 |
+
},
|
336 |
+
"outputs": [
|
337 |
+
{
|
338 |
+
"name": "stdout",
|
339 |
+
"output_type": "stream",
|
340 |
+
"text": [
|
341 |
+
"action space [0, 1]\n",
|
342 |
+
"Model: \"train_only\"\n",
|
343 |
+
"_________________________________________________________________\n",
|
344 |
+
" Layer (type) Output Shape Param # \n",
|
345 |
+
"=================================================================\n",
|
346 |
+
" x_input (InputLayer) [(None, 4)] 0 \n",
|
347 |
+
" \n",
|
348 |
+
" dense_6 (Dense) (None, 16) 80 \n",
|
349 |
+
" \n",
|
350 |
+
" dense_7 (Dense) (None, 16) 272 \n",
|
351 |
+
" \n",
|
352 |
+
" y_pred (Dense) (None, 2) 34 \n",
|
353 |
+
" \n",
|
354 |
+
"=================================================================\n",
|
355 |
+
"Total params: 386\n",
|
356 |
+
"Trainable params: 386\n",
|
357 |
+
"Non-trainable params: 0\n",
|
358 |
+
"_________________________________________________________________\n",
|
359 |
+
"Episode 100\tAverage Score: 66.31\n",
|
360 |
+
"Episode 200\tAverage Score: 161.58\n",
|
361 |
+
"Episode 300\tAverage Score: 282.58\n"
|
362 |
+
]
|
363 |
+
}
|
364 |
+
],
|
365 |
+
"source": [
|
366 |
+
"env = gym.make('CartPole-v1')\n",
|
367 |
+
"\n",
|
368 |
+
"model = Policy(env=env, action_size=2)\n",
|
369 |
+
"# model.learn(total_steps=6_000)\n",
|
370 |
+
"\n",
|
371 |
+
"model.learn(n_training_episodes=1000, max_t=1000, print_every=100)\n",
|
372 |
+
"env.close()\n"
|
373 |
+
]
|
374 |
+
},
|
375 |
+
{
|
376 |
+
"cell_type": "code",
|
377 |
+
"execution_count": 11,
|
378 |
+
"metadata": {
|
379 |
+
"id": "7zS2PuLSNnJr"
|
380 |
+
},
|
381 |
+
"outputs": [],
|
382 |
+
"source": [
|
383 |
+
"model.save(\"./alt/policy_grad_cartpole.h5\")"
|
384 |
+
]
|
385 |
+
},
|
386 |
+
{
|
387 |
+
"cell_type": "code",
|
388 |
+
"execution_count": 15,
|
389 |
+
"metadata": {
|
390 |
+
"colab": {
|
391 |
+
"base_uri": "https://localhost:8080/"
|
392 |
+
},
|
393 |
+
"id": "vgDMDokeNnJr",
|
394 |
+
"outputId": "13a744e9-5119-4c5f-d681-39b0933fd661"
|
395 |
+
},
|
396 |
+
"outputs": [
|
397 |
+
{
|
398 |
+
"name": "stdout",
|
399 |
+
"output_type": "stream",
|
400 |
+
"text": [
|
401 |
+
"action space [0, 1]\n",
|
402 |
+
"Model: \"train_only\"\n",
|
403 |
+
"_________________________________________________________________\n",
|
404 |
+
" Layer (type) Output Shape Param # \n",
|
405 |
+
"=================================================================\n",
|
406 |
+
" x_input (InputLayer) [(None, 4)] 0 \n",
|
407 |
+
" \n",
|
408 |
+
" dense_4 (Dense) (None, 16) 80 \n",
|
409 |
+
" \n",
|
410 |
+
" dense_5 (Dense) (None, 16) 272 \n",
|
411 |
+
" \n",
|
412 |
+
" y_pred (Dense) (None, 2) 34 \n",
|
413 |
+
" \n",
|
414 |
+
"=================================================================\n",
|
415 |
+
"Total params: 386\n",
|
416 |
+
"Trainable params: 386\n",
|
417 |
+
"Non-trainable params: 0\n",
|
418 |
+
"_________________________________________________________________\n",
|
419 |
+
"Total reward 189.0\n"
|
420 |
+
]
|
421 |
+
}
|
422 |
+
],
|
423 |
+
"source": [
|
424 |
+
"eval_env = gym.make('CartPole-v1')\n",
|
425 |
+
"model = Policy(env=eval_env, action_size=2)\n",
|
426 |
+
"model.load(\"./alt/policy_grad_cartpole.h5\")\n",
|
427 |
+
"eval_env = wrappers.Monitor(eval_env, \"./alt/gym-results\", force=True)\n",
|
428 |
+
"state = eval_env.reset()\n",
|
429 |
+
"total_reward = 0\n",
|
430 |
+
"for _ in range(1000):\n",
|
431 |
+
" action = model.play(state)\n",
|
432 |
+
" observation, reward, done, info = eval_env.step(action)\n",
|
433 |
+
" total_reward +=reward\n",
|
434 |
+
" state = observation\n",
|
435 |
+
" if done: \n",
|
436 |
+
" print(f\"Total reward {total_reward}\")\n",
|
437 |
+
" break\n",
|
438 |
+
"eval_env.close()"
|
439 |
+
]
|
440 |
+
},
|
441 |
+
{
|
442 |
+
"cell_type": "code",
|
443 |
+
"execution_count": null,
|
444 |
+
"metadata": {
|
445 |
+
"id": "2HxBAnLQUoZW"
|
446 |
+
},
|
447 |
+
"outputs": [],
|
448 |
+
"source": []
|
449 |
+
}
|
450 |
+
],
|
451 |
+
"metadata": {
|
452 |
+
"accelerator": "GPU",
|
453 |
+
"colab": {
|
454 |
+
"provenance": []
|
455 |
+
},
|
456 |
+
"gpuClass": "standard",
|
457 |
+
"kernelspec": {
|
458 |
+
"display_name": "Python 3.9.16 ('rl3')",
|
459 |
+
"language": "python",
|
460 |
+
"name": "python3"
|
461 |
+
},
|
462 |
+
"language_info": {
|
463 |
+
"codemirror_mode": {
|
464 |
+
"name": "ipython",
|
465 |
+
"version": 3
|
466 |
+
},
|
467 |
+
"file_extension": ".py",
|
468 |
+
"mimetype": "text/x-python",
|
469 |
+
"name": "python",
|
470 |
+
"nbconvert_exporter": "python",
|
471 |
+
"pygments_lexer": "ipython3",
|
472 |
+
"version": "3.9.16"
|
473 |
+
},
|
474 |
+
"orig_nbformat": 4,
|
475 |
+
"vscode": {
|
476 |
+
"interpreter": {
|
477 |
+
"hash": "9070e15ca35f8308b0c5d51e893fc04d77e428fe4d803a6d9ae4f68a65d8ce17"
|
478 |
+
}
|
479 |
+
}
|
480 |
+
},
|
481 |
+
"nbformat": 4,
|
482 |
+
"nbformat_minor": 0
|
483 |
+
}
|
Test_custom_loss.ipynb
ADDED
@@ -0,0 +1,804 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"cells": [
|
3 |
+
{
|
4 |
+
"cell_type": "markdown",
|
5 |
+
"metadata": {
|
6 |
+
"id": "nwaAZRu1NTiI"
|
7 |
+
},
|
8 |
+
"source": [
|
9 |
+
"# Test custom loss in Keras\n",
|
10 |
+
"\n"
|
11 |
+
]
|
12 |
+
},
|
13 |
+
{
|
14 |
+
"cell_type": "code",
|
15 |
+
"execution_count": 1,
|
16 |
+
"metadata": {
|
17 |
+
"id": "LNXxxKojNTiL"
|
18 |
+
},
|
19 |
+
"outputs": [
|
20 |
+
{
|
21 |
+
"name": "stderr",
|
22 |
+
"output_type": "stream",
|
23 |
+
"text": [
|
24 |
+
"2023-02-06 12:11:59.127639: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
|
25 |
+
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
|
26 |
+
]
|
27 |
+
},
|
28 |
+
{
|
29 |
+
"name": "stdout",
|
30 |
+
"output_type": "stream",
|
31 |
+
"text": [
|
32 |
+
"2.11.0\n"
|
33 |
+
]
|
34 |
+
}
|
35 |
+
],
|
36 |
+
"source": [
|
37 |
+
"import tensorflow as tf\n",
|
38 |
+
"from tensorflow.keras import layers, Model, Input\n",
|
39 |
+
"from tensorflow.keras.utils import to_categorical\n",
|
40 |
+
"import tensorflow.keras.backend as K\n",
|
41 |
+
"\n",
|
42 |
+
"from matplotlib import pyplot as plt\n",
|
43 |
+
"import numpy as np\n",
|
44 |
+
"\n",
|
45 |
+
"print(tf.__version__)\n"
|
46 |
+
]
|
47 |
+
},
|
48 |
+
{
|
49 |
+
"cell_type": "code",
|
50 |
+
"execution_count": 184,
|
51 |
+
"metadata": {},
|
52 |
+
"outputs": [],
|
53 |
+
"source": [
|
54 |
+
"class CustomModel(tf.keras.Model):\n",
|
55 |
+
" def train_step(self, data):\n",
|
56 |
+
" # Unpack the data. Its structure depends on your model and\n",
|
57 |
+
" # on what you pass to `fit()`.\n",
|
58 |
+
" if len(data) == 3:\n",
|
59 |
+
" x, y, sample_weight = data\n",
|
60 |
+
" else:\n",
|
61 |
+
" sample_weight = None\n",
|
62 |
+
" x, y = data\n",
|
63 |
+
"\n",
|
64 |
+
" # check if we passed the d_return\n",
|
65 |
+
" if isinstance(x, tuple):\n",
|
66 |
+
" x = x[0]\n",
|
67 |
+
" d_return = x[1]\n",
|
68 |
+
"\n",
|
69 |
+
"\n",
|
70 |
+
" with tf.GradientTape() as tape:\n",
|
71 |
+
" y_pred = self(x, training=True) # Forward pass\n",
|
72 |
+
" # Compute the loss value.\n",
|
73 |
+
" # The loss function is configured in `compile()`.\n",
|
74 |
+
" # loss = self.compiled_loss(\n",
|
75 |
+
" # y,\n",
|
76 |
+
" # y_pred,\n",
|
77 |
+
" # sample_weight=sample_weight,\n",
|
78 |
+
" # regularization_losses=self.losses,\n",
|
79 |
+
" # )\n",
|
80 |
+
" y = tf.cast(y, tf.float32)\n",
|
81 |
+
" loss = K.mean(K.square(y_pred - y), axis=-1)\n",
|
82 |
+
"\n",
|
83 |
+
" # Compute gradients\n",
|
84 |
+
" trainable_vars = self.trainable_variables\n",
|
85 |
+
" gradients = tape.gradient(loss, trainable_vars)\n",
|
86 |
+
"\n",
|
87 |
+
" # Update weights\n",
|
88 |
+
" self.optimizer.apply_gradients(zip(gradients, trainable_vars))\n",
|
89 |
+
"\n",
|
90 |
+
" # Update the metrics.\n",
|
91 |
+
" # Metrics are configured in `compile()`.\n",
|
92 |
+
" self.compiled_metrics.update_state(y, y_pred, sample_weight=sample_weight)\n",
|
93 |
+
"\n",
|
94 |
+
" # Return a dict mapping metric names to current value.\n",
|
95 |
+
" # Note that it will include the loss (tracked in self.metrics).\n",
|
96 |
+
" return {m.name: m.result() for m in self.metrics}"
|
97 |
+
]
|
98 |
+
},
|
99 |
+
{
|
100 |
+
"cell_type": "code",
|
101 |
+
"execution_count": 185,
|
102 |
+
"metadata": {},
|
103 |
+
"outputs": [
|
104 |
+
{
|
105 |
+
"name": "stdout",
|
106 |
+
"output_type": "stream",
|
107 |
+
"text": [
|
108 |
+
"Model: \"train_only\"\n",
|
109 |
+
"_________________________________________________________________\n",
|
110 |
+
" Layer (type) Output Shape Param # \n",
|
111 |
+
"=================================================================\n",
|
112 |
+
" x_input (InputLayer) [(None, 1)] 0 \n",
|
113 |
+
" \n",
|
114 |
+
" dense_47 (Dense) (None, 1) 2 \n",
|
115 |
+
" \n",
|
116 |
+
"=================================================================\n",
|
117 |
+
"Total params: 2\n",
|
118 |
+
"Trainable params: 2\n",
|
119 |
+
"Non-trainable params: 0\n",
|
120 |
+
"_________________________________________________________________\n"
|
121 |
+
]
|
122 |
+
}
|
123 |
+
],
|
124 |
+
"source": [
|
125 |
+
"# Simplest NN without custom loss\n",
|
126 |
+
"\n",
|
127 |
+
"x_input = Input(shape=(1,), name='x_input')\n",
|
128 |
+
"output = layers.Dense(1, activation=None)(x_input)\n",
|
129 |
+
"\n",
|
130 |
+
"model = CustomModel(inputs=x_input, outputs=output, name='train_only')\n",
|
131 |
+
"model.compile(loss=None, optimizer=tf.keras.optimizers.Adam(learning_rate=0.1))\n",
|
132 |
+
"\n",
|
133 |
+
"model.summary()"
|
134 |
+
]
|
135 |
+
},
|
136 |
+
{
|
137 |
+
"cell_type": "code",
|
138 |
+
"execution_count": 186,
|
139 |
+
"metadata": {},
|
140 |
+
"outputs": [],
|
141 |
+
"source": [
|
142 |
+
"x = np.array([ [i] for i in range(1,100) ])\n",
|
143 |
+
"y = np.array([ [i] for i in range(1,100) ])\n",
|
144 |
+
"\n",
|
145 |
+
"history = model.train_on_batch(x=(x,x),y=y)\n",
|
146 |
+
"# history = model.fit([x,x], y, epochs=700, verbose=0)\n",
|
147 |
+
"# history = model.fit(x, y, validation_split=0.2, epochs=500, verbose=0)\n"
|
148 |
+
]
|
149 |
+
},
|
150 |
+
{
|
151 |
+
"cell_type": "code",
|
152 |
+
"execution_count": 147,
|
153 |
+
"metadata": {},
|
154 |
+
"outputs": [
|
155 |
+
{
|
156 |
+
"ename": "AttributeError",
|
157 |
+
"evalue": "'float' object has no attribute 'history'",
|
158 |
+
"output_type": "error",
|
159 |
+
"traceback": [
|
160 |
+
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
161 |
+
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
|
162 |
+
"Cell \u001b[0;32mIn[147], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m plt\u001b[38;5;241m.\u001b[39mplot(\u001b[43mhistory\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mhistory\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mloss\u001b[39m\u001b[38;5;124m'\u001b[39m])\n\u001b[1;32m 2\u001b[0m \u001b[38;5;66;03m# plt.plot(history.history['val_loss'])\u001b[39;00m\n\u001b[1;32m 3\u001b[0m plt\u001b[38;5;241m.\u001b[39mtitle(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mmodel loss\u001b[39m\u001b[38;5;124m'\u001b[39m)\n",
|
163 |
+
"\u001b[0;31mAttributeError\u001b[0m: 'float' object has no attribute 'history'"
|
164 |
+
]
|
165 |
+
}
|
166 |
+
],
|
167 |
+
"source": [
|
168 |
+
"plt.plot(history.history['loss'])\n",
|
169 |
+
"# plt.plot(history.history['val_loss'])\n",
|
170 |
+
"plt.title('model loss')\n",
|
171 |
+
"plt.ylim(-0.1, 1)\n",
|
172 |
+
"plt.ylabel('loss')\n",
|
173 |
+
"plt.xlabel('epoch')\n",
|
174 |
+
"plt.legend(['train', 'test'], loc='upper left')\n",
|
175 |
+
"plt.show()"
|
176 |
+
]
|
177 |
+
},
|
178 |
+
{
|
179 |
+
"cell_type": "code",
|
180 |
+
"execution_count": 97,
|
181 |
+
"metadata": {},
|
182 |
+
"outputs": [
|
183 |
+
{
|
184 |
+
"data": {
|
185 |
+
"text/plain": [
|
186 |
+
"0.0"
|
187 |
+
]
|
188 |
+
},
|
189 |
+
"execution_count": 97,
|
190 |
+
"metadata": {},
|
191 |
+
"output_type": "execute_result"
|
192 |
+
}
|
193 |
+
],
|
194 |
+
"source": [
|
195 |
+
"history.history['loss'][-1]"
|
196 |
+
]
|
197 |
+
},
|
198 |
+
{
|
199 |
+
"cell_type": "code",
|
200 |
+
"execution_count": null,
|
201 |
+
"metadata": {},
|
202 |
+
"outputs": [],
|
203 |
+
"source": [
|
204 |
+
"pred = model.predict(x)\n",
|
205 |
+
"pred"
|
206 |
+
]
|
207 |
+
},
|
208 |
+
{
|
209 |
+
"cell_type": "code",
|
210 |
+
"execution_count": 99,
|
211 |
+
"metadata": {},
|
212 |
+
"outputs": [
|
213 |
+
{
|
214 |
+
"data": {
|
215 |
+
"image/png": "",
|
216 |
+
"text/plain": [
|
217 |
+
"<Figure size 640x480 with 1 Axes>"
|
218 |
+
]
|
219 |
+
},
|
220 |
+
"metadata": {},
|
221 |
+
"output_type": "display_data"
|
222 |
+
}
|
223 |
+
],
|
224 |
+
"source": [
|
225 |
+
"\n",
|
226 |
+
"plt.scatter(x, y, c='blue')\n",
|
227 |
+
"plt.plot(x, pred, color='g')\n",
|
228 |
+
"plt.show()"
|
229 |
+
]
|
230 |
+
},
|
231 |
+
{
|
232 |
+
"cell_type": "code",
|
233 |
+
"execution_count": null,
|
234 |
+
"metadata": {},
|
235 |
+
"outputs": [],
|
236 |
+
"source": [
|
237 |
+
"for i in np.arange(1.5,100.5, 1):\n",
|
238 |
+
" print(i)"
|
239 |
+
]
|
240 |
+
},
|
241 |
+
{
|
242 |
+
"cell_type": "code",
|
243 |
+
"execution_count": null,
|
244 |
+
"metadata": {},
|
245 |
+
"outputs": [],
|
246 |
+
"source": []
|
247 |
+
}
|
248 |
+
],
|
249 |
+
"metadata": {
|
250 |
+
"colab": {
|
251 |
+
"provenance": []
|
252 |
+
},
|
253 |
+
"kernelspec": {
|
254 |
+
"display_name": "Python 3.9.16 ('rl3')",
|
255 |
+
"language": "python",
|
256 |
+
"name": "python3"
|
257 |
+
},
|
258 |
+
"language_info": {
|
259 |
+
"codemirror_mode": {
|
260 |
+
"name": "ipython",
|
261 |
+
"version": 3
|
262 |
+
},
|
263 |
+
"file_extension": ".py",
|
264 |
+
"mimetype": "text/x-python",
|
265 |
+
"name": "python",
|
266 |
+
"nbconvert_exporter": "python",
|
267 |
+
"pygments_lexer": "ipython3",
|
268 |
+
"version": "3.9.16"
|
269 |
+
},
|
270 |
+
"orig_nbformat": 4,
|
271 |
+
"vscode": {
|
272 |
+
"interpreter": {
|
273 |
+
"hash": "9070e15ca35f8308b0c5d51e893fc04d77e428fe4d803a6d9ae4f68a65d8ce17"
|
274 |
+
}
|
275 |
+
},
|
276 |
+
"widgets": {
|
277 |
+
"application/vnd.jupyter.widget-state+json": {
|
278 |
+
"01a2dbcb714e40148b41c761fcf43147": {
|
279 |
+
"model_module": "@jupyter-widgets/base",
|
280 |
+
"model_module_version": "1.2.0",
|
281 |
+
"model_name": "LayoutModel",
|
282 |
+
"state": {
|
283 |
+
"_model_module": "@jupyter-widgets/base",
|
284 |
+
"_model_module_version": "1.2.0",
|
285 |
+
"_model_name": "LayoutModel",
|
286 |
+
"_view_count": null,
|
287 |
+
"_view_module": "@jupyter-widgets/base",
|
288 |
+
"_view_module_version": "1.2.0",
|
289 |
+
"_view_name": "LayoutView",
|
290 |
+
"align_content": null,
|
291 |
+
"align_items": null,
|
292 |
+
"align_self": null,
|
293 |
+
"border": null,
|
294 |
+
"bottom": null,
|
295 |
+
"display": null,
|
296 |
+
"flex": null,
|
297 |
+
"flex_flow": null,
|
298 |
+
"grid_area": null,
|
299 |
+
"grid_auto_columns": null,
|
300 |
+
"grid_auto_flow": null,
|
301 |
+
"grid_auto_rows": null,
|
302 |
+
"grid_column": null,
|
303 |
+
"grid_gap": null,
|
304 |
+
"grid_row": null,
|
305 |
+
"grid_template_areas": null,
|
306 |
+
"grid_template_columns": null,
|
307 |
+
"grid_template_rows": null,
|
308 |
+
"height": null,
|
309 |
+
"justify_content": null,
|
310 |
+
"justify_items": null,
|
311 |
+
"left": null,
|
312 |
+
"margin": null,
|
313 |
+
"max_height": null,
|
314 |
+
"max_width": null,
|
315 |
+
"min_height": null,
|
316 |
+
"min_width": null,
|
317 |
+
"object_fit": null,
|
318 |
+
"object_position": null,
|
319 |
+
"order": null,
|
320 |
+
"overflow": null,
|
321 |
+
"overflow_x": null,
|
322 |
+
"overflow_y": null,
|
323 |
+
"padding": null,
|
324 |
+
"right": null,
|
325 |
+
"top": null,
|
326 |
+
"visibility": null,
|
327 |
+
"width": null
|
328 |
+
}
|
329 |
+
},
|
330 |
+
"20b0f38ec3234ff28a62a286cd57b933": {
|
331 |
+
"model_module": "@jupyter-widgets/controls",
|
332 |
+
"model_module_version": "1.5.0",
|
333 |
+
"model_name": "PasswordModel",
|
334 |
+
"state": {
|
335 |
+
"_dom_classes": [],
|
336 |
+
"_model_module": "@jupyter-widgets/controls",
|
337 |
+
"_model_module_version": "1.5.0",
|
338 |
+
"_model_name": "PasswordModel",
|
339 |
+
"_view_count": null,
|
340 |
+
"_view_module": "@jupyter-widgets/controls",
|
341 |
+
"_view_module_version": "1.5.0",
|
342 |
+
"_view_name": "PasswordView",
|
343 |
+
"continuous_update": true,
|
344 |
+
"description": "Token:",
|
345 |
+
"description_tooltip": null,
|
346 |
+
"disabled": false,
|
347 |
+
"layout": "IPY_MODEL_01a2dbcb714e40148b41c761fcf43147",
|
348 |
+
"placeholder": "",
|
349 |
+
"style": "IPY_MODEL_90c874e91b304ee1a7ef147767ac00ce",
|
350 |
+
"value": ""
|
351 |
+
}
|
352 |
+
},
|
353 |
+
"270cbb5d6e9c4b1e9e2f39c8b3b0c15f": {
|
354 |
+
"model_module": "@jupyter-widgets/controls",
|
355 |
+
"model_module_version": "1.5.0",
|
356 |
+
"model_name": "VBoxModel",
|
357 |
+
"state": {
|
358 |
+
"_dom_classes": [],
|
359 |
+
"_model_module": "@jupyter-widgets/controls",
|
360 |
+
"_model_module_version": "1.5.0",
|
361 |
+
"_model_name": "VBoxModel",
|
362 |
+
"_view_count": null,
|
363 |
+
"_view_module": "@jupyter-widgets/controls",
|
364 |
+
"_view_module_version": "1.5.0",
|
365 |
+
"_view_name": "VBoxView",
|
366 |
+
"box_style": "",
|
367 |
+
"children": [
|
368 |
+
"IPY_MODEL_a02224a43d8d4af3bd31d326540d25da",
|
369 |
+
"IPY_MODEL_20b0f38ec3234ff28a62a286cd57b933",
|
370 |
+
"IPY_MODEL_f6c845330d6743c0b35c2c7ad834de77",
|
371 |
+
"IPY_MODEL_f1675c09d16a4251b403f9c56255f168",
|
372 |
+
"IPY_MODEL_c1a82965ae26479a98e4fdbde1e64ec2"
|
373 |
+
],
|
374 |
+
"layout": "IPY_MODEL_3fa248114ac24656ba74923936a94d2d"
|
375 |
+
}
|
376 |
+
},
|
377 |
+
"2dc5fa9aa3334dfcbdee9c238f2ef60b": {
|
378 |
+
"model_module": "@jupyter-widgets/controls",
|
379 |
+
"model_module_version": "1.5.0",
|
380 |
+
"model_name": "DescriptionStyleModel",
|
381 |
+
"state": {
|
382 |
+
"_model_module": "@jupyter-widgets/controls",
|
383 |
+
"_model_module_version": "1.5.0",
|
384 |
+
"_model_name": "DescriptionStyleModel",
|
385 |
+
"_view_count": null,
|
386 |
+
"_view_module": "@jupyter-widgets/base",
|
387 |
+
"_view_module_version": "1.2.0",
|
388 |
+
"_view_name": "StyleView",
|
389 |
+
"description_width": ""
|
390 |
+
}
|
391 |
+
},
|
392 |
+
"3e753b0212644990b558c68853ff2041": {
|
393 |
+
"model_module": "@jupyter-widgets/base",
|
394 |
+
"model_module_version": "1.2.0",
|
395 |
+
"model_name": "LayoutModel",
|
396 |
+
"state": {
|
397 |
+
"_model_module": "@jupyter-widgets/base",
|
398 |
+
"_model_module_version": "1.2.0",
|
399 |
+
"_model_name": "LayoutModel",
|
400 |
+
"_view_count": null,
|
401 |
+
"_view_module": "@jupyter-widgets/base",
|
402 |
+
"_view_module_version": "1.2.0",
|
403 |
+
"_view_name": "LayoutView",
|
404 |
+
"align_content": null,
|
405 |
+
"align_items": null,
|
406 |
+
"align_self": null,
|
407 |
+
"border": null,
|
408 |
+
"bottom": null,
|
409 |
+
"display": null,
|
410 |
+
"flex": null,
|
411 |
+
"flex_flow": null,
|
412 |
+
"grid_area": null,
|
413 |
+
"grid_auto_columns": null,
|
414 |
+
"grid_auto_flow": null,
|
415 |
+
"grid_auto_rows": null,
|
416 |
+
"grid_column": null,
|
417 |
+
"grid_gap": null,
|
418 |
+
"grid_row": null,
|
419 |
+
"grid_template_areas": null,
|
420 |
+
"grid_template_columns": null,
|
421 |
+
"grid_template_rows": null,
|
422 |
+
"height": null,
|
423 |
+
"justify_content": null,
|
424 |
+
"justify_items": null,
|
425 |
+
"left": null,
|
426 |
+
"margin": null,
|
427 |
+
"max_height": null,
|
428 |
+
"max_width": null,
|
429 |
+
"min_height": null,
|
430 |
+
"min_width": null,
|
431 |
+
"object_fit": null,
|
432 |
+
"object_position": null,
|
433 |
+
"order": null,
|
434 |
+
"overflow": null,
|
435 |
+
"overflow_x": null,
|
436 |
+
"overflow_y": null,
|
437 |
+
"padding": null,
|
438 |
+
"right": null,
|
439 |
+
"top": null,
|
440 |
+
"visibility": null,
|
441 |
+
"width": null
|
442 |
+
}
|
443 |
+
},
|
444 |
+
"3fa248114ac24656ba74923936a94d2d": {
|
445 |
+
"model_module": "@jupyter-widgets/base",
|
446 |
+
"model_module_version": "1.2.0",
|
447 |
+
"model_name": "LayoutModel",
|
448 |
+
"state": {
|
449 |
+
"_model_module": "@jupyter-widgets/base",
|
450 |
+
"_model_module_version": "1.2.0",
|
451 |
+
"_model_name": "LayoutModel",
|
452 |
+
"_view_count": null,
|
453 |
+
"_view_module": "@jupyter-widgets/base",
|
454 |
+
"_view_module_version": "1.2.0",
|
455 |
+
"_view_name": "LayoutView",
|
456 |
+
"align_content": null,
|
457 |
+
"align_items": "center",
|
458 |
+
"align_self": null,
|
459 |
+
"border": null,
|
460 |
+
"bottom": null,
|
461 |
+
"display": "flex",
|
462 |
+
"flex": null,
|
463 |
+
"flex_flow": "column",
|
464 |
+
"grid_area": null,
|
465 |
+
"grid_auto_columns": null,
|
466 |
+
"grid_auto_flow": null,
|
467 |
+
"grid_auto_rows": null,
|
468 |
+
"grid_column": null,
|
469 |
+
"grid_gap": null,
|
470 |
+
"grid_row": null,
|
471 |
+
"grid_template_areas": null,
|
472 |
+
"grid_template_columns": null,
|
473 |
+
"grid_template_rows": null,
|
474 |
+
"height": null,
|
475 |
+
"justify_content": null,
|
476 |
+
"justify_items": null,
|
477 |
+
"left": null,
|
478 |
+
"margin": null,
|
479 |
+
"max_height": null,
|
480 |
+
"max_width": null,
|
481 |
+
"min_height": null,
|
482 |
+
"min_width": null,
|
483 |
+
"object_fit": null,
|
484 |
+
"object_position": null,
|
485 |
+
"order": null,
|
486 |
+
"overflow": null,
|
487 |
+
"overflow_x": null,
|
488 |
+
"overflow_y": null,
|
489 |
+
"padding": null,
|
490 |
+
"right": null,
|
491 |
+
"top": null,
|
492 |
+
"visibility": null,
|
493 |
+
"width": "50%"
|
494 |
+
}
|
495 |
+
},
|
496 |
+
"42d140b838b844819bc127afc1b7bc84": {
|
497 |
+
"model_module": "@jupyter-widgets/controls",
|
498 |
+
"model_module_version": "1.5.0",
|
499 |
+
"model_name": "DescriptionStyleModel",
|
500 |
+
"state": {
|
501 |
+
"_model_module": "@jupyter-widgets/controls",
|
502 |
+
"_model_module_version": "1.5.0",
|
503 |
+
"_model_name": "DescriptionStyleModel",
|
504 |
+
"_view_count": null,
|
505 |
+
"_view_module": "@jupyter-widgets/base",
|
506 |
+
"_view_module_version": "1.2.0",
|
507 |
+
"_view_name": "StyleView",
|
508 |
+
"description_width": ""
|
509 |
+
}
|
510 |
+
},
|
511 |
+
"90c874e91b304ee1a7ef147767ac00ce": {
|
512 |
+
"model_module": "@jupyter-widgets/controls",
|
513 |
+
"model_module_version": "1.5.0",
|
514 |
+
"model_name": "DescriptionStyleModel",
|
515 |
+
"state": {
|
516 |
+
"_model_module": "@jupyter-widgets/controls",
|
517 |
+
"_model_module_version": "1.5.0",
|
518 |
+
"_model_name": "DescriptionStyleModel",
|
519 |
+
"_view_count": null,
|
520 |
+
"_view_module": "@jupyter-widgets/base",
|
521 |
+
"_view_module_version": "1.2.0",
|
522 |
+
"_view_name": "StyleView",
|
523 |
+
"description_width": ""
|
524 |
+
}
|
525 |
+
},
|
526 |
+
"9d847f9a7d47458d8cd57d9b599e47c6": {
|
527 |
+
"model_module": "@jupyter-widgets/base",
|
528 |
+
"model_module_version": "1.2.0",
|
529 |
+
"model_name": "LayoutModel",
|
530 |
+
"state": {
|
531 |
+
"_model_module": "@jupyter-widgets/base",
|
532 |
+
"_model_module_version": "1.2.0",
|
533 |
+
"_model_name": "LayoutModel",
|
534 |
+
"_view_count": null,
|
535 |
+
"_view_module": "@jupyter-widgets/base",
|
536 |
+
"_view_module_version": "1.2.0",
|
537 |
+
"_view_name": "LayoutView",
|
538 |
+
"align_content": null,
|
539 |
+
"align_items": null,
|
540 |
+
"align_self": null,
|
541 |
+
"border": null,
|
542 |
+
"bottom": null,
|
543 |
+
"display": null,
|
544 |
+
"flex": null,
|
545 |
+
"flex_flow": null,
|
546 |
+
"grid_area": null,
|
547 |
+
"grid_auto_columns": null,
|
548 |
+
"grid_auto_flow": null,
|
549 |
+
"grid_auto_rows": null,
|
550 |
+
"grid_column": null,
|
551 |
+
"grid_gap": null,
|
552 |
+
"grid_row": null,
|
553 |
+
"grid_template_areas": null,
|
554 |
+
"grid_template_columns": null,
|
555 |
+
"grid_template_rows": null,
|
556 |
+
"height": null,
|
557 |
+
"justify_content": null,
|
558 |
+
"justify_items": null,
|
559 |
+
"left": null,
|
560 |
+
"margin": null,
|
561 |
+
"max_height": null,
|
562 |
+
"max_width": null,
|
563 |
+
"min_height": null,
|
564 |
+
"min_width": null,
|
565 |
+
"object_fit": null,
|
566 |
+
"object_position": null,
|
567 |
+
"order": null,
|
568 |
+
"overflow": null,
|
569 |
+
"overflow_x": null,
|
570 |
+
"overflow_y": null,
|
571 |
+
"padding": null,
|
572 |
+
"right": null,
|
573 |
+
"top": null,
|
574 |
+
"visibility": null,
|
575 |
+
"width": null
|
576 |
+
}
|
577 |
+
},
|
578 |
+
"a02224a43d8d4af3bd31d326540d25da": {
|
579 |
+
"model_module": "@jupyter-widgets/controls",
|
580 |
+
"model_module_version": "1.5.0",
|
581 |
+
"model_name": "HTMLModel",
|
582 |
+
"state": {
|
583 |
+
"_dom_classes": [],
|
584 |
+
"_model_module": "@jupyter-widgets/controls",
|
585 |
+
"_model_module_version": "1.5.0",
|
586 |
+
"_model_name": "HTMLModel",
|
587 |
+
"_view_count": null,
|
588 |
+
"_view_module": "@jupyter-widgets/controls",
|
589 |
+
"_view_module_version": "1.5.0",
|
590 |
+
"_view_name": "HTMLView",
|
591 |
+
"description": "",
|
592 |
+
"description_tooltip": null,
|
593 |
+
"layout": "IPY_MODEL_caef095934ec47bbb8b64eab22049284",
|
594 |
+
"placeholder": "",
|
595 |
+
"style": "IPY_MODEL_2dc5fa9aa3334dfcbdee9c238f2ef60b",
|
596 |
+
"value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
|
597 |
+
}
|
598 |
+
},
|
599 |
+
"a2cfb91cf66447d7899292854bd64a07": {
|
600 |
+
"model_module": "@jupyter-widgets/base",
|
601 |
+
"model_module_version": "1.2.0",
|
602 |
+
"model_name": "LayoutModel",
|
603 |
+
"state": {
|
604 |
+
"_model_module": "@jupyter-widgets/base",
|
605 |
+
"_model_module_version": "1.2.0",
|
606 |
+
"_model_name": "LayoutModel",
|
607 |
+
"_view_count": null,
|
608 |
+
"_view_module": "@jupyter-widgets/base",
|
609 |
+
"_view_module_version": "1.2.0",
|
610 |
+
"_view_name": "LayoutView",
|
611 |
+
"align_content": null,
|
612 |
+
"align_items": null,
|
613 |
+
"align_self": null,
|
614 |
+
"border": null,
|
615 |
+
"bottom": null,
|
616 |
+
"display": null,
|
617 |
+
"flex": null,
|
618 |
+
"flex_flow": null,
|
619 |
+
"grid_area": null,
|
620 |
+
"grid_auto_columns": null,
|
621 |
+
"grid_auto_flow": null,
|
622 |
+
"grid_auto_rows": null,
|
623 |
+
"grid_column": null,
|
624 |
+
"grid_gap": null,
|
625 |
+
"grid_row": null,
|
626 |
+
"grid_template_areas": null,
|
627 |
+
"grid_template_columns": null,
|
628 |
+
"grid_template_rows": null,
|
629 |
+
"height": null,
|
630 |
+
"justify_content": null,
|
631 |
+
"justify_items": null,
|
632 |
+
"left": null,
|
633 |
+
"margin": null,
|
634 |
+
"max_height": null,
|
635 |
+
"max_width": null,
|
636 |
+
"min_height": null,
|
637 |
+
"min_width": null,
|
638 |
+
"object_fit": null,
|
639 |
+
"object_position": null,
|
640 |
+
"order": null,
|
641 |
+
"overflow": null,
|
642 |
+
"overflow_x": null,
|
643 |
+
"overflow_y": null,
|
644 |
+
"padding": null,
|
645 |
+
"right": null,
|
646 |
+
"top": null,
|
647 |
+
"visibility": null,
|
648 |
+
"width": null
|
649 |
+
}
|
650 |
+
},
|
651 |
+
"c1a82965ae26479a98e4fdbde1e64ec2": {
|
652 |
+
"model_module": "@jupyter-widgets/controls",
|
653 |
+
"model_module_version": "1.5.0",
|
654 |
+
"model_name": "HTMLModel",
|
655 |
+
"state": {
|
656 |
+
"_dom_classes": [],
|
657 |
+
"_model_module": "@jupyter-widgets/controls",
|
658 |
+
"_model_module_version": "1.5.0",
|
659 |
+
"_model_name": "HTMLModel",
|
660 |
+
"_view_count": null,
|
661 |
+
"_view_module": "@jupyter-widgets/controls",
|
662 |
+
"_view_module_version": "1.5.0",
|
663 |
+
"_view_name": "HTMLView",
|
664 |
+
"description": "",
|
665 |
+
"description_tooltip": null,
|
666 |
+
"layout": "IPY_MODEL_9d847f9a7d47458d8cd57d9b599e47c6",
|
667 |
+
"placeholder": "",
|
668 |
+
"style": "IPY_MODEL_42d140b838b844819bc127afc1b7bc84",
|
669 |
+
"value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
|
670 |
+
}
|
671 |
+
},
|
672 |
+
"caef095934ec47bbb8b64eab22049284": {
|
673 |
+
"model_module": "@jupyter-widgets/base",
|
674 |
+
"model_module_version": "1.2.0",
|
675 |
+
"model_name": "LayoutModel",
|
676 |
+
"state": {
|
677 |
+
"_model_module": "@jupyter-widgets/base",
|
678 |
+
"_model_module_version": "1.2.0",
|
679 |
+
"_model_name": "LayoutModel",
|
680 |
+
"_view_count": null,
|
681 |
+
"_view_module": "@jupyter-widgets/base",
|
682 |
+
"_view_module_version": "1.2.0",
|
683 |
+
"_view_name": "LayoutView",
|
684 |
+
"align_content": null,
|
685 |
+
"align_items": null,
|
686 |
+
"align_self": null,
|
687 |
+
"border": null,
|
688 |
+
"bottom": null,
|
689 |
+
"display": null,
|
690 |
+
"flex": null,
|
691 |
+
"flex_flow": null,
|
692 |
+
"grid_area": null,
|
693 |
+
"grid_auto_columns": null,
|
694 |
+
"grid_auto_flow": null,
|
695 |
+
"grid_auto_rows": null,
|
696 |
+
"grid_column": null,
|
697 |
+
"grid_gap": null,
|
698 |
+
"grid_row": null,
|
699 |
+
"grid_template_areas": null,
|
700 |
+
"grid_template_columns": null,
|
701 |
+
"grid_template_rows": null,
|
702 |
+
"height": null,
|
703 |
+
"justify_content": null,
|
704 |
+
"justify_items": null,
|
705 |
+
"left": null,
|
706 |
+
"margin": null,
|
707 |
+
"max_height": null,
|
708 |
+
"max_width": null,
|
709 |
+
"min_height": null,
|
710 |
+
"min_width": null,
|
711 |
+
"object_fit": null,
|
712 |
+
"object_position": null,
|
713 |
+
"order": null,
|
714 |
+
"overflow": null,
|
715 |
+
"overflow_x": null,
|
716 |
+
"overflow_y": null,
|
717 |
+
"padding": null,
|
718 |
+
"right": null,
|
719 |
+
"top": null,
|
720 |
+
"visibility": null,
|
721 |
+
"width": null
|
722 |
+
}
|
723 |
+
},
|
724 |
+
"eaba3f1de4444aabadfea2a3dadb1d80": {
|
725 |
+
"model_module": "@jupyter-widgets/controls",
|
726 |
+
"model_module_version": "1.5.0",
|
727 |
+
"model_name": "DescriptionStyleModel",
|
728 |
+
"state": {
|
729 |
+
"_model_module": "@jupyter-widgets/controls",
|
730 |
+
"_model_module_version": "1.5.0",
|
731 |
+
"_model_name": "DescriptionStyleModel",
|
732 |
+
"_view_count": null,
|
733 |
+
"_view_module": "@jupyter-widgets/base",
|
734 |
+
"_view_module_version": "1.2.0",
|
735 |
+
"_view_name": "StyleView",
|
736 |
+
"description_width": ""
|
737 |
+
}
|
738 |
+
},
|
739 |
+
"ee4a21bedc504171ad09d205d634b528": {
|
740 |
+
"model_module": "@jupyter-widgets/controls",
|
741 |
+
"model_module_version": "1.5.0",
|
742 |
+
"model_name": "ButtonStyleModel",
|
743 |
+
"state": {
|
744 |
+
"_model_module": "@jupyter-widgets/controls",
|
745 |
+
"_model_module_version": "1.5.0",
|
746 |
+
"_model_name": "ButtonStyleModel",
|
747 |
+
"_view_count": null,
|
748 |
+
"_view_module": "@jupyter-widgets/base",
|
749 |
+
"_view_module_version": "1.2.0",
|
750 |
+
"_view_name": "StyleView",
|
751 |
+
"button_color": null,
|
752 |
+
"font_weight": ""
|
753 |
+
}
|
754 |
+
},
|
755 |
+
"f1675c09d16a4251b403f9c56255f168": {
|
756 |
+
"model_module": "@jupyter-widgets/controls",
|
757 |
+
"model_module_version": "1.5.0",
|
758 |
+
"model_name": "ButtonModel",
|
759 |
+
"state": {
|
760 |
+
"_dom_classes": [],
|
761 |
+
"_model_module": "@jupyter-widgets/controls",
|
762 |
+
"_model_module_version": "1.5.0",
|
763 |
+
"_model_name": "ButtonModel",
|
764 |
+
"_view_count": null,
|
765 |
+
"_view_module": "@jupyter-widgets/controls",
|
766 |
+
"_view_module_version": "1.5.0",
|
767 |
+
"_view_name": "ButtonView",
|
768 |
+
"button_style": "",
|
769 |
+
"description": "Login",
|
770 |
+
"disabled": false,
|
771 |
+
"icon": "",
|
772 |
+
"layout": "IPY_MODEL_a2cfb91cf66447d7899292854bd64a07",
|
773 |
+
"style": "IPY_MODEL_ee4a21bedc504171ad09d205d634b528",
|
774 |
+
"tooltip": ""
|
775 |
+
}
|
776 |
+
},
|
777 |
+
"f6c845330d6743c0b35c2c7ad834de77": {
|
778 |
+
"model_module": "@jupyter-widgets/controls",
|
779 |
+
"model_module_version": "1.5.0",
|
780 |
+
"model_name": "CheckboxModel",
|
781 |
+
"state": {
|
782 |
+
"_dom_classes": [],
|
783 |
+
"_model_module": "@jupyter-widgets/controls",
|
784 |
+
"_model_module_version": "1.5.0",
|
785 |
+
"_model_name": "CheckboxModel",
|
786 |
+
"_view_count": null,
|
787 |
+
"_view_module": "@jupyter-widgets/controls",
|
788 |
+
"_view_module_version": "1.5.0",
|
789 |
+
"_view_name": "CheckboxView",
|
790 |
+
"description": "Add token as git credential?",
|
791 |
+
"description_tooltip": null,
|
792 |
+
"disabled": false,
|
793 |
+
"indent": true,
|
794 |
+
"layout": "IPY_MODEL_3e753b0212644990b558c68853ff2041",
|
795 |
+
"style": "IPY_MODEL_eaba3f1de4444aabadfea2a3dadb1d80",
|
796 |
+
"value": true
|
797 |
+
}
|
798 |
+
}
|
799 |
+
}
|
800 |
+
}
|
801 |
+
},
|
802 |
+
"nbformat": 4,
|
803 |
+
"nbformat_minor": 0
|
804 |
+
}
|
fin_rl_dqn_v1.ipynb
CHANGED
@@ -2045,6 +2045,11 @@
|
|
2045 |
"Sequential random (175 days)- Trade the test set sequentially from start to end day with random actions "
|
2046 |
]
|
2047 |
},
|
|
|
|
|
|
|
|
|
|
|
2048 |
{
|
2049 |
"cell_type": "markdown",
|
2050 |
"metadata": {},
|
|
|
2045 |
"Sequential random (175 days)- Trade the test set sequentially from start to end day with random actions "
|
2046 |
]
|
2047 |
},
|
2048 |
+
{
|
2049 |
+
"cell_type": "markdown",
|
2050 |
+
"metadata": {},
|
2051 |
+
"source": []
|
2052 |
+
},
|
2053 |
{
|
2054 |
"cell_type": "markdown",
|
2055 |
"metadata": {},
|
fin_rl_dqn_v2.ipynb
ADDED
The diff for this file is too large to render.
See raw diff
|
|
fin_rl_policy_gradiant_v1.ipynb
ADDED
The diff for this file is too large to render.
See raw diff
|
|
fin_rl_qlearning_v1.ipynb
CHANGED
@@ -784,7 +784,7 @@
|
|
784 |
"\n",
|
785 |
"| Model | 1000 trades 20 steps | Sequential trading | 1000 trades 20 steps random actions | Sequential random|\n",
|
786 |
"|------------|----------------------|--------------------|-------------------------------------|------------------|\n",
|
787 |
-
"|Q-learning |
|
788 |
"\n"
|
789 |
]
|
790 |
},
|
|
|
784 |
"\n",
|
785 |
"| Model | 1000 trades 20 steps | Sequential trading | 1000 trades 20 steps random actions | Sequential random|\n",
|
786 |
"|------------|----------------------|--------------------|-------------------------------------|------------------|\n",
|
787 |
+
"|Q-learning | 113.14 | 563.67 | -18.10 | 39.30 |\n",
|
788 |
"\n"
|
789 |
]
|
790 |
},
|