Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency when running the keras-io/examples/timeseries/timeseries_classification_transformer.py #1908

Open
condor-cp opened this issue Aug 13, 2024 · 2 comments
Assignees

Comments

@condor-cp
Copy link

Running the example shows inconsistency in number of parameters and model performance compared to what is displayed.

It seems that the global average pooling should take data_format to "channel_first" to reach the same number of parameters and the accuracy performance consistent with the displayed console log (tried with google colab).

x = layers.GlobalAveragePooling1D(data_format="channels_last")(x)

But then there is no pooling, just removing the feature dimension => Maybe another layer should be used.

@mw66
Copy link

mw66 commented Sep 17, 2024

Experienced the same problem:

with

    x = layers.GlobalAveragePooling1D(data_format="channels_last")(x)                                                                                                                                              

the number of parameters are (only show the last few rows that differ):

...                                                                                                  
 global_average_pooling1d (Glob  (None, 1)           0           ['tf.__operators__.add_7[0][0]'] 
 alAveragePooling1D)                                                                              
                                                                                                  
 dense (Dense)                  (None, 128)          256         ['global_average_pooling1d[0][0]'
                                                                 ]                                
                                                                                                  
 dropout_8 (Dropout)            (None, 128)          0           ['dense[0][0]']                  
                                                                                                  
 dense_1 (Dense)                (None, 2)            258         ['dropout_8[0][0]']              
                                                                                                  
==================================================================================================
Total params: 29,258
Trainable params: 29,258
Non-trainable params: 0

And the training stop very quickly with bad result:

45/45 [==============================] - 24s 545ms/step - loss: 0.6922 - sparse_categorical_accuracy: 0.5208 - val_loss: 0.6952 - val_sparse_categorical_accuracy: 0.4799
42/42 [==============================] - 4s 91ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5159

While on page: https://keras.io/examples/timeseries/timeseries_classification_transformer/

it shows:

│ global_average_poo… │ (None, 500)       │       0 │ add_7[0][0]          │
│ (GlobalAveragePool… │                   │         │                      │
├─────────────────────┼───────────────────┼─────────┼──────────────────────┤
│ dense (Dense)       │ (None, 128)       │  64,128 │ global_average_pool… │
├─────────────────────┼───────────────────┼─────────┼──────────────────────┤
│ dropout_12          │ (None, 128)       │       0 │ dense[0][0]          │
│ (Dropout)           │                   │         │                      │
├─────────────────────┼───────────────────┼─────────┼──────────────────────┤
│ dense_1 (Dense)     │ (None, 2)         │     258 │ dropout_12[0][0]     │
└─────────────────────┴───────────────────┴─────────┴──────────────────────┘
 Total params: 93,130 (363.79 KB)
 Trainable params: 93,130 (363.79 KB)
 Non-trainable params: 0 (0.00 B)

@mw66
Copy link

mw66 commented Sep 17, 2024

Hi , @fchollet

x = layers.GlobalAveragePooling1D(data_format="channels_last")(x)

This line need to be changed to channel_first to see the good result on:

https://keras.io/examples/timeseries/timeseries_classification_transformer/

Can you make this change? can also add explanation why channel_first is need instead of channels_last?

The input training data's shape is (3601, 500, 1), i.e channels_last for sure; but why we need to set channel_first to see the good training result?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants