Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extra activation layer in image_classification_from_scratch.py #1418

Closed
wants to merge 4 commits into from

Conversation

SuryanarayanaY
Copy link
Contributor

@SuryanarayanaY SuryanarayanaY commented Jul 6, 2023

The image_classification_from_scratch.py tutorial has one extra relu activation layer which is being applied on the layer that already gone through relu activation which makes one code of line redundant. Though it may not affect output it unnecessarily creates confusion for the users and also consumes some resources if used for training.Hence I am proposing to remove the unwanted line of code.

Fixes #1119

The image_classification_from_scratch.py tutorial has one extra relu activation layer which is being applied on the layer that already gone through relu activation which makes one code of line redundant. Though it won't affect output it unnecessarily creates confusion for the users and also consumes some resources if used for training.Hence I am proposing to remove the unwanted line of code.
Copy link
Contributor

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. It is recommended to remove the first call at line 238 instead of the one in the loop.

@SuryanarayanaY
Copy link
Contributor Author

Actually line no.240 previous_block_activation = x (where x = layers.Activation("relu")(x) defined at line no.238 ) stored under variable previous_block_activation as residual and used it later at Line No.254.

x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
for size in [256, 512, 728]:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides=2, padding="same")(
previous_block_activation
)

@divyashreepathihalli
Copy link
Contributor

@SuryanarayanaY in that case this change will break the implementation. I will close this PR. Thanks!

@SuryanarayanaY
Copy link
Contributor Author

SuryanarayanaY commented Jul 25, 2023

@divyashreepathihalli ,

I am thinking in perspective of user request for removing extra layer. But here issue seems more than that. Here author used activation layer output as residual. I think it's better to pass residuals without activation. In that case removing the first activation call at line no.138 itself makes sense. Like the amendment in #1398

If the author intention is to pass activation layer output as residual then earlier proposal is the solution. Anyways passing already activated output to same activation layer again gives same result.Hence the proposed changes will not break. I have tested the code with earlier proposal and it executes fine as per attached gist. Correct me if I am missing anything here.

I think what we need to pay attention more here is whether we have to pass residuals as with relu activation or without activation. If we need to pass residuals without going through activation then we need to remove line no.238 as proposed by you.

Please suggest.

@SuryanarayanaY
Copy link
Contributor Author

@divyashreepathihalli ,

Can I go ahead with the proposal you referred earlier in this comment ?

@divyashreepathihalli
Copy link
Contributor

@divyashreepathihalli ,

Can I go ahead with the proposal you referred earlier in this comment ?

Sure, please go ahead and update the changes.

Removed activation layer on residuals
Removed unwanted activation layer on residuals
@SuryanarayanaY
Copy link
Contributor Author

@divyashreepathihalli , Done the changes.May please review.

@gbaned
Copy link

gbaned commented Oct 31, 2023

Hi @divyashreepathihalli, Can you please review this PR ? Thank you!

@gbaned
Copy link

gbaned commented Jan 8, 2024

Hi @fchollet, Can you please review this PR ? Thank you!

@sachinprasadhs
Copy link
Collaborator

Fines are outdated, can you please make the changes to the latest file https://keras.io/examples/vision/image_classification_from_scratch/

Copy link

This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Jan 26, 2024
Copy link

This PR was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

@github-actions github-actions bot closed this Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problems with the image classification example
4 participants