Computing flops #35

rebeen · 2020-04-16T13:28:45Z

please could you answer my questions
Q1- can we computing flops for a model without training the model? is there any relation between flops and training? can training affect flops? when flops can be computed?

I am asking this question because I defined a model and then I computed the flops and here is the results.
Computational complexity: 0.03 GMac
Number of parameters: 2.24 M

Q2- if we want to have flops in million, we should multiply 0.03 * 1000 ? if yes then for this case the computational complexity is 30.0 million.

Q3- what I understand from your code, Mac is flops, am I right ?

Thank you

sovrasov · 2020-04-16T14:35:08Z

A1: Flops can be estimated as soon as the architecture of your model has been defined. For classical training schemes (like when you train ResNet-50 on Imagenet via SGD) the architecture is defined in advance and flops are not changing during the training.

A2: Giga means 10^6, so in your example 0.03 GMac = 30 MMac, you're right.

A3: See #16

rebeen · 2020-04-16T20:43:41Z

Thank you very much, regarding the second answer A2 I think Giga means 10 ^9 based your code so when we change to Million Mac we should GMac* 1000= Milion MAC,

regarding the A3 I have seen these issues but actually still not clear

sovrasov · 2020-04-17T07:39:04Z

Yes, giga is 10^9
If you mean the code, then variables that include flops are actually used to compute macs.

rebeen · 2020-04-18T06:14:40Z

This is the computation cost of mobilenetv2 which I think it is not correct,, what do you think?

import torch
import torch.nn as nn
import torch.nn.functional as F


class Block(nn.Module):
    '''expand + depthwise + pointwise'''
    def __init__(self, in_planes, out_planes, expansion, stride):
        super(Block, self).__init__()
        self.stride = stride

        planes = expansion * in_planes
        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, groups=planes, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, out_planes, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn3 = nn.BatchNorm2d(out_planes)

        self.shortcut = nn.Sequential()
        if stride == 1 and in_planes != out_planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(out_planes),
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = F.relu(self.bn2(self.conv2(out)))
        out = self.bn3(self.conv3(out))
        out = out + self.shortcut(x) if self.stride==1 else out
        return out


class MobileNetV2(nn.Module):
    # (expansion, out_planes, num_blocks, stride)
    cfg = [(1,  16, 1, 1),
           (6,  24, 2, 1),  # NOTE: change stride 2 -> 1 for CIFAR10
           (6,  32, 3, 2),
           (6,  64, 4, 2),
           (6,  96, 3, 1),
           (6, 160, 3, 2),
           (6, 320, 1, 1)]

    def __init__(self, num_classes=10):
        super(MobileNetV2, self).__init__()
        # NOTE: change conv1 stride 2 -> 1 for CIFAR10
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(32)
        self.layers = self._make_layers(in_planes=32)
        self.conv2 = nn.Conv2d(320, 1280, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn2 = nn.BatchNorm2d(1280)
        self.linear = nn.Linear(1280, num_classes)

    def _make_layers(self, in_planes):
        layers = []
        for expansion, out_planes, num_blocks, stride in self.cfg:
            strides = [stride] + [1]*(num_blocks-1)
            for stride in strides:
                layers.append(Block(in_planes, out_planes, expansion, stride))
                in_planes = out_planes
        return nn.Sequential(*layers)

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.layers(out)
        out = F.relu(self.bn2(self.conv2(out)))
        # NOTE: change pooling kernel_size 7 -> 4 for CIFAR10
        out = F.avg_pool2d(out, 4)
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out

net = MobileNetV2()
# def test():
   
#     print(net)
#     x = torch.randn(2,3,32,32)
#     y = net(x)
#     print(y.size())

# test()

from ptflops import get_model_complexity_info

with torch.cuda.device(0):
 
  macs, params = get_model_complexity_info(net, (3, 32, 32), as_strings=True,
                                           print_per_layer_stat=False, verbose=True)
  print('{:<30}  {:<8}'.format('Computational complexity: ', macs))
  print('{:<30}  {:<8}'.format('Number of parameters: ', params))

Warning: module Block is treated as a zero-op.
Warning: module MobileNetV2 is treated as a zero-op.
Computational complexity:       0.09 GMac
Number of parameters:           2.3 M

sovrasov · 2020-04-18T18:10:37Z

It seems OK to me. mobilenetv2 for Imagenet has different amount of params/macs than mobilenetv2 on CIFAR10. Why do you think this result is not correct?
To verify the result of ptflops you can switch the input resolution of 224 and the number of classes to 1000 in your code and then compare the results against the original MobilenetV2 paper.

rebeen · 2020-04-18T23:31:18Z

I compared that is why I am confused if we look at page 5 in this paper we can see that the highest flops are 42.0 million for mobileNet v2 while if we look at the above result it is 90.0 million flops, also your code does not count block in the code while there are some convolution layer in the block "Warning: module Block is treated as a zero-op.
"
thank you for your replay

sovrasov · 2020-04-19T08:02:46Z

The numbers in this paper are quite weird: that are the differences between MobileNetV2 for CIFAR10 and MobileNetV2 for SVHN? Both the datasets have 32x32 images and 10 classes, so from the architecture perspective MobilNets should be identical in tables 5 and 6 and have the same amount of flops, but the paper reports less flops for SVHN. May be you'd better ask the authors of the paper if there is a difference between the stock MobileNetV2 and their versions.

Regarding warnings: you should treat them carefully. Module block is custom and ptflops doesn't have a rule for it, but at the same it's just a container for other modules that can be parsed correctly. Unfortunately I couldn't figure out a criterion how to distinguish such containers and modules that really need a custom rule to count flops correctly and because of that ptflops just outputs warnings about any unknown module.

rebeen · 2020-04-19T11:19:57Z

Thank you, yes that is really a problem I am confused why these two flops are different,

sovrasov added the question Further information is requested label May 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computing flops #35

Computing flops #35

rebeen commented Apr 16, 2020

sovrasov commented Apr 16, 2020

rebeen commented Apr 16, 2020

sovrasov commented Apr 17, 2020

rebeen commented Apr 18, 2020

sovrasov commented Apr 18, 2020

rebeen commented Apr 18, 2020

sovrasov commented Apr 19, 2020

rebeen commented Apr 19, 2020

Computing flops #35

Computing flops #35

Comments

rebeen commented Apr 16, 2020

sovrasov commented Apr 16, 2020

rebeen commented Apr 16, 2020

sovrasov commented Apr 17, 2020

rebeen commented Apr 18, 2020

sovrasov commented Apr 18, 2020

rebeen commented Apr 18, 2020

sovrasov commented Apr 19, 2020

rebeen commented Apr 19, 2020