-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change default option and format #43
Conversation
ruler/ruler.py
Outdated
arr: Iterable[np.ndarray], | ||
phys_size: Iterable[np.ndarray], | ||
margin_size: Iterable[np.ndarray] = np.zeros((2, 2)), | ||
measure_region: str = 'min', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest making this an enum instead of a string, which will alleviate validation and the need to enumerate all the variations above. Similar suggestion for the pad_mode
parameter
ruler/ruler.py
Outdated
|
||
def minimum_length( | ||
arr: Iterable[np.ndarray], | ||
phys_size: Iterable[np.ndarray], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense for these to be np.ndarray
s (which, in general, can have any number of dimensions / shape)? If I understand these parameters correctly, phys_size
specifies the extent in physical units of the design arr
along each of its axes? If so, could we just type this as Tuple[float, ...]
or List[float]
and validate that len(phys_size) == arr.ndim
? IMO this would be more straightforward.
Similar comment for margin_size
. If I understand correctly, it is meant to be interpreted similarly to a padding, but is instead a sort of de-padding. I would suggest a type similar to the pad_width
in numpy's pad operation like List[Tuple[float, float]]
where the quantities are [(before_1, after_1), ... (before_N, after_N)]
for each axis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense for these to be np.ndarrays (which, in general, can have any number of dimensions / shape)?
I think it makes sense for the input design pattern arr
to be np.ndarray
, but phys_size
does not need to be np.ndarray
. The code can accept phys_size
as an array, a list, a tuple, or simply a number (for 1d design pattern).
If I understand these parameters correctly, phys_size specifies the extent in physical units of the design arr along each of its axes?
Yes, this is correct.
If so, could we just type this as Tuple[float, ...] or List[float] and validate that len(phys_size) == arr.ndim?
The argument phys_size
can have different types as mentioned above. If we have to choose one of the types for type hints, why is Tuple
or List
preferred over Iterable[np.ndarray]
(apart from appearing more straightforward)?
Similar comment for margin_size. If I understand correctly, it is meant to be interpreted similarly to a padding, but is instead a sort of de-padding.
Yes, this is correct.
I would suggest a type similar to the pad_width in numpy's pad operation like List[Tuple[float, float]] where the quantities are [(before_1, after_1), ... (before_N, after_N)] for each axis.
This will be adopted in the next version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code can accept phys_size as an array, a list, a tuple, or simply a number (for 1d design pattern).
Supporting so many possibilities places a much higher burden on validating for correctness of the inputs. In particular, np.ndarray
is a completely unconstrained type and seems quite awkward (IMO) for specifying the extent and border "throw away" region.
Constraining the API for phys_size
and margin_size
to be either a tuple or a list whose length matches arr.ndim
seems reasonable and has precident in various numpy APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we should keep things as simple as possible. To that end, it may be worthwhile removing the phys_size
here and simply compute the minimum feature size in units of pixels. A separate API with a physical size could then be created by wrapping this simpler function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To that end, it may be worthwhile removing the phys_size here and simply compute the minimum feature size in units of pixels. A separate API with a physical size could then be created by wrapping this simpler function.
The resolution along different dimensions may not always be the same. For example, in a 2d design pattern, the resolution along the x direction may not be the same as the resolution along the y axis, although we have not encountered this situation in this project. Such anisotropy makes it infeasible to decouple the two steps. We can only decouple them safely for 1d design patterns, and in ruler.py
it was done in this way.
If so, could we just type this as Tuple[float, ...] or List[float] and validate that len(phys_size) == arr.ndim?
Sometimes phys_size
and arr
provided by users may have redundant dimensions, and phys_size
may contain zeros. In _ruler_initialize
, zeros in phys_size
are removed and arr
is squeezed, upon which the dimension check is performed. To remove zeros from phys_size
, my approach is to convert it to an array and then apply .nonzero()
to obtain the indices of nonzeros elements.
ruler/ruler.py
Outdated
def minimum_length(arr,phys_size,margin_size=np.zeros((2,2)),measure_region='solid',pad_mode=None): | ||
Compute the minimum length scale in a design pattern. | ||
|
||
arr: input array that represents a design pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These first three parameters are typed as Iterable
and it's not clear from these docstrings how that is supposed to be interpreted. Is one supposed to provide a list of designs and the function will return a list of measured feature sizes (per provided design)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is one supposed to provide a list of designs and the function will return a list of measured feature sizes (per provided design)?
No, the user is supposed to provide one design every time instead of a list of designs. As mentioned in the annotation, arr: input array that represents a design pattern.
Do you think it is still not clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then Iterable[np.ndarray]
is not the correct type annotation. It should be np.ndarray
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks!
In this design, it seems natural that the minimum length scale is the diameter of the circle. This effectively assumes that there is more void to the right of the image, which seems better justified than the alternative, i.e. assuming that there is solid to the right of the image. To me, it seems like edge padding is generally the most reasonable choice. Regarding the (edit) Actually, upon reflection I think I see the utility of trimming the border -- for example, when a feature cuts diagonally across the design region. In this case, edge padding could create sharp 45 degree corners which have small minimum length scales. Let me try to run some tests with my version of the ruler. |
In
Yes, users can always do this manually. The padding option was added to
Yes. Two examples are here. |
In a diagonal line problem, if the border is not trimmed, the sharp corners should have a length scale, which is much larger than the size of one pixel. In your example, I guess the length scale of sharp corners is larger than the width of the strip. Consequently, those sharp corners do not affect the results.
Here is a similar example that causes incorrect estimation. The width of the strip is 200, which is also the declared minimum length scale if the void gaps are ignored. However, the minimum length scales estimated by The code is as follows. (I got errors when I tried to rerun the notebook in your link.)
|
Ah yes, thanks for this example. It does motivate some flexibility to ignore the borders of the design, although I would still question whether we need the flexibility of arbitrary trimming along each edge. In any case, this motivates some changes to my own ruler implementation, so let me work on that and I might have a proposal thereafter. |
If we use the edge mode, what would be the minimum length scale of the void region in this pattern? When the edge mode is used, the result given by |
ruler/test_ruler.py
Outdated
|
||
|
||
class TestRuler(unittest.TestCase): | ||
def setUp(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no logic or parameterization being done in this setUp()
routine. Therefore, it is better to set these constants at the module level, i.e.
RESOLUTION = 1
PHYS_SIZE = (200, 200)
# ...
above the class definition.
ruler/ruler.py
Outdated
def void_minimum_length( | ||
arr: np.ndarray, | ||
phys_size: Tuple[float, ...], | ||
margin_size: List[Tuple[float, float]] = ((0, 0), (0, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you would actually want [(0, 0), (0, 0)]
, since your typing indicates a list of tuples. OTOH this is probably not a reasonable default value since it is fixed to two dimensions and arr
could be 1, 2, or 3 dimensions. If you want the default behavior to be 0 margin on top and bottom of all dims of arr
, I suggest typing this as Optional[ ... ]
, setting the default to None
, and handling the creation of an appropriate zero-valued margin_size
for whatever dimension of arr
is passed inside the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you would actually want [(0, 0), (0, 0)], since your typing indicates a list of tuples.
I think something like ((0,0),(0,0))
may sound better because it would follow the same syntax as pad_width
in numpy.pad
, as you suggested before.
I suggest typing this as Optional[ ... ], setting the default to None
In the next version, this line will be margin_size: Optional[Tuple[Tuple[float, float],...]] = None
.
One other high-level comment is that we should document what each function returns in its docstring. I won't require that we adhere to a particular formatting style, but my bias would be to suggest the Google style, which is consistent with |
ruler/ruler.py
Outdated
phys_size: physical size of the design pattern. | ||
margin_size: physical size of borders that need to be discarded. | ||
''' | ||
if isinstance(phys_size, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these checks still needed? You're constraining phys_size
to be a tuple of floats at the top level now, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the type hint indicates phys_size
to be a tuple, I think some checks here are still needed if we want some flexibility in user input. I personally prefer to keeping such flexibility for users. One reason is that a user can easily input a phys_size
that has a reasonable type but is slightly different from our expected one. Another reason is that type hints do not report errors. In addition, we need to peel away redundant dimensions and remove zeros in phys_size
, which can be done more easily when phys_size
is converted to an array.
That being said, the code here can be simplified, because the if
and the first elif
parts can be merged as one if
statement.
Can you explain the approach based on padding that you mentioned today, @mfschubert? I think trimming helps but padding does not. In recent versions, I remove padding options for users. |
Yes, let me try. I believe there is a problematic case, which is very similar to the one you shared -- where a feature runs into the edge of the design region, with a very shallow angle. I created such a design In all cases I am using a diameter-40 circular brush. Currently, the minimum feature size for solid is computed by comparing the binary opening of edge-padded Turning now to the minimum feature size for void: this is computed by comparing the binary opening of My current belief is that we will need check for violations twice -- once with entirely solid padding, and once with entirely void padding. Then, if any pixel has no violations in even one of the two cases, then we would consider this as no violations. |
Thanks for clarifying, @mfschubert!
I agree that this double-checking scheme can probably work well in many cases. The logic seems that, entirely solid padding does not affect the result of solid minimum length scale, if the design pattern has a well defined solid minimum length scale somewhere. Likewise, entirely void padding does not affect the result of void minimum length scale, if the design pattern has a well defined void minimum length scale somewhere. Overall, apart from the edge-mode padding, it seems a good idea to do entirely solid padding when we estimate solid minimum length scales, and do entirely void padding when we estimate void minimum length scales. However, when we want to estimate minimum length scales through open minus close, which is done by the function |
ruler/regular_shapes.py
Outdated
declared_mls: float, | ||
angle: float = 0, | ||
center: Tuple[float, float] = (0, 0)) -> np.ndarray: | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete the docstring for this and the disc() function below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
@mawc2019 Nice work on the formatting / documentation! I think the code is much easier to follow now. As minor general stylistic suggestion, I suggest using |
A non-docstring annotation may occupy several lines. Perhaps the |
It's a recommended approach: https://google.github.io/styleguide/pyguide.html#385-block-and-inline-comments |
101f97b
to
0df9cf8
Compare
Conclusion: let's go with min(open-design, design-close), use opencv, use the new definition of interfacial pixels, and use the simplest-to-explain definition of the discretized "circular" kernel. |
One other feature relates to the padding of the edge pixels of the design region. Our convention will be:
|
…ting regular shapes and testing ruler
…ace phys_size with pixel_size in various places, and add more docstrings
…ions, add the 3d metagrating optimized by meep
A few more issues need to be finalized, but they do not influence the results of minimum length scales.
|
0df9cf8
to
9a52603
Compare
To be clear, is the 1D codepath different than the 2D codepath? I would expect the 2D algorithm works for 1D designs, and if we do have a dedicated 1D codepath I would actually recommend eliminating it entirely. We don't need 1D length scales for the paper, and if I'm not mistaken the 2D algorithm will work for 1D designs (as long as you add a dummy dimension). With this change, we can eliminate Regarding your question (5), best to name things descriptively so the user has an idea about what they are working with. For example:
|
Yes. Both 1d and 2d share the same user functions, but inside the user functions, 1d problems are solved in a different approach from 2d. No morphological operations are involved in the 1d approach.
There are two reasons for using an dedicated 1d algorithm instead of making it 2d by adding a dummy dimension.
It does not affect any results in the paper and we do not need to mention it in the paper. But for the completeness of a program for computing minimum length scales, I suggest we keep it. By the way, we were also reluctant to eliminate the 3d code (as mentioned in the first point in @stevengj 's comment here). We finally eliminate it not only because the paper does not need it, but also because we have to do so in order to use OpenCV. So the 1d code does not enforce us to remove it as the 3d code did.
Sorry, there was a mistake in my previous reply. It was corrected. It should be "the argument
Thanks for the suggestion. The first three sounds good to me. I will change the names correspondingly. But for the last one, I still prefer |
I would be surprised if the 2D algorithm were inaccurate for 1D designs. In my own ruler I used 1D designs in some test cases. Are you sure this is the case? And, regarding the dummy dimension, all I meant was
I don't think we actually need the new dimension to have >1 dimension for the code to work. Finally, regarding naming convention, it's your call but I wouldn't have two functions with the names |
Thanks for this example. I guess this is a corner case I hadn't considered sufficiently. It is definitely unintended behavior. In any case, I am ok with the 1D codepath, as it seems that there is some more work to merge it with the 2D case. |
This is just what the current version does in |
…nd void minimum lengths separately and taking the minimum
Now the algorithm of the function |
In this PR, according to @stevengj 's suggestion, the default option in
ruler.py
is changed back to open minus close. This default approach generates the smaller minimum length scale between solid and void regions. The number could be slightly different from calculating minimum length scales for solid and void regions separately and then taking the minimum.According to @ianwilliamson 's suggestion, the output is always a two-element tuple. When a user requires only one number, the second element is
None
. The code inruler.py
has been formatted byyapf
.@oskooi suggested to show various designs in
README.md
. It was done in a previous PR. Do you think those were too many? Shall we move all of them toexample.ipynb
and provide a link inREADME.md
?The python test file will be uploaded later. After everything is done, shall we move it to the NanoComp site? In addition, the name
ruler
is quite arbitrary. Shall we rename it by extracting some letters from a descriptive phrase, e.g.,miles
from minimum length scale?