Skip to content

Commit

Permalink
Fix test failures and address feedback
Browse files Browse the repository at this point in the history
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
  • Loading branch information
AyanSinhaMahapatra committed Aug 30, 2024
1 parent 6d93bba commit c427e01
Show file tree
Hide file tree
Showing 24 changed files with 167 additions and 48 deletions.
10 changes: 10 additions & 0 deletions src/licensedcode/data/rules/cclrc_1.RULE
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
license_expression: cclrc
is_license_notice: yes
referenced_filenames:
- External_License/CCLRC_CDAT_License.txt
---

* This software may be distributed under the terms of the
* {{CCLRC Licence}} for CCLRC Software
* <CDATDIR>/External_License/CCLRC_CDAT_License.txt
7 changes: 7 additions & 0 deletions src/licensedcode/data/rules/cclrc_2.RULE
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
license_expression: cclrc
is_license_notice: yes
---

* This software may be distributed under the terms of the
* {{CCLRC Licence}} for CCLRC Software
1 change: 1 addition & 0 deletions src/licensedcode/data/rules/mit_1182.RULE
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
license_expression: mit
is_license_reference: yes
is_required_phrase: yes
relevance: 100
---

Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/mit_334.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ referenced_filenames:
- LICENSE
---

This software is released under the MIT software license.
This software is released under the {{MIT software license}}.
This license, including disclaimer, is available in the 'LICENSE' file.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/mit_337.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ ignorable_urls:
- http://www.opensource.org/licenses/mit-license.php
---

This is the http://www.opensource.org/licenses/mit-license.php MIT Software License
This is the http://www.opensource.org/licenses/mit-license.php {{MIT Software License}}
which is OSI-certified, and GPL-compatible.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/mit_392.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,5 @@ ignorable_urls:
- http://www.opensource.org/licenses/mit-license.php
---

"Distributed under the MIT software license, see the accompanying file COPYING or
"Distributed under the {{MIT software license}}, see the accompanying file COPYING or
http://www.opensource.org/licenses/mit-license.php.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/mit_396.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ referenced_filenames:
- COPYING
---

// Distributed under the MIT software license, see the accompanying
// Distributed under the {{MIT software license}}, see the accompanying
// file COPYING
6 changes: 3 additions & 3 deletions src/licensedcode/data/rules/mit_397.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ relevance: 100
referenced_filenames:
- COPYING
ignorable_urls:
- http://www.opensource.org/licenses/mit-license.php
- https://www.opensource.org/licenses/mit-license.php
---

// Distributed under the MIT software license, see the accompanying
// file COPYING or shttp://www.opensource.org/{{licenses/mit}}-license.php.
// Distributed under the {{MIT software license}}, see the accompanying
// file COPYING or https://www.opensource.org/licenses/mit-license.php.
16 changes: 5 additions & 11 deletions src/licensedcode/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1832,10 +1832,6 @@ def license_flags(self):
for license_flag_name in self.license_flag_names
}

@property
def license_flag_values(self):
return self.license_flags.values()

def validate(self, licensing=None, thorough=False):
"""
Validate this rule using the provided ``licensing`` Licensing and yield
Expand All @@ -1844,8 +1840,7 @@ def validate(self, licensing=None, thorough=False):
is_false_positive = self.is_false_positive

has_license_flags = any(self.license_flag_values)
has_no_license_flags = len([l for l in self.license_flag_values if l]) == 0
has_many_license_flags = len([l for l in self.license_flag_values if l]) > 1
license_flags_count = sum(self.license_flags.values())

license_expression = self.license_expression

Expand Down Expand Up @@ -1888,11 +1883,8 @@ def validate(self, licensing=None, thorough=False):

if not (0 <= self.relevance <= 100):
yield 'Invalid rule relevance. Should be between 0 and 100.'

if has_no_license_flags:
yield 'Invalid rule no is_license_* flags present.'

if has_many_license_flags:
if license_flags_count:
yield 'Invalid rule is_license_* flags. Only one allowed.'

if not has_license_flags:
Expand Down Expand Up @@ -2295,6 +2287,8 @@ def dump(self, rules_data_dir, **kwargs):
rule_file = self.rule_file(rules_data_dir=rules_data_dir)

metadata = self.to_dict()
# This can be used to pass objects to dump on the rule file with
# other rule metadata, like debugging collection of required phrases
if kwargs:
metadata.update(kwargs)
content = self.text
Expand Down Expand Up @@ -2328,7 +2322,7 @@ def load(self, rule_file, with_checks=True):
raise e

known_attributes = set(attr.fields_dict(self.__class__))
# This is an attirbute used to debug marking required phrases, and is not needed
# This is an attribute used to debug marking required phrases, and is not needed
if "sources" in data:
data.pop("sources")
data_file_attributes = set(data)
Expand Down
7 changes: 1 addition & 6 deletions src/licensedcode/required_phrases.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,6 @@ def get_required_phrase_spans(text):
>>> x = get_required_phrase_spans(text)
>>> assert x == [Span(4, 6)], x
>>> text = 'This is enclosed in {{double curly braces}}'
>>> # 0 1 2 3 4 5 6
>>> x = get_required_phrase_spans(text)
>>> assert x == ['double', 'curly', 'braces'], x
>>> text = 'This is {{enclosed}} a {{double curly braces}} or not'
>>> # 0 1 2 SW 3 4 5 6 7
>>> x = get_required_phrase_spans(text)
Expand Down Expand Up @@ -114,7 +109,7 @@ def get_required_phrase_texts(text):
>>> text = 'This is enclosed in {{double curly braces}}'
>>> # 0 1 2 3 4 5 6
>>> x = get_required_phrase_texts(text=text)
>>> assert x == ['double', 'curly', 'braces'], x
>>> assert x == ['double curly braces'], x
"""
return [
required_phrase.text
Expand Down
19 changes: 7 additions & 12 deletions src/licensedcode/tokenize.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,25 +181,21 @@ def get_non_overlapping_spans(old_required_phrase_spans, new_required_phrase_spa
return new_required_phrase_spans

for new_span in new_required_phrase_spans:
if not any(
if any(
old_span.overlap(new_span) != 0
for old_span in old_required_phrase_spans
):
yield new_span
continue

yield new_span


def combine_tokens(token_tuples):
"""
Returns a string `combined_text` combining token tuples from the list `token_tuples`,
which are token tuples created by the tokenizer functions.
"""
combined_text = ''

for token_tuple in token_tuples:
_value, token = token_tuple
combined_text += token

return combined_text
"""
return ''.join(token for _, token in token_tuples)


def add_required_phrase_markers(text, required_phrase_span):
Expand All @@ -208,11 +204,10 @@ def add_required_phrase_markers(text, required_phrase_span):
markers to the `text` around the tokens which the span represents, while
being mindful of whitespace and stopwords.
"""
tokens_tuples_without_markers = list(matched_query_text_tokenizer(text))
tokens_tuples_with_markers = []
token_index = 0

for token_tuple in tokens_tuples_without_markers:
for token_tuple in matched_query_text_tokenizer(text):

is_word, token = token_tuple

Expand Down
30 changes: 27 additions & 3 deletions tests/formattedcode/data/common/manifests-expected.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ headers:
system_environment:
operating_system: linux
cpu_architecture: 64
platform: Linux-5.15.0-116-generic-x86_64-with-glibc2.35
platform_version: '#126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024'
python_version: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0]
platform: Linux-5.15.0-119-generic-x86_64-with-glibc2.35
platform_version: '#129-Ubuntu SMP Fri Aug 2 19:25:20 UTC 2024'
python_version: 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
spdx_license_list_version: '3.24'
files_count: 4
summary:
Expand Down Expand Up @@ -1601,6 +1601,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1626,6 +1628,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1651,6 +1655,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1677,6 +1683,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1703,6 +1711,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1728,6 +1738,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand Down Expand Up @@ -1765,6 +1777,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1790,6 +1804,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: yes
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand Down Expand Up @@ -1818,6 +1834,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: yes
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1843,6 +1861,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1868,6 +1888,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -1893,6 +1915,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand Down
22 changes: 17 additions & 5 deletions tests/formattedcode/data/yaml/package-and-licenses-expected.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,18 @@ headers:
for any legal advice.
ScanCode is a free software code scanning tool from nexB Inc. and others.
Visit https://github.com/nexB/scancode-toolkit/ for support and download.
output_format_version: 3.1.0
output_format_version: 3.2.0
message:
errors: []
warnings: []
extra_data:
system_environment:
operating_system: linux
cpu_architecture: 64
platform: Linux-5.15.0-112-generic-x86_64-with-glibc2.35
platform_version: '#122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024'
python_version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
spdx_license_list_version: '3.23'
platform: Linux-5.15.0-119-generic-x86_64-with-glibc2.35
platform_version: '#129-Ubuntu SMP Fri Aug 2 19:25:20 UTC 2024'
python_version: 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
spdx_license_list_version: '3.24'
files_count: 4
summary:
declared_license_expression: apache-2.0
Expand Down Expand Up @@ -626,6 +626,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: yes
Expand Down Expand Up @@ -854,6 +856,8 @@ license_rule_references:
is_license_tag: yes
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -879,6 +883,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand All @@ -904,6 +910,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: yes
is_builtin: yes
is_from_license: no
Expand All @@ -929,6 +937,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: yes
Expand Down Expand Up @@ -972,6 +982,8 @@ license_rule_references:
is_license_tag: no
is_license_intro: no
is_license_clue: no
is_required_phrase: no
skip_collecting_required_phrases: no
is_continuous: no
is_builtin: yes
is_from_license: no
Expand Down
Loading

0 comments on commit c427e01

Please sign in to comment.