Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add required phrase rules automatically #3254

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ More (advanced) rules options:
be present in the result license detections. These just have the license text and a
`is_false_positive` flag set to True.

- you can specify key phrases by surrounding one or more words between the `{{`
- you can specify required phrases by surrounding one or more words between the `{{`
and `}}` tags. Key phrases are words that **must** be matched/present in order
for a RULE to be considered a match.

Expand Down
20 changes: 2 additions & 18 deletions etc/scripts/licenses/buildrules.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from licensedcode import models
from licensedcode import match_hash
from licensedcode import frontmatter
from licensedcode.models import get_rule_id_for_text
from license_expression import Licensing

"""
Expand Down Expand Up @@ -129,23 +130,6 @@ def load_data(location="00-new-licenses.txt"):
return rules


def rule_exists(text):
"""
Return the matched rule identifier if the text is an existing rule matched
exactly, False otherwise.
"""
idx = cache.get_index()

matches = idx.match(query_string=text)
if not matches:
return False
if len(matches) > 1:
return False
match = matches[0]
if match.matcher == match_hash.MATCH_HASH and match.score() == 100:
return match.rule.identifier


def all_rule_by_tokens():
"""
Return a mapping of {tuples of tokens: rule id}, with one item for each
Expand Down Expand Up @@ -346,7 +330,7 @@ def cli(licenses_file, dump_to_file_on_errors=False):

text = rule.text

existing_rule = rule_exists(text)
existing_rule = get_rule_id_for_text(text)
skinny_text = " ".join(text[:80].split()).replace("{", " ").replace("}", " ")

existing_msg = (
Expand Down
2 changes: 2 additions & 0 deletions etc/scripts/licenses/report_license_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@
"is_license_reference",
"is_license_intro",
"is_license_clue",
"is_required_phrase",
"skip_collecting_required_phrases",
"is_deprecated",
"has_unknown",
"only_known_words",
Expand Down
1 change: 1 addition & 0 deletions setup-mini.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ console_scripts =
scancode-reindex-licenses = licensedcode.reindex:reindex_licenses
scancode-license-data = licensedcode.license_db:dump_scancode_license_data
regen-package-docs = packagedcode.regen_package_docs:regen_package_docs
add-required-phrases = licensedcode.required_phrases:add_required_phrases

# These are configurations for ScanCode plugins as setuptools entry points.
# Each plugin entry hast this form:
Expand Down
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ console_scripts =
scancode-reindex-licenses = licensedcode.reindex:reindex_licenses
scancode-license-data = licensedcode.license_db:dump_scancode_license_data
regen-package-docs = packagedcode.regen_package_docs:regen_package_docs
add-required-phrases = licensedcode.required_phrases:add_required_phrases

# These are configurations for ScanCode plugins as setuptools entry points.
# Each plugin entry hast this form:
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-1.1_45.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ignorable_urls:
- http://www.apache.org/
---

APACHE 1.1
{{APACHE 1.1}}

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-1.1_82.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ ignorable_emails:
- [email protected]
---

========================= Apache-1.1 =========================
========================= {{Apache-1.1}} =========================


The Apache Software License, Version 1.1
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-1.1_89.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ ignorable_emails:
- [email protected]
---

Apache License 1.1
{{Apache License 1.1}}

Copyright (c) 2000 The Apache Software Foundation. All rights reserved.

Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-1.1_92.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ ignorable_emails:
- [email protected]
---

Apache License 1.1 Copyright (c) 2000 The Apache Software Foundation. All
{{Apache License 1.1}} Copyright (c) 2000 The Apache Software Foundation. All
rights reserved.

Redistribution and use in source and binary forms, with or without modification,
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1009.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ignorable_urls:
- https://www.apache.org/licenses/LICENSE-2.0
---

License: Apache-2.0
License: {{Apache-2.0}}
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
Expand Down
4 changes: 2 additions & 2 deletions src/licensedcode/data/rules/apache-2.0_1010.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

License: Apache-2.0
License: {{Apache-2.0}}
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
Expand All @@ -22,4 +22,4 @@ License: Apache-2.0
.
On Debian systems, the text of the Apache License, Version 2.0 can be
found in the file
`/usr/share/common-licenses/Apache-2.0'.
`/usr/share/common-licenses/{{Apache-2.0}}'.
4 changes: 2 additions & 2 deletions src/licensedcode/data/rules/apache-2.0_1013.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ignorable_urls:
- https://www.apache.org/licenses/LICENSE-2.0
---

License: Apache-2.0
License: {{Apache-2.0}}
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
Expand All @@ -22,4 +22,4 @@ License: Apache-2.0
.
On Debian systems, the text of the Apache License, Version 2.0 can be
found in the file
`/usr/share/common-licenses/Apache-2.0'.
`/usr/share/common-licenses/{{Apache-2.0}}'.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1027.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0.html
---

licensed under the http://www.apache.org/licenses/LICENSE-2.0.html[Apache 2.0] license.
licensed under the http://www.apache.org/licenses/LICENSE-2.0.html[{{Apache 2.0}}] license.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1048.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ ignorable_urls:
- http://www.apache.org/licenses
---

'Apache 2.0 http://www.apache.org/licenses/': 'Apache-2.0',
'{{Apache 2.0}} http://www.apache.org/licenses/': '{{Apache-2.0}}',
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1049.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ ignorable_urls:
- http://www.apache.org/licenses/
---

'Apache 2.0 http://www.apache.org/licenses/
'{{Apache 2.0}} http://www.apache.org/licenses/
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1063.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

Redistributed under the the Apache License 2.0 at http://www.apache.org/licenses/LICENSE-2.0
Redistributed under the the {{Apache License 2.0}} at http://www.apache.org/licenses/LICENSE-2.0
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1094.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

<license uri="http://www.apache.org/licenses/LICENSE-2.0">Apache 2.0</license>
<license uri="http://www.apache.org/licenses/LICENSE-2.0">{{Apache 2.0}}</license>
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1095.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ is_license_notice: yes
relevance: 100
---

licensed under Apache License 2.0("{{ALv2}}"),
licensed under {{Apache License 2.0}}("{{ALv2}}"),
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1101.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- https://spdx.org/licenses/Apache-2.0.html
---

- [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html).
- [{{Apache-2.0}}](https://spdx.org/licenses/Apache-2.0.html).
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1105.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ referenced_filenames:
- LICENSE
---

This project uses [the Apache 2.0 license](./LICENSE).
This project uses [the {{Apache 2.0}} license](./LICENSE).
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1107.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

library and tools are licenced under Apache 2.0: http://www.apache.org/licenses/LICENSE-2.0
library and tools are licenced under {{Apache 2.0}}: http://www.apache.org/licenses/LICENSE-2.0
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1113.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ referenced_filenames:
- LICENSE
---

released under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.
released under the {{Apache 2.0}} license. See the [LICENSE](LICENSE) file for details.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1139.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ referenced_filenames:
---

# Li`cense
released under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.
released under {{the Apache 2}}.0 license. See the [LICENSE](LICENSE) file for details.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1141.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ referenced_filenames:
- License.txt
---

Licensed under the Apache-2.0 License. See License.txt in the project root for license information.
Licensed under the {{Apache-2.0}} License. See License.txt in the project root for license information.
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1147.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

libraries are under the Apache License 2.0
libraries are under the {{Apache License 2.0}}
(see @url{http://www.apache.org/licenses/LICENSE-2.0} for details)
4 changes: 2 additions & 2 deletions src/licensedcode/data/rules/apache-2.0_1153.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ ignorable_urls:
<licenses>
<license>
<name>
All files contained in this JAR are licensed under the Apache
2.0 license, unless noted differently in their source (see
All files contained in this JAR are licensed under the {{Apache
2.0}} license, unless noted differently in their source (see
swing2swt).
</name>
<url>
Expand Down
4 changes: 2 additions & 2 deletions src/licensedcode/data/rules/apache-2.0_1154.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ ignorable_urls:
<licenses>
<license>
<name>
All files contained in this JAR are licensed under the Apache
2.0 license, unless noted differently in their source (see
All files contained in this JAR are licensed under the {{Apache
2.0}} license, unless noted differently in their source (see
swing2swt).
</name>
<url>
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1158.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- http://www.opensource.org/licenses/apache2.0.php
---

http://www.opensource.org/licenses/apache2.0.php Apache License, 2.0
http://www.opensource.org/licenses/apache2.0.php {{Apache License, 2.0}}
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1175.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0.txt
---

Apache License 2.0 (http://www.apache.org/licenses/LICENSE-2.0.txt)
{{Apache License 2.0}} (http://www.apache.org/licenses/LICENSE-2.0.txt)
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1179.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- https://www.apache.org/licenses/LICENSE-2.0.txt
---

Apache-2.0 (https://www.apache.org/licenses/LICENSE-2.0.txt)
{{Apache-2.0}} (https://www.apache.org/licenses/LICENSE-2.0.txt)
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1190.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@ ignorable_urls:
---

<license>
<name> Apache-2.0 </name>
<name> {{Apache-2.0}} </name>
<url> (https://www.apache.org/licenses/LICENSE-2.0.txt)</url>
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1193.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ ignorable_urls:
---

<license>
<name> Apache License 2.0 </name>
<name> {{Apache License 2.0}} </name>
<url> (http://www.apache.org/licenses/LICENSE-2.0.txt)</url>
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1228.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ignorable_urls:

<licenses>
<license>
<name>Apache License 2.0</name>
<name>{{Apache License 2.0}}</name>
<url>http://www.apache.org/licenses/LICENSE-2.0</url>
<distribution>repo</distribution>
</license>
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1254.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ignorable_urls:

<licenses>
<license>
<name>Apache License 2.0</name>
<name>{{Apache License 2.0}}</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.html</url>
<distribution>repo</distribution>
</license>
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_126.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

Apache 2.0 License
{{Apache 2.0}} License

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1271.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,5 @@ ignorable_urls:

licenses {
license {
name 'Apache 2.0'
name '{{Apache 2.0}}'
url 'https://www.apache.org/licenses/LICENSE-2.0.html'
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1287.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

* Apache License 2.0:
* {{Apache License 2.0}}:
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1298.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ignorable_urls:

Apache License Version 2.0, January 2004

The following applies to all products licensed under the Apache 2.0
The following applies to all products licensed under the {{Apache 2.0}}
License: You may not use the identified files except in compliance
with the Apache License, Version 2.0 (the "License.") You may obtain a
copy of the License at http://www.apache.org/licenses/LICENSE-2.0. A
Expand Down
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_13.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ignorable_urls:
- http://www.apache.org/licenses/LICENSE-2.0
---

- Apache 2.0 : http://www.apache.org/licenses/LICENSE-2.0
- {{Apache 2.0}} : http://www.apache.org/licenses/LICENSE-2.0
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1311.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ ignorable_urls:
---

<license>
<name>Apache-2.0
<name>{{Apache-2.0}}
<url>https://opensource.org/licenses/Apache-2.0
<comments>Apache License, Version 2.0
2 changes: 1 addition & 1 deletion src/licensedcode/data/rules/apache-2.0_1312.RULE
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ ignorable_urls:
- https://opensource.org/licenses/Apache-2.0
---

<name>Apache-2.0
<name>{{Apache-2.0}}
<url>https://opensource.org/licenses/Apache-2.0
<comments>Apache License, Version 2.0
Loading
Loading