Releases: se-sic/coronet
Releases · se-sic/coronet
v4.4
4.4
Changes in detail
Announcement
- Due to a bug in package
igraph
(igraph/rigraph#1158), which is present in their versions 2.0.0 to 2.0.3, the functionsmetrics.scale.freeness
andmetrics.is.scale.free
can currently not be used with theseigraph
versions. If you need to call any of these two functions, you either need to installigraph
version 1.6.0 or wait until the bug inigraph
is fixed in a future version ofigraph
.
Added
- Add issue-based artifact-networks, in which issues form vertices connected by edges that represent issue references. If possible, disambiguate duplicate JIRA issue references that originate from codeface-extraction (PR #244, PR #249, 98a93ee, 771bcc8, 368e792, fa3167c, 4646d581d5e1f63260692b396a8bd8f51b0da48fda, ed77bd7)
- Add a new
split.data.by.bins
function (not to be confused with a previously existing function that had the same name and was renamed in this context), which splits data based on given activity-based bins (PR #244, ece569c, ed5feb2) - Add
get.bin.dates.from.ranges
function to convert date ranges into bins format (PR #249, a1842e9, 858b181) - Add the possibility to simplify edges of multiple-relation networks into a single edge at all instead of a single edge per relation (PR #250, PR #255, 2105ea8, a34b5bd, 3451641, 78f4351, d310fdc, 58d77b0)
- Add network simplification to showcase file (PR #255, dc32d44)
- Add tests for network simplification (PR #255, 338b069, 8a6f47b, e01908c, 7b6848f, 666d784)
- Add an
assert.sparse.matrices.equal
function to compare two sparse matrices for equality for testing purposes (PR #248, 9784cdf, d9f1a8d) - Add tests for file
util-networks-misc.R
(#242, PR #248, PR #258, f3202a6, 030574b, 380b022, 8b803c5, 7335c3d, 6b600df, a53fab8, faf19fc)
Changed/Improved
- Add input validation for the
bins
parameter insplit.data.time.based
andsplit.data.by.bins
(PR #244, ed0a530, 5e5ecba) - Test for the presence and validity of the
bins
attribute on network-, and data-splits (PR #249, c064aff, 93051ab) - Simplify call chain-, and branching-routes in network-splitting functions and consequently set the
bins
attribute on every output network-split (while minimizing recalculations) (PR #249, #256, PR #257, a1842e9, 8695fbe) - Rename
split.data.by.bins
intosplit.dataframe.by.bins
as this it what it does (PR #244, ed5feb2) - Throw an error in
split.data.time.based.by.timestamps
if no custom event timestamps are available in the ProjectData object (6305adc) - Enhance testing data by adding
add_link
andreferenced_by
issue events, which connect issues to form edges in issue-based artifact-networks. This includes duplicate edge information in JIRA data as produced by codeface-extraction (PR #244, 9f840c0, ea4fe8d, 6eb7311) - Add a check for empty networks in the functions
metrics.scale.freeness
andmetrics.is.scale.free
and returnNA
if the network is empty (29418f2) - Enhance
get.author.names.from.network
andget.author.names.from.data
to always have the same output format. Now it doesn't depend on theglobal
flag anymore (PR #248, d87d325, ddbfe68) - Change
util-tensor.R
to correctly use the new output format ofget.author.names.from.network
(PR #248, 72b663e) - Throw an error in
convert.adjacency.matrix.list.to.array
if the function is called with wrong parameters (PR #248, ece2d38, 1a3e510) - Rename
compare.networks
toassert.networks.equal
to better match the purpose of the function (PR #248, d9f1a8d) - Explicitly add R version 4.3 to the CI test pipeline (9f346d5)
Fixed
- Reformat
event.info.1
column of issue data according to the <issue-%source-%id> format, if the content of theevent.info.1
field references another issue (PR #244, 62ff9d0) - Rename vertex attribute
IssueEvent
toIssue
in multi-networks, to be consistent with bipartite-networks (PR #244, 26d7b7e) - Fix an issue in activity-based splitting where elements close to the border of bins might be assigned to the wrong bin. The issue was caused by the usage of
split.data.time.based
insidesplit.data.activity.based
to split data into the previously derived bins, when elements close to bin borders share the same timestamps. It is fixed by replacingsplit.data.time.based
bysplit.data.by.bins
(PR #244, ece569c) - Remove the last range when using a sliding-window approach and the last range's elements are fully contained in the second last range (PR #244, 48ef4fa, 943228f)
- Fix broken error logging in
metrics.smallworldness
(03e0688) - Fix
get.expanded.adjacency
to work if the provided author list does not contain all authors from the network and add a warning when that happens since it causes some authors from the network to be lost in the resulting matrix (PR #248, ff59017) - Fix
get.expanded.adjacency.matrices
to have correct names for the columns and rows (PR #248, PR #258, e72eff8, a53fab8) - Fix
get.expanded.adjacency.cumulated
so that it works ifweighted
parameter is set toFALSE
(PR #248, 2fb9a5d) - Fix multi-network construction to work with
igraph
version 2.0.1.1, which does not allow to add an empty list of vertices (PR #250, 5547896)
v4.3
4.3
Changes in detail
Added
- Add function
verify.data.frame.columns
to check that a dataframe includes all required columns, optionally with a specified datatype (PR #231, d1d9a03) - Add helper function
is.single.na
to check whether an element is of length 1 and isNA
(ddff2b8, ccfc2d1) - Add CI support for GitHub Actions (PR #234, fa1fc4a)
Changed/Improved
- Include structural verification to almost all functions that read dataframes from files or set a dataframe (setter-functions) (PR #231, b7a9588)
- Include removal of empty and deleted users in the setters of mails, commits, issues, and authors. For commits, also the
committer.name
column is now checked for deleted or empty users (PR #235, 08fbd3e) - Check for empty values (i.e., values of length < 1) when updating configuration attributes and throw an error if a value is empty (9f36c54)
Fixed
- Fix check for empty input files in utility read functions. Compared to unpresent files, empty files do not throw an error when reading them, a check for
nrow(commit.data) < 1
is therefore required (PR #231, ecfa643) - Fix various problems regarding the default classes of edge attributes and vertex attributes, and also make sure that the edge attributes for bipartite edges are chosen correctly (PR #240, 4275b93, 98a6deb, b8232c0, a953555, 820a763)
- Add argument to
construct.edge.list.from.key.value.list
function which differentiates if constructed edges are supposed to be artifact edges, in which case we check if theartifact
attribute is present for edges and replace it byauthor.name
. (PR #238, e2c9d6c, 7f42a91) - Change edge construction algorithm for cochange-based artifact networks to respect the temporal order of data. This avoids duplicate edges. (PR #238, e2c9d6c)
- Clarify that edges in issue-based artifact-networks are not available yet in
README.md
. (PR #238, 18a54f0) - Fix bugs related to expanded adjacency matrices and update the initiation of sparse matrices to the most recent version of package Matrix, to replace deprecated and disfunct function calls. Due to this update, package versions prior to 1.3.0 of the Matrix package cannot be used any more. If the 'install.R' detects that a version prior to 1.3.0 is installed, it now automatically tries to re-install package Matrix once (PR #241, 573fab2, 2f06252)
- Prevent R warnings
'length(x) = 2 > 1' in coercion to 'logical(1)'
inif
conditions for updating configuration values, in update functions of additional data sources, and inget.first.activity.data()
(PR #237, PR #241, ddff2b8, e1579ca) - Prevent R warnings
In xtfrm.data.frame(x) : cannot xtfrm data frames
(PR #237, c24aee7) - Fix wrong bracket in pasted logging message (PR #241, 50c68cb)
- Replace deprecated R function calls (PR #237, ed43382)
v4.2
4.2
Changes in detail
Added
- Incorporate custom event timestamps, i.e., add a configuration entry to the project configuration that allows specifying a file from which timestamps can be read, as well as an entry that allows locking this data; add corresponding functions
get.custom.event.timestamps
,set.custom.event.timestamps
andclear.custom.event.timestamps
(PR #227, 0aa3424, 0f237d0, c180398, 54e089d, 54673f8, c5f5403) - Add function
split.data.time.based.by.timestamps
to allow using custom event timestamps for splitting. Alternatively, timestamps can be specified manually (PR #227, 5b8515f, 43f23a8) - Add the following vertex attributes for artifact vertices and corresponding helper functions (PR #229, 2072807, 51b5478, 56ed57a, 9b06036, 52d40ba, e91161c)
add.vertex.attribute.artifact.last.edited
add.vertex.attribute.mail.thread.contributer.count
,get.mail.thread.contributor.count
add.vertex.attribute.mail.thread.message.count
,get.mail.thread.message.count
add.vertex.attribute.mail.thread.start.date
,get.mail.thread.start.date
add.vertex.attribute.mail.thread.end.date
,get.mail.thread.end.date
add.vertex.attribute.mail.thread.originating.mailing.list
,get.mail.thread.originating.mailing.list
add.vertex.attribute.issue.contributor.count
,get.issue.contributor.count
add.vertex.attribute.issue.event.count
,get.issue.event.count
add.vertex.attribute.issue.comment.event.count
,get.issue.comment.count
add.vertex.attribute.issue.opened.date
,get.issue.opened.date
add.vertex.attribute.issue.closed.date
,get.issue.closed.date
add.vertex.attribute.issue.last.activity.date
,get.issue.last.activity.date
add.vertex.attribute.issue.title
,get.issue.title
add.vertex.attribute.pr.open.merged.or.closed
,get.pr.open.merged.or.closed
add.vertex.attribute.issue.is.pull.request
,get.issue.is.pull.request
Changed/Improved
- Breaking Change: Rename existing vertex attributes for author vertices to be distinguishable from attributes for artifact vertices. With this change, the first word after
add.vertex.attribute.
now signifies the type of vertex the attribute applies to (PR #229, 75e8514)add.vertex.attribute.commit.count.author
->add.vertex.attribute.author.commit.count
add.vertex.attribute.commit.count.author.not.committer
->add.vertex.attribute.author.commit.count.not.committer
add.vertex.attribute.commit.count.committer
->add.vertex.attribute.author.commit.count.committer
add.vertex.attribute.commit.count.committer.not.author
->add.vertex.attribute.author.commit.count.committer.not.author
add.vertex.attribute.commit.count.committer.and.author
->add.vertex.attribute.author.commit.count.committer.and.author
add.vertex.attribute.commit.count.committer.or.author
->add.vertex.attribute.author.commit.count.committer.or.author
add.vertex.attribute.artifact.count
->add.vertex.attribute.author.artifact.count
add.vertex.attribute.mail.count
->add.vertex.attribute.author.mail.count
add.vertex.attribute.mail.thread.count
->add.vertex.attribute.author.mail.thread.count
add.vertex.attribute.issue.count
->add.vertex.attribute.author.issue.count
add.vertex.attribute.issues.commented.count
->add.vertex.attribute.author.issues.commented.count
add.vertex.attribute.issue.creation.count
->add.vertex.attribute.author.issue.creation.count
add.vertex.attribute.issue.comment.count
->add.vertex.attribute.author.issue.comment.count
add.vertex.attribute.first.activity
->add.vertex.attribute.author.first.activity
add.vertex.attribute.active.ranges
->add.vertex.attribute.author.active.ranges
- Add parameter
use.unfiltered.data
toadd.vertex.attribute.issue.*
. This allows selecting whether the filtered or unfiltered issue data is used for calculating the attribute (PR #229, b77601d, 922258c) - Improve handling of issue type in vertex attribute name for
add.vertex.attribute.issue.*
. The default attribute name still adjusts to the issue type, but this no longer happens if the same name is specified manually (PR #229, fe5dc61)
v4.1
4.1
Changes in detail
Added
- Incorporate gender data, i.e., add a configuration entry to the project configuration, add function
read.gender
for reading gender data, add functionsget.gender
andset.gender
and corresponding utility functions to automatically merge gender data to the author data (PR #216, 8868ff4, bfbe4de, 0a23862, a7744b5, 6a50fd1, 413e24c, 39db315, 1e4026d)
Changed/Improved
- Add
mode
parameter tometrics.vertex.degrees
to allow choosing between indegree, outdegree, and total (#219, ae14eb4) - Adjust
.drone.yml
CI config to prevent pipeline fails:R
version3.3
is not tested any more as some packages are not available any more for thisR
version (ca6b474). Also another docker container in the CI pipeline is used as there are problems with the previously used docker instance (937f797)
Fixed
- Fix values in test for the eigenvector centrality as igraph has changed the calculation of this with version 1.2.7. Also put a warning that we recommend version 1.3.0 in
install.R
and document it in theREADME.md
(25fb862, 1bcbca9) - Fix the filtering of the deleted user in
util-read.R
to always be lowercase as the deleted user can appear with different spellings (#214, 1b4072c) - Add check to
get.first.activity.data
to look for missing activity types. If no activities are in the RangeData, the function will print a warning and return an empty list (PR #220, #217, 5707517, 42a4bef, d6424c0, ca8a1b4, f6553c6)
v4.0
4.0
Changes in detail
Announcement
coronet
now has a logo and a website: https://se-sic.github.io/coronet (#167, PR #196)
Added
- Add functionality to read and process commit messages in order to merge them to the commit data (see issue #180). Three values are available for the new attribute
commit.messages
inProjectConf
:none
,title
andmessages
(PR #193, 85b1d05, fdc414a, 43e1894) - Add functions
cleanup.commit.message.data
andcleanup.synchronicity.data
to remove commit hashes that are not any more present in the commit data from the commit message data or synchronicity data (PR #193, 98e83b0) - Add function
metrics.is.smallworld
to the metrics module in order to unify checks for smallworldness (similar to scalefreeness) (PR #195, ce1f812) - Add function
metrics.vertex.centralities
to metrics module in order to simplify getting a data frame containing author names and their respective centrality values (d3cd528, e7182e7) - Add function
get.data.sources.from.relations
toutil-networks.R
which extracts the data sources of a network that were used when building it (PR #195, d1e4413) - Add tests for the
get.data.sources.from.relations
function (PR #195, add0c74) - Add logo directory containing several logo variants (PR #196, 82f9971, dc4659e, fdc5e67, 752a9b3)
- Add function
preprocess.issue.data
, which implements common issue data filtering operations. (fcf5cee, a566cae, 5ba6feb) - Add function
get.issues.uncached
, which gets the issues filtered without poisoning or using the cache. (eb919fa) - Add function
get.issues.unfiltered
to get the unfiltered issues so that these methods follow the naming scheme known from the respective methods for commits (b9dd94c, e05f344) - Add per-author vertex attributes regarding counting of issues, issue-creations, issue-comments, mails, mail-threads, ... (like mail thread count, issue creation count) (PR #194, issue #188, 9f9150a, 7260d62, 139f70b, eb4f649, 627873c, 1e1406f, 98e11ab, a566cae)
- Add functionality that allows to read any data source at any point in time, even after splitting. In this case, the read data is automatically cut to the corresponding range on the
RangeData
object (PR #201, 7f9394f). Additionally, when changing the configuration parameters concerning additional data sources, the environment of aProjectData
object is no longer reset (PR #201, eed45ac) - Add new configuration parameters
commits.locked
,mails.locked
andissues.locked
toProjectConf
which, when set toTRUE
, prevent the respective getters from triggering the read of the data if it is not present yet (PR #201, 3821677) - Add support for classifying developers on the basis of more count-based classification metrics, including mail-count, mail-thread count, issue-count, issue-comment count, issue-commented-in count, and issue-created count (issue #70, PR #209, d7b2455, 6f737c8)
- Add bot filtering mechanism, which allows removing issues/mails/commits made by bots (838855f, dcce82d)
- Ignore the "deleted user", as well as the author having an empty name "" (1a08140, 24c222a)
Changed/Improved
- Breaking Change: Rename getters for main data sources: Unfiltered date is now acquired using
get.<datasource>.unfiltered
, filtered data is acquired usingget.<datasource>
(edf19cf, e05f344) - Add check for empty network in
metrics.hub.degree
function. In the case of an empty network, a warning is being printed andNA
is returned (PR #195, 4b164be) - Adjust the function
ProjectData$get.artifacts
: Rename its single parameter todata.sources
and change the function so that it can extract the artifacts for multiple data sources at once. The default is still that only artifacts from the commit data are extracted. (PR #195, cf795f2, 70c05ec, 5a46ff4, fd767bb) - Change the internal representation of empty data from
NULL
to empty data frames and adapt functionget.cached.data.sources()
ofProjectData
which returns a vector of all data sources that are cached (including additional and filtered data sources) (PR #201, aec898e, e55d088, 24c222a); additionally, introduce new functionis.data.source.cached()
inutil-data.R
that returns a logical vector indicating which of the given data sources are cached (PR #201, b49cc5d, 491e70c, 24c222a) - Change the threshold calculation for the classification of developers to use a quantile approach when classifying on the basis of network centrality metrics (issue #205, PR #209, PR #210, 5128252, 0d6a3a1)
- Update documentation in
util-network-metrics.R
andutil.conf.R
(PR #195, f929248, de9988c, PR #199, 059b286) - Splitting no longer loads all (additional) data sources, but only the ones that have already been cached in the
ProjectData
(PR #201, 52a3014, aec898e, de1bbfe) - Improve the documentation in
util-core-peripheral.R
by adding roxygen skeleton documentation to undocumented functions (issue #70, PR #209, a3d5ca7, 6f737c8) - Change the
$
notation to the bracket notation inutil-core-peripehral.R
(issue #70, PR #209, 6f737c8) - Add
.drone.yml
to enable running our CI pipelines on drone.io (PR #191, 1c5804b) - Not only run test suite in our CI pipeline, but also run the showcase file in our CI pipeline using test data (719a4f0, 3eb31d8)
- Add R version 4.1 to test suite and adjust missing time-zone attributes on
NA
vectors or empty POSIXct vectors which are correctly added as of R version 4.1 (PR #203, 6b7fb36, 98c5671, 09d11ab)
Fixed
- Fix fencing issue timing data so that issue events "happen" after the issue was created. Since only
commit_added
events are affected, that only happens for these. (issue #185, 627873c, 6ff585d) - Fix the function
reset.environment()
of both theProjectData
andNetworkBuilder
class; they now reset all the data (PR #199, de091a5, fc4c086) - Adjust the functions
update.commit.message.data()
,update.pasta.data()
, andupdate.synchronicity.data()
: no warning is being printed anymore when being called by the corresponding cleanup function (PR #199, e5c60a5) - Fix issue where the data path on
RangeData
objects was wrong in special cases. Introduce the (private) flagbuilt.from.range.data.read
that is set according to how the object has been created (splitting manually or reading codeface ranges) and calculating the data path accordingly (PR #199, cce9527, 917bf64, 169c034). Also add tests for this new behaviour (PR #199, ef5bac6, 3aa8e7d, d454e5a, 66ad127) - Make splitting no longer modify the original
ProjectConf
, instead create a copy (e82d056) - Fix and update outdated examples in the showcase file (473c094, 287fbfa, 0a5cce4, PR #207)
- Fix generation of Codeface range directory names from commit hashes (5c90d1c)
- Fix plotting an empty network via
plot.network
(03f986d) - Fix behavior of
construct.ranges
when only one range has to bee constructed andsliding.window = TRUE
(000314b) - Add package
reshape2
to the install script as this package is used in module `util-plot-...
v3.7
3.7
Changes in detail
Added
- Add a new file
util-tensor.R
containing the classFourthOrderTensor
to create (author x relation x author x relation) tensors from a list of networks (with each network having a different relation) and its corresponding utility functionget.author.networks.for.multiple.relations
(PR #173, c136b1f, e4ee0dc, 051a5f0) - Add function
calculate.EDCPTD.centrality
for calculating the EDCPTD centrality for a fourth-order tensor in the above described form (c136b1f, e4ee0dc, 051a5f0) - Add new file
util-networks-misc.R
which contains miscellaneous functions for processing network data and creating and converting various kinds of adjacency matrices:get.author.names.from.networks
,get.author.names.from.data
,get.expanded.adjacency
,get.expanded.adjacency.matrices
,get.expanded.adjacency.matrices.cumulated
,convert.adjacency.matrix.list.to.array
(051a5f0) - Add tests for sliding-window functionality and make parameterized tests possible (a3ad0a8, 2ed84ac, PR #184)
- Add function
cleanup.pasta.data
to remove wrong commit hashes and message ids from the PaStA data (1797e03, PR #189)
Changed/Improved
- Adjust the function
get.authors.by.data.source
: Rename its single parameter todata.sources
and change the function so that it can extract the authors for multiple data sources at once. The default value of the parameter is a vector containing all the available data sources (commits, mails, issues) (051a5f0) - Adjust recommended R version to 3.6.3 in README (92be262)
- Add R version 4.0 to test suite and adjust package installation in
install.R
to improve compatibility with Travis CI (40aa0d8, 1ba0367, #161)
Fixed
- Fix sliding-window creation in various splitting functions (
split.network.time.based
,split.networks.time.based
,split.data.time.based
,split.data.activity.based
,split.network.activity.based
) and also fix the computation of overlapping ranges in the functionconstruct.overlapping.ranges
to make sure that the last and the second-last range do not cover the same range) (1abc1b8, c34c42a, 097cebc, 9a1b651, 0fc179e, cad28bf, 7602af2, PR #184) - Fix off-by-1 error in the function
get.data.cut.to.same.date
(f0744c0) - Fix missing or wrongly set layout when plotting networks (#186, 720cc7b, 877931b)
- Fix reading of the PaStA data since the file format has changed (712bbaf, PR #189)
- Fix bug that duplicates revision set ids in the mail and commit data when merging the PaStA data and also copy-paste error when merging PaStA data to commit data (1797e03, PR #189)
- Fix bug that results in an error when there is a variable called 'c' in the R environment (de42eb2, PR #189)
- Fix bug that when applying
filter.patchstack.mails()
to an environment with no mail data, the mail data gets set toNULL
(8261475, PR #189)
v3.6
3.6
Changes in detail
Added
- Add a parameter
editor.definition
to the functionadd.vertex.attribute.artifact.editor.count
which can be used to define, if author or committer or both count as editors when computing the attribute values. (#92, ff1e147) - Add the possibility to filter out patchstack mails from the mails of the
ProjectData
. The option can be toggled using the newly added configuration optionmails.filter.patchstack.mails
. (1608e28, a932c8c) - Add a new file
util-plot-evaluation.R
containing functions to plot commit edit types per author and project. (PR #171, d4af515, aa542a2. 0a0a590)
Changed/Improved
- Add R version 3.6 to test suite (8b2a52d, #161)
- Update
.travis.yml
to improve compatibility with Travis CI (41ce589)
Fixed
- Ensure sorting of commit-count and LOC-count data.frames to fix tests with R 3.3 (33d63fd)
v3.5
3.5
Announcement
- Rename project to
coronet
(#10, 929f8ce, ac1ce80)- Be sure to update Git remotes and submodules to the new URL!
Changes in detail
Added
- Add the constants
UNTRACKED.FILE
,UNTRACKED.FILE.EMPTY.ARTIFACT
, andUNTRACKED.FILE.EMPTY.ARTIFACT.TYPE
: Commits that do not change any artifact are considered to be carried out on a meta-file called<untracked.file>
. The constantUNTRACKED.FILE
is added to hold the string constant. Analogously, the constantsUNTRACKED.FILE.EMPTY.ARTIFACT
(currently,""
) andUNTRACKED.FILE.EMPTY.ARTIFACT.TYPE
(currently,""
) hold the constants for any artifacts and their corresponding types, respectively, "changed" in untracked files. (11428d9, 5ea65b9, dde0dd7, 2284bbe) - Add the public method
ProjectData$get.commits.filtered.uncached
: The method allows for external filtering of the commits by specifying if untracked files and/or the base artifact should be filtered (this method does not take advantage of caching, whereas the methodProjectData$get.commits.filtered
does) (11428d9) - Add the parameters
commits.filter.base.artifact
andcommits.filter.untracked.files
to theProjectConf
: In addition to theProjectConf
parametercommits.filter.base.artifact
(previously calledartifact.filter.base
), which configured whether the base artifact should be included in theget.commits.filtered
method, there is now a similar parameter calledcommits.filter.untracked.files
doing the same thing for untracked files (11428d9, 466d8eb) - Add parameter
edges.for.base.artifacts
toNetworkConf
: In author networks, edges do not get constructed anymore between authors for solely modifying untracked files. For authors involved in changing the base artifact, it can be configured whether edges should be created or not using the newNetworkConf
parameteredges.for.base.artifacts
(c60c2f6, 466d8eb) - Add method
ProjectData$get.authors.by.data.source
to retrieve authors by given data-source name (#149, 6580427, 137d833) - Add helper function
create.empty.data.frame
: The function returns empty data.frames (0 rows) with correct columns and, if specified, all the correct data types. In the future, functions, that return data in data.frames, should always return data.frames of the same shape (regarding columns and data types) – especially when they are empty – because this makes later case distinctions easier or unnecessary (67a4fbe, 3513647) - For the most common types of data.frames (data.frames of commits, mails, issues, and authors) four more utility methods are added, namely
create.empty.authors.list
,create.empty.commits.list
,create.empty.issues.list
,create.empty.mails.list
,create.empty.synchronicity.list
,create.empty.pasta.list
as well as corresponding constants holding columns and associated data types for all these empty data.frames (5f0f529, 523daef, f8e021d, 3513647, 2f4e6f0, cd3e34a) - Add mandatory attributes in
create.empty.network
if wanted (cae9d4b, cc8bd86) - Add function
create.empty.vertex.list
(c00101d) - Add tests for construction of networks without data (a4b3524)
- Add tests for construction of networks without vertices (6eb214c)
- Add a note on mailing-list threads to README (c6dca27)
- Add cutting functionality to README descriptions (fb40c50)
- Add the parameter
restrict.classification.to.authors
to the functionsget.author.class.by.type
,get.author.class.overview
,get.author.class.network.degree
,get.author.class.network.eigen
,get.author.class.network.hierarchy
,get.author.class.commit.count
andget.author.class.loc.count
. The parameter allows to perform classifications on a limited group of authors whose names are specified in this parameter. (2492dd0, #148) - Add test cases for
util-core-peripheral.R
by adding the new filetest-core-peripheral.R
along with test cases (2627d6c) - Add project-configuration parameter
issues.from.source
to choose if only issues from JIRA, only issues from GitHub, or all issues shall be read in (PR #159, d677949, a3e7132, ea26181). Therefore two test cases, one that reads in only JIRA issues and one that reads in only GitHub issues, are added to the issue read test (65b1acd, 2d897cb) - Add class documentation (#157, 6e33d0a, 250f9e0)
Changed/Improved
- Always add mandatory vertex and edge attributes (#154, 0526755)
- Heavily improve addition of PaStA data (cd3e34a)
- The method
read.issues
inutil-read.R
now supports the new issue data format (PR #147, 77c750c, e04ce30, 67b818a, 4020487, 3513647). Therefore, the test issue data and all related tests are updated (39971ee, 0ec6c6c, 6a9f4ad, fda000f, 3513647) - Rename
ProjectConf
parameterartifact.filter.base
tocommits.filter.base.artifact
(PR #149, 466d8eb) - The constant
BASE.ARTIFACTS
is extended by adding untracked files (i.e. the new meta-fileUNTRACKED.FILE
), which is now considered to be a new base artifact in the case of file-level analyses. This implies, that, in case of file-level analyses, the base artifact and the untracked files fall together, while in feature-level and function-level analyses they are treated differently (d11d0fb) - Filtering by artifact kind (e.g. filtering out either
"Feature"
or"FeatureExpression"
) is now being done in the methodProjectData$get.commits
instead of the methodProjectData$get.commits.filtered
(894c9a5) - Remove
get.commits.filtered.empty
and correspondingfilter.commits.empty
method, the functionality is now included into the methodsget.commits.filtered
andfilter.commits
respectively (11428d9) - The private method
ProjectData$filter.commits
now takes parameters which configure whether untracked files and/or the base artifact are to be filtered (11428d9) - Remove
get.commits.raw
,set.commits.raw
andread.commits.raw
functions (64a9486, c26e582) - Add commits on untracked files to test suite (#153, d9f527c)
- In the class
Conf
(and its sub-classesNetworkConf
andProjectConf
), default parameters are not validated anymore to avoid confusion by logging output (ec8c6dd) - In the class
Conf
(and its sub-classesNetworkConf
andProjectConf
),stop
is called on errors during parameter updates now (ec8c6dd) - Change shape of
Vertices
in the legend of plots to avoid confusion (f4fb480) - Refactor
ProjectData$get.cached.data.sources
to be more concise (a4e7a21) - Update contribution guide regarding
roxygen2
conventions (#157, fbc2d54, 783ee58, 6e33d0a) - Update README regarding mandatory edge attributes (641624b)
- Rename misleading parameter names for functions
get.author.class.by.type
,get.author.class.overview
,get.author.class.network.degree
,get.author.class.network.eigen
,get.author.class.network.hierarchy
,get.author.class.commit.count
andget.author.class.loc.count
. Most importantly, the parameterrange.data
was renamed toproj.data
for these functions. (587ef99, 81568b1, #70) - Remove the unused functions
get.commit.count.threshold
andget.loc.count.threshold
. (2534d73, #70) - The function
verify.argument.for.parameter
was adjusted to be suitable in more general use-cases (557bdcd) - Do not redundantly initialize data sources when splitting (35698a1)
- Read PaStA and synchronicity data only if enabled (79bf3ca)
- Add and enforce coding convention to use 'vertices' and not 'nodes'. Most importantly, the function
metrics.node.degrees
is renamed tometrics.vertex.degrees
. (d35ce61...
v3.4
3.4
Changes in detail
Added
- Create global constant named
BASE.ARTIFACTS
(7031d45) - Split data into time-based equally-sized windows in function
split.data.time.based
(#49, 40974ba, a174753) - Split networks into time-based equally-sized windows in functions
split.network(s).time.based
(#49, 94cc87b, a174753, 5ac1492) - Add function to delete authors without specific edges from networks
delete.authors.without.specific.edges
(#76, b9319e3, 107854c, 4e211f0, 4850666) - Add methods
ProjectData$group.authors.by.data.column
andProjectData$group.artifacts.by.data.column
(#97, 11f7189) - Add method
ProjectData$group.data.by.column
(b78f54f, 11f7189, related to #97)
Changed/Improved
- Add possibility to add multiple first activities for different activity types in one vertex attribute (#92, 04f18b3)
- Add possibility to decide whether first activity should be computed per activity type or over all activity types when added as vertex attribute (#92, , 86962a3)
- Refactor computation of vertex attribute
first.activity
for better performance (40b7d87, f518890) - Move
RELATION.TO.DATASOURCE
to module 'networks' (1ac09f6) - Determine list of artifacts more reasonably in ProjectData (#97, 23a8aa3, 11f7189)
- Adapt
ProjectData$get.artifacts
to work with all data sources (#97, 0d184b8) - Improve function
save.and.load
to work without assignment (7f6ab1a) - Handle incorrect keys and values in
get.key.to.value.from.df
(5b74038, related to #97)
Fixed
- Fix computation of vertex attribute
first.activity
to handle empty data sources (4a9ad23, 425c46b) - Fortify check on callgraph revision in NetworkBuilder (dcf56ad)
- Move pull-request template to take effect (6df72e9)
- Fix function
split.networks.time.based
regarding case that provided list of networks only contains one element (010a935) - Fix problem with fractional time periods in
generate.date.sequence
(8d80fa9) - Handle ARPACK errors in eigen-centrality calculation (c5413c2, f213648)
- Allow merge of empty networks (edges and/or vertices) (#142, 26e3bef)
v3.3
3.3
Changes in detail
Added
- Possibility to add the commit count per person as vertex attribute, counting either commits, where the person is committer AND author, or committer OR author (#92 (second task), da87c06, 0f0a90f, 5df541d, 3f97397)
- Add method
metrics.is.scale.free()
to decide whether a network is scale-free or not (80f4751, 97161b1) - Add tests for comparing networks that are created differently (66d37ce, 4a9d6b9, a37c277)
- Add method
clear.edge.attributes
to clear the edge attributes list of the network configuration (15f7587) - Add network configuration parameter
author.respect.temporal.order
for determining the edge-construction algorithm (#6, 4fc59a0, fd0b07d)
Changed/Improved
- Add committer information to the commit list in the test data
- Set the locale to "english-us" on Windows (b3da10d)
- Update templates for pull requests and issues (0b9ecb7)
- Update the contribution guide regarding things to be done for a pull request (0b9ecb7)
- Update TravisCI script to run a job matrix with R 3.3, 3.4, and 3.5 (9bf7fcb, b34bf75)
- Update README file regarding functionality, network types, data sources, and mandatory attributes (#121, da68b94, 3200c57, baf41aa, bec3a47)
- Adjust legend orientation and placement in plots (now column-oriented) (c93ad2a)
- Refactor 'add.edges.for.bipartite.relation' for better readability (#118, 3d98b40)
- Remove function 'combine.networks' (#118, b349631)
- Do not support missing committer data anymore (871008e)
- Do not serialize Strings when calculating the sha1 hash to generate an event ID for issues (basically due to encoding issues, eb56a87)
- Add implementation of Codeface to compute the scale-free attributes for small networks (80f4751)
- Remove data inconsistencies when re-setting the commit, mail, and PaStA data (5695526)
- Switch the order of the
type
andkind
attributes of vertices in bipartite networks for testing reasons (351311a) - Update README file (8380dc6, f590453, 792cb95, 8c2aI8255966cc6c38ea16e64a28b135ef8456e58, 5cfc5ae, ebae9f8, c66321e, 38e7c5d)
- Distinguish directedness of author networks and edge-construction algorithm (#6, 4fc59a0, 70b3c82)
Fixed
- Change the type of all commit count default values to Integer (62c0339)
- Retain network attributes in
simplify.network
(inigraph
language, graph attributes) (424b2bc) - Fix showcase file regarding outdated plotting parameters (29d5ac6)
- Eliminate duplicated lines in the raw commit data (dec0005)
- Fix the
split.networks.time.based
method by now splitting the networks from the earliest timestamp to the latest (1f65db3) - Fix TravisCI build regarding
sudo
commands (baca08e) - Fix direction of edges in exemplary network plot in README file (5c80c25)