Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP on map_io #14

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

WIP on map_io #14

wants to merge 3 commits into from

Conversation

smehringer
Copy link
Collaborator

No description provided.

@smehringer smehringer marked this pull request as ready for review March 11, 2022 09:27
@smehringer smehringer requested a review from h-2 March 11, 2022 09:28
Copy link
Member

@h-2 h-2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make some small changes like double-checking the dates and the brief descriptions.

For the header I would also like to make it a bit more similar to the var_io::header.

All other design things can be discussed later, I think.

include/bio/detail/add_enum_bitwise_operators.hpp Outdated Show resolved Hide resolved
include/bio/detail/add_enum_bitwise_operators.hpp Outdated Show resolved Hide resolved
include/bio/format/sam.hpp Show resolved Hide resolved
include/bio/format/sam.hpp Outdated Show resolved Hide resolved
include/bio/format/sam.hpp Outdated Show resolved Hide resolved
* is not in a correct state (e.g. required fields are not given), but throwing might occur downstream of the actual
* error.
*/
void header::read(std::string_view header_string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made private if the header is constructed from a full-header-string.

include/bio/map_io/reader_options.hpp Outdated Show resolved Hide resolved
struct sam_tag_type
{
//!\brief The type for all unknown tags with no extra overload defaults to a std::variant.
using type = detail::sam_tag_variant;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we expose this, should it be public? The variants in var_io are public... In general, the interface of the sam_tag_dictionary is very different from how this is done in var_io at the moment.

I would recommend merging the dictionary as-is, and that we later re-evaluate together whether a common design makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I just copied over the seqan3 version because I didn't have time to go into every detail of the VCF implementation. Let's again keep track of this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created discussion points.

BIO_RECORD_MEMBER(tlen)
BIO_RECORD_MEMBER(optionals)
BIO_RECORD_MEMBER(tags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BIO_RECORD_MEMBER(tags)
BIO_RECORD_MEMBER(sam_tags)

? Dunno. Maybe tags is OK, too.

Copy link
Collaborator Author

@smehringer smehringer Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm.. none of the other names is prefixed with its format. E.g. tlen also does not really tell you anything.
On the other hand, tags is I think also made up from me :D
Here is the respective paragraph form the SAM specs:

The alignment section: optional fields

All optional fields follow the TAG:TYPE:VALUE format where TAG is a two-character string that matches
/[A-Za-z][A-Za-z0-9]/. Within each alignment line, no TAG may appear more than once and the order
in which the optional fields appear is not significant. A TAG containing lowercase letters is reserved for end
users. In an optional field, TYPE is a single case-sensitive letter which defines the format of VALUE:

because it is called TAG:VALUE I guess I called it tags. I think it's more memorable than optional_fields? But I have no strong feelings on this. I am just against prefixing the format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like tag because we have already other things called "tag" and it has a very specific meaning in C++ (it's always difficult when a term has one meaning in C++ and another in Bioinformatics).

I also decided to stay close to the format-terminology, so I would propose to just call the field "optionals".

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wpedantic"
template <typename char_t, char_t... s>
constexpr uint16_t operator""_tag()
Copy link
Member

@h-2 h-2 Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we can replace this with a bio::vtag, but this doesn't have to happen now. I will have a think on it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, are you keeping track of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now started adding Github discussion items for these things. Feel free to add information there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants