Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filenames with : in them do something odd in MediaInfoList #1359

Closed
hjmallon opened this issue Feb 10, 2021 · 12 comments · Fixed by #1361
Closed

Filenames with : in them do something odd in MediaInfoList #1359

hjmallon opened this issue Feb 10, 2021 · 12 comments · Fixed by #1361
Labels

Comments

@hjmallon
Copy link
Contributor

hjmallon commented Feb 10, 2021

I'm not sure what this code is trying to do? Was there meant to be a mode where lots of filenames could be provided as a string which was ':' separated? ':'s can occur in filenames (on macOS at least).

:.png will go through the top code path and a.png will go through the other one

https://github.com/MediaArea/MediaInfoLib/blob/master/Source/MediaInfo/MediaInfoList_Internal.cpp#L89

    if (Pos!=string::npos && Pos!=1)
        List.push_back(File_Name);
    #if defined(MEDIAINFO_FILE_YES)
    else if (File::Exists(File_Name))
        List.push_back(File_Name);
    #endif //defined(MEDIAINFO_FILE_YES)
    #if defined(MEDIAINFO_DIRECTORY_YES)
    else
    {
        ....
    }
@JeromeMartinez
Copy link
Member

Oops...
Right, it was an ugly hack from very old times when MediaInfo was only on Windows and this was considered safe enough at this moment.
I need to check where/when/how it is/was used before removing this ugly hack. No ETA through.

@hjmallon
Copy link
Contributor Author

I found this while looking for how to ask for info on lists of files. It might be nice to have a mode to ask for details on multiple files at once. That way if I have:

my_folder/seq_1.png
my_folder/seq_2.png
my_folder/seq_3.png

I will get the same results from mediainfocli ./my_folder and mediainfocli ./my_folder/*.png?

@JeromeMartinez
Copy link
Member

I will get the same results from mediainfocli ./my_folder and mediainfocli ./my_folder/*.png?

IIRC mediainfo ./my_folder/*.png the command line parser may expand itself to a list of files, so in theory not the same code and may be some differences if the algo is not same but I expect that it is same.

In any case, AFAIK both commands work, what is the issue with them?

It might be nice to have a mode to ask for details on multiple files at once.

mediainfo my_folder/seq_1.png my_folder/seq_2.png my_folder/seq_3.png works AFAIK.

@hjmallon
Copy link
Contributor Author

hjmallon commented Feb 10, 2021

Sorry bad example, it turns out you need more then 24 frames in a sequence for it to be detected, so you need

my_folder/seq_1.png
my_folder/seq_2.png
my_folder/seq_3.png
... etc ...
my_folder/seq_25.png

Running it with bash expanding the * to a long list of files means that they are Opened individually by MediaInfoList, so the sequence is not detected. Sending the list of file names from a MediaInfoList client would avoid that, something like:

// In MediaInfoList
size_t Open (const std::vector<String> &Files, const fileoptions_t Options=FileOption_Nothing);

@hjmallon
Copy link
Contributor Author

hjmallon commented Feb 10, 2021

Something like this maybe? #1360

@JeromeMartinez
Copy link
Member

I think I get it, if a directory the sequence is detected but if a list of files the sequence is not detected.
Note that providing the first png file name should trigger the sequence detection too.

There is definitely a lot to do for a more coherent interface. But I prefer to have the time to refactor all well instead of touching a part without being sure I improve the overall coherency.

e.g. in your PR, there is a regression for people using the ":" feature. So it would be the addition of an option, unset by default, for accepting filenames with ":", with a warning (not implemented) saying that default will change in the future in case of ":" is used, then option is set by default at some point.
We are aware that we have a long legacy issue and that at some point we need to do clean up, and we also know that it is not easy (we already have lot of complains when we change tiny details), so this is not something we'll do quickly and on specific item only.

In the ":" case, I think that a quick and less risky fix is to invert the File::Exists(File_Name) test and the : test, less risky (people using the : feature don't have files with : in the file names :), I guess ) and more acceptable for the moment IMO.

Note that we do C++03 for the moment due to the support of very old versions of some platforms (hello macOS <= 10.8), we plan to trash that soon but for the moment no auto sutff :).

@hjmallon
Copy link
Contributor Author

I think I get it, if a directory the sequence is detected but if a list of files the sequence is not detected.
Note that providing the first png file name should trigger the sequence detection too.

Currently running mediainfocli on the folder finds the sequence, running it on the first frame finds the sequence, but running it on the list of all the frames (provided on command line) finds the sequence many times (i.e 1-25, 2-25, 3-25, 4-25, etc).

e.g. in your PR, there is a regression for people using the ":" feature.

What is the definition of the ":" feature? If it is a list of file names it is already broken, this doesnt work ./mediainfocli "./example_mediainfolib/fileB.png:./example_mediainfolib/fileA.png".

Note that we do C++03

I'm on it 😄

@JeromeMartinez
Copy link
Member

What is the definition of the ":" feature? If it is a list of file names it is already broken, this doesnt work

Maybe not with the CLI, but by calling the lib.
Main issue is that we didn't add regression test during implementation of a feature, it may be unused, broken or whatever, we don't know (and I don't remember the main purpose of the hack, too old).
I prefer to avoid changes without a good reason.

Currently running mediainfocli on the folder finds the sequence, running it on the first frame finds the sequence [...]

OK, this is bad, and good reason for changing the behavior.
But: if you can keep the ":" stuff, while being able to read files with ":" in the file name, I prefer. In other words, fixing the spotted issue(s) with as less changes as possible.


We have regression tests on our todo-list, when we have some free time for that... At long term it would help, but never easy to prioritize that over sponsored features.

@hjmallon
Copy link
Contributor Author

hjmallon commented Feb 10, 2021

if you can keep the ":" stuff, while being able to read files with ":" in the file name, I prefer [..]

I haven't changed any functionality to do with the ':' at all. Here is that part of my diff with the whitespace removed (git show -w). The old call to Open(String will just make a 1 object vector and call the new function.

diff --git a/Source/MediaInfo/MediaInfoList_Internal.cpp b/Source/MediaInfo/MediaInfoList_Internal.cpp
index 14dfd464..6db110a8 100644
--- a/Source/MediaInfo/MediaInfoList_Internal.cpp
+++ b/Source/MediaInfo/MediaInfoList_Internal.cpp
@@ -76,6 +76,14 @@ MediaInfoList_Internal::~MediaInfoList_Internal()

 //---------------------------------------------------------------------------
 size_t MediaInfoList_Internal::Open(const String &File_Name, const fileoptions_t Options)
+{
+    vector<String> File_Names;
+    File_Names.push_back(File_Name);
+    return Open(File_Names, Options);
+}
+
+//---------------------------------------------------------------------------
+size_t MediaInfoList_Internal::Open(const vector<String> &File_Names, const fileoptions_t Options)
 {
     //Option FileOption_Close
     if (Options & FileOption_CloseAll)
@@ -86,6 +94,8 @@ size_t MediaInfoList_Internal::Open(const String &File_Name, const fileoptions_t

     //Get all filenames
     ZtringList List;
+    for (const auto& File_Name : File_Names)
+    {
         size_t Pos=File_Name.find(__T(':'));
         if (Pos!=string::npos && Pos!=1)
             List.push_back(File_Name);
@@ -96,7 +106,8 @@ size_t MediaInfoList_Internal::Open(const String &File_Name, const fileoptions_t
         #if defined(MEDIAINFO_DIRECTORY_YES)
         else
         {
-        List=Dir::GetAllFileNames(File_Name, (Options&FileOption_NoRecursive)?Dir::Include_Files:((Dir::dirlist_t)(Dir::Include_Files|Dir::Parse_SubDirs)));
+            ZtringList LocalList=Dir::GetAllFileNames(File_Name, (Options&FileOption_NoRecursive)?Dir::Include_Files:((Dir::dirlist_t)(Dir::Include_Files|Dir::Parse_SubDirs)));
+            List.insert(List.end(), LocalList.begin(), LocalList.end());
             sort(List.begin(), List.end());
             if (List.empty())
                 List.push_back(File_Name); // Try directly the file name e.g. "-" for pipe
@@ -118,6 +129,7 @@ size_t MediaInfoList_Internal::Open(const String &File_Name, const fileoptions_t
             #endif //MEDIAINFO_ADVANCED
         }
         #endif //defined(MEDIAINFO_DIRECTORY_YES)
+    }

     #if defined(MEDIAINFO_DIRECTORY_YES)
         Reader_Directory().Directory_Cleanup(List);

@hjmallon
Copy link
Contributor Author

On closer inspection maybe ':' is not meant for lists, maybe it is meant for detecting URLs?

@JeromeMartinez
Copy link
Member

maybe it is meant for detecting URLs?

Ooops, too tired, right.
(there is a list hack somewhere else)

@hjmallon
Copy link
Contributor Author

hjmallon commented Mar 2, 2021

I actually think that this bug is now fixed. My other PR is useful but fixes a different bug (sequences are reported multiple times if you do seq_folder/*, rather than seq_folder).

@hjmallon hjmallon closed this as completed Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants