Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Sep 23, 2024
1 parent 63e142e commit 1f38998
Show file tree
Hide file tree
Showing 18 changed files with 138 additions and 135 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
c100f47f
fdc8b9bb
9 changes: 6 additions & 3 deletions Applications/Blogs/blog-music-identification.html
Original file line number Diff line number Diff line change
Expand Up @@ -322,8 +322,9 @@ <h1>The Challenge</h1>
<p>The mystery I have been working on since about 1993 is how to systematically describe the Irish traditional dance music repertoire. This particular musical culture might be the healthiest European folk music tradition that has survived unbroken for centuries as an aurally transmitted culture, with something on the order of 10,000 musically distinct “tunes” (as musical works for dance use are called in this culture) and tens of thousands of active participants around the world. My main work is published at <a href="https://www.irishtune.info">irishtune.info</a> as a combination scholarly reference work and practical day-to-day tool for the global community of musicians at all levels.</p>
<p>One side benefit of the manual work I’ve been doing for 30 years to carefully describe the contents of about a thousand albums of commercially published Irish traditional music is that I (only somewhat intentionally) created an ideal dataset for training a machine-learning solution that can do what only very few human experts can do after a lifetime of experience and use of large archival resources: <strong>Hear any performance and identify what tune it is</strong>. This is the core challenge I have tackled here.</p>
<p>Background explanation: Folk musicians in aurally transmitted traditions generally do not know the “identity” of tunes they play, especially not in any kind of broadly reliable or generally agreed-upon way other than as often-contradictory informal assertions, each held within a subset of the global community. Welcome to the fuzziness of humanities research! :)</p>
<p>Some benefits of solving this challenge: - Accelerate my work as a human expert as I expand the coverage of albums represented in <a href="https://www.irishtune.info">irishtune.info</a>, which in turn benefits the global community of musicians and the health of the tradition itself. This goal was accomplished as of July 27, 2024, but refinement continues.</p>
<p>Some benefits of solving this challenge:</p>
<ul>
<li><p>Accelerate my work as a human expert as I expand the coverage of albums represented in <a href="https://www.irishtune.info">irishtune.info</a>, which in turn benefits the global community of musicians and the health of the tradition itself. This goal was accomplished as of July 27, 2024, but refinement continues.</p></li>
<li><p>Empower the global community of Irish traditional musicians to identify their own tunes and recordings. (I am currently looking for help with this technical challenge.)</p></li>
<li><p>Now that I have solved this for Irish traditional music, how do we enable musicologists and musicians interested in all the other folk musics on planet Earth to create the same kind of solution for those musical cultures?</p></li>
</ul>
Expand All @@ -333,16 +334,18 @@ <h1>The Solution</h1>
<p>For better or worse, commercial music is a big-money industry. That means there are both well-funded organizations involved as well as financial incentives to analyze and automate all sorts of business processes related to pop music. For example, social media platforms have to worry about copyright and licensing issues in any kind of social-media post that includes music in any form. Youtube or Tiktok, for example, need to make sure that your video of yourself singing a Taylor Swift song in the car doesn’t get shared without the proper licensing fees being paid to the artist. And in order to do that, they needed to solve the problem of “What song is that?” in an automated way. So they have been funding CSI (Cover Song Identification) research for years, and that has occasionally shown up in public as open-source code repositories, like this ground-breaking one that appeared in July 2023 from an industry-funded team of researchers in China: <a href="https://github.com/Liu-Feng-deeplearning/CoverHunter">github.com/Liu-Feng-deeplearning/CoverHunter</a></p>
<p>While humanities research would never get the resources to build a solution from scratch to, say, figure out a way to catalog 10,000 hours of audio in an Armenian folk song archive, we can certainly take advantage of big-industry solutions that solve similar problems.</p>
<p>And that’s exactly what I did. I forked that CoverHunter project and the result is published as <a href="https://github.com/alanngnet/CoverHunterMPS">github.com/alanngnet/CoverHunterMPS</a>.</p>
<p>It took about 6 months of: - Reverse-engineering the very poorly documented CoverHunter code to understand it enough to proceed.</p>
<p>It took about 6 months of:</p>
<ul>
<li><p>Reverse-engineering the very poorly documented CoverHunter code to understand it enough to proceed.</p></li>
<li><p>Teaching myself just enough about machine learning, neural network training, and just enough new Python skills to proceed.</p></li>
<li><p>Fixing bugs and documenting both the Python code itself as well as how to use it. My correspondence with the lead author of CoverHunter confirmed that their industry sponsor required them to remove proprietary aspects of the code, which presumably broke things in the process. Plus they wrote their solution to run on big industry-scale server farms which humanities researchers typically have no way to fund, so I had to revise it to run on a desktop.</p></li>
<li><p>Data wrangling - mainly in the form of writing Python scripts - to automate the large-scale tasks of leveraging my own database and audio library to prepare training data in the format needed by the Python application.</p></li>
<li><p>Hyperparameter tuning (and adding more hyperparameters). So many long training runs, so many TensorBoard graphs to pore over!</p></li>
<li><p>Adding features to take the CoverHunter solution, which was only built out enough to generate the training metrics needed to claim success as a CSI research breakthough, and turn it into a reliable, easy-to-use solution that answers “What tune is this?” for any arbitrary audio input.</p></li>
</ul>
<p>In case you are interested in a more technical summary of the solution, it involves: - Conversion of raw audio data to CQT (Constant-Q Transform) 2-D arrays, representing time and frequency dimensions that can also be treated as a visual picture of the audio, so that visual machine-learning methods and models could also be leveraged. For example, a single-instrument melody appears as a line moving vertically higher for higher notes and lower for lower notes, and longer horizontally when notes are sustained for a longer time. CQT is a well-established method of audio analysis in the larger academic research field of MIR (music information retrieval).</p>
<p>In case you are interested in a more technical summary of the solution, it involves:</p>
<ul>
<li><p>Conversion of raw audio data to CQT (Constant-Q Transform) 2-D arrays, representing time and frequency dimensions that can also be treated as a visual picture of the audio, so that visual machine-learning methods and models could also be leveraged. For example, a single-instrument melody appears as a line moving vertically higher for higher notes and lower for lower notes, and longer horizontally when notes are sustained for a longer time. CQT is a well-established method of audio analysis in the larger academic research field of MIR (music information retrieval).</p></li>
<li><p>A lot of artificial data augmentation done both in pre-training preprocessing as well as on-the-fly augmentation done during training. For each real-world audio sample you give to this solution, it will generate 4 other variants in pre-processing, and then during training each of those will in turn get augmented (modified) in different, partly random ways during each training step. So by the end of a full training run, the model may have seen hundreds of artificial variants of each real-world data sample.</p></li>
<li><p>A PyTorch-based implementation of a conformer neural network, a somewhat unusual or newer complex network that combines a CNN (convolutional neural network) for small-timescale learning of specific patterns with a transformer network that enables large-timescale and structurally flexible learning. You need the latter because real-world musicians (unlike, say, music played at the school level) are free to modify the structure of a work without hurting the musical identity of the work, and are free to improvise and vary within the structure, likewise without causing a human listener to fail to recognize the identity of the work.</p></li>
<li><p>Leveraging a “bag of tricks” of various neural-network training optimization techniques that the CoverHunter authors adapted from previous deep-learning research, including taking a lot of source code from various - often unidentified - open-source projects in the field of automated speech recognition. This amalgamation of code from many undocumented sources was a big part of why my first task of even understanding their code was so challenging (I did fix a lot of that along the way).</p></li>
Expand Down
2 changes: 1 addition & 1 deletion Applications/Blogs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -343,7 +343,7 @@ <h3 class="anchored" data-anchor-id="coming-soon-explore-ml-stories">Coming soon

<div class="quarto-listing quarto-listing-container-default" id="listing-listing">
<div class="list quarto-listing-default">
<div class="quarto-post image-right" data-index="0" data-categories="Blogs,Deep learning,Conformer,Transformer,CNN,Humanities,Audio,Music,CSI,Time-series" data-listing-date-sort="1726012800000" data-listing-file-modified-sort="1727100118709" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="13" data-listing-word-count-sort="2572">
<div class="quarto-post image-right" data-index="0" data-categories="Blogs,Deep learning,Conformer,Transformer,CNN,Humanities,Audio,Music,CSI,Time-series" data-listing-date-sort="1726012800000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="13" data-listing-word-count-sort="2572">
<div class="thumbnail">
<p><a href="../../Applications/Blogs/blog-music-identification.html" class="no-external"></a></p><a href="../../Applications/Blogs/blog-music-identification.html" class="no-external">
<p><img loading="lazy" src="../../images/blog-music-identification.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down
14 changes: 7 additions & 7 deletions Applications/Highlights/Forums/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ <h3 class="anchored" data-anchor-id="share-your-work">Share your work!</h3>

<div class="quarto-listing quarto-listing-container-default" id="listing-listing">
<div class="list quarto-listing-default">
<div class="quarto-post image-right" data-index="0" data-categories="ML+X,Computer vision,Ultrasound,Medical imaging,Agriculture,LSTM,CNN-LSTM,CNN,Deep learning" data-listing-date-sort="1712620800000" data-listing-file-modified-sort="1727100118710" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="quarto-post image-right" data-index="0" data-categories="ML+X,Computer vision,Ultrasound,Medical imaging,Agriculture,LSTM,CNN-LSTM,CNN,Deep learning" data-listing-date-sort="1712620800000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2024-04-09.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2024-04-09.html" class="no-external">
<p><img loading="lazy" src="https://img.youtube.com/vi/DHYbBGI7EWc/maxresdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down Expand Up @@ -404,7 +404,7 @@ <h3 class="no-anchor listing-title">
</a>
</div>
</div>
<div class="quarto-post image-right" data-index="1" data-categories="ML+X,Multimodal learning,Foundation models,Model sharing,Hugging Face,LLM,LMM,LLaVA,Deep learning" data-listing-date-sort="1710201600000" data-listing-file-modified-sort="1727100118710" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="222">
<div class="quarto-post image-right" data-index="1" data-categories="ML+X,Multimodal learning,Foundation models,Model sharing,Hugging Face,LLM,LMM,LLaVA,Deep learning" data-listing-date-sort="1710201600000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="222">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2024-03-12.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2024-03-12.html" class="no-external">
<p><img loading="lazy" src="https://img.youtube.com/vi/zs1T3H80Fk4/maxresdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down Expand Up @@ -461,7 +461,7 @@ <h3 class="no-anchor listing-title">
</a>
</div>
</div>
<div class="quarto-post image-right" data-index="2" data-categories="ML+X,Physics,Simulations" data-listing-date-sort="1707782400000" data-listing-file-modified-sort="1727100118710" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="quarto-post image-right" data-index="2" data-categories="ML+X,Physics,Simulations" data-listing-date-sort="1707782400000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2024-02-13.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2024-02-13.html" class="no-external">
<p><img loading="lazy" src="http://i3.ytimg.com/vi/LmKMhNiu5Fw/hqdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down Expand Up @@ -500,7 +500,7 @@ <h3 class="no-anchor listing-title">
</a>
</div>
</div>
<div class="quarto-post image-right" data-index="3" data-categories="ML+X,Science communication,Healthcare,Drug synergy,LLM,Text mining" data-listing-date-sort="1702339200000" data-listing-file-modified-sort="1727100118709" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="quarto-post image-right" data-index="3" data-categories="ML+X,Science communication,Healthcare,Drug synergy,LLM,Text mining" data-listing-date-sort="1702339200000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2023-12-12.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2023-12-12.html" class="no-external">
<p><img loading="lazy" src="https://img.youtube.com/vi/7Xcsr0mKj4A/maxresdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down Expand Up @@ -548,7 +548,7 @@ <h3 class="no-anchor listing-title">
</a>
</div>
</div>
<div class="quarto-post image-right" data-index="4" data-categories="ML+X,Healthcare,Clustering,Deep learning,LLM,Genomics" data-listing-date-sort="1699315200000" data-listing-file-modified-sort="1727100118709" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="quarto-post image-right" data-index="4" data-categories="ML+X,Healthcare,Clustering,Deep learning,LLM,Genomics" data-listing-date-sort="1699315200000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2023-11-07.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2023-11-07.html" class="no-external">
<p><img loading="lazy" src="https://img.youtube.com/vi/P3bO2naMCD4/maxresdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down Expand Up @@ -596,7 +596,7 @@ <h3 class="no-anchor listing-title">
</a>
</div>
</div>
<div class="quarto-post image-right" data-index="5" data-categories="ML+X,Time-series,Genomics,Healthcare" data-listing-date-sort="1696896000000" data-listing-file-modified-sort="1727100118709" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="quarto-post image-right" data-index="5" data-categories="ML+X,Time-series,Genomics,Healthcare" data-listing-date-sort="1696896000000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2023-10-10.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2023-10-10.html" class="no-external">
<p><img loading="lazy" src="https://img.youtube.com/vi/MBGtl5lwyFA/maxresdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down Expand Up @@ -638,7 +638,7 @@ <h3 class="no-anchor listing-title">
</a>
</div>
</div>
<div class="quarto-post image-right" data-index="6" data-categories="ML+X,Multimodal learning,Deep learning,Computer vision,Healthcare,Genomics" data-listing-date-sort="1695081600000" data-listing-file-modified-sort="1727100118709" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="quarto-post image-right" data-index="6" data-categories="ML+X,Multimodal learning,Deep learning,Computer vision,Healthcare,Genomics" data-listing-date-sort="1695081600000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="4">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/Forums/mlx_2023-09-19.html" class="no-external"></a></p><a href="../../../Applications/Highlights/Forums/mlx_2023-09-19.html" class="no-external">
<p><img loading="lazy" data-src="https://img.youtube.com/vi/W3h9s1CG35c/maxresdefault.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down
2 changes: 1 addition & 1 deletion Applications/Highlights/SILO/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ <h3 class="anchored" data-anchor-id="join-the-next-live-silo">Join the next live

<div class="quarto-listing quarto-listing-container-default" id="listing-listing">
<div class="list quarto-listing-default">
<div class="quarto-post image-right" data-index="0" data-categories="SILO,VLM,LLM,LMM,Multimodal learning,Foundation models,Knowledge-based" data-listing-date-sort="1700611200000" data-listing-file-modified-sort="1727100118710" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="451">
<div class="quarto-post image-right" data-index="0" data-categories="SILO,VLM,LLM,LMM,Multimodal learning,Foundation models,Knowledge-based" data-listing-date-sort="1700611200000" data-listing-file-modified-sort="1727100360602" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="451">
<div class="thumbnail">
<p><a href="../../../Applications/Highlights/SILO/23-11-22_KennethMarino_ World-Knowledge-in-the-Time-of-Large-Models.html" class="no-external"></a></p><a href="../../../Applications/Highlights/SILO/23-11-22_KennethMarino_ World-Knowledge-in-the-Time-of-Large-Models.html" class="no-external">
<p><img loading="lazy" src="https://vumbnail.com/891935467.jpg" class="thumbnail-image" style="height: 150px;"></p>
Expand Down
Loading

0 comments on commit 1f38998

Please sign in to comment.