Skip to content

Commit

Permalink
Merge pull request #273 from partheee/patch-2
Browse files Browse the repository at this point in the history
Wrong image URL(Image not loaded ) --typo
  • Loading branch information
merveenoyan authored Apr 30, 2024
2 parents ff4311a + abfdf19 commit e17e96c
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion chapters/en/unit4/multimodal-models/a_multimodal_world.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ _An infographic on multimodality and why it is important to capture the overall
Many times communication between 2 people gets really awkward in textual mode, slightly improves when voices are involved but greatly improves when you are able to visualize body language and facial expressions as well. This has been studied in detail by the American Psychologist, Albert Mehrabian who stated this as the 7-38-55 rule of communication, the rule states:
"In communication, 7% of the overall meaning is conveyed through verbal mode (spoken words), 38% through voice and tone and 55% through body language and facial expressions."

![Funny Image + Text Meme example](https://huggingface.co/datasets/hf-vision/course-assets/main/resolve/multimodal_fusion_text_vision/bigbang.jpg)

To be more general, in the context of AI, 7% of the meaning conveyed is through textual modality, 38% through audio modality and 55% through vision modality.
Within the context of deep learning, we would refer each modality as a way data arrives to a deep learning model for processing and predictions. The most commonly used modalities in deep learning are: vision, audio and text. Other modalities can also be considered for specific use cases like LIDAR, EEG Data, eye tracking data etc.
Expand Down

0 comments on commit e17e96c

Please sign in to comment.