Awesome-Video-Streaming-and-Analysis is a curated list of awesome frameworks, applications, and systems dedicated to video streaming and analysis (processing).
With the emergence of the metaverse, an increasing number of related projects and technologies have emerged. However, many existing repositories have not been updated for some time, making it challenging to find current and relevant information. To address this gap, this repository provides an up-to-date and comprehensive collection of papers, resources, and techniques for video streaming and analysis.
Published Scope:
Streaming and analysis: SIGCOMM, NSDI, MobiCom, INFOCOM, MM, VR, WWW, MMSys, OSDI, NOSSDAV
Video processing: CVPR, ECCV, ICCV, TCSVT, TMM, ToG, TVCG, TIP, Siggraph, Vis
*Noting that this repo mainly starts from a perspective of networking.
I also pack the conference proceedings about multimedia networking into a single pdf.
- Resources
- 2D Videos
- 360-degree Videos
- Volumetric Videos
- Quality of Experience
- Video Processing
- Tools
- Multimedia Libraries
Back to Table of Contents
Related repo: Paper-Lit, Video-Streaming-Research-Papers, Deep image/video compression, Awesome-360-vision, Awesome-Streaming, Awesome-NeRF, Weekly-NeRF, Awesome-ARKit, Awesome-iot, 3D Machine learning, audio-video-streaming, Awesome Point cloud Analysis, Awesome-System-for-Machine-Learning, Awesome-Deep-Neural-Network-Compression
EE364a-Convex optimization a, EE364b-Convex optimization b, CS349D: Cloud Computing techniques, CSC348K-visual computing system, EE267-virtual reality, EE359-wireless communication, EE398A-Image and Video Compression, CS244 Advanced topics in networking, ECE 5578 Multimedia Communication, 34702 Topics in Networks: Machine Learning for Networking and Systems
SIGCOMM 23 Workshop on Emerging Multimedia Systems Several Special Issues in IEEE Network 23 MM 22: Advances In Quality Assessment Of Video Streaming Systems: Algorithms, Methods, Tools; Short Video Streaming Challenge
MM 21: Deep Learning for Visual Data Compression
MMSys 23, NTIRE 2023 Challenge on 360° Omnidirectional Image and Video Super-Resolution
OmniCV2022, GAZE2022, 19 MM Grand Challenge:
Limited by the author's knowledge, this section awaits replenishment.
- Some articles may be repeated.*
Back to Table of Contents
- Enabling High Quality Real-Time Communications with Adaptive Frame-Rate. [NSDI'23]
- Dashlet: Taming Swipe Uncertainty for Robust Short Video Streaming [NSDI'23]
- Robust Saliency-Driven Quality Adaptation for Mobile 360-Degree Video Streaming [TMC'23]
- Buffer Awareness Neural Adaptive Video Streaming for Avoiding Extra Buffer Consumption [INFOCOM 23]
- SJA: Server-driven Joint Adaptation of Loss and Bitrate for Multi-Party Realtime Video Streaming [INFOCOM 23]
- EAVS: Edge-assisted Adaptive Video Streaming with Fine-grained Serverless Pipelines [INFOCOM 23]
- Swift: Adaptive Video Streaming with Layered Neural Codecs [NSDI'22]
- Zwei: A Self-play Reinforcement Learning Framework for Video Transmission Services [TMM 22]
- Lumos: towards Better Video Streaming QoE through Accurate Throughput Prediction [INFOCOM 22]
- SenSei: Aligning Video Streaming Quality with Dynamic User Sensitivity [NSDI'21]
- OnRL: Improving Mobile Video Telephony via Online Reinforcement Learning [Mobicom 21]
- Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines [SoCC 21]
- PECAM: Privacy-Enhanced Video Streaming and Analytics via Securely-Reversible Transformation [Mobicom 20]
- Interpreting Deep Learning-Based Networking Systems [Sigcomm 20]
- Learning in situ: a randomized experiment in video streaming [NSDI'20]
- Grad: Learning for Overhead-aware Adaptive Video Streaming with Scalable Video Coding [MM'20]
- PERM: Neural Adaptive Video Streaming with Multi-path Transmission [INFOCOM'20]
- Self-play Reinforcement Learning for Video Transmission [NOSSDAV 20]
- End-to-End Transport for Video QoE Fairness [SIGCOMM'19]
- Jigsaw: Robust Live 4K Video Streaming [Mobicom 19]
- PiTree: Practical Implementation of ABR Algorithms Using Decision Trees [MM'19] [Code] [Dataset]
- Comyco: Quality-aware Adaptive Video Streaming via Imitation Learning [MM'19]
- Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE [Infocom 19]
- Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE [Infocom 19] [DeepCast]
- QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks [Mobihoc 19]
- Edge Computing Assisted Adaptive Mobile Video Streaming [TMC 19
- Oboe: Auto-tuning Video ABR Algorithms to Network Conditions [SIGCOMM'18]
- Neural Adaptive Content-aware Internet Video Delivery [OSDI'18]
- ABR Streaming of VBR-encoded Videos: Characterization, Challenges, and Solutions [CoNEXT'18]
- Understanding Video Management Planes [IMC'18]
- From Theory to Practice: Improving Bitrate Adaptation in the DASH Reference Player [MMSys'18]
- VideoNOC: assessing video QoE for network operators using passive measurements [MMSys'18]
- Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE [Infocom 19] [DeepCast]
- QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks [Mobihoc 19]
- Edge Computing Assisted Adaptive Mobile Video Streaming [TMC 19]
- Disk|Crypt|Net: rethinking the stack for high-performance video streaming [SIGCOMM'17]
- Neural Adaptive Video Streaming with Pensieve [SIGCOMM'17][Code]
- Pytheas: Enabling Data-Driven QoE Optimization Using Group-Based Exploration-Exploitation [NSDI'17]
- Dissecting VOD Services for Cellular: Performance, Root Causes and Best Practices [IMC'17]
- CS2P: Improving Video Bitrate Selection and Adaptation with Data-Driven Throughput Prediction [SIGCOMM'16]
- MP-DASH: Adaptive Video Streaming Over Preference-Aware Multipath [CoNEXT'16]
- DASH2M: Exploring HTTP/2 for Internet Streaming to Mobile Devices [MM'16]
- mDASH: A Markov Decision-Based Rate Adaptation Approach for Dynamic HTTP Streaming [TMM 16]
- BOLA: Near-Optimal Bitrate Adaptation for Online Videos [INFOCOM'16]
- A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP [SIGCOMM'15]
- Can Accurate Predictions Improve Video Streaming in Cellular Networks? [HotMobile'15]
- A Control-Theoretic Approach to Rate Adaption for DASH Over Multiple Content Distribution Servers [TCSVT 14]
- A Buffer-Based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service [SIGCOMM'14]
- Improving Fairness, Efficiency, and Stability in HTTP-based Adaptive Video Streaming with FESTIVE [CoNEXT'12]
Year | Method | Detail |
---|---|---|
21 | Fugu [NSDI 21] | DNN (bandwidth prediction)+DP (control) |
20 | OnRL [Mobicom 20] | Online RL |
20 | Stick [Infocom 20] | Buffer-based+Learning-based |
19 | Comyco [MM 19], Concerto [Mobicom 19] | Imitation Learning |
19 | PiTree [MM 19] | Explainable Learning |
18 | Oboe [Sigcomm 18] | Auto-tuning parameters |
17 | Pensieve [Sigcomm 17] Update [ICML 19] | Reinforcement Learning |
16 | CS2P [Sigcomm 16] | |
16 | BOLA [Infocom 16] | Buffer-Based+Lyapunov Optimization |
15 | MPC [Sigcomm 15] | MPC |
14 | Buffer-Based [Sigcomm 14] | Buffer-Based |
12 | Rate-Based [CoNEXT 12] | Rate-Based |
- ARTEMIS: Adaptive bitrate ladder optimization for live video streaming [NSDI 24]
- Who is the Rising Star? Demystifying the Promising Streamers in Crowdsourced Live Streaming [Infocom'23]
- AggCast: Practical Cost-effective Scheduling for Large-scale Cloud-edge Crowdsourced Live Streaming [MM'22]
- Low Latency Live Streaming Implementation in DASH and HLS [MM'22]
- Opte: Online per-title encoding for live video streaming [ICASSP'22]
- Towards Optimal Low-Latency Live Video Streaming [ToN'21]
- ReCLive: Real-Time Classification and QoE Inference of Live Video Streaming Services [IWQoS'21]
- A Low-Latency MPTCP Scheduler for Live Video Streaming in Mobile Networks [TWC'21]
- Look Ahead at the First-mile in Livecast with Crowdsourced Highlight Prediction [Infocom 20]
- Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning [Sigcomm 20] [LiveNas]
- MultiLive: Adaptive Bitrate Control for Low-delay Multi-party Interactive Live Streaming [Infocom 20]
- Vabis: Video Adaptation Bitrate System for Time-Critical Live Streaming [TMM 20]
- Optimizing Social Welfare of Live Video Streaming Services in Mobile Edge Computing [TMC 20]
- Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE [Infocom 19] [DeepCast]
- Vantage: Optimizing video upload for time-shifted viewing of social live stream [Sigcomm 19]
- Edge-based Transcoding for Adaptive Live Video Streaming [HotEdge 19]
- QARC: Video Quality Aware Rate Control for Real-Time Video Streaming based on Deep Reinforcement Learning [MM 18]
- Characterizing User Behaviors in Mobile Personal Livecast: Towards an Edge Computing-assisted Paradigm [ToMM 18]
- Cloud-Assisted Crowdsourced Livecast [ToMM 17]
- Coping With Heterogeneous Video Contributors and Viewers in Crowdsourced Live Streaming: A Cloud-Based Approach [TMM 16]
- When Crowd Meets Big Video Data: Cloud-Edge Collaborative Transcoding for Personal Livecast [TNSE 15]
- NeuriCam: Key-Frame Video Super-Resolution and Colorization for IoT Cameras [Mobicom 23]
- Collaborative Streaming and Super Resolution Adaptation for Mobile Immersive Videos [INFOCOM 23]
- YuZu: Neural-Enhanced Volumetric Video Streaming [NSDI 22]
- Efficient Video Compression via Content-Adaptive Super-Resolution [ICCV 21]
- Efficient Volumetric Video Streaming Through Super Resolution [HotMobile 21]
- SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices [IMWUT 21]
- Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning [Sigcomm 20] [LiveNas]
- NEMO: Enabling Neural-enhanced Video Streaming on Commodity Mobile Devices [Mobicom 20]
- Streaming 360-Degree Videos Using Super-Resolution [Infocom 20] [code]
- SR360: Boosting 360-Degree Video Streaming with Super-Resolution [Nossdav 20]
- Improving Quality of Experience by Adaptive Video Streaming with Super-Resolution [Infocom 20]
- Streaming 360â—¦ Videos using Super-resolution [Infocom 20]
- Supremo: Cloud-Assisted Low-Latency Super-Resolution in Mobile Devices [TMC 20]
- MobiSR: Effcient OnDevice Super-Resolution through Heterogeneous Mobile Processors [Mobicom 19]
- Dejavu: Enhancing Videoconferencing with Prior Knowledge [HotMobile 19]
- Bridging the Edge-Cloud Barrier for Real-time Advanced Vision Analytics [HotCloud 19]
- Neural Adaptive Content-aware Internet Video Delivery [OSDI 18] [NAS] [code]
- NEMO: Enabling Neural-enhanced Video Streaming on Commodity Mobile Devices [MobiCom'20]
- OnRL: Improving Mobile Video Telephony via Online Reinforcement Learning [MobiCom'20]
- LiveNAS - Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning [SIGCOMM'20]
- Jigsaw: Robust Live 4K Video Streaming [MobiCom'19]
- Learning to Coordinate Video Codec with Transport Protocol for Mobile Video Telephony [MobiCom'19]
- Vantage: optimizing video upload for time-shifted viewing of social live streams [SIGCOMM'19]
- Salsify: Low-Latency Network Video Through Tighter Integration Between a Video Codec and a Transport Protocol [NSDI'18]
- Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads [NSDI'17]
- POI360: Panoramic Mobile Video Telephony over LTE Cellular Networks [CoNEXT'17]
- Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild [ASPLOS'21]
- Deja View: Spatio-Temporal Compute Reuse for Energy-Efficient 360° VR Video Streaming [ISCA '20]
- Distilling the Essence of Raw Video to Reduce Memory Usage and Energy at Edge Devices [MICRO '19]
- Race-To-Sleep + Content Caching + Display Caching: A Recipe for Energy-eficient Video Streaming on Handhelds [MICRO '17]
- Grad: Learning for Overhead-aware Adaptive Video Streaming with Scalable Video Coding [ACM MM'20]
- LBP: Robust Rate Adaptation Algorithm for SVC Video Streaming [IEEE/ACM ToN'19]
- Layer-Assisted Adaptive Video Streaming [ACM NOSSDAV'18]
- Layered Coding vs. Multiple Descriptions for Video Streaming over Multiple Paths [ACM MM'03]
- Two-Layer Coding of Video Signals for VBR Networks [IEEE JSAC'1989]
- Multiple Description Coding
Back to Table of Contents
- Robust Saliency-Driven Quality Adaptation for Mobile 360-Degree Video Streaming [TMC 23]
- Energy-Efficient 360-Degree Video Streaming on Multicore-Based Mobile Devices [INFOCOM 23]
- OmniSense: Towards Edge-Assisted Online Analytics for 360-Degree Videos [INFOCOM 23]
- Sophon: Super-Resolution Enhanced 360°Video Streaming with Visual Saliency-aware Prefetch [MM 22]
- Personalized 360-Degree Video Streaming: A Meta-Learning Approach [MM 22]
- Improving Generalization for Neural Adaptive Video Streaming via Meta Reinforcement Learning [MM 22]
- SalientVR: Saliency-Driven Mobile 360-Degree Video Streaming with Gaze Information [Mobicom 22]
- Popularity-Aware 360-Degree Video Streaming [Infocom 21]
- Robust 360° Video Streaming via Non-Linear Sampling [Infocom 21]
- AdaP-360: User-Adaptive Area-of-Focus Projections for Bandwidth-Efficient 360-Degree Video Streaming [MM 20]
- QuRate: Power-Efficient Mobile Immersive Video Streaming [MMsys 20]
- Deja View: Spatio-Temporal Compute Reuse for Energy-Efficient 360° VR Video Streaming [ISCA 20]
- SR360: Boosting 360-Degree Video Streaming with Super-Resolution [NOSSDAV 20]
- Streaming 360-Degree Videos Using Super-Resolution [Infocom 20]
- Tile Rate Allocation for 360-Degree Tiled Adaptive Video Streaming [MM 20]
- EPASS360: QoE-aware 360-degree Video Streaming over Mobile Devices [TMC 20]
- DRL360: 360-degree Video Streaming with Deep Reinforcement Learning [Infocom 19]
- Pano: Optimizing 360° Video Streaming with a Better Understanding of Quality Perception [Sigcomm 19]
- Proactive Caching for Vehicular Multi-View 3D Video Streaming via Deep Reinforcement Learning [TWC 19]
- CLS: A Cross-user Learning based System for Improving QoE in 360-degree Video Adaptive Streaming [MM18]
- Viewport-Driven Rate-Distortion Optimized 360° Video Streaming [ICC 18]\
- 360-Degree Innovations for Panoramic Video Streaming [HotNets 18]
- 360ProbDASH: Improving QoE of 360 Video Streaming Using Tile-based HTTP Adaptive Streaming [MM 17]
- Adaptive 360-Degree Video Streaming using Scalable Video Coding [MM 17]
- SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication [MM 20]
- An Analysis of Delay in Live 360° Video Streaming Systems [MM 20]
- Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming [MM 20]
- Flocking-based Live Streaming of 360-degree Video [MMsys 20]
- Mobile Streaming of Live 360-Degree Videos [TMM 20]
- MA360: MULTI-AGENT DEEP REINFORCEMENT LEARNING BASED LIVE 360-DEGREE VIDEO STREAMING ON EDGE [ICME 20]
- A Measurement Study of YouTube 360° Live Video Streaming [NOSSDAV 19]
- Event-driven Stitching for Tile-based Live 360 Video Streaming [MMsys 19]
- RATS: Adaptive 360-degree Live Streaming [MMsys 19]
- LIME: Understanding Commercial 360° Live Video Streaming Services [MMSys 19]
- How to Evaluate Mobile 360° Video Streaming Systems? [HotMobile 20]
- Freedom: Fast Recovery Enhanced VR Delivery Over Mobile Networks [MobiSys 19]
- Tile-based Caching Optimization for 360° Videos [Mobihoc 19]
- A Two-Tier System for On-Demand Streaming of 360 Degree Video Over Dynamic Networks [TCSVT 19]
- Flare: Practical Viewport-Adaptive 360-Degree Video Streaming for Mobile Devices [Mobicom 18]
- Rubiks: Practical 360-Degree Streaming for Smartphones [MobiSys 18]
- Creating the Perfect Illusion : What will it take to Create Life-Like Virtual Reality Headsets? [HotMobile 18]
- POI360: Panoramic Mobile Video Telephony over LTE Cellular Networks [CoNEXT 17]
- VR is on the Edge: How to Deliver 360° Videos in Mobile Networks [VR/AR Network 17]
- VR/AR Immersive Communication: Caching, Edge Computing, and Transmission Trade-Offs [VR/AR Network 17]
- Personalized 360-Degree Video Streaming: A Meta-Learning Approach [MM 22]
- Subtitle-based Viewport Prediction for 360-degree Virtual Tourism Video
- Graph Learning Based Head Movement Predictionfor Interactive 360 Video Streaming [TIP 21]
- Motion-Prediction-based Wireless Scheduling for Multi-User Panoramic Video Streaming [Infocom 21]
- PARIMA: Viewport Adaptive 360-Degree Video Streaming [WWW 21]
- LiveDeep: Online Viewport Prediction for Live Virtual Reality Streaming Using Lifelong Deep Learning [VR 20]
- Viewport Prediction for 360° Videos: A Clustering Approach [NOSSDAV 20]
- A Spherical Convolution Approach for Learning Long Term Viewport Prediction in 360 Immersive Video [AAAI 20]
- Sparkle: User-Aware Viewport Prediction in 360-degree Video Streaming [TMM 20]
- DGaze: CNN-Based Gaze Prediction in Dynamic Scenes [TVCG 20]
- Very Long Term Field of View Prediction for 360-degree Video Streaming [MIPR 19]
- LADDERNET: KNOWLEDGE TRANSFER BASED VIEWPOINT PREDICTION IN 360° VIDEO [ICASSP 19]
- SPHERICAL CLUSTERING OF USERS NAVIGATING 360! CONTENT [ICASSP 19]
- Viewport Prediction for Live 360-Degree Mobile Video Streaming Using User-Content Hybrid Motion Tracking [IMWUT 19]
- Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction [MM 18]
- Gaze Prediction in Dynamic 360° Immersive Videos [CVPR 18]
- CUB360: EXPLOITING CROSS-USERS BEHAVIORS FOR VIEWPORT PREDICTION IN 360 VIDEO ADAPTIVE STREAMING [ICME 18]
- Predictive View Generation to Enable Mobile 360-degree and VR Experiences [VR/AR Network 18]
- Fixation Prediction for 360° Video Streaming to Head-Mounted Displays [NOSSDAV 17]
- Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach [TPAMI 15]
- A Taxonomy and Dataset for 360° Videos [MMSys 19]
- 360-degree Video Gaze Behaviour: A Ground-Truth Data Set and a Classification Algorithm for Eye Movements [MM 19]
- Gaze Prediction in Dynamic 360° Immersive Videos [CVPR 18]
- A Dataset of Head and Eye Movements for 360° Videos [MMSys 18]
- 360° Video Viewing Dataset in Head-Mounted Virtual Reality [MMSys 17]
- 360-Degree Video Head Movement Dataset [MMSys 17]
- A Dataset of Head and Eye Movements for 360 Degree Images [MMSys 17]
- A Dataset for Exploring User Behaviors in VR Spherical Video Streaming [MMSys 17]
Back to Table of Contents
- Habitus: Boosting Mobile Immersive Content Delivery through Full-body Pose Tracking and Multipath Networking [NSDI 24]
- Progressive Frame Patching for FoV-based Point Cloud Video Streaming [Arxiv 23]
- G-PCC++: Enhanced Geometry-based Point Cloud Compression [MM 23]
- Addressing Scalability for Real-time Multiuser Holo-portation: Introducing and Assessing a Multipoint Control Unit (MCU) for Volumetric Video [MM 23]
- Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming [MM 23]
- Enabling Low Bit-Rate MPEG V-PCC-encoded Volumetric Video Streaming with Sub-sampling [MMSys 23]
- LiveVV: Human-Centered Live Volumetric Video Streaming System [Arxiv 23]
- Spatial Perceptual Quality Aware Adaptive Volumetric Video Streaming [Globecom 23]
- PatchVVC: A Real-time Compression Framework for Streaming Volumetric Videos [MMSys 23]
- Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming [MM 23]
- Understanding User Behavior in Volumetric Video Watching: Dataset, Analysis and Prediction [MM 23]
- Mobile Volumetric Video Streaming System through Implicit Neural Representation [Sigcomm EMS 23]
- FarfetchFusion: Towards Fully Mobile Live 3D Telepresence Platform [MobiCom 23]
- MetaStream: Live Volumetric Content Capture, Creation, Delivery, and Rendering in Real Time [MobiCom 23]
- Immersive media technologies [Review 23]
- CaV3: Cache-assisted Viewport Adaptive Volumetric Video Streaming [VR 23]
- FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming [Arxiv 23]
- Volumetric video streaming: Current approaches and implementations [Review 22]
- YuZu: Neural-Enhanced Volumetric Video Streaming [NSDI 22]
- Vues: Practical Volumetric Video Streaming through Multiview Transcoding [Mobicom 22]
- Optimal Volumetric Video Streaming with Hybrid Saliency based Tiling [TMM 22]
- Dynamic Point Cloud Compression with Cross-Sectional Approach [Arxiv 22]
- A QoE Model in Point Cloud Video Streaming [Arxiv 22]
- FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming [Arxiv 22]
- From Capturing to Rendering: Volumetric Media Delivery with Six Degrees of Freedom [Review 22]
- Innovating Multi-user Volumetric Video Streaming through Cross-layer Design [HotNets 21]
- Efficient Volumetric Video Streaming Through Super Resolution [HotMobile 21]
- Point Cloud Video Streaming: Challenges and Solutions [IEEE Network 21]
- [AITransfer: Progressive AI-powered Transmission for Real-Time Point Cloud Video Streaming] ()
- GROOT: A Real-time Streaming System of High-Fidelity Volumetric Videos [Mobicom 20]
- ViVo: Visibility-Aware Mobile Volumetric Video Streaming [Mobicom 20]
- Towards Viewport-dependent 6DoF 360 Video Tiled Streaming for Virtual Reality Systems [MM 20]
- User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling [MM 20]
- Towards Viewport-dependent 6DoF 360 Video Tiled Streaming for Virtual Reality Systems [MM 20]
- VVSec: Securing Volumetric Video Streaming via Benign Use of Adversarial Perturbation [MM 20]
- A Pipeline for Multiparty Volumetric Video Conferencing: Transmission of Point Clouds over Low Latency DASH [MMsys 20]
- Cloud Rendering-based Volumetric Video Streaming System for Mixed Reality Services [MMsys 20]
- Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction [NOSSDAV 20]
- Emerging MPEG Standards for Point Cloud Compression [TCSVT 19]
- Toward Practical Volumetric Video Streaming on Commodity Smartphones [HotMobile 19]
- Rate-Utility Optimized Streaming of Volumetric Media for Augmented Reality [arxiv 18]
- Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video [TCSVT 17]
Several mediums for volumetric video including point cloud, mesh, voxel, NeRF, light fields, radiance fields..... Virtual reality papers research how to render with low latency in edge/cloud architecture. They often render small objects in mobile devices and render heavy background in the server. NeRF-based Volumetric Video: \
-
NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields [Arxiv 22]
-
Streamable Neural Fields [ECCV 22]
-
Streaming Radiance Fields for 3D Video Synthesis [Nips 22] 6DoF VR Video
-
Q-VR: System-Level Design for Future Mobile Collaborative Virtual Reality [ASPLOS 21]
-
Coterie: Exploiting Frame Similarity to Enable High-Quality Multiplayer VR on Commodity Mobile [ASPLOS 20]
-
Firefly: Untethered Multi-user VR for Commodity Mobile Devices [ATC 20]
-
MUVR: Supporting Multi-User Mobile Virtual Reality with Resource Constrained Edge Cloud [Egde Computing 18]
-
Cutting the Cord: Designing a High-quality Untethered VR System with Low Latency Remote Rendering [MobiSys 18]
-
Furion: Engineering High-Quality Immersive Virtual Reality on Today’s Mobile Devices [Mobicom 17]
-
CloudVR: Cloud Accelerated Interactive Mobile Virtual Reality [MM 17]
-
FlashBack: Immersive Virtual Reality on Mobile Devices via Rendering Memoization [MobiSys 16]
Dynamic Point Cloud
- VOLVQAD: An MPEG V-PCC Volumetric Video Quality Assessment Dataset [MMSys 23]
- A Dynamic Point Cloud Dataset for Immersive Applications [MMSys 23]
- FSVVD: A Dataset of Full Scene Volumetric Video [MMSys 23]
- A 6DoF VR Dataset of 3D VirtualWorld for Privacy-Preserving Approach and Utility-Privacy Tradeoff [MMSys 23]
- CWIPC-SXR: Point Cloud dynamic human dataset for Social XR [MMSys 22]
- JPEG Pleno Database: 8i Voxelized Full Bodiess [Dataset 17]
- JPEG Pleno Database: Microsoft Voxelized Upper Bodies - A Voxelized Point Cloud Datasets [Dataset]
- Sketchfab [Website]
Others:
- JPEG Pleno Database [Website]
- Rebuffering but not Suffering: Exploring Continuous-Time Quantitative QoE by User's Exiting Behaviors [INFOCOM 23]
- Adaptive Bitrate with User-level QoE Preference for Video Streaming [Infocom 22]
- VSiM: Improving QoE Fairness for Video Streaming in Mobile Environments [Infocom 22]
- 360ProbDASH: Improving QoE of 360 Video Streaming Using Tile-based HTTP Adaptive Streaming [Infocom 22]
- CLS: A Cross-user Learning based System for Improving QoE in 360-degree Video Adaptive Streaming [Infocom 22]
- 360°Mulsemedia: A Way to Improve Subjective QoE in 360° Videos [Infocom 22]
- Lumos: towards Better Video Streaming QoE through Accurate Throughput Prediction [Infocom 22]
- Coal Not Diamonds: How Memory Pressure Falters Mobile Video QoE [CoNext'22]
- XLINK: QoE-Driven Multi-Path QUIC Transport in Large-scale Video Services [SIGCOMM'21]
- End-to-End Transport for Video QoE Fairness [SIGCOMM 20]
- Impact of Device Performance on Mobile Internet QoE [IMC'18]
- [Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE] (Infocom 19)
- Rldish: Edge-Assisted QoE Optimization of HTTP Live Streaming with Reinforcement Learning [Infocom 20]
Back to Table of Contents
- RECL: Responsive Resource-Efficient Continuous Learning for Video Analytics [NSDI 23]
- Boggart: Towards General-Purpose Acceleration of Retrospective Video Analytics [NSDI 23]
- Gemel: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge [NSDI 23]
- SEPE Dataset: K Video Sequences and Images for Analysis and Development [MMSys 23]
- Minimizing packet retransmission for real-time video analytics [SoCC 23]
- Crowd^2: Multi-agent Bandit-based Dispatch for Video Analytics upon Crowdsourcing [INFOCOM 23]
- Owl: A Pre-and Post-processing Framework for Video Analytics in Low-light Surroundings [INFOCOM 23]
- AccMPEG: Optimizing Video Encoding for Accurate Video Analytics [MLSys 22]
- DAO: Dynamic Adaptive Offloading for Video Analytics [MM'22]
- Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers [NSDI 22]
- Privid: Practical, Privacy-Preserving Video Analytics Queries [NSDI 22]
- Understanding the Potential of Server-Driven Edge Video Analytics [HotMobile 22]
- Video Analytics with Zero-streaming Cameras [ATC 21]
- Elf: Accelerate High-resolution Mobile Deep Vision with Content-aware Parallel Offloading [ATC 21]
- Enabling Edge-Cloud Video Analytics for Robotic Applications [TCC 22]
- Enabling Edge-Cloud Video Analytics for Robotic Applications [INFOCOM'21]
- Real-Time Deep Video Analytics on Mobile Devices [MobiHoc'21]
- Soudain: Online Adaptive Profile Configuration for Real-time Video Analytics [IWQoS'21]
- CrossRoI: Cross-camera Region of Interest Optimization for Efficient Real Time Video Analytics at Scale [MMSys'21]
- Vision Paper: Towards Software-Defined Video Analytics with Cross-Camera Collaboration [SenSys'21]
- Server-Driven Video Streaming for Deep Learning Inference [SIGCOMM'20]
- Distream: scaling live video analytics with workload-adaptive distributed edge intelligence [Sensys'20]
- Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics [SIGCOMM'20]
- Joint Configuration Adaptation and Bandwidth Allocation for Edge-based Real-time Video Analytics [Infocom'20]
- Scaling Video Analytics on Constrained Edge Nodes [SysML'19]
- AWStream: adaptive wide-area streaming analytics [SIGCOMM'18]
- Chameleon: Scalable Adaptation of Video Analytics [SIGCOMM'18]
- Focus: Querying Large Video Datasets with Low Latency and Low Cost [OSDI'18]
- Live Video Analytics at Scale with Approximation and Delay-Tolerance [NSDI'17]
- Glimpse: Continuous, Real-Time Object Recognition on Mobile Devices [Sensys'15]
- The Design and Implementation of a Wireless Video Surveillance System [Mobicom'15]
- DeepMix: Mobility-aware, Lightweight, and Hybrid 3D Object Detection for Headsets [Mobisys'22]
- Hybrid Mobile Vision for Emerging Applications [HotMobile'22]
This part is also known Neural video compression: This repo is currently the most comprehensive list of papers about image/video coding with deep learning
- Structure-Preserving Motion Estimation for Learned Video Compression [MM'22]
- Learning-Based Video Coding with Joint Deep Compression and Enhancement [MM'22]
- Efficient Video Compression via Content-Adaptive Super-Resolution [ICCV'21] [Code]
- Online-trained Upsampler for Deep Low Complexity Video Compression [ICCV'21]
- ELF-VC: Efficient Learned Flexible-Rate Video Coding [arxiv'21]
- Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement [CVPR'20]
- Learned Video Compression [ICCV'19]
- DVC: An End-to-end Deep Video Compression Framework [CVPR'19]
- Deep Learning-Based Video Coding: A Review and A Case Study [arxiv'19]
- Video Compression through Image Interpolation [ECCV'18]
A list of summary of represented works for interested readers:
- end to end optimized image compression(End-to-end VAE)
- Hyperprior(modeling p(y,z|x), intro the distribution of prior hyperparameter)
- Joint Autoregressive and Hierarchical Priors for Learned Image Compression
- Integer Network for data compression with latent variable models
- channel wise autoregressive entropy models for learned image compression(diferent slice model slice dependence gg20c)
- learned image compression with discretized gaussian mixture likehoods and attention moduels(GMM entropy model + attention cheng2020)
- context adaptive entropy model for end to end optimized image compression(bits consuming and bits free entropy model)
- Learned Variable-Rate Multi-Frequency Image Compression using Modulated Generalized Octave Convolution(lamda sigmoid)
- Variable Rate Deep Image Compression With a Conditional Autoencoder(using lambda to fine model parameters,conditional autoencoder)
- YOU ONLY TRAIN ONCE: LOSS-CONDITIONAL TRAINING OF DEEP NETWORKS(conditional training)
- A Hybriad image codec with learned residuals(local attention,octave convolution, residual compression)
- Learned image compression with residual coding
- End-to-End Learned ROI Image Compression(encoder + importance map)
- Learning Convolutional Networks for Content-weighted Image Compression(importance map + binary feature map)
- Conditional probability model for deep image compression
- generative adversarial networks for extreme learned image compression
- towards conceptual compression(convolutional hyperprior)
- Learn to inpaint for image compression(inpaint + residual compression)
- deep generative models for distribution preserving lossy compression(gan)
- Revisiting Video Saliency: A Large-scale Benchmark and a New Model [CVPR'18]
- A semiautomatic saliency model and its application to video compression [ICCP'17]
- AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality [MobiCom'23]
- MetaStream: Live Volumetric Content Capture, Creation, Delivery, and Rendering in Real Time [MobiCom'23]
Unique Identification of 50,000+ Virtual Reality Users from Head & Hand Motion Data [Arxiv 23]
- YouTube-8M: A Large-Scale Video Classification Benchmark[arxiv'16]
- Beyond Short Snippets: Deep Networks for Video Classification [CVPR'15]
- Large-scale Video Classification with Convolutional Neural Networks [CVPR'14]
Back to Table of Contents Following are the tools and libraries that are useful to build your ideas on top of.
- FFMPEG: A multimedia library with a collection of diverse video codecs, filters, and video streaming capabilities.
- GPAC: A multimedia library that has decoding, rendering and displaying support. It also has support for 360 degree video delivery. It comes with MP4Box to package the video into DASH format segments and MP4Client a video player with adaptive video streaming solutions
- x265: Open source implementation H.265 video codec.
- OBS Studio: Open source broadcaster software. It is useful to stream live videos on platforms such as Facebook and Periscope etc.
- SVT Encoders: Software (multithreaded CPU) implementation of HEVC, VP9 and AV1 encoders.
- Saliency-aware Video Codec: X264 implementation of saliency-aware video compression.
- SHVC: Layered coding - scalable extentions for H.265/HEVC
- SVC: Layered coding - scalable extensions for H.264/AVC
- VVC: Reference implementation of H.266/VVC
- NeRFstudio: A collaboration friendly studio for NeRFs
- FFMPEG: A multimedia library with a collection of diverse video codecs, filters, and video streaming capabilities.
- GPAC: A multimedia library that has decoding, rendering and displaying support. It also has support for 360 degree video delivery. It comes with MP4Box to package the video into DASH format segments and MP4Client a video player with adaptive video streaming solutions
- x265: Open source implementation H.265 video codec.
- OBS Studio: Open source broadcaster software. It is useful to stream live videos on platforms such as Facebook and Periscope etc.
- SVT Encoders: Software (multithreaded CPU) implementation of HEVC, VP9 and AV1 encoders.
- Saliency-aware Video Codec: X264 implementation of saliency-aware video compression.
- SHVC: Layered coding - scalable extentions for H.265/HEVC
- SVC: Layered coding - scalable extensions for H.264/AVC
- VVC: Reference implementation of H.266/VVC