Metaverse: A dilemma between comfort, immersion, and value (part 1)

The mechanics behind Metaverse: displays first or humans first?

BRELYON
8 min readFeb 14, 2022
Article by BRELYON

Virtual worlds are coming, billions of dollars have been invested, and this is just the beginning of what is now referred to as the metaverse, an immersive construct of the internet in which you can comfortably live, work, play, and generally be present. Regardless of how you want to define the metaverse, there is no question that we are heading full-speed into uncharted territory of human-computer experience in which computing is immersive, ubiquitous, and lifelike. As virtual experiences and the formation of metaverse gain momentum, we are expected to spend more and more time in these virtual experiences. This all comes down to spending more time with some type of screen or display interface. But because we are already at our limit with 7 hours and 11 minutes of average screen time in the United States [1], we undoubtedly need to replace existing media and replace them with new, more immersive and user-friendly ones. Because of this, that new medium or display technology either has to be more sticky than our cell phones or it has to drastically increase our visual bandwidth and functionality for experiencing the digital world so that we can see more, do more, and essentially get more value out of these virtual worlds. In order to understand where we are going, it’s essential to study where we are, what the trends are, and what the roadblocks have been.

In recent years, flat displays have increased their resolution, image quality, and size. Research shows that large displays with higher refresh rates have a direct impact on productivity and immersion [26,7,8]. Although the increase in display size increases productivity and reduces visual fatigue (up to a certain limit), it comes at the cost of desk real estate and sedentary experiences at a desk or on a couch. Small displays (15” and below), on the other hand, offer better portability and access, and because of that level of access their repeated addictive use case is consequently more common. In fact, the statistics show that the mobile screen time has already surpassed TV screen time in 2018 with now being at almost 4 hours everyday [9]. For three decades, increasing internet speed has gone hand in hand with better, higher-bandwidth displays, which took platforms like Youtube, Netflix and Twitch to new heights. However, since 2016, with the introduction of 8K, the flat display industry seems to have reached a major point of saturation- scientifically speaking: the human retina resolution [10]. Beyond this resolution, your eyes simply can’t discern/appreciate further pixels per degrees in the given field of view and even adding more fidelity and contrast (dynamic range) has diminishing returns in richness of the experience. This is kind of like adding ultrasound frequencies to your headphone’s audio range; you can’t hear them, and so therefore their presence doesn’t add value to your listening experience.

Why has this saturation not been overcome? Well, the panel manufacturing industry has been dominated by a few Asian players: Samsung, LG, BOE, AUO, JDI and Innolux. These mega-manufacturers are extremely efficient in putting out tens of millions of panels into the world every year and because of that, there is not much room for risky experimentation — only incremental predictable enhancements.

Samsung’s new site in India is expected to increase production capacity from 63 to 93 million units/year [11].

LG alone sells around 6 million OLED TVs per year [12].

On the other side of the world, however, a display renaissance is happening: headsets, light field displays, and holographic display startups are all popping up to disrupt the human-computer experience.

Unfortunately, the human factor has been a major roadblock, and many big players like Google and Microsoft have now realized that this is not the same as the cell phone revolution in the 1990s and 2000s. Creating a better virtual experience is not as easy as just jamming more pixels into your field of view; optical components do not follow Moore’s law, and we humans are extremely sensitive to our comfort [13]. AR and VR headsets enhance immersion by increasing the field of view and combining it with feedback like head tracking. However, the average session time for VR is only 20 minutes, even for VR enthusiasts; users tend to lose spatial awareness after 30 minutes, and some users report immediate discomfort and disorientation using such devices [14].

So, what exactly is comfort when experiencing a digital medium, and what is immersion? Can we find the building blocks–the mechanics–of these two metrics? If only one modality of displays (i.e., flat display, headsets, cellphones) is chosen, then these are very simple to define. Obviously, displays with more pixels and better contrast and larger image size are more immersive and, likewise, increased field of view and better, more accurate depth representation is better in a headset format. But such a comparison is not sufficient to make comparisons across all the possible formats of delivering immersion, especially new formats that are built to replace existing ones.

We must reevaluate our metrics and even reformulate our questions. The more important question is: when will a new format of display replace the other existing format of displays? When will headsets replace cell phones? When will holographic displays, autostereoscopic displays, or headset-less virtual displays like Brelyon Ultra Reality [15] displays replace our monitors or TVs? How about other modalities of displays we haven’t thought about? We need to be able to compare different formats, and to do so we need to put human comfort and immersion at the center of this comparison. This is certainly a capstone question in crafting a lasting metaverse experience. The definition of a comfort and immersiveness score across different display platforms is challenging due to the different approaches in optical design, ways to interact with the user, and types of content being shown–there are just too many variables. A simple story, when told well, can be extremely immersive even without a screen, and the best IMAX 3D experience might still not be engaging if the content is dull and unbelievable. Both immersion and comfort can become extremely challenging to pinpoint. In order to be able to have meaningful results and metrics, we need to limit the scopes; In other words, to compare apples and oranges, you have to look at their common parameters, like DNA, sugar content, or calorie content! Here you have to focus the broad idea of comfort and immersion to visual and ergonomic comfort and visual immersion.

Most parameters in evaluating visual comfort and immersiveness involve the performance of a system in projecting certain spatial frequencies (e.g., black and white line pairs) or patterns at different depths, the mitigation of the accommodation-vergence mismatch, and the presence of distortions and aberrations [16]. Moreover, both comfort and immersiveness also depend on the content being projected [17]. The type of content has a direct impact on how the eyes move and the effort they make in focusing different objects appearing in the image. Watching an action movie does not have the same effect on our eyes as watching a pleasant stationary landscape. To make things more complex, some content might be better experienced in a specific type of display. For example, reading text might be much more comfortable on the flat screen of an ebook reader than on a curved monitor. On the other hand, experiencing an engulfing environment is surely better represented by a curved screen compared to an ebook reader.

This gravitation (addictive access) to experience content is very important because it brings more people into an experience or medium and allows community building, social interaction, platforms to exchange value and, ultimately, a metaverse. So the stickiness of the content and how frequently that content can be accessed can’t be overlooked. After all, the biggest metaverse economies today are around addictive gaming experiences like Roblox, Fortnite, and Minecraft, none of which are considered to be high graphic or super realistic experiences, but they all have addictive mechanics and give you a great sense of presence using contextual cues that are digestible even by kids.

So, the answer to the question of “humans first, or displays first” is: humans first; metaverses will be built around where people and people’s attention are. If the internet was built on connection, the metaverse will be built on presence.

We don’t have the intention to talk further about how to engineer the content to make it more sticky, but rather want to look at some physical and psychophysical parameters that represent a sense of presence in a very fundamental way regardless of context or meaning of the shown content. In the next part, we will dive into a global human perception model to see how we might be able to craft visual experiences more effectively.

1. Moody, R. Screen Time Statistics: Average Screen Time in U.S. vs. the rest of the world. comparitech https://www.comparitech.com/tv-streaming/screen-time-statistics/ (2021).
2. Olsen, D. R. et al. Comparing usage of a large high-resolution display to single or dual desktop displays for daily work. Proc Sigchi Conf Hum Factors Comput Syst 1005–1014 (2009) doi:10.1145/1518701.1518855.
3. Stegman, A., Ling, C. & Shehab, R. Human Interface and the Management of Information. Interacting with Information, Symposium on Human Interface 2011, Held as Part of HCI International 2011, Orlando, FL, USA, July 9–14, 2011, Proceedings, Part II. Lect Notes Comput Sc 84–93 (2011) doi:10.1007/978–3–642–21669–5_11.
4. Kang, Y. & Stasko, J. Lightweight Task/Application Performance using Single versus Multiple Monitors: A Comparative Study. in Graphics Interface (2008). doi:10.1145/1375714.1375718.
5. Conlon, C. Do Larger Monitors Equal Greater Productivity? https://ciaraconlon.com/2011/do-larger-monitors-equal-greater-productivity/ (2011).
6. Anderson, J. A., Hill, J., Parkin, P. & Garrison, A. Productivity, Screens, and Aspects Ratios. https://collections.lib.utah.edu/ark:/87278/s69w3dmf (2007).
7. Almagro, M. Key Trends in Immersive Display Technologies and Experiences. Sound & Communications (2020).
8. Foster, A. IBC2019 Technical Papers: Immersive. Trends (2019).
9. Zalani, R. Screen Time Statistics 2021: Your Smartphone is Hurting You. Elite Content Marketer https://elitecontentmarketer.com/screen-time-statistics/ (2021).
10. Morrison, G. 8K TV explained, and why you definitely don’t need to buy one. cnet https://www.cnet.com/tech/home-entertainment/8k-tv-explained-and-why-you-definitely-dont-need-to-buy-one/ (2021).
11. Won, L. S. Samsung to shift some smartphone production at Vietnam to India. TheElect http://www.thelec.net/news/articleView.html?idxno=3591#:~:text=Samsung%20is%20aiming%20to%20reduce,units%20per%20year%20by%202022 (2022).
12. Park, S.-Y. LG Display OLED TV panel sales top 20 mn units on strong demand. The Korea Economic Daily https://www.kedglobal.com/newsView/ked202112220001#:~:text=Accumulated%20sales%20of%20OLED%20TV,units%20so%20far%20this%20year (2021).
13. Heshmat, B. Is Augmented reality doomed? A look into the future of AR. (2019).
14. Vailshery, L. S. Average session time of virtual reality (VR) users in the United Stated between the 2nd and 3rd quarter of 2019, by user type. Statista https://www.statista.com/statistics/1098976/average-session-time-of-vr-users-by-user-type/ (2021).
15. Heshmat, B. The first no-headset virtual monitor. (2020).
16. Aghasi, A., Heshmat, B., Wei, L., Tian, M. & Cholewiak, S. A. Optimal allocation of quantized human eye depth perception for multi-focal 3D display design. Opt Express 29, 9878 (2021).
17. Zhang, C., Perkis, A. & Arndt, S. Spatial immersion versus emotional immersion, which is more immersive? 2017 Ninth Int Conf Qual Multimedia Exp Qomex 1–6 (2017) doi:10.1109/qomex.2017.7965655.

--

--

BRELYON

We are a team of scientists, and entrepreneurs focused on future of human computer evolution. Our expertise is in display tech. and computer science.