WO2016139532A1 - Method and apparatus for transmitting a video - Google Patents

Method and apparatus for transmitting a video Download PDF

Info

Publication number
WO2016139532A1
WO2016139532A1 PCT/IB2016/000262 IB2016000262W WO2016139532A1 WO 2016139532 A1 WO2016139532 A1 WO 2016139532A1 IB 2016000262 W IB2016000262 W IB 2016000262W WO 2016139532 A1 WO2016139532 A1 WO 2016139532A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
video frames
quality
tracking information
predicted area
Prior art date
Application number
PCT/IB2016/000262
Other languages
French (fr)
Inventor
Yu Chen
Original Assignee
Alcatel Lucent
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201510207760.4A external-priority patent/CN106162363B/en
Application filed by Alcatel Lucent filed Critical Alcatel Lucent
Publication of WO2016139532A1 publication Critical patent/WO2016139532A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234354Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering signal-to-noise ratio parameters, e.g. requantization
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection

Definitions

  • the present disclosure relates generally to communication systems, and more particularly, to video transmissions based on eye tracking techniques.
  • the study of human visual system has a long history.
  • One of the most important findings is the structure of the retina.
  • the retina covers the back surface of the eye ball connecting the ciliary body, which hosts the huge number of photo-receptors, rod and cone cells.
  • Cone cells are smaller than rod cells but more important for color vision.
  • Most of the cone cells are located in the macula area which is near the blind spot for blood vessels and nerves into the brain.
  • Rods are responsible for the night vision or scotopic vision and rod cells cannot differentiate colors. In bright light, the color-sensitive cones are predominant, so we see a colorful world. In bright daylight environment, rods may saturate and then only cones work.
  • the density of the photo-receptors in the retina varies greatly.
  • the peak appears in 20 degree around the center and decreases to the edge.
  • most of the cones are located in the center, a very small area called fovea, around 1.5mm wide and this is the core of the human vision, but it is rods free.
  • the distribution of rods and cones are shown in Figure 1.
  • the human color vision is limited in this several degree area. Hence, the human only have good eyesight in the gaze point. Few people notice this because the eye ball moves and our brain can piece the small patches of good vision together.
  • the end to end delay is summarized in Table I, showing totally 106ms.
  • the response delay from saccadic movement to fixation is around 30ms, and a maximum delay should less than 50ms.
  • Technique advance may shorten these delay components. For example, by using 100Hz eye tracker, the measurement delay could be shortened to 10ms. However, transmission delay is still not acceptable. There is need to optimize the main delay component, the transmission from the base station to the video server.
  • the purpose of this invention is to provide an eye tracking based video transmission system which could reduce system delay and save resources.
  • a method for use in a communication device, of transmitting a video, the method comprising the steps of: receiving encoded video frames from a video server; decoding the received video frames; buffering the decoded video frames; receiving eye tracking information from a user device; determining a first predicted area of eye fixation based on the eye tracking information; re-encoding buffered video frames with a first quality in the first predicted area and re-encoding buffered video frames with a fourth quality outside the predicted area, wherein the first quality is better than the fourth quality; sending the re-encoded video frames to the user device.
  • the method may further comprise the steps of: determining a second predicted area of eye fixation based on the buffered video frames and the eye tracking information; re-encoding buffered video frames with a second quality in the second predicted area, wherein the first quality is better than the second quality and the second quality is better than the fourth quality.
  • the method may further comprise the steps of: determining a predicted area of eye saccadic route based on the buffered video frames and the eye tracking information; re-encoding buffered video frames with a third quality in the predicted area of eye saccadic route, wherein the first quality is better than the third quality and the third quality is better than the fourth quality.
  • a method for use in a communication device, of transmitting a video, the method comprising the steps of:
  • x is the position of a point on a screen of the user device
  • g(x) is a distance from the point to a center of a gaze point
  • v [ s an e y e tracker resolution, v is a velocity of eye movement, x is the position of a point on a screen of the user device, f( x is a minimum distance from the point to an estimated moving trajectory of eye,
  • y. — , ⁇ x ) is a distance from the point to a center a j (max (g (x) ,x)) 2 + a 2 max(g-(x),x) + a 3
  • ⁇ l is a parameter to control the resolution of the predicted area i
  • Figure 1 shows a schematic view of rod and cone cell distribution
  • Figure 2 shows a schematic view of network architecture
  • Figure 3 shows a schematic view of equally readable chart
  • Figure 4 shows a schematic view of visual acuity
  • Figure 5 shows a schematic view of an eye tracking based video transmission system according to one embodiment of the invention
  • Figure 6 shows a flow chart of a method of transmitting a video according to one embodiment of the invention
  • Figure 7 shows a schematic view of video information adapted to the eye.
  • the human visual acuity is related to the density of cone cells, and this is also the baseline in most studies. However, there are some other factors to affect the visual acuity, e.g. the ganglion cells. Multiple photoreceptor cells connect to one ganglion cell, and usually there are more photoreceptor cells in the connection for retina peripheral area. Hence, the most accurate model of the human acuity is still by experiment.
  • “Anstis SM. A chart demonstrating variations in acuity with retinal position (Letter). Vision Res. 1974;14:589-592” (“Anstis" for short), it gives the acuity threshold model and an interesting equally readable chart. As shown in Figure. 3, when the eye is fixated in the center, all the letters should be equally readable even though the letters in the outer ring are much bigger than those in the center. This reveals on the other hand the visual acuity in the center retina is much better than the peripheral area.
  • the unit is degree and x represents the eccentricity to the fovea center. It is indicated in “Anstis” that the small negative intercept might be probably caused by experimental error, so we use general coefficients to replace the numbers.
  • the area recognition threshold should vary to the square of the eccentricity.
  • the visual acuity is defined as the inverse of the threshold:
  • the eye When we watch a video, a picture or anything, the eye repeats the two major types of movement it performs, saccadic and fixating.
  • the eye starts to gaze at a new element in a picture, it needs some time to prepare and also to accumulate sufficient light to stimulate the neural. This has special meaning to video because video frame changes and the eye needs time to recognize it and if there is no change in the picture, there is no information.
  • the eye recognizes the picture needs two conditions, time and size.
  • the eye movement behavior can be modeled by a piecewise function corresponding to fixating and saccadic movement.
  • the visual acuity is very low during saccadic movement as the eye turns at a speed as high as 400 degree per second. Hence, only the fixate movement needs to be considered.
  • the saccadic movement is quite stereotype which can be modeled by three steps, initial preparation, fast open loop movement and final adjustment, where the second steps depends on the distance between the eye and the target, so the duration of saccadic movement duration is:
  • r is the screen size of the display
  • preparation latency
  • ⁇ 2 is the final adjustment delay
  • S(r) is the second stage delay.
  • the total delay varies between 20ms to 200ms. The sense to model the saccadic movement is the visual acuity is low in this stage and this is useful to optimize the video transmission.
  • the fixation duration can be modeled by lognormal or exponential distribution as proposed in "Arthur Lugtigheid, Distributions of fixation durations and visual acquisition rates, Lugtigheid, A.J. P., 2007".
  • the duration is usually on the hundreds of millions level though depending on the content of the video. This means the eye might not "watch” for about 1/3 time, corresponding to 30% transmission resource saving in principle.
  • the eye tracking based video transmission system comprises a video server 101 , three communication devices 102a, 102b and 102c, and a user device 103.
  • the communication device may be a base station or an eNode B, for example.
  • the user device may be a cell phone or a tablet, for example.
  • step S201 the Pcell 102a receives encoded video frames from the video server 101.
  • step S202 the Pcell 102a decodes the received video frames.
  • the video frames may be encoded by a low decoding complexity encoder in the video server 101 such that it would be easier for the Pcell 102a to do transcoding.
  • step S203 the Pcell 102a buffers the decoded video frames.
  • the Pcell 102a receives eye tracking information from the user device 103.
  • the eye tracking information may comprise eye gaze position and/or eye movement direction, for example.
  • the fixation area can be predicted when the saccadic movement just starts. Saccadic movement has an interesting characteristic, ballistic.
  • the fixation area of interest is usually predicable, e.g., moving items, human, outstanding color objects, etc.
  • the Pcell 102a determines at least one predicted area of eye fixation. For example, the Pcell 102a may determine two predicted area of eye fixation, i.e., the first predicted area of eye fixation and the second predicted area of eye fixation.
  • the Pcell 102a re-encodes buffered video frames with a first quality in the first predicted area, re-encodes buffered video frames with a second quality in the second predicted area and re-encodes buffered video frames with a fourth quality outside the first and second predicted areas.
  • the first quality and the second quality are better than the fourth quality.
  • the first quality may be the same as the second quality, or be better than the second quality if the first predicted area is closer to the eye.
  • the quality may comprise resolution, for example.
  • the Pcell 102a may further determine a predicted area of eye saccadic route based on the buffered video frames and the eye tracking information. For the predicted area of eye saccadic route, the Pcell 102a re-encodes buffered video frames with a third quality in the predicted area of eye saccadic route. The first quality and second quality are better than the third quality and the third quality is better than the fourth quality.
  • step S207 the Pcell 102a sends the re-encoded video frames to the user device 103.
  • the Pcell 102a For the multi-cell transmission, e.g., COMP, the Pcell 102a would send video content to Scells 102b and 102c.
  • the Pcell 102a may send the decoded video frames to the Scells 102b and 102c respectively.
  • the Pcell 102a may directly send encoded video frames to the Scells 102b and 102c after receiving them from the video server 101.
  • the Pcell 102a and the Scells 102b, 102c use a video control protocol to ensure the video frames are re-encoded in the same way that can be combined at the user device 103.
  • the video control protocol may define the video encoder and decoder types and its version.
  • the video control protocol may also define the encoder parameters, e.g. quantization configuration and parameters in equation (3).
  • the video control protocol may also include the timing information for the video frames to be re-encoded.
  • the Pcell 102a would send the eye tracking information to the Scells 102b and 102c. Then each cell could do the same video re-encoding based on the information.
  • the video content is distributed to the related cells and decoded and buffered for some time to absorb the delay variation, so the eye tracking based video encoder only needs to perform encoding process, not transcoding (decoding and then encoding). As the decoded video content will be buffered for some time, for example 1 second, this is very useful to smooth the decoding and encoding computation demand.
  • a shorten subframe structure is used where the granularity is one slot, i.e. 0.5ms. This reduces the latency of one transmission plus one retransmission from 16ms to 8ms.
  • the video re-encoding could be decreased to 5ms and inter base station signaling to 2ms, the total delay of the system will be 25ms.
  • Further possible latency reduction includes shortening the HARQ retransmission cycle, reducing the re-encoding delay and eye tracker processing delay.
  • the Pcell 102a determines eye status based on the eye tracking information.
  • the Pcell 102a re-encodes the buffered video frames using resolution :
  • x arg(g(x) > x + s)
  • x is the position of a point on the screen of the user device 103
  • g(x) is the distance from this point to the center of the gaze point
  • 1 is system delay
  • arg is a function to calculate the suitable x according to an input formula.
  • the Pcell 102a re-encodes the buffered video frames using resolution :
  • v S an eye tracker resolution
  • v is a velocity of eye movement
  • x is the position of a point on the screen of the user device 103
  • ( x is a minimum distance from this point to an estimated moving trajectory of eye.
  • y. — , g(x) is the distance from this point to the a 1 (max (g (x) , j) 2 + a 2 max(g(x), ) + a 3
  • k ; ⁇ l is the parameter to control the resolution of each predicted area i.
  • k ⁇ can be set to 1.
  • the first predicted area is the one nearest to the previous gaze point.
  • the second one may be upgraded to the first one and so on.
  • the passed predicted fixation area is then deleted. There may be zero, one or multiple predicted fixation area.
  • the Pcell 102a sends the re-encoded video frames to the user device 103.
  • the end to end delay depends on multiple factors.
  • the adaptive delay compensation could be proposed.
  • the high resolution area size is set based on equation (3).
  • a threshold can be set if the system delay exceeds it, the system will switch to none eye tracking mode.
  • a similar slow start transmission can be used to absorb the delay variation. It is assumed the improper video encoding configuration in delay variation may affect the user experience, a target can be set e.g. the configuration should work in 99% cases, and then based on the end to end delay statistics, the base station can have the optimal configuration.
  • the gains of different terminals are tested.
  • a maximum end to end delay of 25ms is assumed.
  • the distance between the phone and the eye is assumed to be 60cm based on the measurement of the author and his phone.
  • the eye movement is modeled as saccadic->fixation->saccadic... Each time a random position on the screen is selected and the eye makes saccadic movement from the current position to the next position.
  • the fixation is modeled by an ex-Guassian process based on the paper of "Adrian Staub, Ashley Benatar, Individual differences in fixation duration distributions in reading, Psychonomic Bulletin & Review, December 2013, Volume 20, Issue 6, pp 1304-1311".
  • the end-to end delay is considered in the modeling to ensure when the eyes starts the saccadic movement, the viewer will not notice a change of the video quality. This enlarges the high resolution circle.
  • the effect is shown in Figure 7.
  • the 25ms end to end delay corresponds to about 12 degree, which exhibits a two-step shaped resolution distribution figure.
  • the functions of the present application may be implemented using hardware, software, firmware, or any combinations thereof.
  • the functions may be stored on a computer readable medium as one or more instructions or codes, or transmitted as one or more instructions or codes on the computer readable medium.
  • the computer readable medium comprises a computer storage medium and a communication medium.
  • the communication medium includes any medium that facilitates transmission of the computer program from one place to another.
  • the storage medium may be any available medium accessible to a general or specific computer.
  • the computer-readable medium may include, for example, but not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disc storage devices, magnetic disk storage devices, or other magnetic storage devices, or any other medium that carries or stores desired program code means in a manner of instructions or data structures accessible by a general or specific computer or a general or specific processor. Furthermore, any connection may also be considered as a computer-readable medium.
  • co-axial cable an optical cable, a twisted pair wire, a digital subscriber line (DSL), or radio technologies such as infrared, radio or microwave
  • co-axial cable, optical cable, twisted pair wire, digital subscriber line (DSL), or radio technologies such as infrared, radio or microwave are also covered by the definition of medium.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any normal processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.

Abstract

The invention provides a method, for use in a communication device, of transmitting a video, the method comprising the steps of: receiving encoded video frames from a video server; decoding the received video frames; buffering the decoded video frames; receiving eye tracking information from a user device; determining a first predicted area of eye fixation based on the eye tracking information; re-encoding buffered video frames with a first quality in the first predicted area and re-encoding buffered video frames with a fourth quality outside the predicted area, wherein the first quality is better than the fourth quality; and sending the re-encoded video frames to the user device.

Description

METHOD AND APPARATUS FOR TRANSMITTING A VIDEO
Field of the Invention
The present disclosure relates generally to communication systems, and more particularly, to video transmissions based on eye tracking techniques.
Background of the Invention
The study of human visual system has a long history. One of the most important findings is the structure of the retina. The retina covers the back surface of the eye ball connecting the ciliary body, which hosts the huge number of photo-receptors, rod and cone cells. Cone cells are smaller than rod cells but more important for color vision. Most of the cone cells are located in the macula area which is near the blind spot for blood vessels and nerves into the brain. There are around 90 million rod cells, compared to 4.5 million cone cells. Rods are responsible for the night vision or scotopic vision and rod cells cannot differentiate colors. In bright light, the color-sensitive cones are predominant, so we see a colorful world. In bright daylight environment, rods may saturate and then only cones work. In the indoor environment, both rods and cones contribute to the vision. Next, it takes some time for the cells to adapt to the environments, for rods it takes about 30 minutes fully adapted to the dim light, slower than cones. Hence, when talking about video, only cone cells are relevant.
The density of the photo-receptors in the retina varies greatly. For rods, the peak appears in 20 degree around the center and decreases to the edge. In contrast, most of the cones are located in the center, a very small area called fovea, around 1.5mm wide and this is the core of the human vision, but it is rods free. The distribution of rods and cones are shown in Figure 1. The human color vision is limited in this several degree area. Hence, the human only have good eyesight in the gaze point. Few people notice this because the eye ball moves and our brain can piece the small patches of good vision together.
One may consider if a system only transmits where the eye gazes at by eye tracking technique, huge gains may be achievable. However, the eye moves at a very high speed, about 400 degree per second. This requires the system performing with extremely low latency. For example, when the eye moves from one corner of a display to the other across 20 degree, it only takes 50ms. Hence, the eye tracking information reporting and video transmission switching should be finished within 50ms. This poses a big challenge to the network, which may only be supported by 5G. 5G is the network expected to be available by 2020 and featured by high capacity and extremely low transmission delay. New techniques might be used to support these including large scale antenna array, new frame structure, new scheduling mechanism and etc. Moreover, fundamental changes to the network architecture are also required to support end to end eye tracking based video transmission.
Object and Summary of the Invention There are some studies to link the human visual perception and video transmission.
In "Robert-Inacio, F. ; Scaramuzzino, R. ; Stainer, Q. ; Kussener-Combier, E., Biologically inspired image sampling for electronic eye, Biomedical Circuits and Systems Conference (BioCAS), 2010, pages: 246-249", an image sampling scheme is proposed for electronic eye. The sampling is made based on a hexagon pavement where the area of each hexagon increases with the distance to the focus. The human gazing behavior is studied in "Laura Muir Iain, Iain Richardson, Steven Leaper, Gaze Tracking and Its Application to Video Coding for Sign Language, Picture Coding Symposium 2003, pages 32-325" to discover which part of a picture is likely to be gazed by human. Mohsen M. proposed a real-time eye tracking based video coding system in "Mohsen M. Farid, Fatih Kurugollu, Fionn D. Murtaghk, Adaptive wavelet eye-gaze-based video compression, Proc. SPIE 4877, Opto-Ireland 2002: Optical Metrology, Imaging, and Machine Vision, 255 (March 17, 2003)". In this system, the video frame is sub-blocked and encoded according to the eye tracker feedback. This scheme is realized between computers in laboratory, where the latency might not be the constraint. At the moment, the model of human visual perception is only studied in "Robert-Inacio, F. ; Scaramuzzino, R. ; Stainer, Q. ; Kussener-Combier, E., Biologically inspired image sampling for electronic eye, Biomedical Circuits and Systems Conference (BioCAS), 2010, pages: 246-249" but without the consideration of eye movement behavior. The next aspect is latency, which is the main challenge of video transmission in a mobile network and is not studied yet.
One potential solution for a mobile network like 5G could be the eye tracker feeds back the eye gaze information to the video server and the server encodes the video accordingly based on the gaze information. However, this suffers too long delay. The general network architecture is depicted in Figure 2, where the video server distributes the video to the base stations. Based on such architecture, the total delay is evaluated in Table I.
The end to end delay is summarized in Table I, showing totally 106ms. The response delay from saccadic movement to fixation is around 30ms, and a maximum delay should less than 50ms. Technique advance may shorten these delay components. For example, by using 100Hz eye tracker, the measurement delay could be shortened to 10ms. However, transmission delay is still not acceptable. There is need to optimize the main delay component, the transmission from the base station to the video server.
Table I Delay analysis
Figure imgf000005_0001
Based on above concerns, the purpose of this invention is to provide an eye tracking based video transmission system which could reduce system delay and save resources.
In one aspect of the invention, there is provided a method, for use in a communication device, of transmitting a video, the method comprising the steps of: receiving encoded video frames from a video server; decoding the received video frames; buffering the decoded video frames; receiving eye tracking information from a user device; determining a first predicted area of eye fixation based on the eye tracking information; re-encoding buffered video frames with a first quality in the first predicted area and re-encoding buffered video frames with a fourth quality outside the predicted area, wherein the first quality is better than the fourth quality; sending the re-encoded video frames to the user device.
In an example, the method may further comprise the steps of: determining a second predicted area of eye fixation based on the buffered video frames and the eye tracking information; re-encoding buffered video frames with a second quality in the second predicted area, wherein the first quality is better than the second quality and the second quality is better than the fourth quality.
In an example, the method may further comprise the steps of: determining a predicted area of eye saccadic route based on the buffered video frames and the eye tracking information; re-encoding buffered video frames with a third quality in the predicted area of eye saccadic route, wherein the first quality is better than the third quality and the third quality is better than the fourth quality.
In another aspect of the invention, there is provided a method, for use in a communication device, of transmitting a video, the method comprising the steps of:
- receiving encoded video frames from a video server;
- decoding the received video frames;
- buffering the decoded video frames;
- receiving eye tracking information from a user device;
- determining an eye status based on the eye tracking information;
- if the eye status is in a fixation status, then re-encoding the buffered video frames using a resolution :
w
Figure imgf000006_0001
herein , x is the position of a point on a screen of the user device, g(x) is a distance from the point to a center of a gaze point, is a system delay,
\ - e~ m3
av a2s <¾ ^ s s a ^ame^er 0f me gaze 0mt? x is derived from an equation of 1
y =— -2
a{x + a2x + a3 ^ an(j arg ^ a mnc^on ^0 calculate a suitable x according to an input formula;
- if the eye status is in a saccadic status, then re-encoding the buffered video frames using a resolution :
l _ e-"33.3
y = max( , ktyt )
almax(f (x), s) + a2max(f (x), s) + a3 t - wherein v , [s an eye tracker resolution, v is a velocity of eye movement, x is the position of a point on a screen of the user device, f(x is a minimum distance from the point to an estimated moving trajectory of eye,
j _
y. = — , ^x) is a distance from the point to a center aj (max (g (x) ,x))2 + a2max(g-(x),x) + a3
of a predicted fixation area i, k;<l is a parameter to control the resolution of the predicted area i;
- sending the re-encoded video frames to the user device.
Brief Description of the Drawings The invention is explained in further detail, and by way of example, with reference to the accompanying drawings wherein:
Figure 1 shows a schematic view of rod and cone cell distribution;
Figure 2 shows a schematic view of network architecture;
Figure 3 shows a schematic view of equally readable chart;
Figure 4 shows a schematic view of visual acuity;
Figure 5 shows a schematic view of an eye tracking based video transmission system according to one embodiment of the invention;
Figure 6 shows a flow chart of a method of transmitting a video according to one embodiment of the invention;
Figure 7 shows a schematic view of video information adapted to the eye.
Throughout the above drawings, like reference numerals will be understood to refer to like, similar or corresponding features or functions.
Detailed Description First, the human visual system modeling is described below.
The human visual acuity is related to the density of cone cells, and this is also the baseline in most studies. However, there are some other factors to affect the visual acuity, e.g. the ganglion cells. Multiple photoreceptor cells connect to one ganglion cell, and usually there are more photoreceptor cells in the connection for retina peripheral area. Hence, the most accurate model of the human acuity is still by experiment. In "Anstis SM. A chart demonstrating variations in acuity with retinal position (Letter). Vision Res. 1974;14:589-592" ("Anstis" for short), it gives the acuity threshold model and an interesting equally readable chart. As shown in Figure. 3, when the eye is fixated in the center, all the letters should be equally readable even though the letters in the outer ring are much bigger than those in the center. This reveals on the other hand the visual acuity in the center retina is much better than the peripheral area.
The recognition threshold model is given in "Anstis", as:
j = 0.046x- 0.031 ^
where the unit is degree and x represents the eccentricity to the fovea center. It is indicated in "Anstis" that the small negative intercept might be probably caused by experimental error, so we use general coefficients to replace the numbers. Next, the area recognition threshold should vary to the square of the eccentricity. The visual acuity is defined as the inverse of the threshold:
1
y =— 2
When we watch a video, a picture or anything, the eye repeats the two major types of movement it performs, saccadic and fixating. When the eye starts to gaze at a new element in a picture, it needs some time to prepare and also to accumulate sufficient light to stimulate the neural. This has special meaning to video because video frame changes and the eye needs time to recognize it and if there is no change in the picture, there is no information. Hence, the eye recognizes the picture needs two conditions, time and size.
In the study of congenital nystagmus, the exponential relationship between visual acuity and the time is found in "Mario Cesarelli, Paolo Bifulco, Luciano Loffredo, Marcello Bracale, Relationship between visual acuity and eye position variability during foveations in congenital nystagmus, Documenta Ophthalmologicajuly 2000, Volume 101, Issue 1, pp 59-72" ("Mario Cesarelli" for short). Hence, take this into account, the visual acuity model is:
1 - e-"333
y =— :— T~ where t is the foveation time and 33.3 is the coefficient defined in "Mario Cesarelli" and in unit of millisecond. Let 1 =0.046, 2 ~ 3 ~ s Retina eccentricity from 2 to 15 degree for a duration from 0 to 100ms, the visual acuity is illustrated in Figure 4 based on equation (3). It can be seen that the visual acuity increases with time distinctly for the first tens of milliseconds. However, the main factor to affect the acuity is retina eccentricity. The visual acuity drops quickly to the floor beyond 8 degree.
The eye movement behavior can be modeled by a piecewise function corresponding to fixating and saccadic movement. The visual acuity is very low during saccadic movement as the eye turns at a speed as high as 400 degree per second. Hence, only the fixate movement needs to be considered. It is straightforward to model the movement by a Markov transition model, which could be as simple as two states. The saccadic movement is quite stereotype which can be modeled by three steps, initial preparation, fast open loop movement and final adjustment, where the second steps depends on the distance between the eye and the target, so the duration of saccadic movement duration is:
D (r) = δ1 + 5(r) + δ2 (4)
where r is the screen size of the display, δ is preparation latency, δ2 is the final adjustment delay, S(r) is the second stage delay. Usually the total delay varies between 20ms to 200ms. The sense to model the saccadic movement is the visual acuity is low in this stage and this is useful to optimize the video transmission.
The fixation duration can be modeled by lognormal or exponential distribution as proposed in "Arthur Lugtigheid, Distributions of fixation durations and visual acquisition rates, Lugtigheid, A.J. P., 2007". The duration is usually on the hundreds of millions level though depending on the content of the video. This means the eye might not "watch" for about 1/3 time, corresponding to 30% transmission resource saving in principle.
Then, respective embodiments of the present invention are described below.
Referring to Figure 5, the eye tracking based video transmission system comprises a video server 101 , three communication devices 102a, 102b and 102c, and a user device 103. The communication device may be a base station or an eNode B, for example. The user device may be a cell phone or a tablet, for example.
In the following, a method of transmitting a video based on eye tracking techniques according to one embodiment of the invention will be described using a primary cell (Pcell) as an example of the communication device 102a, secondary cells (Scells) as an example of the communication devices 102b and 102c. Referring to Figure 6, in step S201 , the Pcell 102a receives encoded video frames from the video server 101. Then, in step S202, the Pcell 102a decodes the received video frames. For example, the video frames may be encoded by a low decoding complexity encoder in the video server 101 such that it would be easier for the Pcell 102a to do transcoding. Next, in step S203, the Pcell 102a buffers the decoded video frames.
Also, in step S204, the Pcell 102a receives eye tracking information from the user device 103. The eye tracking information may comprise eye gaze position and/or eye movement direction, for example. As the eye movement is quite stereotype, following saccadic and fixating pace, the fixation area can be predicted when the saccadic movement just starts. Saccadic movement has an interesting characteristic, ballistic. The fixation area of interest is usually predicable, e.g., moving items, human, outstanding color objects, etc. Thus, based on the buffered video frames and the eye tracking information, in step S205, the Pcell 102a determines at least one predicted area of eye fixation. For example, the Pcell 102a may determine two predicted area of eye fixation, i.e., the first predicted area of eye fixation and the second predicted area of eye fixation.
For the two predicted areas of eye fixation, in step S206, the Pcell 102a re-encodes buffered video frames with a first quality in the first predicted area, re-encodes buffered video frames with a second quality in the second predicted area and re-encodes buffered video frames with a fourth quality outside the first and second predicted areas. The first quality and the second quality are better than the fourth quality. The first quality may be the same as the second quality, or be better than the second quality if the first predicted area is closer to the eye. The quality may comprise resolution, for example.
Moreover, the Pcell 102a may further determine a predicted area of eye saccadic route based on the buffered video frames and the eye tracking information. For the predicted area of eye saccadic route, the Pcell 102a re-encodes buffered video frames with a third quality in the predicted area of eye saccadic route. The first quality and second quality are better than the third quality and the third quality is better than the fourth quality.
Then, in step S207, the Pcell 102a sends the re-encoded video frames to the user device 103.
For the multi-cell transmission, e.g., COMP, the Pcell 102a would send video content to Scells 102b and 102c. In one example, the Pcell 102a may send the decoded video frames to the Scells 102b and 102c respectively. In another example, the Pcell 102a may directly send encoded video frames to the Scells 102b and 102c after receiving them from the video server 101. The Pcell 102a and the Scells 102b, 102c use a video control protocol to ensure the video frames are re-encoded in the same way that can be combined at the user device 103. The video control protocol may define the video encoder and decoder types and its version. The video control protocol may also define the encoder parameters, e.g. quantization configuration and parameters in equation (3). The video control protocol may also include the timing information for the video frames to be re-encoded. Moreover, for each transmission, the Pcell 102a would send the eye tracking information to the Scells 102b and 102c. Then each cell could do the same video re-encoding based on the information.
Additionally, as for the complexity, the video content is distributed to the related cells and decoded and buffered for some time to absorb the delay variation, so the eye tracking based video encoder only needs to perform encoding process, not transcoding (decoding and then encoding). As the decoded video content will be buffered for some time, for example 1 second, this is very useful to smooth the decoding and encoding computation demand.
Moreover, a shorten subframe structure is used where the granularity is one slot, i.e. 0.5ms. This reduces the latency of one transmission plus one retransmission from 16ms to 8ms. Suppose the video re-encoding could be decreased to 5ms and inter base station signaling to 2ms, the total delay of the system will be 25ms. Further possible latency reduction includes shortening the HARQ retransmission cycle, reducing the re-encoding delay and eye tracker processing delay.
In another embodiment, after receiving the eye tracking information from the user device 103, the Pcell 102a determines eye status based on the eye tracking information.
If the eye status is in a fixation status, then the Pcell 102a re-encodes the buffered video frames using resolution :
Figure imgf000011_0001
y = (4)
, x = arg(g(x) > x + s)
a. (g (x) - x) + a2(g (x) - x) + a i e [^ , +oo)
wherein , x is the position of a point on the screen of the user device 103, g(x) is the distance from this point to the center of the gaze point, 1 is system delay,
-ί, /33.3
- e
y = - 2
av a2s a3 ^ s ^ a ^ame^er 0f me gaze 0int, x is derived from the equation of 1
y =—r2 ;
d JC ~ ~ CI JC ~ ~ I
1 2 3 , and arg is a function to calculate the suitable x according to an input formula.
If the eye status is in a saccadic status, then the Pcell 102a re-encodes the buffered video frames using resolution :
\ - e-"333
y = max( , kiyi ) (5)
almax(f(x), s) + a2max(f(x), s) + ci3 t - wherein v , S an eye tracker resolution, v is a velocity of eye movement, x is the position of a point on the screen of the user device 103, (x is a minimum distance from this point to an estimated moving trajectory of eye.
_ -ί, /33.3
y. = — , g(x) is the distance from this point to the a1(max (g (x) , j)2 + a2max(g(x), ) + a3
center of the predicted fixation area i. In addition, k;<l is the parameter to control the resolution of each predicted area i. In an example, for the first predicted fixation area, k\ can be set to 1. The first predicted area is the one nearest to the previous gaze point. When the eye passes the first predicted fixation area, the second one may be upgraded to the first one and so on. The passed predicted fixation area is then deleted. There may be zero, one or multiple predicted fixation area.
Then, the Pcell 102a sends the re-encoded video frames to the user device 103.
The end to end delay depends on multiple factors. Hence, the adaptive delay compensation could be proposed. The high resolution area size is set based on equation (3). A threshold can be set if the system delay exceeds it, the system will switch to none eye tracking mode. Next, a similar slow start transmission can be used to absorb the delay variation. It is assumed the improper video encoding configuration in delay variation may affect the user experience, a target can be set e.g. the configuration should work in 99% cases, and then based on the end to end delay statistics, the base station can have the optimal configuration.
Below, simulation of the gains based on the proposed model is described.
Based on the proposed model, i.e. equation (3), the gains of different terminals are tested. First, a maximum end to end delay of 25ms is assumed. The distance between the phone and the eye is assumed to be 60cm based on the measurement of the author and his phone. The eye movement is modeled as saccadic->fixation->saccadic... Each time a random position on the screen is selected and the eye makes saccadic movement from the current position to the next position. At the new position, the fixation is modeled by an ex-Guassian process based on the paper of "Adrian Staub, Ashley Benatar, Individual differences in fixation duration distributions in reading, Psychonomic Bulletin & Review, December 2013, Volume 20, Issue 6, pp 1304-1311".
The end-to end delay is considered in the modeling to ensure when the eyes starts the saccadic movement, the viewer will not notice a change of the video quality. This enlarges the high resolution circle. The effect is shown in Figure 7. The 25ms end to end delay corresponds to about 12 degree, which exhibits a two-step shaped resolution distribution figure.
The simulation results are summarized in the below table, where one can see the proposed model can save 55.5% to 80.6%> resources for different terminals. Bigger screen gives higher performance gain.
Figure imgf000013_0001
In one or more exemplary designs, the functions of the present application may be implemented using hardware, software, firmware, or any combinations thereof. In the case of implementation with software, the functions may be stored on a computer readable medium as one or more instructions or codes, or transmitted as one or more instructions or codes on the computer readable medium. The computer readable medium comprises a computer storage medium and a communication medium. The communication medium includes any medium that facilitates transmission of the computer program from one place to another. The storage medium may be any available medium accessible to a general or specific computer. The computer-readable medium may include, for example, but not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disc storage devices, magnetic disk storage devices, or other magnetic storage devices, or any other medium that carries or stores desired program code means in a manner of instructions or data structures accessible by a general or specific computer or a general or specific processor. Furthermore, any connection may also be considered as a computer-readable medium. For example, if software is transmitted from a website, server or other remote source using a co-axial cable, an optical cable, a twisted pair wire, a digital subscriber line (DSL), or radio technologies such as infrared, radio or microwave, then the co-axial cable, optical cable, twisted pair wire, digital subscriber line (DSL), or radio technologies such as infrared, radio or microwave are also covered by the definition of medium.
The various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein may be implemented or performed with a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any normal processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The above depiction of the present disclosure is to enable any of those skilled in the art to implement or use the present invention. For those skilled in the art, various modifications of the present disclosure are obvious, and the general principle defined herein may also be applied to other transformations without departing from the spirit and protection scope of the present invention. Thus, the present invention is not limited to the examples and designs as described herein, but should be consistent with the broadest scope of the principle and novel characteristics of the present disclosure.

Claims

What is claimed is:
1. A method, for use in a communication device, of transmitting a video, the method comprising the steps of:
- receiving encoded video frames from a video server;
- decoding the received video frames;
- buffering the decoded video frames;
- receiving eye tracking information from a user device;
- determining a first predicted area of eye fixation based on the eye tracking information;
- re-encoding buffered video frames with a first quality in the first predicted area and re-encoding buffered video frames with a fourth quality outside the predicted area, wherein the first quality is better than the fourth quality;
- sending the re-encoded video frames to the user device.
2. The method of claim 1, further comprising the steps of:
- determining a second predicted area of eye fixation based on the buffered video frames and the eye tracking information;
- re-encoding buffered video frames with a second quality in the second predicted area, wherein the first quality is better than the second quality and the second quality is better than the fourth quality.
3. The method of claim 1, further comprising the steps of:
- determining a predicted area of eye saccadic route based on the buffered video frames and the eye tracking information;
- re-encoding buffered video frames with a third quality in the predicted area of eye saccadic route, wherein the first quality is better than the third quality and the third quality is better than the fourth quality.
4. The method of claim 1, further comprising the steps of:
- sending the encoded video frames to one or more other communication devices; or sending the decoded video frames to the one or more other communication devices;
- sending the eye tracking information to the one or more other communication devices.
5. The method of claim 1, wherein the eye tracking information comprises eye gaze position and/or eye movement direction.
6. The method of claim 1, wherein the communication device is a base station or an eNode B.
7. A method, for use in a communication device, of transmitting a video, the method comprising the steps of:
- receiving encoded video frames from a video server;
- decoding the received video frames;
- buffering the decoded video frames;
- receiving eye tracking information from a user device;
- determining an eye status based on the eye tracking information;
- if the eye status is in a fixation status, then re-encoding the buffered video frames using a resolution :
w
Figure imgf000017_0001
herein , x is the position of a point on a screen of the user device,
S x is a distance from the point to a center of a gaze point, is a system delay,
\ - e~ m3
y =— ——
1 2 3 , s is a diameter of the gaze point, x is derived from an equation of
1 2 3 , and arg is a function to calculate a suitable x according to an input formula;
- if the eye status is in a saccadic status, then re-encoding the buffered video frames using a resolution ^ : 1— e~"333
y = max( , kiyi )
almax(f(x), s) + a2max(f(x), s) + a3 t - wherein v , [s an eye tracker resolution, v is a velocity of eye movement, x is the position of a point on a screen of the user device, f(x is a minimum distance from the point to an estimated moving trajectory of eye,
-<, /33.3
l - e
, S(x) is a distance from the point to a a (max ( g ( x ) , x )) 2 + a2max (g ( ) , ) + a3 center of a predicted fixation area i, 1 is a parameter to control the resolution of the predicted area i;
- sending the re-encoded video frames to the user device.
8. The method of claim 7, wherein the eye tracking information comprises eye gaze position and/or eye movement direction.
9. An apparatus, for use in a communication device, for transmitting a video, the apparatus comprising:
a receiver configured to receive encoded video frames from a video server and eye tracking information from a user device;
a decoder configured to decode the received video frames;
a buffer configured to buffer the decoded video frames;
a determining unit configured to determine a first predicted area of eye fixation based on the eye tracking information;
an encoder configured to re-encode buffered video frames with a first quality in the first predicted area and to re-encode buffered video frames with a fourth quality outside the predicted area, wherein the first quality is better than the fourth quality;
a transmitter configured to send the re-encoded video frames to the user device.
10. The apparatus of claim 9, wherein the determining unit is further configured to determine a second predicted area of eye fixation based on the buffered video frames and the eye tracking information; and the encoder is further configured to re-encode buffered video frames with a second quality in the second predicted area, wherein the first quality is better than the second quality and the second quality is better than the fourth quality.
11. The apparatus of claim 9, wherein the determining unit is further configured to determine a predicted area of eye saccadic route based on the buffered video frames and the eye tracking information; and the encoder is further configured to re-encode buffered video frames with a third quality in the predicted area of eye saccadic route, wherein the first quality is better than the third quality and the third quality is better than the fourth quality.
12. The apparatus of claim 9, wherein the transmitter is further configured to send the encoded video frames to one or more other communication devices, or send the decoded video frames to the one or more other communication devices; and to send the eye tracking information to the one or more other communication devices.
13. The apparatus of claim 9, wherein the eye tracking information comprises eye gaze position and/or eye movement direction.
14. The apparatus of claim 9, wherein the communication device is a base station or an eNode B.
15. An apparatus, for use in a communication device, for transmitting a video, the apparatus comprising:
a receiver configured to receive encoded video frames from a video server and eye tracking information from a user device;
a decoder configured to decode the received video frames;
a buffer configured to buffer the decoded video frames;
a determining unit configured to determine an eye status based on the eye tracking information;
an encoder configured to re-encode the buffered video frames using a resolution , if the eye status is in a fixation status:
Figure imgf000020_0001
wherein ) x is the position of a point on a screen of the user device, g(x) is a distance from the point to a center of a gaze point, 1 is a system delay, l - e"¾/33-3
y =— ——
1 2 3 , s is a diameter of the gaze point, x is derived from an equation of 1
y =—r2 ;
a{x a2x a3 ^ an(j arg ^ a mnc^on ^0 calculate a suitable x according to an input formula;
and to re-encode the buffered video frames using a resolution , if the eye status is in a saccadic status:
-i/33.3
\ - e
y = max(- almax(f(x),s) + a2max(f(x),s) + a yi)
Ax
wherein v , ^ is an eye tracker resolution, v is a velocity of eye movement, x is the position of a point on a screen of the user device, f ^ is a minimum distance from the point to an estimated moving trajectory of eye, l - e -t, /33.3
y, distance from the point to a center ax (max (g (x) ,x))2 + a2max(g(x),x) + a3
of a predicted fixation area i, k;<l is a parameter to control the resolution of the predicted area i;
a transmitter configured to send the re-encoded video frames to the user device.
PCT/IB2016/000262 2015-03-03 2016-01-26 Method and apparatus for transmitting a video WO2016139532A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201510094975 2015-03-03
CN201510094975.X 2015-03-03
CN201510207760.4A CN106162363B (en) 2015-03-03 2015-04-27 The method and apparatus for transmitting video
CN201510207760.4 2015-04-27

Publications (1)

Publication Number Publication Date
WO2016139532A1 true WO2016139532A1 (en) 2016-09-09

Family

ID=55697237

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2016/000262 WO2016139532A1 (en) 2015-03-03 2016-01-26 Method and apparatus for transmitting a video

Country Status (1)

Country Link
WO (1) WO2016139532A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11290699B2 (en) 2016-12-19 2022-03-29 Dolby Laboratories Licensing Corporation View direction based multilevel low bandwidth techniques to support individual user experiences of omnidirectional video

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4479784A (en) * 1981-03-03 1984-10-30 The Singer Company Eye line-of-sight responsive wide angle visual system
US6028608A (en) * 1997-05-09 2000-02-22 Jenkins; Barry System and method of perception-based image generation and encoding
US20120146891A1 (en) * 2010-12-08 2012-06-14 Sony Computer Entertainment Inc. Adaptive displays using gaze tracking

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4479784A (en) * 1981-03-03 1984-10-30 The Singer Company Eye line-of-sight responsive wide angle visual system
US6028608A (en) * 1997-05-09 2000-02-22 Jenkins; Barry System and method of perception-based image generation and encoding
US20120146891A1 (en) * 2010-12-08 2012-06-14 Sony Computer Entertainment Inc. Adaptive displays using gaze tracking

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
ADRIAN STAUB; ASHLEY BENATAR: "Individual differences in fixation duration distributions in reading", PSYCHONOMIC BULLETIN & REVIEW, vol. 20, no. 6, December 2013 (2013-12-01), pages 1304 - 1311
ANSTIS SM: "A chart demonstrating variations in acuity with retinal position (Letter", VISION RES., vol. 14, 1974, pages 589 - 592
ARTHUR LUGTIGHEID: "Distributions of fixation durations and visual acquisition rates", LUGTIGHEID, A.J.P., 2007
LAURA MUIR LAIN; LAIN RICHARDSON; STEVEN LEAPER: "Gaze Tracking and Its Application to Video Coding for Sign Language", PICTURE CODING SYMPOSIUM, 2003, pages 32 - 325
MARIO CESARELLI; PAOLO BIFULCO; LUCIANO LOFFREDO: "Marcello Bracale, Relationship between visual acuity and eye position variability during foveations in congenital nystagmus", DOCUMENTA OPHTHALMOLOGICA, vol. 101, no. 1, July 2000 (2000-07-01), pages 59 - 72
MOHSEN M. FARID; FATIH KURUGOLLU; FIONN D. MURTAGHK: "Adaptive wavelet eye-gaze-based video compression", PROC. SPIE 4877, OPTO-IRELAND 2002: OPTICAL METROLOGY, IMAGING, AND MACHINE VISION, vol. 255, 17 March 2003 (2003-03-17)
PARKHURST D ET AL: "Variable-Resolution Displays: A theoretical, practical and behavioral evaluation", HUMAN FACTORS, HUMAN FACTORS AND ERGONOMICS SOCIETY, SANTA MONICA, CA, US, vol. 44, no. 4, 1 January 2002 (2002-01-01), pages 611629, XP002332177, ISSN: 0018-7208, DOI: 10.1518/0018720024497015 *
ROBERT-INACIO, F.; SCARAMUZZINO, R.; STAINER, Q.; KUSSENER-COMBIER, E.: "Biologically inspired image sampling for electronic eye", BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS, 2010, pages 246 - 249

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11290699B2 (en) 2016-12-19 2022-03-29 Dolby Laboratories Licensing Corporation View direction based multilevel low bandwidth techniques to support individual user experiences of omnidirectional video

Similar Documents

Publication Publication Date Title
CN106162363B (en) The method and apparatus for transmitting video
CN105892647B (en) A kind of display screen method of adjustment, its device and display device
CN105630167B (en) Screen self-adapting regulation method, screen self-adapting adjusting apparatus and terminal device
EP3188479A1 (en) Adaptive video definition adjustment method and apparatus, terminal device, and storage medium
US20170277258A1 (en) Method for adjusting screen luminance and electronic device
CN106406522B (en) Virtual reality scene content adjusting method and device
CN105630143A (en) Screen display adjusting method and device
CN108591868B (en) Automatic dimming desk lamp based on eye fatigue degree
CN110352033A (en) Eyes degree of opening is determined with eye tracks device
KR101987837B1 (en) Method for providing eyesight shielding service based on multi-media device
CN105719618A (en) Method for saving energy by means of automatically adjusting brightness of mobile phone screen
CN110969116B (en) Gaze point position determining method and related device
CN102436306A (en) Method and device for controlling 3D display system
DE112019003229T5 (en) Video processing in virtual reality environments
CN105900496A (en) Fast dormancy system and process
KR102638468B1 (en) Electronic apparatus and operating method thereof
WO2016139532A1 (en) Method and apparatus for transmitting a video
CN108986770B (en) Information terminal
CN103941986B (en) Portable type terminal and input method interface self-adaptation adjusting method thereof
CN105575364A (en) Intelligent watch and brightness adaptive adjusting system and method
CN105898057A (en) Mobile terminal and method of adjusting brightness of VR glasses
WO2015024328A1 (en) Eyesight-protection imaging system and eyesight-protection imaging method
CN106604130A (en) Video playing method based on line-of-sight tracking
JP2021530918A (en) Forviation and HDR
CN104811802A (en) Image playing method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16715083

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16715083

Country of ref document: EP

Kind code of ref document: A1