US20140092087A1 - Adaptive load balancing in software emulation of gpu hardware - Google Patents
Adaptive load balancing in software emulation of gpu hardware Download PDFInfo
- Publication number
- US20140092087A1 US20140092087A1 US13/631,803 US201213631803A US2014092087A1 US 20140092087 A1 US20140092087 A1 US 20140092087A1 US 201213631803 A US201213631803 A US 201213631803A US 2014092087 A1 US2014092087 A1 US 2014092087A1
- Authority
- US
- United States
- Prior art keywords
- pixels
- tiles
- rasterization
- rendered
- threads
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/10—Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/08—Volume rendering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/46—Colour picture communication systems
- H04N1/56—Processing of colour picture signals
- H04N1/60—Colour correction or control
- H04N1/6058—Reduction of colour to a range of reproducible colours, e.g. to ink- reproducible colour gamut
- H04N1/6063—Reduction of colour to a range of reproducible colours, e.g. to ink- reproducible colour gamut dependent on the contents of the image to be reproduced
- H04N1/6069—Reduction of colour to a range of reproducible colours, e.g. to ink- reproducible colour gamut dependent on the contents of the image to be reproduced spatially varying within the image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/46—Colour picture communication systems
- H04N1/64—Systems for the transmission or the storage of the colour picture signal; Details therefor, e.g. coding or decoding means therefor
- H04N1/642—Adapting to different types of images, e.g. characters, graphs, black and white image portions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4318—Generation of visual interfaces for content selection or interaction; Content or additional data rendering by altering the content in the rendering process, e.g. blanking, blurring or masking an image region
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5017—Task decomposition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5018—Thread allocation
Definitions
- the present disclosure is related to video game emulation.
- this application describes a method and apparatus for emulating a graphics processing unit (GPU) over a cloud based network with tile-based rasterization.
- GPU graphics processing unit
- a graphics processing unit may transform a three-dimensional virtual object into a two-dimensional image that may be displayed on a screen.
- the GPU may use one or more graphics pipelines for processing information initially provided to the GPU, such as graphics primitives.
- Graphics primitives are properties that are used to describe a three-dimensional object that is being rendered.
- graphics primitives may be lines, triangles, or vertices that form a three dimensional object when combined.
- Each of the graphics primitives may contain additional information to further define the three dimensional object such as, but not limited to X-Y-Z coordinates, red-green-blue (RGB) values, translucency, texture, and reflectivity.
- RGB red-green-blue
- Rasterization is the process by which the graphics primitives describing the three-dimensional object are transformed into a two-dimensional image representation of the scene.
- the two-dimensional image is comprised of individual pixels, each of which may contain unique RGB values.
- the GPU will rasterize a three-dimensional image by stepping across the entire three-dimensional object in raster pattern along a two dimensional plane. Each step along the line represents one pixel.
- the GPU must determine if the pixel should be rendered and delivered to the frame buffer. If the pixel has not changed from a previous rendering, then there is no need to deliver an updated pixel to the frame buffer. Therefore, each raster line may have a variable number of pixels that must be processed.
- a plurality of rasterization threads may each be assigned one or more of the raster lines to process, and the rasterization threads may be executed in parallel.
- the emulation software may dedicate an increased number of available rasterization threads to the rasterization process. This increases the demand on the processor running the emulation software. Also, in the case of cloud-based services, the number of instances of the emulation software that will be miming at a given time is not known beforehand. If the emulation software requires extensive processing power, then scaling the system for increased users becomes prohibitively expensive. By way of example, during peak usage hours, there may be many instances of the emulator being executed on the network. This requires that resources such as processing power be used as efficiently as possible.
- the efficiency of the processing power cannot be made by decreasing the frame rate that the emulator is capable of producing.
- the frame rate should ideally remain above 24 frames per second in order to ensure smooth animation.
- a rasterization method that allows for efficient load balancing is needed.
- FIG. 1 is a schematic diagram of a snapshot generator and an emulator communicating over a network according to an aspect of the present disclosure.
- FIG. 2A-2B are flow diagrams of a methods for using tile-based rasterization as part of a software based emulation of a GPU implemented over a cloud-based network according various aspects of the present disclosure.
- FIG. 3A-3B are schematics of software based emulators of a GPU implemented over a cloud-based network that are configured to rasterize a virtual image with tile based rasterization according to various aspects of the present disclosure.
- FIG. 4A-4B are block diagrams describing the instructions for how a software based emulator of a GPU implemented over a cloud-based network rasterizes a virtual image with tile based rasterization according to various aspects of the present disclosure.
- a virtual image containing graphics primitives is first divided into a plurality of tiles. Each of the tiles has a predetermined number of image pixels. The emulator may then scan each of the tiles to determine how many of the image pixels in each tile need to be rendered. The number of pixels that need to be rendered for each tile is then delivered to a load balancer. The load balancer distributes the processing between rasterization threads. Each rasterization thread will be assigned approximately the same total number of pixels to be rendered. The rasterization threads then rasterize their assigned tiles, and render the pixels that require rendering. Additionally, the rasterization threads may deliver the rendered pixels to a frame buffer. The frame buffer builds a frame from the rendered pixels and then delivers the frame over the network to a client device platform.
- a virtual image containing graphics primitives is first divided into a plurality of tiles. Each of the tiles has a predetermined number of image pixels. The emulator may then scan each of the tiles to determine if any of the image pixels that are within a tile need to be rendered. Pixels that do not need to be rendered are sometimes referred to herein as “ignorable” pixels. If at least one image pixel in a tile needs to be rendered, then a message is sent to a load balancer indicating that the tile is “full”. Once each tile has been scanned, the load balancer can divide the “full” tiles evenly between the available rasterization threads. Each rasterization thread then rasterizes the assigned tiles and delivers the rendered pixels to a frame buffer. The frame buffer builds a frame from the rendered pixels and then delivers the frame over the network to a client device platform.
- FIG. 1A is a schematic of an embodiment of the present invention.
- Emulator 107 may be accessed by a client device platform 103 over a network 160 .
- Client device platform 103 may access alternative emulators 107 over the network 160 .
- Emulators 107 may be identical to each other, or they may each be programmed to emulate unique legacy game titles 106 or unique sets of legacy game titles 106 .
- Client device platform 103 may include a central processor unit (CPU) 131 .
- a CPU 131 may include one or more processors, which may be configured according to, e.g., a dual-core, quad-core, multi-core, or Cell processor architecture.
- Snapshot generator 102 may also include a memory 132 (e.g., RAM, DRAM, ROM, and the like).
- the CPU 131 may execute a process-control program 133 , portions of which may be stored in the memory 132 .
- the client device platform 103 may also include well-known support circuits 140 , such as input/output (I/O) circuits 141 , power supplies (P/S) 142 , a clock (CLK) 143 and cache 144 .
- I/O input/output
- P/S power supplies
- CLK clock
- the client device platform 103 may optionally include a mass storage device 134 such as a disk drive, CD-ROM drive, tape drive, or the like to store programs and/or data.
- the client device platform 103 may also optionally include a display unit 137 and a user interface unit 138 to facilitate interaction between the client device platform 103 and a user.
- the display unit 137 may be in the form of a cathode ray tube (CRT) or flat panel screen that displays text, numerals, or graphical symbols.
- the user interface unit 138 may include a keyboard, mouse, joystick, light pen, or other device.
- a controller 145 may be connected to the client device platform 103 through the I/O circuit 141 or it may be directly integrated into the client device platform 103 .
- the controller 145 may facilitate interaction between the client device platform 103 and a user.
- the controller 145 may include a keyboard, mouse, joystick, light pen, hand-held controls or other device.
- the controller 145 may be capable of generating a haptic response 146 .
- the haptic response 146 may be vibrations or any other feedback corresponding to the sense of touch.
- the client device platform 103 may include a network interface 139 , configured to enable the use of Wi-Fi, an Ethernet port, or other communication methods.
- the network interface 139 may incorporate suitable hardware, software, firmware or some combination of two or more of these to facilitate communication via an electronic communications network 160 .
- the network interface 139 may be configured to implement wired or wireless communication over local area networks and wide area networks such as the Internet.
- the client device platform 103 may send and receive data and/or requests for files via one or more data packets over the network 160 .
- the preceding components may exchange signals with each other via an internal system bus 150 .
- the client device platform 103 may be a general purpose computer that becomes a special purpose computer when miming code that implements embodiments of the present invention as described herein.
- the emulator 107 may include a central processor unit (CPU) 131 ′.
- a CPU 131 ′ may include one or more processors, which may be configured according to, e.g., a dual-core, quad-core, multi-core, or Cell processor architecture.
- the emulator 107 may also include a memory 132 ′ (e.g., RAM, DRAM, ROM, and the like).
- the CPU 131 ′ may execute a process-control program 133 ′, portions of which may be stored in the memory 132 ′.
- the emulator 107 may also include well-known support circuits 140 ′, such as input/output (I/O) circuits 141 ′, power supplies (P/S) 142 ′, a clock (CLK) 143 ′ and cache 144 ′.
- the emulator 107 may optionally include a mass storage device 134 ′ such as a disk drive, CD-ROM drive, tape drive, or the like to store programs and/or data.
- the emulator 107 may also optionally include a display unit 137 ′ and user interface unit 138 ′ to facilitate interaction between the emulator 107 and a user who requires direct access to the emulator 107 .
- a snapshot generator or engineer 102 may need direct access to the emulator 107 in order to program the emulator 107 to properly emulate a desired legacy game 106 or to add additional mini-game capabilities to a legacy game 106 .
- the display unit 137 ′ may be in the form of a cathode ray tube (CRT) or flat panel screen that displays text, numerals, or graphical symbols.
- the user interface unit 138 ′ may include a keyboard, mouse, joystick, light pen, or other device.
- the emulator 107 may include a network interface 139 ′, configured to enable the use of Wi-Fi, an Ethernet port, or other communication methods.
- the network interface 139 ′ may incorporate suitable hardware, software, firmware or some combination of two or more of these to facilitate communication via the electronic communications network 160 .
- the network interface 139 ′ may be configured to implement wired or wireless communication over local area networks and wide area networks such as the Internet.
- the emulator 107 may send and receive data and/or requests for files via one or more data packets over the network 160 .
- the preceding components may exchange signals with each other via an internal system bus 150 ′.
- the emulator 107 may be a general purpose computer that becomes a special purpose computer when running code that implements embodiments of the present invention as described herein.
- Emulator 107 may access a legacy game 106 that has been selected by the client device platform 103 for emulation through the internal system bus 150 ′.
- the legacy games may also be stored in the memory 132 ′ or in the mass storage device 134 ′. Additionally, one or more legacy games 106 may be stored at a remote location accessible to the emulator 107 over the network 160 .
- Each legacy game 106 contains game code 108 . When the legacy game 106 is emulated, the game code 108 produces legacy game data 109 .
- a legacy game 106 may be any game that is not compatible with a target platform.
- the legacy game 106 may have been designed to be played on Sony Computer Entertainment's PlayStation console, but the target platform is a home computer.
- the legacy game 106 may have been designed to be played on a PlayStation 2 console, but the target platform is a PlayStation 3 console.
- a legacy game 106 may have been designed to be played on a PlayStation console, but the target platform is a hand held console such as the PlayStation Vita from Sony Computer Entertainment.
- Emulator 107 may be a deterministic emulator.
- a deterministic emulator is an emulator that may process a given set of game inputs the same way every time that the same set of inputs are provided to the emulator 107 . This may be accomplished by eliminating any dependencies in the code run by the emulator 107 that depend from an asynchronous activity.
- Asynchronous activities are events that occur independently of the main program flow. This means that actions may be executed in a non-blocking scheme in order to allow the main program flow to continue processing. Therefore, by way of example, and not by way of limitation, the emulator 107 may be deterministic when the dependencies in the code depend from basic blocks that always begin and end with synchronous activity.
- basic blocks may be predetermined increments of code at which the emulator 107 checks for external events or additional game inputs.
- the emulator 107 may also wait for anything that runs asynchronously within a system component to complete before proceeding to the next basic block.
- a steady state within the emulator 107 may be when all of the basic blocks are in lock step.
- FIG. 2A is a flow diagram of a method 200 for implementing the rasterization step in a graphics pipeline with a software based emulator for a GPU on a cloud-based network.
- the emulator 107 may divide a virtual image 320 into smaller tiles 315 .
- the height and width of each tile may be 8 pixels by 8 pixels, 16 pixels by 16 pixels, or 32 pixels by 32 pixels.
- Each tile 315 corresponds to a portion of a frame 319 that may be displayed by the client device platform's display 137 .
- FIG. 3A is a diagram of an emulator system 300 .
- the virtual image 320 contains the graphics primitives 310 that will be rendered to produce a frame 319 that is viewable on the display 137 of the client device platform 103 .
- the graphics primitives 310 shown in FIG. 3A are a series of triangles.
- the virtual image 320 may contain any alternative type of graphic primitives, such as, but not limited to, lines, points, arcs, vertices, or any combination thereof.
- the graphics primitives 310 are displayed in two dimensions, but the virtual image 320 may also include three-dimensional objects.
- method 200 continues with the emulator 107 determining which tiles 315 have pixels that need to be rendered at 262 .
- Each tile 315 will be scanned by the emulator 107 to determine how many of the pixels within the tile 315 need to be rendered.
- a pixel needs to be rendered if the value of the new pixel for the frame 319 being rasterized is different from the value of the pixel presently stored in the frame buffer 318 . Otherwise, the pixel is “ignorable”.
- a pixel value may include X-Y-Z coordinates, RGB values, translucency, texture, reflectivity or any combination thereof.
- the number of pixels that need to be rendered for a given tile 315 may then be delivered to the load balancer 317 at 263 .
- the emulator 107 may determine how many pixels need to be rendered for each tile by determining whether the tile is entirely within a polygon.
- the emulator 107 may determine whether all corners of the tile lie within the polygon. If all four corners are within the polygon, then that tile is fully covered and it may be easy to apply a texture or calculate RGB values from the top left corner pixel value. If the tile is partially outside the polygon then the pixel values are determined on a per-pixel basis.
- the load balancer 317 begins assigning tiles 315 to one or more rasterization threads 316 for rasterization at 264 .
- Load balancer 317 distributes the processing load amongst the available rasterization threads 316 so that each thread 316 has approximately the same processing load. Ideally, the load balancer 317 will distribute the tiles 315 such that each rasterization thread 316 will render the same number of pixels.
- FIG. 3A is an example of the load balancer 317 distributing the load across several rasterization threads 316 A , 316 B , 316 C , and 316 D .
- Each of the tiles 315 assigned to a rasterization thread 316 has the number of pixels that need to be rendered indicated (i.e., the topmost pixel assigned to rasterization thread 316 A contains four pixels that need to be rendered).
- rasterization thread 316 A is assigned four tiles 315 and a combined nine pixels that need to be rendered.
- the remaining rasterization threads, 316 B , 316 C , and 316 D each have eight pixels that need to be rendered.
- Rasterization threads 316 B and 316 C each have their eight pixels split between four tiles 315
- rasterization thread 316 B has its eight pixels spread amongst only three tiles 315 .
- rasterization threads 316 , tiles 315 , and pixels displayed in FIG. 3A are given as one example, and that there may be a greater or lesser number of each in an emulator 103 . It should also be noted that if a tile does not contain pixels that require rendering, then the thread may not need to process the tile at all.
- the rasterization threads 316 begin rasterizing the tiles 315 assigned to them by the load balancer 317 at 265 .
- the rasterization proceeds according to a traditional raster pattern, except that it is limited to the dimensions of a single tile 315 .
- every pixel that must be rendered is delivered to the frame buffer 318 at 266 .
- the frame buffer 318 may then build the frame 319 that will be displayed on the display 137 of the client device platform 103 at 267 .
- the emulator 103 delivers the frame 318 to the client device platform 103 over the network 160 .
- the emulator 103 may use a video codec to encode the frame 319 before delivering it to the client device platform 103 .
- the client device platform 103 may have suitable codec configured to decode the encoded frame 319 .
- a set of emulator instructions 470 may be implemented, e.g., by the emulator 107 .
- the emulator instructions 470 may be formed on a nontransitory computer readable medium such as the memory 132 ′ or the mass storage device 134 ′.
- the emulator instructions 470 may also be part of the process control program 133 ′.
- the instructions may include instructing the emulator to setting the predetermined size for each tile 315 of the virtual image 320 .
- the emulator 107 may be instructed to scan each of the tiles 315 to determine the number of pixels that need to be rendered.
- the emulator 107 may then be instructed to deliver the number of pixels to be rendered for each tile 315 to the load balancer 317 at 474 .
- the emulator 317 may then be instructed to have the load balancer evenly distribute the processing load between each of the available rasterization threads 316 at 475 .
- hardware e.g., Power VR
- the assignment number is equal to the processor core number.
- there are multiple asynchronous threads e.g., four threads, but not as many threads as queues.
- a queue is a group of tiles that need to be processed.
- Each queue can have a state ID that allows state to be maintained.
- the state for an arbitrary number of tiles may be stored separately, e.g., in a different buffer. Storing the states separately reduces the amount of memory copying that needs to be done.
- the load balancer 317 may then assign an empty thread to a queue that is waiting for rendering. This maintains cache locality by keeping the threads occupied.
- the emulator 107 may be instructed to have the rasterization threads 316 begin rasterizing each of the tiles 315 . During the rasterization, the emulator 107 may be instructed to deliver the rendered pixels to the frame buffer 318 at 477 . The emulator 107 may then be instructed to generate the frame 319 from the pixels in the frame buffer 318 . Thereafter, the emulator 107 may be provided with instructions for delivering the frame 319 to a client device platform 103 over a network 160 .
- FIG. 2B is a flow diagram of a method 201 for implementing the rasterization step in a graphics pipeline with a software based emulator for a GPU on a cloud-based network according to an additional aspect of the present disclosure.
- the emulator 107 may divide a virtual image 320 into smaller tiles 315 .
- the height and width of each tile may be 8 pixels by 8 pixels, 16 pixels by 16 pixels, or 32 pixels by 32 pixels.
- Each tile 315 corresponds to a portion of a frame 319 that may be displayed by the client device platform's display 137 .
- FIG. 3B is a diagram of an emulator system 301 .
- the virtual image 320 contains the graphics primitives 310 that will be rendered to produce a frame 319 that is viewable on the display 137 of the client device platform 103 .
- the graphics primitives 310 are a series of triangles.
- the virtual image 320 may contain any alternative type of graphic primitives, such as, but not limited to, lines, points, arcs, vertices, or any combination thereof.
- the graphics primitives 310 are displayed in two dimensions, but the virtual image 320 may also include three-dimensional objects.
- method 201 continues with the emulator 107 determining if any pixels need to be rendered for each tile at 272 . If there is at least one pixel that needs to be rendered in a tile 315 , then that tile may be designated as a “full” tile 315 . If there are no pixels that need to be rendered in a tile 315 (i.e., all pixels in the tile are ignorable), then that tile may be designated as an “empty” tile 315 .
- a “full” designation will be interpreted by the load balancer 317 as indicating that all pixels in the tile 315 need to be rendered, and an “empty” designation will be interpreted by the load balancer 317 as indicating that none of the pixels in the tile 315 need to be rendered.
- the use of “empty” and “full” designations may improve the scanning speed of the emulator 107 because each tile 315 does not need to be completely scanned. Once a single pixel that requires rendering is detected, the scan of the tile 315 may be ceased. The identification of which tiles 315 are “full” may then be delivered to the load balancer 317 at 273 .
- the load balancer 317 begins assigning “full” tiles 315 to one or more rasterization threads 316 for rasterization at 274 .
- Load balancer 317 distributes the processing load amongst the available rasterization threads 316 so that each thread 316 has approximately the same processing load. Ideally, the load balancer 317 will distribute the tiles 315 such that each rasterization thread 316 will render the same number of pixels.
- FIG. 3B illustrates one example of the load balancer 317 distributing the load across several rasterization threads 316 A , 316 B , 316 C , and 316 D .
- each of the tiles 315 assigned to a rasterization thread 316 has been identified as a “full” tile 315 . Therefore, it is assumed that each tile will require that every pixel within it will need to be rendered (e.g., in an 8 pixel by 8 pixel tile, it is assumed that there will be 64 pixels that must be rendered). This simplifies the load balancing, because each rasterization thread will be assigned an equal number of “full” tiles 315 to process. However, it should be noted that if the number of tiles 315 that are designated as “full” is not evenly divisible by the number of available rasterization threads 316 , then there may be one or more threads 316 that are assigned an additional tile 315 to process. As shown in FIG.
- the load may be divided such that three of the rasterization threads 316 A , 316 B , and 316 C are each randomly assigned four “full” tiles, and the fourth rasterization thread 316 D is randomly assigned three “full” tiles.
- the use of randomization ensures that the load of each rasterization thread 316 will be approximately even. It should be noted that the number of rasterization threads 316 , tiles 315 , and pixels displayed in FIG. 3B are given as one example, and that there may be a greater or lesser number of each in an emulator 103 . It should also be noted that if a tile does not contain pixels that require rendering, then the thread may not need to process the tile at all.
- the rasterization threads 316 begin rasterizing the tiles 315 assigned to them by the load balancer 317 at 275 .
- the rasterization proceeds according to a traditional raster pattern, except that it is limited to the dimensions of a single tile 315 .
- every pixel that must be rendered is delivered to the frame buffer 318 at 276 .
- the frame buffer 318 may then build the frame 319 that will be displayed on the display 137 of the client device platform 103 at 277 .
- the emulator 103 delivers the frame 318 to the client device platform 103 over the network 160 .
- the emulator 103 may use a video codec to encode the frame 319 before delivering it to the client device platform 103 .
- the client device platform 103 may have suitable codec configured to decode the encoded frame 319 .
- a set of emulator instructions 480 may be implemented, e.g., by the emulator 107 .
- the emulator instructions 480 may be formed on a nontransitory computer readable medium such as the memory 132 ′ or the mass storage device 134 ′.
- the emulator instructions 470 may also be part of the process control program 133 ′.
- the instructions may include instructing the emulator to setting the predetermined size for each tile 315 of the virtual image 320 .
- the emulator 107 may be instructed to scan each of the tiles 315 to determine if each tile is “full” or “empty”.
- the emulator 107 may then be instructed to deliver the identities of each “full” tile 315 to the load balancer 317 at 474 .
- the emulator 317 may then be instructed to have the load balancer 317 evenly distribute the processing load between each of the available rasterization threads 316 at 475 .
- the emulator 107 may be instructed to have the rasterization threads 316 begin rasterizing each of the tiles 315 .
- the emulator 107 may be instructed to deliver the rendered pixels to the frame buffer 318 at 477 .
- the emulator 107 may then be instructed to generate the frame 319 from the pixels in the frame buffer 318 .
- the emulator 107 may be provided with instructions for delivering the frame 319 to a client device platform 103 over a network 160 .
- certain aspects of the present disclosure may be used to facilitate distribution of the processing load for rasterization of a virtual image containing graphics primitives through the use of tiling. Tiling makes it possible to determine the processing loads that need to be distributed.
Abstract
Aspects of the present disclosure describe a software based emulator of a graphics processing unit (GPU) that is configured to operate over a cloud-based network. A virtual image containing graphics primitives is divided into a plurality of tiles. A load balancer assigns tiles to rasterization threads in order to evenly distribute the processing load. The rasterization threads then rasterize their assigned tiles and deliver rendered pixels to a frame buffer. The frame buffer builds a frame from the rendered pixels and then delivers the frame over the network to a client device platform. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
Description
- This application is related to commonly-assigned, co-pending provisional application Ser. No. 61/666,628, (Attorney Docket Number SCEA12004US00) filed Jun. 29, 2012, and entitled “DETERMINING TRIGGERS FOR CLOUD-BASED EMULATED GAMES”, the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending provisional application Ser. No. 61/666,645, (Attorney Docket Number SCEA12005US00) filed Jun. 29, 2012, and entitled “HAPTIC ENHANCEMENTS FOR EMULATED VIDEO GAME NOT ORIGINALLY DESIGNED WITH HAPTIC CAPABILITIES”, the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending provisional application Ser. No. 61/666,665, (Attorney Docket Number SCEA12006US00) filed Jun. 29, 2012, and entitled “CONVERSION OF HAPTIC EVENTS INTO SCREEN EVENTS”, the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending provisional application Ser. No. 61/666,679, (Attorney Docket Number SCEA12007US00) filed Jun. 29, 2012, and entitled “SUSPENDING STATE OF CLOUD-BASED LEGACY APPLICATIONS”, the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending application Ser. No. 13/631,725, (Attorney Docket Number SCEA12008US00), filed the same day as the present application, and entitled “REPLAY AND RESUMPTION OF SUSPENDED GAME” to Brian Michael Christopher Watson, Victor Octav Suba Miura, Jacob P. Stine and Nicholas J. Cardell, filed the same day as the present application, the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending application Ser. No. 13/631,740, (Attorney Docket Number SCEA12009US00), filed the same day as the present application, and entitled “METHOD FOR CREATING A MINI-GAME” to Brian Michael Christopher Watson, Victor Octav Suba Miura, and Jacob P. Stine, the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending application Ser. No. 13/631,785, (Attorney Docket Number SCEA12010US00), filed the same day as the present application, and entitled “PRE-LOADING TRANSLATED CODE IN CLOUD BASED EMULATED APPLICATIONS”, to Jacob P. Stine, Victor Octav Suba Miura, Brian Michael Christopher Watson, and Nicholas J. Cardell the entire disclosures of which are incorporated herein by reference.
- This application is related to commonly-assigned, co-pending application serial no. 13/, (Attorney Docket Number SCEA12012US00), filed the same day as the present application, entitled “METHOD AND APPARATUS FOR IMPROVING EFFICIENCY WITHOUT INCREASING LATENCY IN EMULATION OF A LEGACY APPLICATION TITLE”, to Jacob P. Stine and Victor Octav Suba Miura, the entire disclosures of which are incorporated herein by reference.
- The present disclosure is related to video game emulation. Among other things, this application describes a method and apparatus for emulating a graphics processing unit (GPU) over a cloud based network with tile-based rasterization.
- In three dimensional graphics rendering, a graphics processing unit (GPU) may transform a three-dimensional virtual object into a two-dimensional image that may be displayed on a screen. The GPU may use one or more graphics pipelines for processing information initially provided to the GPU, such as graphics primitives. Graphics primitives are properties that are used to describe a three-dimensional object that is being rendered. By way of example, graphics primitives may be lines, triangles, or vertices that form a three dimensional object when combined. Each of the graphics primitives may contain additional information to further define the three dimensional object such as, but not limited to X-Y-Z coordinates, red-green-blue (RGB) values, translucency, texture, and reflectivity.
- A critical step in a graphics pipeline is the rasterization step. Rasterization is the process by which the graphics primitives describing the three-dimensional object are transformed into a two-dimensional image representation of the scene. The two-dimensional image is comprised of individual pixels, each of which may contain unique RGB values. Typically, the GPU will rasterize a three-dimensional image by stepping across the entire three-dimensional object in raster pattern along a two dimensional plane. Each step along the line represents one pixel. At each step, the GPU must determine if the pixel should be rendered and delivered to the frame buffer. If the pixel has not changed from a previous rendering, then there is no need to deliver an updated pixel to the frame buffer. Therefore, each raster line may have a variable number of pixels that must be processed. In order to quickly process the three-dimensional object, a plurality of rasterization threads may each be assigned one or more of the raster lines to process, and the rasterization threads may be executed in parallel.
- When a GPU is being emulated through software, the processing capabilities may not be as efficient or as highly optimized as they would be in the original hardware based GPU. Therefore, if the processing load on each rasterization thread is not properly balanced, a delay or latency in the execution of the rasterization may develop. Further, it is difficult to predict the number of pixels that will be rendered along each raster line before it is processed. Without knowing a priori the processing load each rasterization thread is assigned, it is difficult to ensure that load can be evenly balanced.
- In order to prevent latencies, the emulation software may dedicate an increased number of available rasterization threads to the rasterization process. This increases the demand on the processor running the emulation software. Also, in the case of cloud-based services, the number of instances of the emulation software that will be miming at a given time is not known beforehand. If the emulation software requires extensive processing power, then scaling the system for increased users becomes prohibitively expensive. By way of example, during peak usage hours, there may be many instances of the emulator being executed on the network. This requires that resources such as processing power be used as efficiently as possible.
- Further, the efficiency of the processing power cannot be made by decreasing the frame rate that the emulator is capable of producing. The frame rate should ideally remain above 24 frames per second in order to ensure smooth animation. In order to provide a scalable software emulator of a GPU that is implemented over a cloud-based network, a rasterization method that allows for efficient load balancing is needed.
- It is within this context that aspects of the present disclosure arise.
-
FIG. 1 is a schematic diagram of a snapshot generator and an emulator communicating over a network according to an aspect of the present disclosure. -
FIG. 2A-2B are flow diagrams of a methods for using tile-based rasterization as part of a software based emulation of a GPU implemented over a cloud-based network according various aspects of the present disclosure. -
FIG. 3A-3B are schematics of software based emulators of a GPU implemented over a cloud-based network that are configured to rasterize a virtual image with tile based rasterization according to various aspects of the present disclosure. -
FIG. 4A-4B are block diagrams describing the instructions for how a software based emulator of a GPU implemented over a cloud-based network rasterizes a virtual image with tile based rasterization according to various aspects of the present disclosure. - Although the following detailed description contains many specific details for the purposes of illustration, anyone of ordinary skill in the art will appreciate that many variations and alterations to the following details are within the scope of the present disclosure. Accordingly, the aspects of the present disclosure described below are set forth without any loss of generality to, and without imposing limitations upon, the claims that follow this description.
- Aspects of the present disclosure describe a software based emulator of a graphics processing unit (GPU) that is configured to operate over a cloud-based network. A virtual image containing graphics primitives is first divided into a plurality of tiles. Each of the tiles has a predetermined number of image pixels. The emulator may then scan each of the tiles to determine how many of the image pixels in each tile need to be rendered. The number of pixels that need to be rendered for each tile is then delivered to a load balancer. The load balancer distributes the processing between rasterization threads. Each rasterization thread will be assigned approximately the same total number of pixels to be rendered. The rasterization threads then rasterize their assigned tiles, and render the pixels that require rendering. Additionally, the rasterization threads may deliver the rendered pixels to a frame buffer. The frame buffer builds a frame from the rendered pixels and then delivers the frame over the network to a client device platform.
- Additional aspects of the present disclosure describe a software based emulator of a GPU that is configured to operate over a cloud-based network. A virtual image containing graphics primitives is first divided into a plurality of tiles. Each of the tiles has a predetermined number of image pixels. The emulator may then scan each of the tiles to determine if any of the image pixels that are within a tile need to be rendered. Pixels that do not need to be rendered are sometimes referred to herein as “ignorable” pixels. If at least one image pixel in a tile needs to be rendered, then a message is sent to a load balancer indicating that the tile is “full”. Once each tile has been scanned, the load balancer can divide the “full” tiles evenly between the available rasterization threads. Each rasterization thread then rasterizes the assigned tiles and delivers the rendered pixels to a frame buffer. The frame buffer builds a frame from the rendered pixels and then delivers the frame over the network to a client device platform.
-
FIG. 1A is a schematic of an embodiment of the present invention.Emulator 107 may be accessed by aclient device platform 103 over anetwork 160.Client device platform 103 may accessalternative emulators 107 over thenetwork 160.Emulators 107 may be identical to each other, or they may each be programmed to emulate uniquelegacy game titles 106 or unique sets oflegacy game titles 106. -
Client device platform 103 may include a central processor unit (CPU) 131. By way of example, aCPU 131 may include one or more processors, which may be configured according to, e.g., a dual-core, quad-core, multi-core, or Cell processor architecture. Snapshot generator 102 may also include a memory 132 (e.g., RAM, DRAM, ROM, and the like). TheCPU 131 may execute a process-control program 133, portions of which may be stored in thememory 132. Theclient device platform 103 may also include well-knownsupport circuits 140, such as input/output (I/O)circuits 141, power supplies (P/S) 142, a clock (CLK) 143 andcache 144. Theclient device platform 103 may optionally include amass storage device 134 such as a disk drive, CD-ROM drive, tape drive, or the like to store programs and/or data. Theclient device platform 103 may also optionally include adisplay unit 137 and auser interface unit 138 to facilitate interaction between theclient device platform 103 and a user. Thedisplay unit 137 may be in the form of a cathode ray tube (CRT) or flat panel screen that displays text, numerals, or graphical symbols. Theuser interface unit 138 may include a keyboard, mouse, joystick, light pen, or other device. Acontroller 145 may be connected to theclient device platform 103 through the I/O circuit 141 or it may be directly integrated into theclient device platform 103. Thecontroller 145 may facilitate interaction between theclient device platform 103 and a user. Thecontroller 145 may include a keyboard, mouse, joystick, light pen, hand-held controls or other device. Thecontroller 145 may be capable of generating a haptic response 146. By way of example and not by way of limitation, the haptic response 146 may be vibrations or any other feedback corresponding to the sense of touch. Theclient device platform 103 may include anetwork interface 139, configured to enable the use of Wi-Fi, an Ethernet port, or other communication methods. - The
network interface 139 may incorporate suitable hardware, software, firmware or some combination of two or more of these to facilitate communication via anelectronic communications network 160. Thenetwork interface 139 may be configured to implement wired or wireless communication over local area networks and wide area networks such as the Internet. Theclient device platform 103 may send and receive data and/or requests for files via one or more data packets over thenetwork 160. - The preceding components may exchange signals with each other via an
internal system bus 150. Theclient device platform 103 may be a general purpose computer that becomes a special purpose computer when miming code that implements embodiments of the present invention as described herein. - The
emulator 107 may include a central processor unit (CPU) 131′. By way of example, aCPU 131′ may include one or more processors, which may be configured according to, e.g., a dual-core, quad-core, multi-core, or Cell processor architecture. Theemulator 107 may also include amemory 132′ (e.g., RAM, DRAM, ROM, and the like). TheCPU 131′ may execute a process-control program 133′, portions of which may be stored in thememory 132′. Theemulator 107 may also include well-knownsupport circuits 140′, such as input/output (I/O)circuits 141′, power supplies (P/S) 142′, a clock (CLK) 143′ andcache 144′. Theemulator 107 may optionally include amass storage device 134′ such as a disk drive, CD-ROM drive, tape drive, or the like to store programs and/or data. Theemulator 107 may also optionally include adisplay unit 137′ anduser interface unit 138′ to facilitate interaction between the emulator 107 and a user who requires direct access to theemulator 107. By way of example and not by way of limitation a snapshot generator or engineer 102 may need direct access to theemulator 107 in order to program theemulator 107 to properly emulate a desiredlegacy game 106 or to add additional mini-game capabilities to alegacy game 106. Thedisplay unit 137′ may be in the form of a cathode ray tube (CRT) or flat panel screen that displays text, numerals, or graphical symbols. Theuser interface unit 138′ may include a keyboard, mouse, joystick, light pen, or other device. Theemulator 107 may include anetwork interface 139′, configured to enable the use of Wi-Fi, an Ethernet port, or other communication methods. - The
network interface 139′ may incorporate suitable hardware, software, firmware or some combination of two or more of these to facilitate communication via theelectronic communications network 160. Thenetwork interface 139′ may be configured to implement wired or wireless communication over local area networks and wide area networks such as the Internet. Theemulator 107 may send and receive data and/or requests for files via one or more data packets over thenetwork 160. - The preceding components may exchange signals with each other via an
internal system bus 150′. Theemulator 107 may be a general purpose computer that becomes a special purpose computer when running code that implements embodiments of the present invention as described herein. -
Emulator 107 may access alegacy game 106 that has been selected by theclient device platform 103 for emulation through theinternal system bus 150′. There may be more than onelegacy game 106 stored in the emulator. The legacy games may also be stored in thememory 132′ or in themass storage device 134′. Additionally, one ormore legacy games 106 may be stored at a remote location accessible to theemulator 107 over thenetwork 160. Eachlegacy game 106 containsgame code 108. When thelegacy game 106 is emulated, thegame code 108 produces legacy game data 109. - By way of example, a
legacy game 106 may be any game that is not compatible with a target platform. By way of example and not by way of limitation, thelegacy game 106 may have been designed to be played on Sony Computer Entertainment's PlayStation console, but the target platform is a home computer. By way of example, thelegacy game 106 may have been designed to be played on aPlayStation 2 console, but the target platform is aPlayStation 3 console. Further, by way of example and not by way of limitation, alegacy game 106 may have been designed to be played on a PlayStation console, but the target platform is a hand held console such as the PlayStation Vita from Sony Computer Entertainment. -
Emulator 107 may be a deterministic emulator. A deterministic emulator is an emulator that may process a given set of game inputs the same way every time that the same set of inputs are provided to theemulator 107. This may be accomplished by eliminating any dependencies in the code run by theemulator 107 that depend from an asynchronous activity. Asynchronous activities are events that occur independently of the main program flow. This means that actions may be executed in a non-blocking scheme in order to allow the main program flow to continue processing. Therefore, by way of example, and not by way of limitation, theemulator 107 may be deterministic when the dependencies in the code depend from basic blocks that always begin and end with synchronous activity. By way of example, basic blocks may be predetermined increments of code at which theemulator 107 checks for external events or additional game inputs. Theemulator 107 may also wait for anything that runs asynchronously within a system component to complete before proceeding to the next basic block. A steady state within theemulator 107 may be when all of the basic blocks are in lock step. -
FIG. 2A is a flow diagram of amethod 200 for implementing the rasterization step in a graphics pipeline with a software based emulator for a GPU on a cloud-based network. At 261 theemulator 107 may divide avirtual image 320 intosmaller tiles 315. By way of example, and not by way of limitation, the height and width of each tile may be 8 pixels by 8 pixels, 16 pixels by 16 pixels, or 32 pixels by 32 pixels. Eachtile 315 corresponds to a portion of aframe 319 that may be displayed by the client device platform'sdisplay 137. -
FIG. 3A is a diagram of an emulator system 300. InFIG. 3A the arrows represent the flow of data between components. Thevirtual image 320 contains thegraphics primitives 310 that will be rendered to produce aframe 319 that is viewable on thedisplay 137 of theclient device platform 103. Thegraphics primitives 310 shown inFIG. 3A are a series of triangles. However, it should be noted that thevirtual image 320 may contain any alternative type of graphic primitives, such as, but not limited to, lines, points, arcs, vertices, or any combination thereof. Additionally, thegraphics primitives 310 are displayed in two dimensions, but thevirtual image 320 may also include three-dimensional objects. - Once the
virtual image 320 has been divided into thetiles 315,method 200 continues with theemulator 107 determining whichtiles 315 have pixels that need to be rendered at 262. Eachtile 315 will be scanned by theemulator 107 to determine how many of the pixels within thetile 315 need to be rendered. A pixel needs to be rendered if the value of the new pixel for theframe 319 being rasterized is different from the value of the pixel presently stored in theframe buffer 318. Otherwise, the pixel is “ignorable”. By way of example, and not by way of limitation, a pixel value may include X-Y-Z coordinates, RGB values, translucency, texture, reflectivity or any combination thereof. The number of pixels that need to be rendered for a giventile 315 may then be delivered to the load balancer 317 at 263. - By way of example, and not by way of limitation, the
emulator 107 may determine how many pixels need to be rendered for each tile by determining whether the tile is entirely within a polygon. Each polygon is defined by the vertices. Two vertices of a polygon may be used to generate a line equation in the form of Ax+By+C=0. Each polygon may be made up of multiple lines. Once the size and location of the polygon has been defined, theemulator 107 may determine whether all corners of the tile lie within the polygon. If all four corners are within the polygon, then that tile is fully covered and it may be easy to apply a texture or calculate RGB values from the top left corner pixel value. If the tile is partially outside the polygon then the pixel values are determined on a per-pixel basis. - The load balancer 317 begins assigning
tiles 315 to one or more rasterization threads 316 for rasterization at 264. Load balancer 317 distributes the processing load amongst the available rasterization threads 316 so that each thread 316 has approximately the same processing load. Ideally, the load balancer 317 will distribute thetiles 315 such that each rasterization thread 316 will render the same number of pixels.FIG. 3A is an example of the load balancer 317 distributing the load across several rasterization threads 316 A, 316 B, 316 C, and 316 D. Each of thetiles 315 assigned to a rasterization thread 316 has the number of pixels that need to be rendered indicated (i.e., the topmost pixel assigned to rasterization thread 316 A contains four pixels that need to be rendered). By way of example, rasterization thread 316 A is assigned fourtiles 315 and a combined nine pixels that need to be rendered. The remaining rasterization threads, 316 B, 316 C, and 316 D each have eight pixels that need to be rendered. Rasterization threads 316 B and 316 C each have their eight pixels split between fourtiles 315, whereas rasterization thread 316 B has its eight pixels spread amongst only threetiles 315. It should be noted that the number of rasterization threads 316,tiles 315, and pixels displayed inFIG. 3A are given as one example, and that there may be a greater or lesser number of each in anemulator 103. It should also be noted that if a tile does not contain pixels that require rendering, then the thread may not need to process the tile at all. - According to
method 200 the rasterization threads 316 begin rasterizing thetiles 315 assigned to them by the load balancer 317 at 265. The rasterization proceeds according to a traditional raster pattern, except that it is limited to the dimensions of asingle tile 315. During the rasterization, every pixel that must be rendered is delivered to theframe buffer 318 at 266. Theframe buffer 318 may then build theframe 319 that will be displayed on thedisplay 137 of theclient device platform 103 at 267. At 268, theemulator 103 delivers theframe 318 to theclient device platform 103 over thenetwork 160. Additionally, theemulator 103 may use a video codec to encode theframe 319 before delivering it to theclient device platform 103. Theclient device platform 103 may have suitable codec configured to decode the encodedframe 319. - As shown in
FIG. 4A , a set ofemulator instructions 470 may be implemented, e.g., by theemulator 107. Theemulator instructions 470 may be formed on a nontransitory computer readable medium such as thememory 132′ or themass storage device 134′. Theemulator instructions 470 may also be part of theprocess control program 133′. At 472, the instructions may include instructing the emulator to setting the predetermined size for eachtile 315 of thevirtual image 320. Thereafter at 473, theemulator 107 may be instructed to scan each of thetiles 315 to determine the number of pixels that need to be rendered. Theemulator 107 may then be instructed to deliver the number of pixels to be rendered for eachtile 315 to the load balancer 317 at 474. The emulator 317 may then be instructed to have the load balancer evenly distribute the processing load between each of the available rasterization threads 316 at 475. - By way of example, in a static load balancing arrangement hardware (e.g., Power VR) statically assigns responsibility for different tiles to different processors. The assignment number is equal to the processor core number. However, in a dynamic case, there are multiple asynchronous threads, e.g., four threads, but not as many threads as queues. A queue is a group of tiles that need to be processed. Each queue can have a state ID that allows state to be maintained. The state for an arbitrary number of tiles may be stored separately, e.g., in a different buffer. Storing the states separately reduces the amount of memory copying that needs to be done. By way of example, there may be one or more queues. The load balancer 317 may then assign an empty thread to a queue that is waiting for rendering. This maintains cache locality by keeping the threads occupied.
- Next at 476, the
emulator 107 may be instructed to have the rasterization threads 316 begin rasterizing each of thetiles 315. During the rasterization, theemulator 107 may be instructed to deliver the rendered pixels to theframe buffer 318 at 477. Theemulator 107 may then be instructed to generate theframe 319 from the pixels in theframe buffer 318. Thereafter, theemulator 107 may be provided with instructions for delivering theframe 319 to aclient device platform 103 over anetwork 160. -
FIG. 2B is a flow diagram of amethod 201 for implementing the rasterization step in a graphics pipeline with a software based emulator for a GPU on a cloud-based network according to an additional aspect of the present disclosure. At 271 theemulator 107 may divide avirtual image 320 intosmaller tiles 315. By way of example, and not by way of limitation, the height and width of each tile may be 8 pixels by 8 pixels, 16 pixels by 16 pixels, or 32 pixels by 32 pixels. Eachtile 315 corresponds to a portion of aframe 319 that may be displayed by the client device platform'sdisplay 137. -
FIG. 3B is a diagram of an emulator system 301. Thevirtual image 320 contains thegraphics primitives 310 that will be rendered to produce aframe 319 that is viewable on thedisplay 137 of theclient device platform 103. In the example shown inFIG. 3B , thegraphics primitives 310 are a series of triangles. However, it should be noted that thevirtual image 320 may contain any alternative type of graphic primitives, such as, but not limited to, lines, points, arcs, vertices, or any combination thereof. Additionally, thegraphics primitives 310 are displayed in two dimensions, but thevirtual image 320 may also include three-dimensional objects. - Once the
virtual image 320 has been divided into thetiles 315,method 201 continues with theemulator 107 determining if any pixels need to be rendered for each tile at 272. If there is at least one pixel that needs to be rendered in atile 315, then that tile may be designated as a “full”tile 315. If there are no pixels that need to be rendered in a tile 315 (i.e., all pixels in the tile are ignorable), then that tile may be designated as an “empty”tile 315. A “full” designation will be interpreted by the load balancer 317 as indicating that all pixels in thetile 315 need to be rendered, and an “empty” designation will be interpreted by the load balancer 317 as indicating that none of the pixels in thetile 315 need to be rendered. The use of “empty” and “full” designations may improve the scanning speed of theemulator 107 because eachtile 315 does not need to be completely scanned. Once a single pixel that requires rendering is detected, the scan of thetile 315 may be ceased. The identification of whichtiles 315 are “full” may then be delivered to the load balancer 317 at 273. - The load balancer 317 begins assigning “full”
tiles 315 to one or more rasterization threads 316 for rasterization at 274. Load balancer 317 distributes the processing load amongst the available rasterization threads 316 so that each thread 316 has approximately the same processing load. Ideally, the load balancer 317 will distribute thetiles 315 such that each rasterization thread 316 will render the same number of pixels.FIG. 3B illustrates one example of the load balancer 317 distributing the load across several rasterization threads 316 A, 316 B, 316 C, and 316 D. In this example, each of thetiles 315 assigned to a rasterization thread 316 has been identified as a “full”tile 315. Therefore, it is assumed that each tile will require that every pixel within it will need to be rendered (e.g., in an 8 pixel by 8 pixel tile, it is assumed that there will be 64 pixels that must be rendered). This simplifies the load balancing, because each rasterization thread will be assigned an equal number of “full”tiles 315 to process. However, it should be noted that if the number oftiles 315 that are designated as “full” is not evenly divisible by the number of available rasterization threads 316, then there may be one or more threads 316 that are assigned anadditional tile 315 to process. As shown inFIG. 3B there are 15tiles 315 that have been indicated as “full”. Therefore, the load may be divided such that three of the rasterization threads 316 A, 316 B, and 316 C are each randomly assigned four “full” tiles, and the fourth rasterization thread 316 D is randomly assigned three “full” tiles. The use of randomization ensures that the load of each rasterization thread 316 will be approximately even. It should be noted that the number of rasterization threads 316,tiles 315, and pixels displayed inFIG. 3B are given as one example, and that there may be a greater or lesser number of each in anemulator 103. It should also be noted that if a tile does not contain pixels that require rendering, then the thread may not need to process the tile at all. - According to
method 200 the rasterization threads 316 begin rasterizing thetiles 315 assigned to them by the load balancer 317 at 275. The rasterization proceeds according to a traditional raster pattern, except that it is limited to the dimensions of asingle tile 315. During the rasterization, every pixel that must be rendered is delivered to theframe buffer 318 at 276. Theframe buffer 318 may then build theframe 319 that will be displayed on thedisplay 137 of theclient device platform 103 at 277. At 278, theemulator 103 delivers theframe 318 to theclient device platform 103 over thenetwork 160. Additionally, theemulator 103 may use a video codec to encode theframe 319 before delivering it to theclient device platform 103. Theclient device platform 103 may have suitable codec configured to decode the encodedframe 319. - As shown in
FIG. 4B , a set ofemulator instructions 480 may be implemented, e.g., by theemulator 107. Theemulator instructions 480 may be formed on a nontransitory computer readable medium such as thememory 132′ or themass storage device 134′. Theemulator instructions 470 may also be part of theprocess control program 133′. At 482, the instructions may include instructing the emulator to setting the predetermined size for eachtile 315 of thevirtual image 320. Thereafter at 483, theemulator 107 may be instructed to scan each of thetiles 315 to determine if each tile is “full” or “empty”. Theemulator 107 may then be instructed to deliver the identities of each “full”tile 315 to the load balancer 317 at 474. The emulator 317 may then be instructed to have the load balancer 317 evenly distribute the processing load between each of the available rasterization threads 316 at 475. Next at 476, theemulator 107 may be instructed to have the rasterization threads 316 begin rasterizing each of thetiles 315. During the rasterization, theemulator 107 may be instructed to deliver the rendered pixels to theframe buffer 318 at 477. Theemulator 107 may then be instructed to generate theframe 319 from the pixels in theframe buffer 318. Thereafter, theemulator 107 may be provided with instructions for delivering theframe 319 to aclient device platform 103 over anetwork 160. - As may be seen from the foregoing, certain aspects of the present disclosure may be used to facilitate distribution of the processing load for rasterization of a virtual image containing graphics primitives through the use of tiling. Tiling makes it possible to determine the processing loads that need to be distributed.
- While the above is a complete description of the preferred embodiment of the present invention, it is possible to use various alternatives, modifications and equivalents. Therefore, the scope of the present invention should be determined not with reference to the above description but should, instead, be determined with reference to the appended claims, along with their full scope of equivalents. Any feature described herein, whether preferred or not, may be combined with any other feature described herein, whether preferred or not. In the claims that follow, the indefinite article “A”, or “An” refers to a quantity of one or more of the item following the article, except where expressly stated otherwise. The appended claims are not to be interpreted as including means-plus-function limitations, unless such a limitation is explicitly recited in a given claim using the phrase “means for.”
Claims (11)
1. A nontransitory computer readable medium containing program instructions for rasterizing a virtual image, wherein the virtual image comprises one or more graphic primitives, and wherein execution of the program instructions by one or more processors of a computer system causes the one or more processors to carry out a method, the method comprising:
a) dividing the virtual image to be rasterized into a plurality of tiles, wherein each of the tiles include a predetermined number of image pixels, wherein each of the image pixels is either an ignorable pixel or a pixel that needs to be rendered;
b) determining how many of the image pixels in each of the tiles are pixels that need to be rendered;
c) assigning each of the plurality of tiles to one of a plurality of rasterization threads, wherein each rasterization thread is assigned a quantity of tiles such that a total number of pixels that need to be rendered by each rasterization thread is approximately the same;
d) rasterizing each of the plurality of tiles with the rasterization threads, wherein pixels that need to be rendered are rendered and are delivered to a frame buffer;
e) generating a frame of the virtual image from the pixels in the frame buffer; and
f) delivering the frame to a client device platform over a network.
2. Then non-transitory computer readable medium of claim 1 , wherein all of the image pixels in a tile are assumed to be pixels that need to be rendered when at least one of the image pixels in the tile is a pixel that needs to be rendered.
3. Then non-transitory computer readable medium of claim 2 , wherein c) further includes randomly assigning each of the tiles that have at least one pixel that needs to be rendered to the one of the plurality of rasterization threads.
4. Then non-transitory computer readable medium of claim 1 , wherein the plurality of rasterization threads operate in parallel.
5. Then non-transitory computer readable medium of claim 1 , wherein the one or more graphic primitives form a three-dimensional object.
6. Then non-transitory computer readable medium of claim 1 , wherein the one or more graphic primitives are lines, points, arcs, vertices, triangles, polygons, or any combination thereof.
7. Then non-transitory computer readable medium of claim 1 , wherein generating the frame includes encoding the pixels in the frame buffer.
8. Then non-transitory computer readable medium of claim 1 , wherein the size of each tile is 8 pixels by 8 pixels.
9. Then non-transitory computer readable medium of claim 1 , wherein the size of each tile is 16 pixels by 16 pixels.
10. In an emulator of a graphics processing unit (GPU) configured to operate on a network, a method of rasterizing a virtual image, wherein the virtual image comprises one or more graphic primitives, comprising: a) dividing the virtual image to be rasterized into a plurality of tiles, wherein each of the tiles include a predetermined number of image pixels, wherein each of the image pixels is either an ignorable pixel or a pixel that needs to be rendered;
b) determining how many of the image pixels in each of the tiles are pixels that need to be rendered;
c) assigning each of the plurality of tiles to one of a plurality of rasterization threads, wherein each rasterization thread is assigned a quantity of tiles such that a total number of pixels that need to be rendered by each rasterization thread is approximately the same;
d) rasterizing each of the plurality of tiles with the rasterization threads, wherein pixels that need to be rendered are rendered and are delivered to a frame buffer;
e) generating a frame of the virtual image from the pixels in the frame buffer; and
f) delivering the frame to a client device platform over a network.
11. An emulator configured to operate on a network, comprising:
a processor;
a memory coupled to the processor;
one or more instructions embodied in memory for execution by the processor, the instructions being configured implement a method for rasterizing a virtual image, wherein the virtual image comprises one or more graphic primitives, the method comprising:
a) dividing the virtual image to be rasterized into a plurality of tiles, wherein each of the tiles include a predetermined number of image pixels, wherein each of the image pixels is either an ignorable pixel or a pixel that needs to be rendered;
b) determining how many of the image pixels in each of the tiles are pixels that need to be rendered;
c) assigning each of the plurality of tiles to one of a plurality of rasterization threads, wherein each rasterization thread is assigned a quantity of tiles such that a total number of pixels that need to be rendered by each rasterization thread is approximately the same;
d) rasterizing each of the plurality of tiles with the rasterization threads, wherein pixels that need to be rendered are rendered and are delivered to a frame buffer;
e) generating a frame of the virtual image from the pixels in the frame buffer; and
f) delivering the frame to a client device platform over a network.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/631,803 US20140092087A1 (en) | 2012-09-28 | 2012-09-28 | Adaptive load balancing in software emulation of gpu hardware |
US15/225,361 US10354443B2 (en) | 2012-09-28 | 2016-08-01 | Adaptive load balancing in software emulation of GPU hardware |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/631,803 US20140092087A1 (en) | 2012-09-28 | 2012-09-28 | Adaptive load balancing in software emulation of gpu hardware |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/225,361 Continuation US10354443B2 (en) | 2012-09-28 | 2016-08-01 | Adaptive load balancing in software emulation of GPU hardware |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140092087A1 true US20140092087A1 (en) | 2014-04-03 |
Family
ID=50384706
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/631,803 Abandoned US20140092087A1 (en) | 2012-09-28 | 2012-09-28 | Adaptive load balancing in software emulation of gpu hardware |
US15/225,361 Active US10354443B2 (en) | 2012-09-28 | 2016-08-01 | Adaptive load balancing in software emulation of GPU hardware |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/225,361 Active US10354443B2 (en) | 2012-09-28 | 2016-08-01 | Adaptive load balancing in software emulation of GPU hardware |
Country Status (1)
Country | Link |
---|---|
US (2) | US20140092087A1 (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140222965A1 (en) * | 2013-02-01 | 2014-08-07 | Samsung Electronics Co., Ltd | Method and apparatus for processing multimedia content on a graphic cloud |
US20150113514A1 (en) * | 2013-10-18 | 2015-04-23 | Nec Laboratories America, Inc. | Source-to-source transformations for graph processing on many-core platforms |
US20150379672A1 (en) * | 2014-06-27 | 2015-12-31 | Samsung Electronics Co., Ltd | Dynamically optimized deferred rendering pipeline |
US20160006788A1 (en) * | 2014-07-03 | 2016-01-07 | Hob Gmbh & Co. Kg | Client-server-communication system running a client-side-script-program |
US9248374B2 (en) | 2012-06-29 | 2016-02-02 | Sony Computer Entertainment Inc. | Replay and resumption of suspended game |
US9258012B2 (en) | 2013-03-15 | 2016-02-09 | Sony Computer Entertainment Inc. | Compression of state information for data transfer over cloud-based networks |
US9623327B2 (en) | 2012-06-29 | 2017-04-18 | Sony Interactive Entertainment Inc. | Determining triggers for cloud-based emulated games |
US9656163B2 (en) | 2012-06-29 | 2017-05-23 | Sony Interactive Entertainment Inc. | Haptic enhancements for emulated video game not originally designed with haptic capabilities |
US9694276B2 (en) | 2012-06-29 | 2017-07-04 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US9707476B2 (en) | 2012-09-28 | 2017-07-18 | Sony Interactive Entertainment Inc. | Method for creating a mini-game |
US20170221261A1 (en) * | 2016-02-01 | 2017-08-03 | Imagination Technologies Limited | Frustum Rendering in Computer Graphics |
US20170221177A1 (en) * | 2016-02-01 | 2017-08-03 | Imagination Technologies Limited | Sparse Rendering in Computer Graphics |
US9849372B2 (en) | 2012-09-28 | 2017-12-26 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in emulation of a legacy application title |
US9925468B2 (en) | 2012-06-29 | 2018-03-27 | Sony Interactive Entertainment Inc. | Suspending state of cloud-based legacy applications |
US9986187B2 (en) | 2016-07-01 | 2018-05-29 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US10084751B2 (en) * | 2011-02-16 | 2018-09-25 | Fortinet, Inc. | Load balancing among a cluster of firewall security devices |
CN109213607A (en) * | 2017-06-30 | 2019-01-15 | 武汉斗鱼网络科技有限公司 | A kind of method and apparatus of multithreading rendering |
AU2018201757B2 (en) * | 2017-03-23 | 2019-06-13 | Fusion Holdings Limited | Multi-threaded rendering system |
US10354443B2 (en) | 2012-09-28 | 2019-07-16 | Sony Interactive Entertainment Inc. | Adaptive load balancing in software emulation of GPU hardware |
US10406429B2 (en) | 2012-08-29 | 2019-09-10 | Sony Interactive Entertainment, LLC | User-based mini-game generation and distribution |
US10657680B2 (en) | 2015-01-27 | 2020-05-19 | Splunk Inc. | Simplified point-in-polygon test for processing geographic data |
US10688394B2 (en) | 2015-01-27 | 2020-06-23 | Splunk Inc. | Three-dimensional point-in-polygon operation to facilitate visualizing a 3D structure surrounding a data point |
US10748330B2 (en) * | 2015-01-27 | 2020-08-18 | Splunk Inc. | Clipping polygons to fit within a clip region |
US10783008B2 (en) | 2017-05-26 | 2020-09-22 | Sony Interactive Entertainment Inc. | Selective acceleration of emulation |
US10789279B2 (en) | 2015-01-27 | 2020-09-29 | Splunk Inc. | Ray casting technique for geofencing operation |
US10860624B2 (en) | 2015-01-27 | 2020-12-08 | Splunk Inc. | Using ray intersection lists to visualize data points bounded by geometric regions |
US11013993B2 (en) | 2012-09-28 | 2021-05-25 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US11185783B2 (en) | 2013-03-14 | 2021-11-30 | Sony Interactive Entertainment Inc. | Controller emulation for cloud gaming |
US11210816B1 (en) * | 2018-08-28 | 2021-12-28 | Apple Inc. | Transitional effects in real-time rendering applications |
WO2022095010A1 (en) * | 2020-11-09 | 2022-05-12 | Qualcomm Incorporated | Methods and apparatus for rasterization of compute workloads |
US11508028B2 (en) * | 2018-06-29 | 2022-11-22 | Imagination Technologies Limited | Tile assignment to processing cores within a graphics processing unit |
CN116578425A (en) * | 2023-07-11 | 2023-08-11 | 沐曦集成电路(上海)有限公司 | Load balancing method and system based on rasterization |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11776507B1 (en) | 2022-07-20 | 2023-10-03 | Ivan Svirid | Systems and methods for reducing display latency |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040179019A1 (en) * | 2003-03-12 | 2004-09-16 | Nvidia Corporation | Double-buffering of pixel data using copy-on-write semantics |
US20090303245A1 (en) * | 2008-04-30 | 2009-12-10 | Alexei Soupikov | Technique for performing load balancing for parallel rendering |
US20110299105A1 (en) * | 2010-06-08 | 2011-12-08 | Canon Kabushiki Kaisha | Rastering disjoint regions of the page in parallel |
Family Cites Families (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6009458A (en) | 1996-05-09 | 1999-12-28 | 3Do Company | Networked computer game system with persistent playing objects |
US6280323B1 (en) | 1996-11-21 | 2001-08-28 | Konami Co., Ltd. | Device, method and storage medium for displaying penalty kick match cursors in a video soccer game |
US6402620B1 (en) | 1998-12-02 | 2002-06-11 | Technology Creations, Inc. | Amplified stereo sound and force feed back accessory for video game devices |
US6955606B2 (en) | 2000-03-30 | 2005-10-18 | Nintendo Co., Ltd. | Game information storage medium and game system using the same |
US6699127B1 (en) | 2000-06-20 | 2004-03-02 | Nintendo Of America Inc. | Real-time replay system for video game |
US7159008B1 (en) | 2000-06-30 | 2007-01-02 | Immersion Corporation | Chat interface with haptic feedback functionality |
GB2364484B (en) | 2000-06-30 | 2004-10-13 | Nokia Mobile Phones Ltd | Apparatus and methods for a client server system |
US6884171B2 (en) | 2000-09-18 | 2005-04-26 | Nintendo Co., Ltd. | Video game distribution network |
US7470196B1 (en) | 2000-10-16 | 2008-12-30 | Wms Gaming, Inc. | Method of transferring gaming data on a global computer network |
US6904408B1 (en) | 2000-10-19 | 2005-06-07 | Mccarthy John | Bionet method, system and personalized web content manager responsive to browser viewers' psychological preferences, behavioral responses and physiological stress indicators |
US20020065915A1 (en) | 2000-11-30 | 2002-05-30 | Anderson Elizabeth A. | System and method for server-host connection management to serve anticipated future client connections |
US20030037030A1 (en) | 2001-08-16 | 2003-02-20 | International Business Machines Corporation | Method and system for storage, retrieval and execution of legacy software |
US8065394B2 (en) | 2001-08-20 | 2011-11-22 | Bally Gaming, Inc. | Local game-area network method |
JP3559024B2 (en) | 2002-04-04 | 2004-08-25 | マイクロソフト コーポレイション | GAME PROGRAM AND GAME DEVICE |
US9849372B2 (en) | 2012-09-28 | 2017-12-26 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in emulation of a legacy application title |
US7440885B2 (en) | 2002-06-03 | 2008-10-21 | Broadcom Corporation | Method and system for deterministic control of an emulation |
US8661496B2 (en) | 2002-12-10 | 2014-02-25 | Ol2, Inc. | System for combining a plurality of views of real-time streaming interactive video |
US9446305B2 (en) | 2002-12-10 | 2016-09-20 | Sony Interactive Entertainment America Llc | System and method for improving the graphics performance of hosted applications |
US7921302B2 (en) | 2003-03-10 | 2011-04-05 | Igt | Universal game download methods and system for legacy gaming machines |
US7549924B2 (en) | 2003-05-09 | 2009-06-23 | Microsoft Corporation | Instant messaging embedded games |
US20040266529A1 (en) | 2003-06-30 | 2004-12-30 | Sony Computer Entertainment America Inc. | Methods and systems for remote execution of game content and presentation on a wireless portable device |
US7978194B2 (en) | 2004-03-02 | 2011-07-12 | Ati Technologies Ulc | Method and apparatus for hierarchical Z buffering and stenciling |
US7286132B2 (en) | 2004-04-22 | 2007-10-23 | Pinnacle Systems, Inc. | System and methods for using graphics hardware for real time two and three dimensional, single definition, and high definition video effects |
US20060080702A1 (en) | 2004-05-20 | 2006-04-13 | Turner Broadcasting System, Inc. | Systems and methods for delivering content over a network |
US20060117260A1 (en) | 2004-11-30 | 2006-06-01 | Microsoft Corporation | Grouping of representations in a user interface |
US8274518B2 (en) | 2004-12-30 | 2012-09-25 | Microsoft Corporation | Systems and methods for virtualizing graphics subsystems |
US7496495B2 (en) | 2005-05-12 | 2009-02-24 | Microsoft Corporation | Virtual operating system device communication relying on memory access violations |
JP2007068581A (en) | 2005-09-02 | 2007-03-22 | Nintendo Co Ltd | Game device and game program |
US7887420B2 (en) | 2005-09-12 | 2011-02-15 | Igt | Method and system for instant-on game download |
US8572604B2 (en) | 2005-11-12 | 2013-10-29 | Intel Corporation | Method and apparatus to support virtualization with code patches |
JP3908772B1 (en) | 2005-12-26 | 2007-04-25 | 株式会社コナミデジタルエンタテインメント | GAME DEVICE, GAME DEVICE CONTROL METHOD, AND PROGRAM |
EP2032224A2 (en) | 2006-06-26 | 2009-03-11 | Icosystem Corporation | Methods and systems for interactive customization of avatars and other animate or inanimate items in video games |
US7841946B2 (en) | 2006-06-29 | 2010-11-30 | Spawn Labs, Inc. | System for remote game access |
US20080032794A1 (en) | 2006-07-24 | 2008-02-07 | Rambus, Inc. | Partitioned game console system |
US8085264B1 (en) * | 2006-07-26 | 2011-12-27 | Nvidia Corporation | Tile output using multiple queue output buffering in a raster stage |
US8271962B2 (en) | 2006-09-12 | 2012-09-18 | Brian Muller | Scripted interactive screen media |
US9311774B2 (en) | 2006-11-10 | 2016-04-12 | Igt | Gaming machine with externally controlled content display |
US8360847B2 (en) | 2006-11-13 | 2013-01-29 | Igt | Multimedia emulation of physical reel hardware in processor-based gaming machines |
US8073676B2 (en) | 2007-09-21 | 2011-12-06 | Sony Computer Entertainment Inc. | Method and apparatus for emulation enhancement |
US20090088236A1 (en) | 2007-09-28 | 2009-04-02 | Michael Laude | Game of Misdirection and Detection |
EP2232848A4 (en) | 2007-12-20 | 2012-10-24 | Ati Technologies Ulc | Adjusting video processing in a system having a video source device and a video sink device |
US8494833B2 (en) | 2008-05-09 | 2013-07-23 | International Business Machines Corporation | Emulating a computer run time environment |
US20100088296A1 (en) | 2008-10-03 | 2010-04-08 | Netapp, Inc. | System and method for organizing data to facilitate data deduplication |
US9292282B2 (en) | 2009-03-31 | 2016-03-22 | International Business Machines Corporation | Server-side translation for custom application support in client-side scripts |
US8698823B2 (en) | 2009-04-08 | 2014-04-15 | Nvidia Corporation | System and method for deadlock-free pipelining |
GB2471887B (en) | 2009-07-16 | 2014-11-12 | Advanced Risc Mach Ltd | A video processing apparatus and a method of processing video data |
US20110218037A1 (en) | 2010-03-08 | 2011-09-08 | Yahoo! Inc. | System and method for improving personalized search results through game interaction data |
US8935487B2 (en) | 2010-05-05 | 2015-01-13 | Microsoft Corporation | Fast and low-RAM-footprint indexing for data deduplication |
US10010793B2 (en) | 2010-06-14 | 2018-07-03 | Nintendo Co., Ltd. | Techniques for improved user interface helping super guides |
US20120052930A1 (en) | 2010-06-24 | 2012-03-01 | Dr. Elliot McGucken | System and method for the heros journey mythology code of honor video game engine and heros journey code of honor spy games wherein one must fake the enemy's ideology en route to winning |
US20120142425A1 (en) | 2010-10-22 | 2012-06-07 | Bally Gaming, Inc. | Legacy Game Download and Configuration |
US20130137518A1 (en) | 2011-11-29 | 2013-05-30 | Keith V. Lucas | System for Pre-Caching Game Content Based on Game Selection Probability |
US9694276B2 (en) | 2012-06-29 | 2017-07-04 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US9248374B2 (en) | 2012-06-29 | 2016-02-02 | Sony Computer Entertainment Inc. | Replay and resumption of suspended game |
US9623327B2 (en) | 2012-06-29 | 2017-04-18 | Sony Interactive Entertainment Inc. | Determining triggers for cloud-based emulated games |
US9656163B2 (en) | 2012-06-29 | 2017-05-23 | Sony Interactive Entertainment Inc. | Haptic enhancements for emulated video game not originally designed with haptic capabilities |
US20140004941A1 (en) | 2012-06-29 | 2014-01-02 | Sony Computer Entertainment Inc. | Conversion of haptic events into screen events |
US9925468B2 (en) | 2012-06-29 | 2018-03-27 | Sony Interactive Entertainment Inc. | Suspending state of cloud-based legacy applications |
US10406429B2 (en) | 2012-08-29 | 2019-09-10 | Sony Interactive Entertainment, LLC | User-based mini-game generation and distribution |
US9707476B2 (en) | 2012-09-28 | 2017-07-18 | Sony Interactive Entertainment Inc. | Method for creating a mini-game |
US20140092087A1 (en) | 2012-09-28 | 2014-04-03 | Takayuki Kazama | Adaptive load balancing in software emulation of gpu hardware |
US9258012B2 (en) | 2013-03-15 | 2016-02-09 | Sony Computer Entertainment Inc. | Compression of state information for data transfer over cloud-based networks |
-
2012
- 2012-09-28 US US13/631,803 patent/US20140092087A1/en not_active Abandoned
-
2016
- 2016-08-01 US US15/225,361 patent/US10354443B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040179019A1 (en) * | 2003-03-12 | 2004-09-16 | Nvidia Corporation | Double-buffering of pixel data using copy-on-write semantics |
US20090303245A1 (en) * | 2008-04-30 | 2009-12-10 | Alexei Soupikov | Technique for performing load balancing for parallel rendering |
US20110299105A1 (en) * | 2010-06-08 | 2011-12-08 | Canon Kabushiki Kaisha | Rastering disjoint regions of the page in parallel |
Cited By (69)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10084751B2 (en) * | 2011-02-16 | 2018-09-25 | Fortinet, Inc. | Load balancing among a cluster of firewall security devices |
US9717989B2 (en) | 2012-06-29 | 2017-08-01 | Sony Interactive Entertainment Inc. | Adding triggers to cloud-based emulated games |
US11724205B2 (en) | 2012-06-29 | 2023-08-15 | Sony Computer Entertainment Inc. | Suspending state of cloud-based legacy applications |
US9248374B2 (en) | 2012-06-29 | 2016-02-02 | Sony Computer Entertainment Inc. | Replay and resumption of suspended game |
US10293251B2 (en) | 2012-06-29 | 2019-05-21 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US9623327B2 (en) | 2012-06-29 | 2017-04-18 | Sony Interactive Entertainment Inc. | Determining triggers for cloud-based emulated games |
US9925468B2 (en) | 2012-06-29 | 2018-03-27 | Sony Interactive Entertainment Inc. | Suspending state of cloud-based legacy applications |
US9656163B2 (en) | 2012-06-29 | 2017-05-23 | Sony Interactive Entertainment Inc. | Haptic enhancements for emulated video game not originally designed with haptic capabilities |
US9694276B2 (en) | 2012-06-29 | 2017-07-04 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US10668390B2 (en) | 2012-06-29 | 2020-06-02 | Sony Interactive Entertainment Inc. | Suspending state of cloud-based legacy applications |
US10406429B2 (en) | 2012-08-29 | 2019-09-10 | Sony Interactive Entertainment, LLC | User-based mini-game generation and distribution |
US11058947B2 (en) | 2012-08-29 | 2021-07-13 | Sony Interactive Entertainment LLC | User-based mini-game generation and distribution |
US20210162295A1 (en) * | 2012-09-28 | 2021-06-03 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in graphics processing |
US10953316B2 (en) * | 2012-09-28 | 2021-03-23 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in graphics processing |
US9707476B2 (en) | 2012-09-28 | 2017-07-18 | Sony Interactive Entertainment Inc. | Method for creating a mini-game |
US10525359B2 (en) | 2012-09-28 | 2020-01-07 | Sony Interactive Entertainment Inc. | Method for creating a mini-game |
US10518182B2 (en) | 2012-09-28 | 2019-12-31 | Sony Interactive Entertainment Inc. | Method for creating a mini-game |
US20190299089A1 (en) * | 2012-09-28 | 2019-10-03 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in graphics processing |
US9849372B2 (en) | 2012-09-28 | 2017-12-26 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in emulation of a legacy application title |
US11013993B2 (en) | 2012-09-28 | 2021-05-25 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US11660534B2 (en) | 2012-09-28 | 2023-05-30 | Sony Interactive Entertainment Inc. | Pre-loading translated code in cloud based emulated applications |
US10350485B2 (en) | 2012-09-28 | 2019-07-16 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in emulation of a legacy application title |
US11904233B2 (en) * | 2012-09-28 | 2024-02-20 | Sony Interactive Entertainment Inc. | Method and apparatus for improving efficiency without increasing latency in graphics processing |
US10354443B2 (en) | 2012-09-28 | 2019-07-16 | Sony Interactive Entertainment Inc. | Adaptive load balancing in software emulation of GPU hardware |
US9787746B2 (en) * | 2013-02-01 | 2017-10-10 | Samsung Electronics Co., Ltd. | Method and apparatus for processing multimedia content on a graphic cloud |
US20140222965A1 (en) * | 2013-02-01 | 2014-08-07 | Samsung Electronics Co., Ltd | Method and apparatus for processing multimedia content on a graphic cloud |
US11826656B2 (en) | 2013-03-14 | 2023-11-28 | Sony Interactive Entertainment Inc. | Latency compensation for interface type in emulation |
US11185783B2 (en) | 2013-03-14 | 2021-11-30 | Sony Interactive Entertainment Inc. | Controller emulation for cloud gaming |
US9658776B2 (en) | 2013-03-15 | 2017-05-23 | Sony Interactive Entertainment Inc. | Compression of state information for data transfer over cloud-based networks |
US9258012B2 (en) | 2013-03-15 | 2016-02-09 | Sony Computer Entertainment Inc. | Compression of state information for data transfer over cloud-based networks |
US20150113514A1 (en) * | 2013-10-18 | 2015-04-23 | Nec Laboratories America, Inc. | Source-to-source transformations for graph processing on many-core platforms |
US9335981B2 (en) * | 2013-10-18 | 2016-05-10 | Nec Corporation | Source-to-source transformations for graph processing on many-core platforms |
US20150379672A1 (en) * | 2014-06-27 | 2015-12-31 | Samsung Electronics Co., Ltd | Dynamically optimized deferred rendering pipeline |
US9842428B2 (en) * | 2014-06-27 | 2017-12-12 | Samsung Electronics Co., Ltd. | Dynamically optimized deferred rendering pipeline |
US20160006788A1 (en) * | 2014-07-03 | 2016-01-07 | Hob Gmbh & Co. Kg | Client-server-communication system running a client-side-script-program |
US11189083B2 (en) | 2015-01-27 | 2021-11-30 | Splunk Inc. | Clipping polygons based on a scan of a storage grid |
US10789279B2 (en) | 2015-01-27 | 2020-09-29 | Splunk Inc. | Ray casting technique for geofencing operation |
US10657680B2 (en) | 2015-01-27 | 2020-05-19 | Splunk Inc. | Simplified point-in-polygon test for processing geographic data |
US11734878B1 (en) * | 2015-01-27 | 2023-08-22 | Splunk Inc. | Polygon clipping based on traversing lists of points |
US10688394B2 (en) | 2015-01-27 | 2020-06-23 | Splunk Inc. | Three-dimensional point-in-polygon operation to facilitate visualizing a 3D structure surrounding a data point |
US10748330B2 (en) * | 2015-01-27 | 2020-08-18 | Splunk Inc. | Clipping polygons to fit within a clip region |
US10860624B2 (en) | 2015-01-27 | 2020-12-08 | Splunk Inc. | Using ray intersection lists to visualize data points bounded by geometric regions |
US11532068B2 (en) * | 2016-02-01 | 2022-12-20 | Imagination Technologies Limited | Sparse rendering in computer graphics |
US11295524B2 (en) * | 2016-02-01 | 2022-04-05 | Imagination Technologies Limited | Frustum rendering in computer graphics |
US10878626B2 (en) * | 2016-02-01 | 2020-12-29 | Imagination Technologies Limited | Frustum rendering in computer graphics |
US20170221261A1 (en) * | 2016-02-01 | 2017-08-03 | Imagination Technologies Limited | Frustum Rendering in Computer Graphics |
US20170221177A1 (en) * | 2016-02-01 | 2017-08-03 | Imagination Technologies Limited | Sparse Rendering in Computer Graphics |
US11887212B2 (en) | 2016-02-01 | 2024-01-30 | Imagination Technologies Limited | Sparse rendering in computer graphics |
CN107025681A (en) * | 2016-02-01 | 2017-08-08 | 想象技术有限公司 | It is sparse to render |
US11830144B2 (en) * | 2016-02-01 | 2023-11-28 | Imagination Technologies Limited | Frustum rendering in computer graphics |
US10489974B2 (en) * | 2016-02-01 | 2019-11-26 | Imagination Technologies Limited | Frustum rendering in computer graphics |
US11587290B2 (en) * | 2016-02-01 | 2023-02-21 | Imagination Technologies Limited | Frustum rendering in computer graphics |
US20220172432A1 (en) * | 2016-02-01 | 2022-06-02 | Imagination Technologies Limited | Frustum Rendering in Computer Graphics |
US10531030B2 (en) | 2016-07-01 | 2020-01-07 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
TWI767190B (en) * | 2016-07-01 | 2022-06-11 | 美商谷歌有限責任公司 | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
TWI687896B (en) * | 2016-07-01 | 2020-03-11 | 美商谷歌有限責任公司 | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
TWI656508B (en) * | 2016-07-01 | 2019-04-11 | 美商谷歌有限責任公司 | Block operation for image processor with two-dimensional array of arrays and two-dimensional displacement register |
US11196953B2 (en) | 2016-07-01 | 2021-12-07 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US9986187B2 (en) | 2016-07-01 | 2018-05-29 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US10334194B2 (en) | 2016-07-01 | 2019-06-25 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
TWI625697B (en) * | 2016-07-01 | 2018-06-01 | 谷歌有限責任公司 | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
AU2018201757B2 (en) * | 2017-03-23 | 2019-06-13 | Fusion Holdings Limited | Multi-threaded rendering system |
US10783008B2 (en) | 2017-05-26 | 2020-09-22 | Sony Interactive Entertainment Inc. | Selective acceleration of emulation |
CN109213607A (en) * | 2017-06-30 | 2019-01-15 | 武汉斗鱼网络科技有限公司 | A kind of method and apparatus of multithreading rendering |
US11803936B2 (en) | 2018-06-29 | 2023-10-31 | Imagination Technologies Limited | Tile assignment to processing cores within a graphics processing unit |
US11508028B2 (en) * | 2018-06-29 | 2022-11-22 | Imagination Technologies Limited | Tile assignment to processing cores within a graphics processing unit |
US11210816B1 (en) * | 2018-08-28 | 2021-12-28 | Apple Inc. | Transitional effects in real-time rendering applications |
WO2022095010A1 (en) * | 2020-11-09 | 2022-05-12 | Qualcomm Incorporated | Methods and apparatus for rasterization of compute workloads |
CN116578425A (en) * | 2023-07-11 | 2023-08-11 | 沐曦集成电路(上海)有限公司 | Load balancing method and system based on rasterization |
Also Published As
Publication number | Publication date |
---|---|
US10354443B2 (en) | 2019-07-16 |
US20160364906A1 (en) | 2016-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10354443B2 (en) | Adaptive load balancing in software emulation of GPU hardware | |
Montrym et al. | InfiniteReality: A real-time graphics system | |
US8111264B2 (en) | Method of and system for non-uniform image enhancement | |
TWI559729B (en) | Pixel shader bypass for low power graphics rendering | |
KR101640904B1 (en) | Computer-based methods, machine-readable non-transitory medium and server system to provide online gaming experience | |
KR20130108609A (en) | Load balancing between general purpose processors and graphics processors | |
CN107392836B (en) | Stereoscopic multi-projection using a graphics processing pipeline | |
KR102381945B1 (en) | Graphic processing apparatus and method for performing graphics pipeline thereof | |
TW201432609A (en) | Distributed tiled caching | |
CN102609971A (en) | Quick rendering system using embedded GPU (Graphics Processing Unit) for realizing 3D-GIS (Three Dimensional-Geographic Information System) | |
KR20180087356A (en) | Process and apparatus for particle system | |
KR20160130629A (en) | Apparatus and Method of rendering for binocular disparity image | |
US9558573B2 (en) | Optimizing triangle topology for path rendering | |
US8907979B2 (en) | Fast rendering of knockout groups using a depth buffer of a graphics processing unit | |
US20220083385A1 (en) | System and method for multi-tenant implementation of graphics processing unit | |
KR20170040698A (en) | Method and apparatus for performing graphics pipelines | |
JP7164761B2 (en) | Asset-aware computing architecture for graphics processing | |
Abraham et al. | A load-balancing strategy for sort-first distributed rendering | |
CN115298686B (en) | System and method for efficient multi-GPU rendering of geometry by pre-testing for interlaced screen regions prior to rendering | |
KR102644276B1 (en) | Apparatus and method for processing graphic | |
EP2954495B1 (en) | Information processing apparatus, method of controlling the same, program, and storage medium | |
KR20160025894A (en) | Method and apparatus for power control for GPU resources | |
US11514549B2 (en) | System and method for efficient multi-GPU rendering of geometry by generating information in one rendering phase for use in another rendering phase | |
JP7335454B2 (en) | Systems and Methods for Efficient Multi-GPU Rendering of Geometry with Region Testing During Rendering | |
US20210089423A1 (en) | Flexible multi-user graphics architecture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SONY COMPUTER ENTERTAINMENT INC., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SUBA MIURA, VICTOR OCTAV;KAZAMA, TAKAYUKI;SIGNING DATES FROM 20121207 TO 20121210;REEL/FRAME:029494/0455 |
|
AS | Assignment |
Owner name: SONY INTERACTIVE ENTERTAINMENT INC., JAPAN Free format text: CHANGE OF NAME;ASSIGNOR:SONY COMPUTER ENTERTAINMENT INC.;REEL/FRAME:039239/0343 Effective date: 20160401 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |