Pyarelal Knowles

Hi. Here is somewhat of a portfolio of my work. Below is a list of some published research from my PhD. This,, is a website I’ve been working on in my spare time. You can find me on these sites too:

My old website, assuming it’s still up, has some past projects listed too:


Real-Time Deep Image Rendering and Order Independent Transparency - 2015

Pyarelal Knowles

My Ph.D. thesis.

In computer graphics some operations can be performed in either object space or image space. Image space computation can be advantageous, especially with the high parallelism of GPUs, improving speed, accuracy and ease of implementation. For many image space techniques the information contained in regular 2D images is limiting. Recent graphics hardware features, namely atomic operations and dynamic memory location writes, now make it possible to capture and store all per-pixel fragment data from the rasterizer in a single pass in what we call a deep image. A deep image provides a state where all fragments are available and gives a more complete image based geometry representation, providing new possibilities in image based rendering techniques. This thesis investigates deep images and their growing use in real-time image space applications. A focus is new techniques for improving fundamental operation performance, including construction, storage, fast fragment sorting and sampling.

A core and driving application is order-independent transparency (OIT). A number of deep image sorting improvements are presented, through which an order of magnitude performance increase is achieved, significantly advancing the ability to perform transparency rendering in real time. In the broader context of image based rendering we look at deep images as a discretized 3D geometry representation and discuss sampling techniques for raycasting and antialiasing with an implicit fragment connectivity approach. Using these ideas a more computationally complex application is investigated — image based depth of field (DoF). Deep images are used to provide partial occlusion, and in particular a form of deep image mipmapping allows a fast approximate defocus blur of up to full screen size.

Available on the university’s website:

Fast Sorting for Exact OIT of Complex Scenes - 2014

Pyarelal Knowles, Geoff Leach and Fabio Zambetta

A paper in Computer Graphics International 2014, and a special issue of The Visual Computer.

Exact order independent transparency (OIT) techniques capture all fragments during rasterization. The fragments are then sorted per-pixel by depth and then composite them in order using alpha transparency. The sorting stage is a bottleneck for high depth complexity scenes, taking 70–95% of the total time for those investigated. In this paper we show that typical shader based sorting speed is impacted by local memory latency and occupancy. We present and discuss the use of both registers and an external merge sort in register-based block sort to better use the memory hierarchy of the GPU for improved OIT rendering performance. This approach builds upon backwards memory allocation, achieving an OIT rendering speed up to 1.7× that of the best previous method and 6.3× that of the common straight forward OIT implementation. In some cases the sorting stage is reduced to no longer be the dominant OIT component.

A pre-print is available here: rbs-preprint.pdf

The final publication is available at Springer via:

I’ll also provide code at some point. Time is pretty tight right now. Though feel free to email me.

  author={Knowles, Pyarelal and Leach, Geoff and Zambetta, Fabio},
  title={Fast sorting for exact OIT of complex scenes},
  journal={The Visual Computer},
  publisher={Springer Berlin Heidelberg},
  keywords={Sorting; OIT; Transparency; Shaders; Performance; registers; Register-based block sort},

Backwards Memory Allocation and Improved OIT - 2013

Pyarelal Knowles, Geoff Leach and Fabio Zambetta

A short paper in Pacific Graphics 2013.

Order independent transparency (OIT) is a graphics technique which sorts surfaces per-pixel for correct alpha blending. The sorting stage requires relatively large amounts of temporary memory in shaders that is usually conservatively allocated at a maximum, which impacts occupancy and performance. To address this issue we introduce backwards memory allocation (BMA), a strategy which creates a set of shaders with varying static allocation size in lieu of dynamic allocation. Batches of threads are then executed directly with the appropriate shader. This also allows optimizations for each generated shader such as choosing the sorting algorithm based on allocation size with no additional overhead. BMA gives both a more flexible OIT (BMA-OIT) for dynamic scenes of varying depth complexity and up to a 3× speedup.

A pre-print is available here: bma-preprint.pdf.

The definitive version is available at:

  crossref = {PG2013short-proc},
  author = {Pyarelal Knowles and Geoff Leach and Fabio Zambetta},
  title = {{Backwards Memory Allocation and Improved OIT}},
  booktitle = {Proceedings of Pacific Graphics 2013 (short papers)},
  pages = {59-64},
  location = {Singapore},
  month = {October},
  year = {2013},
  URL = {},
  DOI = {10.2312/PE.PG.PG2013short.059-064}

Efficient Layered Fragment Buffer Techniques - 2012

Pyarelal Knowles, Geoff Leach and Fabio Zambetta

A book chapter in OpenGL Insights.

Rasterization typically resolves visible surfaces using the depth buffer, computing just the front-most layer of fragments. However, some applications require all fragment data, including that of hidden surfaces. In this chapter, we refer to this data and the technique to compute it as a layered fragment buffer (LFB).

The chapter is comparison of then-current order independent ransparency techniques which captured all fragments in a single rendering pass — 3D array, linked list and linearized array LFB techniques. There is a focus on the linearized method which packs the fragment data using a prefix sum scan. A discussion of the implementation and performance results is included.

The code sample can be found in the book’s github repository.

  author = {Pyarelal Knowles and Geoff Leach and Fabio Zambetta},
  title = {Efficient Layered Fragment Buffer Techniques},
  booktitle = {{O}pen{GL} {I}nsights},
  pages = {279-292},
  editor = {Patrick Cozzi and Christophe Riccio},
  month = {July},
  year = {2012},
  isbn = {978-1439893760},
  publisher = {CRC Press},
  note = {\url{}}