Graphic Image Display

Pixels and Frame Buffers:

Most contemporary graphics is raster graphics: an image is defined by a mapping from points of a discrete 2D grid to color values. Each point on the grid is called a pixel, derived from the words "picture element". The logical block of memory containing the pixel values is called the frame buffer. Physically, the frame buffer
often resides in a special purpose memory chip located on the graphics card in your PC. This special purpose memory is often referred to as VRAM (video RAM) or DRAM. The exact name does not matter, it depends on the technology of this memory chip. What is more important is the amount of this memory, since this amount determines
the possible frame buffer configurations (resolution and the number of bits per pixel).

Color:

Color is a complex subject, which is covered by several lectures in the Computer Graphics class. Understanding color properly involves understanding the physics of light (photons, wavelengths, spectrums), as well as understanding the perception of color in the human visual system. Some of this material might be also covered in the Image Processing course. To put it very briefly and inaccurately, the human eye has three types of light-sensitive cells on the retina. One type responds best to red light, another to green light, and the third to blue light. So, mathematically, we often treat colors in as points in a 3D color space (referred to as the RGB color space). It should be notes that many different color spaces have been defined for different purposes over the years. Mathematically, the RGB color space is continuous and unbounded. In practice, in order to display colors we need to restrict ourselves to finite precision inside a finite range of intensities. Typically, 8 bits per one color channel are used. Thus, we can represent 256^3 RGB combinations using 24 bits per pixel. The quantization to 8 bits per channel must be performed before the image can be displayed, but for accuracy of color computations, we often represent each color as a floating point number in the range [0,1].

The CRT:

The RGB color space corresponds in the most direct and natural way to the manner in which colors are displayed on a cathode-ray tube (CRT) display, such as your TV or your computer monitor. Part of the surface of the tube is covered with phosphors of three type: red, green, and blue. These phosphors emit red, green, or blue light for a short while when hit by a beam of electrons. The electrons are emitted from three electron guns inside the tube. A shadow mask controls which phosphors are hit by which beam. Because the phosphors only light up for a brief period of time, the screen must be refreshed (about 60 times or more each second) in order to maintain a steady image. The image on the CRT is created by the three electron beams, scanning the screen line after line. The lines can be scanned in either an interlaced or non-interlaced order.

Summary: a graphics application ultimately generates pixel values, written into the frame buffer. The frame buffer physically resides in the video memory on the graphics adapter. On this adapter there is some hardware (RAMDAC) that scans the digital values stored in the video memory and converts them to three separate analog signals (R, G, and B). These signals travel along the video cable to the CRT monitor, where they drive the three electron guns.


From B-Reps to Pixels

The analogy between taking a photograph of a real object and rendering a 3D geometric model. The pinhole camera model (real and synthetic).
Given a B-rep model, we need to know which points on the surface correspond to which pixels in the image, or vice versa. This is a many-to-one mapping (because of occlusions), so we must be able to determine which point on the model is visible at each pixel in the image. Once we know what is visible at each pixel, we should somehow compute the color.

The geometric part in the imaging process with a pinhole camera is modelled by a transformation from 3D to 2D, which we call the perspective projection. The mathematics of this and other transformations in graphics is covered in depth in the computer graphics course.

Establishing what is visible through each pixel in the image is called "hidden-surface removal". There are many different hidden-surface removal algorithms. Several of them are covered in the computer graphics class. Here, we'll briefly explain two: ray tracing and Z-buffer.

Ray Tracing: The idea is simple. If we want to find out what is visible at a particular pixel in the image, we can simply take a ray whose origin is the pinhole of the camera and direct it at the center of the pixel. Then we could intersect the ray with all the surfaces in our model, and find the intersection point that is closest to the origin of the ray.

Z-buffer: Assume that we have a triangulated B-rep model. It turns out that we can very efficiently scan-convert a triangle. That is, we can very quickly generate the coordinates of all the pixels covered by this triangle on the screen. Furthermore, at each (x,y) pixel coordinate covered by the triangle we can generate (with very little extra effort) the depth of the corresponding surface point (depth = orthogonal distance from the image plane). So, in order to get an image of a model with hidden surfaces removed we maintain another buffer alongside the frame buffer. This buffer is called the Z-buffer. The Z-buffer is initialized to some large value. Now we scan-convert the triangles in the model one by one, but for each pixel, before writing it to the frame buffer, we test if its depth is smaller than the one already in the Z-buffer. Only pixels that pass this test are written to the frame buffer (and their depth value is written to the Z-buffer).


Shading Models

We could assign objects a constant color, but that would be boring, unrealistic, and it would be hard to discern the shape of the object when it is shaded in this manner.

In order to assign more realistic colors to the surface points visible at each pixel we attempt to simulate the physical world. This is difficult to do precisely, so we will resort to various shading models,  which are simplified models of the physical reality.

Lambertian reflection is the simplest shading model in computer graphics. The brightness of a Lambertian surface is proportional to the cosine of the angle between the surface normal and a vector directed at the light source. The ambient term is often added in order to create some illumination in areas that face away from the light source.

Lambertian reflection model is appropriate for rough matte surface finishes. For shiny surfaces we expect to see specular highlights whose location depends on the position of the observer relative to the surface and the light source. This shortcoming was addressed by the Phong reflection model, and the Blinn-Phong reflection models. Both of these include an exponential term that succeeds in producing nice looking view-dependent highlights.

The Cook-Torrance model is a physically-based model. It produces more accurate approximations than the empirical models, but it is also considerably more complicated, and requires various physical information about the surface and the material. The analysis is based on the assumption that the surface is made out of mirror-like microfacets, whose normals are distributed about the average surface normal according to some distribution function. The analysis takes into account the self-occlusion of the surface by its own microfacets.

Even more complex physically-based models have been derived, however they are rarely used in practice.