Game Graphics 101: Static 2D (Vectors/Rasters, Color Spaces, 2D Anti-Aliasing, etc.)

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

[[This is Chapter 17(b) from “beta” Volume V of the upcoming book “Development&Deployment of Multiplayer Online Games”, which is currently being beta-tested. Beta-testing is intended to improve the quality of the book, and provides free e-copy of the “release” book to those who help with improving; for further details see “Book Beta Testing“. All the content published during Beta Testing, is subject to change before the book is published.

To navigate through the book, you may want to use Development&Deployment of MOG: Table of Contents.]]

As it was noted in the beginning of this Chapter, please keep in mind that

in this book you will NOT find any advanced topics related to graphics.

What you will find in this chapter, however, is the very very basics of the graphics, just enough to start reading other books on the topic, AND (last but not least) to understand other things which are essential for networking programming and game development flow.

Bottom line:

if you’re a gamedev with at least some graphics experience – it is probably better to skip this Chapter to avoid reading about those-things-you-know-anyway.

This Chapter is more oriented towards those developers who are coming from radically different fields such as, for example, webdev or business app development (and yes, switch from webdev into gamedev does happen).

2D Basics

As noted above, graphics is an almost endless field, and we need to start somewhere. And the simplest graphics I can think of, is static 2D one.

Static 2D

First of all, we need to define our basic building blocks, and one of the very basic building blocks for all kinds of graphics, is static 2D images.

In general, each static image can be represented as either a raster image, or a vector one. I won’t go much into details of raster vs vector here, but will note a few properties of them:

Vector images tend to scale MUCH better than the raster ones (hey, that’s the whole point of having the vector)
- “when trying to scale vector image to VERY small pixel sizes – you might get problemsThat being said, when trying to scale vector image to VERY small pixel sizes – you might get problems (these problems are closely related to aliasing, see on it below). One prominent example of this is TTF fonts, which are vectors, BUT to avoid problems at very small sizes (like 20px and below) they have so-called “hints”
  - When applying the same observation to games: if you need to size your character into 30-50 pixels or so, you will likely need an artist skilled in so-called “pixel art”, and to draw that resolution separately. Of course, there is no exact size where you really need it, but IMO for sizes over 100×100 the need for pixel art is quite rare, and for 30×30 and non-trivial graphics it is almost universally needed (and you’ll need a REALLY good pixel artist to get anything remotely resembling a character into 30×30 px).
- Outside of very small pixel sizes, conversion from vector to raster is trivial, but the reverse conversion can range from “tricky” to “outright ugly”
- Most of the time, vector image files are significantly smaller than raster image files
- Different types of image sources usually dictate “vector vs raster” choice
  - Photos are usually raster
    - While converting photos into vectors is technically possible, most often than not you won’t really want to use such converted stuff
  - Hand-drawn cartoons are often drawn as vectors (in particular, Wacom tablets are rather popular among professional artists for this purpose)
    - Even if you have hand-drawn stuff as raster, you MIGHT be able to convert it to vector, though quality of conversion is not guaranteed. In other words – if you want vector image, it is generally better to ask your artist to draw vector from the very beginning.
  - Graphics such as rectangles, circles, splines, etc. is inherently vector
    - “Most of the fonts are already vectorisedMost of the fonts are already vectorised too¹

¹ Actually, True Type Fonts are inherently vector-based splines (at least if we forget about hints for small sizes)

Vectors vs Raster for Games

When answering question “whether we want to use vector or raster images for our game”, we need to realize that there are actually TWO separate questions. The first one is “what we want our artists to produce”, and the second one “what we want our game engine to use”, and

answers to these two questions may well be different.

As noted above in this Chapter (section [TODO: toolchain]), it is perfectly viable to have our artists to produce vector images, then to have our asset pipeline / toolchain to convert them into different raster versions (for example, aimed at different devices) – and then to have our graphics engine(s) deal with different rasterized images on different devices. While I am not saying that this is The Model to follow (it really depends on lots of considerations which are well beyond the scope of this book) – such things DO work, so if you need this kind of workflow – keep it in mind.

On RGB and Colors

As we all know, colors are often represented as a (red, green, blue) tuple, a.k.a. RGB. This (R,G,B) tuple is normally the way how our displays work with colors. Each of the elements of the tuple is represented by a certain number of bits. This number of bits is closely related to “16-bit color” devices (actually, it is 15-bit with 5 bits per color – same as “5 bit per channel”); “24-bit color devices” (8 bits per color), and “30-bit color devices” (10 bits per color). “32-bit color” device settings are often meant as just 8-bit per color with an additional “alpha channel” for images (more on “alpha channel” below) – so in fact it is not RGB, but RGBA.² In fact, 8-bits per color will be enough for almost all the scenes, except maybe if certain slow gradients are involved (and/or if you’re looking at it from the perspective of a professional artist).

“I am Really Happy that these days we don’t normally need to deal with “color palettes”I am Really Happy that these days we don’t normally need to deal with “color palettes” (limited number of colors on the whole screen like 16 or 256, with index of the color used to encode each pixel); I’ve done it before (yes, I’m one of those dinosaurs who still remember pre-ice-age stuff 😉 ), and can say it was a Really Big Headache.

With regards to RGB, we need to keep in mind that any (R,G,B) tuple can be alternatively represented as an (Hue, Saturation, Brightness) tuple (a.k.a. Hue, Saturation, Value, HSV and HSB), or as (Hue, Saturation, Lightness) (a.k.a. HSL). These are just that – alternative representations of RGB (up to the precision of conversion calculations), which may be more convenient in some use cases. Also, let’s keep in mind that the displays are RGB, so that we’ll need to get to RGB at some point anyway.

² as there is no way to see alpha after the image is shown on a physical display, using “32-bit color” for device settings is IMO quite a misnomer

On Gamma Correction

“devices such as scanners and cameras on one end of our asset pipeline, and displays on the other end of the pipeline, are non-linearOne more thing which we need to take into account when speaking about colors, is gamma correction. In short – while light as such is linear (so the maths within our apps is correct), the devices such as scanners and cameras on one end of our asset pipeline, and displays on the other end of the pipeline, are non-linear 🙁 . As a result, a process known as “gamma correction”³ needs to be performed on the images at two points:

to enter linear color space from real-world image sources. Performed after scanning (or after taking an RGB from a JPEG image, as JPEG images are usually already gamma-corrected)
- in this case, “gamma correction” should be done in a manner specific to source (scanner/camera/JPEG image/…)
to exit linear space back into real-world. Performed before displaying
- in this case, “gamma correction” should be done in a manner specific to target display

BTW, as mentioned in [Gritz], “two wrongs don’t make a right”; in other words – skipping gamma correction on both sides, while being better than doing correction only once, exhibits substantial deficiencies 🙁 .

I don’t want to go into details of gamma correction, but just want to note that classical formula (Vout=A*Vin^gamma) – the one which actually gave the name to the term – is just one of the possible gamma corrections. In general, however, “gamma correction” (at least in the sense which I’m using the term here) is an arbitrary color space conversion which needs to be applied to convert between device-specific color space into linear color space and vice versa. Sometimes – it is described by the formula above (in particular, JPEG usually already carries gamma-converted data using formula above with gamma=2.2); sometimes – gamma correction is device-dependent (with one prominent example being sRGB conversion which roughly corresponds to CRT monitors) – and it doesn’t need to be derivable from the formula above (in particular, conversions to/from sRGB space are more complicated than that, see, for example, [Wikipedia.GammaCorrection]).

Overall – if you’re serious about realistic appearance of your game – you’ll need to keep track of your gamma-related stuff and color spaces. More on it in a very good description in [Gritz]. However, for quite a few non-AAA games you might be able to live without worrying about gamma correction for a while.

³ NB: there are differences in use of the term “gamma correction” out there; I am using it to cover ALL the non-linear color conversion spaces corrections, whether it is described by “power of gamma” formula or not

On Transparency and “Alpha Channel”

“The first way (which was implemented, for example, in GIF) was to say that one of the colors is in fact a “transparent” color.Historically, transparent images were implemented in one of two ways. The first way (which was implemented, for example, in GIF) was to say that one of the colors is in fact a “transparent” color. This does work, but has quite substantial issues when using it, because of aliasing.

To deal properly with aliasing issues when dealing with transparency and overlaps (more on aliasing below), you DO need partially transparent pixels (i.e. pixels which can have transparency in between “100% opaque” and “100% transparent”). This is normally achieved by using so-called “alpha channel”. “Alpha channel” consists of values for each pixel, with the value saying “how transparent this pixel is”. As it is a per-pixel value, it means that for images with the alpha-channel, we effectively have an (R,G,B,A) tuple instead of the usual (R,G,B) tuple to represent our pixels.

Note that if your image format doesn’t support “alpha channel” (for example, JPEG doesn’t), you can implement “alpha channel” yourself (BTW, quite a few game engines do it too). To do it, you need to have two images – one being a non-transparent usual RGB image, and another – being a black-and-white image serving as a “mask” (which in this context can be seen as a different name for the “alpha channel”).

On Aliasing

“number of pixels we’re dealing with, is limited at least for all the real world displaysOne rather nasty thing which we need to keep in mind when working with graphics (whether 2D or 3D), is that number of pixels we’re dealing with, is limited at least for all the real world displays. It means that whenever we have our perfect image (such as vector image, vector font, or a rendering of a 3D scene), and we need to show it on screen with the limited number of screen pixels, we need to approximate. In real world, quite a bit depends on the way we’re doing this approximation.

For example, if we’re trying to resize a 2D raster image, and we’re using “nearest-neighbor” way of scaling, we’ll get a rather jagged⁴ result – exactly because of aliasing. Another well-known example of 2D aliasing arises when we’re trying to render a vector image (in particular, vector font) to a relatively small pixel size; see, for example, [Wikipedia.FontRasterization] for explanation and examples.

⁴ or Badly Jagged, depending on specifics of the images

Naïve aliased rasterization

If doing naïve rasterization of the vector image without anti-aliasing, it will likely go along the following lines:⁵

We need to calculate color of the pixel (x,y) in our rasterized image
To do it, we calculate (x’,y’) – position within the source vector image, which corresponds to (x,y) in our rasterised image
Then, we use (x’,y’) within source vector space to get c’(x’,y’) – as color of the vector image at (x’,y’), and then use this c(x’,y’) as a color for our pixel c(x,y)

“While this approach is simple and obvious, it is not anti-aliased, which will lead to jagged imagesWhile this approach is simple and obvious, it is not anti-aliased, which will lead to jagged images. The reason for this jagging will become obvious if we note that with the naïve method above, if our original vector is black-and-white, we can get only white and black pixels in our rasterized image – which (as can be seen, for example, on illustrations within [Wikipedia.FontRasterization]) is inherently jagged. BTW, if we apply this naïve method to scaling of 2D images (considering our source image as a vector of squares, each square corresponding to a pixel), we’ll get results which are similar to “nearest-neighbor” resizing (i.e. also ranging from “jagged” to “badly jagged”).

Simple anti-aliased rasterization

Ways to deal with anti-aliasing are numerous and complicated and in general are beyond the scope of this book. However, I still want to mention one simple technique here. To implement some anti-aliasing, we can render our vector image as follows:

We still need to calculate the color of the pixel (x,y) in our rasterized image
To do it, we’re considering our pixel not as a point anymore, but as a square
To represent this square, we calculate (x0’,y0’) and (x1’,y1’) – positions of the opposite corners of this square within the source vector image
We can use the whole square defined by corners (x0’,y0’) and (x1’,y1’) within source vector space to calculate our c(x,y). Methods of this calculation over the square can vary, but two most obvious ones are:
- average of the c’(x,y) over the square
  - To get it, you can use your knowledge about the nature of your vector image (i.e. if it is spline, you can calculate areas based on that knowledge).
  - Alternatively, you can try to calculate an integral of your c'(x’,y’) function over (x0′,y0′) and (x1′,y1′). One way of doing it is to take a bunch of points within the square defined by (x0’,y0’) and (x1’,y1’) and test them to get the average.⁶ When applied to resizing (again, considering our source image as a vector of squares) – this will produce results which are more or less similar to “box sampling”
  - one simplified way to average is calculate an average of colors at 4 corners of the square: c=average(c'(x0′,y0′), c'(x0′,y1′), c'(x1′,y0′), c'(x1′,y1′))
- linear interpolation between (x0’,y0’) and (x1’,y1’). This is obviously MUCH less computationally intensive than the averaging stuff, though it MIGHT produce some artifacts due to information loss. When applied to resizing (once again, considering our source image as a vector of pixel-size squares) – this will produce results which are more or less similar to “bilinear interpolation”
- One of the questions you’ll need to answer when interpolating or averaging, is which representation (more strictly – which “color space”) should be used for averaging and/or interpolation.
  - One thing here is clear – RGB is usually not good for these purposes. However, [StackOverflow.AverageTwoColors] has a suggestion (didn’t try it myself, but looks reasonable on the surface) to use square roots of an average of RGB squares.
  - Something like HSV usually works better than RGB for averaging/interpolation
  - I’ve heard from credible sources (though never tried it myself) that Lab color space [https://en.wikipedia.org/wiki/Lab_color_space] works even better than HSV.
  - Feel free to try Y’CbCr color space used by JPEG (no warranties of any kind, batteries not included)

“using this simple anti-aliasing is usually MUCH better than staying without any anti-aliasing at allOf course, this kind of anti-aliasing is quite limited, and there are significantly better methods out there – but if you’re running out of options, using this simple anti-aliasing is usually MUCH better than staying without any anti-aliasing at all.

⁵ we’re not speaking about efficiency here, only about the basic idea

⁶ in extreme cases, you can even use Monte-Carlo simulation to get those points to average

Aliasing and transparency

As we’ve seen above, to get rid of aliasing, we need to account for color averaging between pixels. Pretty much the same thing stands when we’re trying to draw a transparent object (such as a sprite or whatever-else) on top of a background. If our semi-transparent object has all pixels either as fully opaque or as fully transparent – after positioning over the background it will look jugged. To avoid it, border pixels of the image need to be partially transparent (i.e. neither fully opaque nor fully transparent); this is usually achieved using an “alpha channel” described above.

Raster formats: On JPEG vs PNG (vs GIF)

While risking to be too much of Captain Obvious, I will still briefly mention the difference between two of the most popular raster formats for game development: JPEG and PNG.

In short:

JPEG is lossy, PNG is lossless. That being said, for photo images the quality loss of JPEG is usually not visible⁷
PNG supports transparency (with an alpha channel too), JPEG as such doesn’t.⁸
DO use JPEG for photo-like images
- Size gains will be significant, quality degradation won’t be visible
- If you will need transparency for photo-like stuff – consider having main image in JPEG and separate mask (either in b&w JPEG or in PNG) as your “alpha-channel”
DO use PNG for cartoon-like graphics
- “As a rule of thumb, for cartoon-like graphics image quality of PNG will be MUCH better than that of JPEGAs a rule of thumb, image quality will be MUCH better than that of JPEG (in particular, JPEG, as well any Fourier-like compression, tends to produce lots of artefacts near the edges of the same-color-filled areas)
- As a rule of thumb, the size of PNG for this kind of graphics won’t be too bad; in fact, if the graphics has lots of same-color fills – PNG will be often smaller than JPEG in size (while providing MUCH better quality too).

When it comes to GIF – well, TBH I don’t really see much place for GIFs in modern game development. Most of the things which GIF can do, PNG can do better. While in webdev world there is still one thing which makes people to use GIFs (it is related to animation support in PNGs in existing browsers), it doesn’t really apply to game development. In short – unless your game/graphics engine explicitly says “use GIFs because we have Really Good Reasons to prefer them over PNGs”⁹ – use PNGs instead.

One last note – if you’re using raster images both as source assets and for game engine, you will probably want to use PNGs (or any other lossless format, .PSD included 🙂 ) as source assets even if you’re working with photos and photo-like images; it is related to an observation that EACH of the JPEG edits tends to degrade quality a bit, so if you have like 100 edits, the difference in quality can become visible. Whether this degradation will be visible in your particular case – it depends, but it might, so artists generally prefer to stick to lossless image formats for their source assets.

⁷ assuming that you’ve spent your time on finding optimum JPEG compression for your image

⁸ an improvement over JPEG, known as JPEG2000, adds support for transparency, but last time I’ve checked, it still wasn’t supported widely enough 🙁

⁹ and I don’t know of any such engines

PNG and JPEG compression technicalities

Here goes a little bit of a very high-level overview of PNG and JPEG.

PNG use rather usual deflate algorithm to compress information about pixels. And deflate is essentially an LZ77, which encodes exactly repeated portions of the stream rather efficiently. It means that if there are lots of adjacent pixels of the same color (or there is any other exactly repeated relatively small pattern) – PNG is going to be very efficient.

“Internally, JPEG uses a close cousin of a Fourier Transform, which works over 8x8 pixel blocks.JPEG, however, is a very different beast. Internally, JPEG is a DCT (Discrete Cosine Transform, a close cousin of a Fourier Transform), which works over 8×8 pixel blocks.¹⁰ As a result, within JPEG format we get a Fourier-like representation of the picture, with different (spatial) frequencies represented by different coefficients. Then, as human eyes are less sensitive to higher frequency variations, the higher frequencies of DCT are stored with lower accuracy than the lower frequencies (i.e. we’re losing information, but doing it in a way which is not THAT obvious to a human eye). It means that:

JPEG with DCT is an inherently lossy algorithm
It can be used for Fourier-like scaling (in fact, I’ve seen a system which achieved a downsampling quality better than all the bilinear and bicubical stuff, using merely supersampling of original image, and then 2x downsampling provided by standard JPEG library)
As JPEG operates on 8×8 blocks which are processed independently, it is prone to “blockiness” along the lines of these blocks.

¹⁰ a question “how Fourier-like transform can be made THAT efficient speed-wise”, is well beyond the scope of this book, but apparently it can 😉

Static 2D – is it any good for games?

Ok, we’ve discussed the very basics of static 2D graphics. While we haven’t get to animation yet, we’ve already described quite a few concepts which are necessary for any further discussion (we cannot describe sprites without defining transparency, we cannot speak about colors without RGB, and so on).

“it IS possible to make a successful multiplayer game using only static 2D graphics. As a little encouragement for all our efforts (me describing it and you reading up to this point 😉 ), let’s mention that contrary to popular belief, it IS possible to make a successful multiplayer game using only static 2D graphics. ¹¹

One prominent example of such a game, is Lords&Knights. I won’t concentrate on their gameplay (though IMO it DOES deserve being studied by your game designers), but will mention that the game has only two screens which have some graphics (and very rudimentary at that). One is a map with “castles” (the map essentially repeats itself in any direction, with minor random variations). Plus, when player clicks on her castle – she can see an image of the “castle in its current state”. In fact, it is simply a “base” image, with overlaid transparent parts (such as “stables” or “tower”) depending on whatever-stuff-you-already-built in the castle. That’s it! No animation, nothing but these static images – and they still got hundreds of thousands of loyal players, etc. etc.

I don’t really mean that you should build your game as 2D static graphics, but well – even these things have been seen to be successful :-).

¹¹ actually, you can make a game without any graphics at all, starting from my personal favorite NetHack and mentioning much more recent Dwarf Fortress, Hacker Experience, or Aurora 4x. Not to mention stock exchanges etc.

[[To Be Continued…

This concludes beta Chapter 17(b) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”. Stay tuned for beta Chapter 17(c), where we’ll start discussing animation and double buffering…]]

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.

Comments

Wanderer says

August 9, 2016 at 10:01 am

Regarding anti-aliased rasterization, as far as I understand, the whole math problem is called “numerical integration”. I.e. you have a piece of surface and you want to calculate average of f(x,y) over it, which in turn is “just” an integral of f(x,y) over (x0, x1) x (y0, y1) divided by area size (i.e. (x1-x0)*(y1-y0));

Possible approaches to do it described here – https://en.wikipedia.org/wiki/Numerical_integration

Personally, I only worked with 1D numerical integration in a sound fx/modeling field, for which nice sampling formulas with pre-calculated constants can be found in books (https://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas)

For 2D space, one can indeed use Monte-Carlo (which is probably a bit of overkill) as pointed out in the first wiki link under “Multidimensional integrals”. Another approach is to construct a 2D “sampling” version, i.e. google “Trapezoidal Rule in 2D” or https://www.kth.se/social/upload/52a04c17f27654620e188bb0/Ant-Integration.pdf

Omitting all math stuff there, the idea is pretty simple – one can approximate using “flat” approximation, i.e. taking central point as the average value over block (worst, but quickest result), using “linear” approximation as described in your text (i.e. trapezoidal rule), or going with higher orders (for example, taking corners and central point values with some hardcoded weights) and subdividing the square into sub-squares.

PS: at the same time, I’m pretty much sure there should be some good rasterization libs out there that already implement all those Newton–Cotes formulas along the lines.

- "No Bugs" Hare says
  
  August 10, 2016 at 5:09 am
  
  Sure, it is an integral; I just didn’t want to go into too many details here (but tried to point out that it is not a rocket science to implement it yourself if you’re out of luck with libraries). Will try to mention being an integral though.
  
Robert Basler says

August 9, 2016 at 9:21 pm

I love what you’re doing, but I’d leave these sections on rendering out. Rendering is a vast and complex topic all on its own with some amazing resources for beginners. I think readers would be much better served if you put this effort into topics specific to massively multiplayer games which have been so underserved in the game book market.

- "No Bugs" Hare says
  
  August 10, 2016 at 5:07 am
  
  As you have probably felt 🙂 , I didn’t really want to write about graphics (you’re right, it is not exactly my cup of tea), but way too many people have said “how you can possibly have a book on whatever-games without graphics?” Also, in other parts of the book, I have to refer to the stuff such as meshes etc. (like in “for server-side states, you usually need to use low-poly meshes”) – so to be self-contained, I need to include a very high-level definition of “what is mesh” too (and in a similar manner, questions like “what to use – JPEG or PNG” can be within the scope too, especially if speaking about user-uploadable avatars, and animations have an effect on client-side architecture, and so on, and so forth). And there is one more point which I’d like to make here – with 2D engines, it is possible to DIY at a basic cost of dozens of man-weeks (and not dozens of man-years as for 3D); while DIY 2D is not always a good idea, it is an option which is viable even for smaller shops, and I want to mention it.
  
  Don’t worry though – as advertised ;-), I am NOT going to go into details beyond defining a few basic things in layman terms. Overall, it is going to be one chapter out of 34 or so (and what I am struggling with – is “how to squeeze discussion on graphics into less than 50 pages”).
  
  BTW, about those good resources – if you could give a few pointers, it could be very useful (I have trouble finding good explanations of 2D graphics stuff which are not written in terms of “which buttons need to be pressed in Unity Editor to get whatever-you-need”).