OpenGL Instanced Rendering
When working on visualisations, interactive installations etc.. I often
use particles that respond to user interaction. A common way of doing
this is to create a particle system and iterate over the particles
and calling e.g. glDrawArrays() after updating the model matrix for
the particle.
But one of the goals of modern openGL is to move as much as you can to
the GPU and limit the number of calls to openGL itself. Imagine having
thousands of particles and you're using glDrawArrays() in a loop
like:
for(std::vector<Particle>::iterator it = particles.begin(); it != particles.end(); ++it) { Particle& particle = *it; glUniformMatrix4fv(matrix_loc, 1, GL_FALSE, particle.model_matrix.getPtr()); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); }
It means you're making a lot of calls to openGL which is far from optimal. This can be a real bottleneck in simulations. But luckily there is a better, faster way called instanced rendering. All you have to do when using instanced rendering is:
glDrawArraysInstanced(GL_TRIANGLE_STRIP, 0, 4, particles.size());
Instanced Rendering
The goal of instanced rendering is making less calls to openGL. In this situation of a particle system it's minor change to your existing code which will give you a huge boost. For a system I'm working on at the moment, I'm just uploading my complete vector that contains the particles. This means that we probably need to transfer a bit more memory to openGL, but this is outweighted by the fact that we can draw thousands of particles with just one call.
The key element in making this work is to use glVertexAttribDivisor().
When using instanced rendering you basically tell openGL something like: draw
N elements using e.g. GL_TRIANGLES and draw K vertices per instance.
So if you have 1000 particles and you want to draw a square you can tell it
to repeat a draw with 4 elements (4 elements will make a square when using
GL_TRIANGLE_STRIP) and do to that 1000 times.
You use glVertexAtrribDivisor() to step through your VBO data, which
will contain the vector of you particles and you tell openGL to only change
the vertex attributes every X-instance. So glVertexAttribDivisor(0, 1) means
that it will step through the data once per particle. The 0 here is referring
to the vertex attribute location and the 1 is the number of times it should be
changed per instance.
Using glVertexAttribDivisor() together with a fixed triangle strip, that
is stored in your vertex shader, is an amazing way to draw a lot of particles
easily. You can use the variable gl_VertexID in your shader which is automatically
incremented for each vertex you draw. So when you tell openGL to repeat 4 vertices
for each instance, this number will be 0, 1, 2, 4. By creating a array with
position data in your vertex shader you can simply pick the right vertex position
for your shape, in this case a square. See the vertex shader below
#version 150 uniform mat4 u_pm; in vec4 a_pos; in float a_size; const vec2 pos[] = vec2[4]( vec2(-0.5, 0.5), vec2(-0.5, -0.5), vec2(0.5, 0.5), vec2(0.5, -0.5) ); void main() { vec2 offset = pos[gl_VertexID]; gl_Position = u_pm * vec4(a_pos.x + (offset.x * a_size) , a_pos.y + (offset.y * a_size) , 0.0, 1.0); }
Example
The code below implements a very simple particle system that makes use of this technique:
WaterBall.h
#ifndef WATER_BALL_H #define WATER_BALL_H #define ROXLU_USE_ALL #include <tinylib.h> class WaterDrop { public: WaterDrop(); ~WaterDrop(); public: vec2 position; // 8 bytes vec2 forces; // 8 bytes - offset 8 vec2 velocity; // 8 bytes - offset 16 float mass; // 4 bytes - offset 24 float inv_mass; // 4 bytes - offset 28 float size; // 4 bytes - offset 32 }; class WaterBall { public: WaterBall(); ~WaterBall(); bool setup(int w, int h); void update(float dt = 0.016f); void draw(); void addDrop(vec2 position, float mass); public: int win_w; int win_h; Program prog; /* shaders / prog */ GLuint vbo; /* the vbo that holds the water drop data */ GLuint vao; /* vertex array object */ size_t bytes_allocated; /* number of bytes we allocted on gpu */ std::vector<WaterDrop> drops; /* the water drop particles */ mat4 pm; /* projection matrix; ortho */ }; #endif
WaterBall.cpp
#include <assert.h> #include "WaterBall.h" // ------------------------------------ WaterDrop::WaterDrop() :mass(0.0f) ,inv_mass(0.0f) { } WaterDrop::~WaterDrop() { } // ------------------------------------ WaterBall::WaterBall() :win_w(0) ,win_h(0) ,vbo(0) ,vao(0) ,bytes_allocated(0) { } WaterBall::~WaterBall() { } bool WaterBall::setup(int w, int h) { assert(w && h); win_w = w; win_h = h; pm.ortho(0, w, h, 0, 0.0f, 100.0f); // create shader const char* atts[] = { "a_pos", "a_size" } ; prog.create(GL_VERTEX_SHADER, rx_to_data_path("waterdrop.vert")); prog.create(GL_FRAGMENT_SHADER, rx_to_data_path("waterdrop.frag")); prog.link(2, atts); glUseProgram(prog.id); glUniformMatrix4fv(glGetUniformLocation(prog.id, "u_pm"), 1, GL_FALSE, pm.ptr()); float cx = w * 0.5; float cy = h * 0.5; int num = 10; for(int i = 0; i < num; ++i) { addDrop(vec2(rx_random(0, w), rx_random(0, h)), 1.0f); } glGenVertexArrays(1, &vao); glBindVertexArray(vao); glGenBuffers(1, &vbo); glBindBuffer(GL_ARRAY_BUFFER, vbo); glEnableVertexAttribArray(0); // pos glEnableVertexAttribArray(1); // size glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, sizeof(WaterDrop), (GLvoid*) 0); glVertexAttribPointer(1, 1, GL_FLOAT, GL_FALSE, sizeof(WaterDrop), (GLvoid*) 32); glVertexAttribDivisor(0, 1); glVertexAttribDivisor(1, 1); return true; } void WaterBall::update(float dt) { if(!drops.size()) { return ; } vec2 force(16.0, 0.0); for(size_t i = 0; i < drops.size(); ++i) { WaterDrop& d = drops[i]; d.forces += force; d.forces *= d.inv_mass * dt; d.velocity += d.forces * dt; d.position += d.velocity; d.velocity *= 0.99; d.forces = 0; } glBindBuffer(GL_ARRAY_BUFFER, vbo); size_t bytes_needed = sizeof(WaterDrop) * drops.size(); if(bytes_needed > bytes_allocated) { glBufferData(GL_ARRAY_BUFFER, bytes_needed, drops[0].position.ptr(), GL_STREAM_DRAW); bytes_allocated = bytes_needed; } else { glBufferSubData(GL_ARRAY_BUFFER, 0, bytes_needed, drops[0].position.ptr()); } } void WaterBall::draw() { glBindVertexArray(vao); glUseProgram(prog.id); glDrawArraysInstanced(GL_TRIANGLE_STRIP, 0, 4, drops.size()); } void WaterBall::addDrop(vec2 position, float mass) { if(mass < 0.01) { mass = 0.01; } WaterDrop drop; drop.mass = mass; drop.inv_mass = 1.0f / mass; drop.position = position; drop.size = 10 ; drops.push_back(drop); }
NAT Types
Building Cabinets
Compiling GStreamer from source on Windows
Debugging CMake Issues
Dual Boot Arch Linux and Windows 10
Mindset Updated Edition, Carol S. Dweck (Book Notes)
How to setup a self-hosted Unifi NVR with Arch Linux
Blender 2.8 How to use Transparent Textures
Compiling FFmpeg with X264 on Windows 10 using MSVC
Blender 2.8 OpenGL Buffer Exporter
Blender 2.8 Baking lightmaps
Blender 2.8 Tips and Tricks
Setting up a Bluetooth Headset on Arch Linux
Compiling x264 on Windows with MSVC
C/C++ Snippets
Reading Chunks from a Buffer
Handy Bash Commands
Building a zero copy parser
Kalman Filter
Saving pixel data using libpng
Compile Apache, PHP and MySQL on Mac 10.10
Fast Pixel Transfers with Pixel Buffer Objects
High Resolution Timer function in C/C++
Rendering text with Pango, Cairo and Freetype
Fast OpenGL blur shader
Spherical Environment Mapping with OpenGL
Using OpenSSL with memory BIOs
Attributeless Vertex Shader with OpenGL
Circular Image Selector
Decoding H264 and YUV420P playback
Fast Fourier Transform
OpenGL Rim Shader
Rendering The Depth Buffer
Delaunay Triangulation
RapidXML
Git Snippets
Basic Shading With OpenGL
Open Source Libraries For Creative Coding
Bouncing particle effect
OpenGL Instanced Rendering
Mapping a texture on a disc
Download HTML page using CURL
Height Field Simulation on GPU
OpenCV
Some notes on OpenGL
Math
Gists to remember
Reverse SSH
Working Set
Consumer + Producer model with libuv
Parsing binary data
C++ file operation snippets
Importance of blur with image gradients
Real-time oil painting with openGL
x264 encoder
Generative helix with openGL
Mini test with vector field
Protractor gesture recognizer
Hair simulation
Some glitch screenshots
Working on video installation
Generative meshes
Converting video/audio using avconv
Auto start terminal app on mac
Export blender object to simple file format