OpenGL Instanced Rendering
When working on visualisations, interactive installations etc.. I often
use particles that respond to user interaction. A common way of doing
this is to create a particle system and iterate over the particles
and calling e.g. glDrawArrays()
after updating the model matrix for
the particle.
But one of the goals of modern openGL is to move as much as you can to
the GPU and limit the number of calls to openGL itself. Imagine having
thousands of particles and you're using glDrawArrays()
in a loop
like:
for(std::vector<Particle>::iterator it = particles.begin(); it != particles.end(); ++it) { Particle& particle = *it; glUniformMatrix4fv(matrix_loc, 1, GL_FALSE, particle.model_matrix.getPtr()); glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); }
It means you're making a lot of calls to openGL which is far from optimal. This can be a real bottleneck in simulations. But luckily there is a better, faster way called instanced rendering. All you have to do when using instanced rendering is:
glDrawArraysInstanced(GL_TRIANGLE_STRIP, 0, 4, particles.size());
Instanced Rendering
The goal of instanced rendering is making less calls to openGL. In this situation of a particle system it's minor change to your existing code which will give you a huge boost. For a system I'm working on at the moment, I'm just uploading my complete vector that contains the particles. This means that we probably need to transfer a bit more memory to openGL, but this is outweighted by the fact that we can draw thousands of particles with just one call.
The key element in making this work is to use glVertexAttribDivisor()
.
When using instanced rendering you basically tell openGL something like: draw
N elements using e.g. GL_TRIANGLES
and draw K vertices per instance.
So if you have 1000 particles and you want to draw a square you can tell it
to repeat a draw with 4 elements (4 elements will make a square when using
GL_TRIANGLE_STRIP
) and do to that 1000 times.
You use glVertexAtrribDivisor()
to step through your VBO data, which
will contain the vector of you particles and you tell openGL to only change
the vertex attributes every X-instance. So glVertexAttribDivisor(0, 1)
means
that it will step through the data once per particle. The 0
here is referring
to the vertex attribute location and the 1
is the number of times it should be
changed per instance.
Using glVertexAttribDivisor()
together with a fixed triangle strip, that
is stored in your vertex shader, is an amazing way to draw a lot of particles
easily. You can use the variable gl_VertexID
in your shader which is automatically
incremented for each vertex you draw. So when you tell openGL to repeat 4 vertices
for each instance, this number will be 0, 1, 2, 4. By creating a array with
position data in your vertex shader you can simply pick the right vertex position
for your shape, in this case a square. See the vertex shader below
#version 150 uniform mat4 u_pm; in vec4 a_pos; in float a_size; const vec2 pos[] = vec2[4]( vec2(-0.5, 0.5), vec2(-0.5, -0.5), vec2(0.5, 0.5), vec2(0.5, -0.5) ); void main() { vec2 offset = pos[gl_VertexID]; gl_Position = u_pm * vec4(a_pos.x + (offset.x * a_size) , a_pos.y + (offset.y * a_size) , 0.0, 1.0); }
Example
The code below implements a very simple particle system that makes use of this technique:
WaterBall.h
#ifndef WATER_BALL_H #define WATER_BALL_H #define ROXLU_USE_ALL #include <tinylib.h> class WaterDrop { public: WaterDrop(); ~WaterDrop(); public: vec2 position; // 8 bytes vec2 forces; // 8 bytes - offset 8 vec2 velocity; // 8 bytes - offset 16 float mass; // 4 bytes - offset 24 float inv_mass; // 4 bytes - offset 28 float size; // 4 bytes - offset 32 }; class WaterBall { public: WaterBall(); ~WaterBall(); bool setup(int w, int h); void update(float dt = 0.016f); void draw(); void addDrop(vec2 position, float mass); public: int win_w; int win_h; Program prog; /* shaders / prog */ GLuint vbo; /* the vbo that holds the water drop data */ GLuint vao; /* vertex array object */ size_t bytes_allocated; /* number of bytes we allocted on gpu */ std::vector<WaterDrop> drops; /* the water drop particles */ mat4 pm; /* projection matrix; ortho */ }; #endif
WaterBall.cpp
#include <assert.h> #include "WaterBall.h" // ------------------------------------ WaterDrop::WaterDrop() :mass(0.0f) ,inv_mass(0.0f) { } WaterDrop::~WaterDrop() { } // ------------------------------------ WaterBall::WaterBall() :win_w(0) ,win_h(0) ,vbo(0) ,vao(0) ,bytes_allocated(0) { } WaterBall::~WaterBall() { } bool WaterBall::setup(int w, int h) { assert(w && h); win_w = w; win_h = h; pm.ortho(0, w, h, 0, 0.0f, 100.0f); // create shader const char* atts[] = { "a_pos", "a_size" } ; prog.create(GL_VERTEX_SHADER, rx_to_data_path("waterdrop.vert")); prog.create(GL_FRAGMENT_SHADER, rx_to_data_path("waterdrop.frag")); prog.link(2, atts); glUseProgram(prog.id); glUniformMatrix4fv(glGetUniformLocation(prog.id, "u_pm"), 1, GL_FALSE, pm.ptr()); float cx = w * 0.5; float cy = h * 0.5; int num = 10; for(int i = 0; i < num; ++i) { addDrop(vec2(rx_random(0, w), rx_random(0, h)), 1.0f); } glGenVertexArrays(1, &vao); glBindVertexArray(vao); glGenBuffers(1, &vbo); glBindBuffer(GL_ARRAY_BUFFER, vbo); glEnableVertexAttribArray(0); // pos glEnableVertexAttribArray(1); // size glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, sizeof(WaterDrop), (GLvoid*) 0); glVertexAttribPointer(1, 1, GL_FLOAT, GL_FALSE, sizeof(WaterDrop), (GLvoid*) 32); glVertexAttribDivisor(0, 1); glVertexAttribDivisor(1, 1); return true; } void WaterBall::update(float dt) { if(!drops.size()) { return ; } vec2 force(16.0, 0.0); for(size_t i = 0; i < drops.size(); ++i) { WaterDrop& d = drops[i]; d.forces += force; d.forces *= d.inv_mass * dt; d.velocity += d.forces * dt; d.position += d.velocity; d.velocity *= 0.99; d.forces = 0; } glBindBuffer(GL_ARRAY_BUFFER, vbo); size_t bytes_needed = sizeof(WaterDrop) * drops.size(); if(bytes_needed > bytes_allocated) { glBufferData(GL_ARRAY_BUFFER, bytes_needed, drops[0].position.ptr(), GL_STREAM_DRAW); bytes_allocated = bytes_needed; } else { glBufferSubData(GL_ARRAY_BUFFER, 0, bytes_needed, drops[0].position.ptr()); } } void WaterBall::draw() { glBindVertexArray(vao); glUseProgram(prog.id); glDrawArraysInstanced(GL_TRIANGLE_STRIP, 0, 4, drops.size()); } void WaterBall::addDrop(vec2 position, float mass) { if(mass < 0.01) { mass = 0.01; } WaterDrop drop; drop.mass = mass; drop.inv_mass = 1.0f / mass; drop.position = position; drop.size = 10 ; drops.push_back(drop); }