Previously, on creation, we would parse the entire map data,
translating it into and uploading vertices once, then rendering
the entire map on every draw (to keep the draw calls minimal).
This worked great for smaller and medium sized maps, but starting
with larger maps (200x200+) it doesn't scale as the GPUs vertex
processing/culling is overwhelmed by the amount of data each frame.
This rewrite instead changes the strategy to only processing and
uploading a small subregion of the map (the currently visible part)
and regenerating all buffers if this subregion changes. The amount
of data transferred is small enough that it can be done every frame
without causing lag.
The changes also have the convenient side effect that we no longer
require 32 bit indices in mkxp, easing the road to possible GLES2
support in the future.
RGSS allows the source rectangle in both `blt` and
`stretch_blt` to lie outside the source bitmap bounds
(treating the missing data as (0, 0, 0, 0)) and to be
inverted (in which case the blitted image is also inverted).
This commit only hanldes a corner case that
arises in the game "Last Scenario"; emulating the full
RGSS behavior is however desirable.
Previously, we would just stuff the entire tilemap vertex data
four times into the buffers, with only the autotile vertices
offset according to the animation frame. This meant we could
prepare the buffers once, and then just bind a different offset
for each animation frame without any shader changes, but it also
lead to a huge amount of data being duplicated (and blowing up
the buffer sizes).
The new method only requires one buffer, and instead animates by
recognizing vertices belonging to autotiles in a custom vertex
shader, which offsets them on the fly according to the animation
index.
With giant tilemaps, this method would turn out to be a little
less efficient, but considering the Tilemap is planned to be
rewritten to only hold the range of tiles visible on the screen
in its buffers, the on the fly offsetting will become neglient,
while at the same time the amount of data we have to send to the
GPU everytime the tilemap is updated is greatly reduced; so a
net win in the end.
Before, we would blindly rotate through the sources (like a
revolver through its chambers), which worked great if one
assumed all sounds to be relatively short and therefore oldest
use == most likely to be free, but breaks if there is one long
sound playing, which would be stopped and overtaken if we rotated
back to it even though there might be other free sources available.
Instead, keep an ascending priority list of sources with last
used == highest priorty that is iterated through for the first
free one, and only if none is found overtake the one with lowest
priority. This also ensures we're always able to play 'SE_SOURCES'
sounds at once independently of their length.
Fixes#37.
Performance can still be crudely measured by turning off
the framelimit and observing the FPS count. For everything
else, there's always callgrind / apitrace.
GL entrypoint resolution is now done manually. This has a couple
immediate benefits, such as not having to retrieve hundreds of
functions pointers that we'll never use. It's also nice to have
an exact overview of all the entrypoints used by mkxp.
This change allows mkxp to run fine with core contexts, not sure
how relevant that is going to be in the future.
What's noteworthy is that _all_ entrypoints, even the ones core
in 1.1 and guaranteed to be in every libGL, are resolved
dynamically.
This has the added benefit of not having to link directly against
libGL anymore, which also cleans up the output of `ldd` quite
a bit (SDL2 loads most system deps dynamically at runtime).
GL headers are still required at build time.