Lightsmark and Lighsprint SDK are now free and open source software.
Realtime Radiosity - global illumination technique
that achieves its speed (typically order of magnitude higher than other GI techniques)
by transporting light between actual scene triangles, rather than between additional artificial
structures (voxels etc), thus minimizing overhead.
Suitable for rasterizers, all local illumination models, light types, materials, shaders, graphics APIs.
Optimal for indoor games and real-time visualizations with fully dynamic lights.
[ NEWS ] [ DEMOS ] [ TECHNIQUES ] [ PEOPLE ]
Lightsmark and Lighsprint SDK are now free and open source software.
After decade of silence, I plan to have some fun, so I'm resurrecting old projects.
There are tons of news and demos missing here.
Luckily this website (actually all of my projects) is on github, so feel free to send pull requests.
I just updated list with several new demos, Voxel cone tracing, Sfera and Apollo 11.
Unfortunately I was not able to run the most recent one, Apollo 11, as none of my GPUs is supported.
Thanks Yann Yeo for Voxel cone tracing link, thanks Redox for several earlier links.
Wow, interactive pathtracing in browser, instantly added to demos.
User interface makes scene completely dynamic.
Each time you change geometry, light or material, javascript hardcodes everything into new glsl shader and runs it.
My wish from 3 years ago
> Scenes are made of roughly 10 spheres. It makes me wonder in good sense, > we coders love limits and intros show we can do lots in 256 bytes... > can we make good gameplay with renderer limited to 10 spheres? > I guess building games on Thierry's commented source code would be piece > of cake. Any demo organizer here?was nearly satisfied in The once known as Pong project, set of realtime pathtraced nearly minigames made of 10 spheres (or slightly more). It still requires recent Nvidia GPU. It's "nearly" because although user can control e.g. car made of several spheres, and shoot spheres, and there is sphere physics, there is still no game logic.
Few months ago, Jacco Bikker made small cute realtime path traced demo Simplex Paternitas (Windows binary, recent Nvidia GPUs only).
It was for demoparty iGathering 2011 with competition theme "parenting",
so at the beginning, you can see two characters (made of few spheres) making love... oh, they do it behind closed doors (sphere),
where can I control transparency?
Should I add it to demos? There already is Brigade, one year older realtime pathtracer by Jacco, with more complex scenes.
Two more voices for papers with source code (but still no idea how to accomplish that, other than crying loud and trying to build momentum):
Guido van Rossum started it
> But if you want to be taken seriously as a researcher, you should > publish your code! Without publication of your *code* research in your > area cannot be reproduced by others, so it is not science.In discussion below, daglwn added
> > Indeed, scientifically, it's *preferable* that the results be validated from *scratch*, > > to avoid risk of "contaminating" recreations with any mistakes made in the original experiment. > > Ideally, perhaps, but we do not live in a world of ideals. In the world of reality researchers > often do not have the time to reproduce the code, in which case the publication goes without > challenge (this is the most frequent outcome in the computing field). Other times the researcers > will want to reproduce the code but cannot due to lack of documentation. A really ambitious > researcher may spend a ton of time figuring out assumptions and discovering major flaws > in the previous work, none of which can be published (in the computing field) because > no conference will accept papers that do not show "improvement." > > In the computing field, that last point is the Charybdis to the Scylla of unstated assumptions > and privately-held code. Either one is grounds to seriously question the validity > of our research methods, and therefore results, in computing. > > I once talked to a few higher-ups at IBM research. They flat out stated that they will > not accept any computing publication as true until they verify it themselves and they find that > 99% of it is worthless either because it cannot be reproduced or the papers are flat-out lies.
3d artist Amos Dudley asked me about slow progress in architectural visualization
> I just wanted to know if you know of some realtime, or even > semi-realtime solution that would work better than the current > non-realtime model, that exists or will exist soon? From the demos on > your page (lightsmark, Interactive illumination using voxel cone > tracing, etc), the technology looks fairly mature- it just confuses me > why it doesn't seem to be readily offered on the market.Good question, let me answer here.
There are many excellent ideas and papers, but only one in ten techniques makes it into executable demo so that people can actually see it, and then only one in ten demos makes it into product, so that people can actually use it.
Turning idea or paper into demo requires significant effort, there is not enough space in paper for all important details, implementer needs to be as clever as researcher who wrote the paper. In the end, implementer can discover problems that were not evident from the paper and throw whole work away. Still, it is fun, so many try it and we see at least few new demos every year. Of course, paper authors have working 'demos', but existing journal and grant system does not require them, so no one releases them. (It's difficult to setup efficient grant system, I can't say I would do it better, researchers already do miracles for minimal wages. Still, demos would increase chances of new tech getting into production, which is one of motivations behind grants.)
Turning demo into product needs huge effort too. First demos tend to show single data set that works on single GPU. Product needs all corner cases working on diverse hardware, reaching such maturity can take years. Secondary problem is that demo written only to demonstrate something, or research code written for testing new ideas, can't be integrated into product. One has to write new code from scratch to make it as flexible and easy to use as pieces of lego, to adapt it for production. Such rewrite is not terribly difficult for experienced skilled programmer, but writing something again, from scratch, without adding new features is no longer fun, not many do it.
Then there are slightly inflated expectations, as several companies already use realtime GI for marketing, although closer inspection shows that actual products still use static lightmaps.
Although realtime GI spreads slowly due to obstacles I just described, some visualization products already made it to end users. Realtime pathtracing is the most often demonstrated tech, I think some plugins already exist, it's only too slow/noisy for presenting architecture to end users. Then there is RenderLights, probably first noise-free realtime GI product on market (video of RL 1 year ago, it's quite more advanced now). I did not mention RL earlier because it's based on my code. Several other visualization companies licensed my code, as far as I know they did not release yet. There may be other products I don't know of, let me know anyone if you know. So it's not much, but it's getting better.
High Performance Graphics 2011 paper Real-Time Diffuse Global Illumination Using Radiance Hints
(paper,
video,
video)
calculates diffuse lighting in a grid in fully dynamic scene, like Light Propagation Volumes.
Unlike LPV, it has grid spanning whole scene, and it does not inject occluders into the grid, instead
it relies on incomplete occlusion information present in reflective shadow maps.
So it suffers from low resolution/missing details, this can be improved by adding SSAO.
And from incomplete occlusion; this is something patient game designers can possibly workaround with extra scene tweaking,
but non game users have no way of fighting it. As usual, this problem would be easier to evaluate with demo.
Eurographics 2011 paper Guided Image Filtering for Interactive High-quality Global Illumination
(paper)
kills some pathtracing noise by blurring indirect illumination less across geometry edges (found in normal+depth map).
As a proper pathtracing paper, it starts motivation by poking into rasterization: "Code complexity increases tremendously for systems that simulate global illumination effects using rasterization because multiple approximations need to be combined to achieve a sufficient visual quality." Hmm, I use rasterization and I don't filter indirect illumination across geometry edges since 2005. But I won't poke back into pathtracing, guys already suffer enough by waiting for their renders to complete ;)
Still, I would like to see this demonstrated in Sponza (i.e. no caustics, guided filter destroys them), it could produce noiseless interactive pathtracing even faster than in Brigade ERPT video. For interactive use, this could soon become practical.
Upcoming Siggraph 2011 paper Interactive Indirect Illumination Using Voxel Cone Tracing
(short paper,
poster,
video,
author's blog)
looks even better.
Similarly to Voxel-based Global Illumination,
it calculates indirect illumination in prebuilt voxel hierarchy, dynamically updated;
works with fully dynamic geometry and lighting; but it supports both diffuse and specular reflections.
It also intersects voxels instead of triangle meshes,
but it gets extra speed by approximating visibility in cone using single ray.
As usual, any comparison is difficult without demo, but the video looks promising.
i3D 2011 paper Voxel-based Global Illumination
(paper,
video)
calculates near field illumination in prebuilt voxel hierarchy, dynamically updated;
works with fully dynamic geometry and lighting, only diffuse reflections.
Speed comes from intersecting secondary rays with voxels instead of triangle meshes.
And from traversing only limited distance (near field radius), ignoring indirect light from greater distance.
There seems to be no transparency compensation in mostly empty voxels, tiny objects create darker indirect shadows, as if they fully occupy voxels.
Paper mentions using it for multi-bounce indirect illumination at reduced performance.
Embree - Photo-Realistic Ray Tracing Kernels is open source CPU pathtracer released one week ago by Intel.
Interactive renderer works well, but as other plain pathtracing renderers, it is slow/noisy. It is slower than comparable GPU pathtracers on comparably expensive hardware.
Intel claims that Intel Compiler improves performance by approx 10% (compared to Visual Studio or gcc). Although I did not test Intel Compiler, 10% is believable for included trivial scenes; it is hard to believe for complex scenes typically bottlenecked by memory access, something that can't be solved by compiler. Maybe that's why included scenes are so small, Cornell box the most complex one.
For testing complex scenes, package includes .obj importer, however it is bit difficult to work with, noise is sometimes too much. (Also, what I did not find documented: Emittance is ignored, emissive triangles have to be specified from commandline, with full coordinates. Textures have to be in .ppm. "Error: file format not supported" sometimes means "your filename contains space, lol".)
KlayGE (open source game engine) was announced yesterday with "realtime GI" inside.
I did not look at source code, but lighting in GlobalIllumination sample
looks very wrong, indirect illumination is concentrated in small part of scene
and completely missing in the rest. It does not even look like first bounce. (Edit: documentation says it's Splatting Indirect Illumination.)
Missing indirect illumination is masked by good old ugly constant ambient.
Some other non-GI samples are very nice and it's good to see how open source engines improve, but
in this version, "realtime gi" part is not yet working well.
Anton Kaplanyan/Crytek published this realtime GI technique in 2009-08 (Siggraph).
As far as I know, they did not release any demo or game with it.
(I've seen 20 minutes of Crysis 2 video footage,
but there was no realtime GI visible, can anyone explain where should I look for it?)
Lee Salzman published his open source LPV implementation (source and linux+windows executables) probably in 2009-09 (according to date attribute of files). I'm not adding it to demos, because I don't see it working, demo runs strangely lit, without expected GI effects when spheres move.
Andreas Kirsch published his open source LPV implementation (page, source and windows executable) in 2010-07. It seems to work in fixed region, majority of scene stays completely black, and light and objects can't be moved, so it's not easy to test it, but those few colums in center of scene have expected indirect illumination.
OpenGL 4.1 just released. Good news inside.
Aside from OpenGL, I just bought several new GPUs for tests, so next time some confused developer uses CUDA instead of OpenCL, I'll come with better screenshots, I promise.
Jacco Bikker released Brigade, next gen Arauna, with optional GPU path.
Unlike e.g. Lightsmark that splits work between CPU and GPU by work type (direct illumination is faster rasterized on GPU, incoherent indirect rays are faster on CPU),
Brigade uses both CPU and GPU for the same task and balances load to keep both busy.
Technique is pathtracing, so it's too slow for games (way slower than any engine with direct illumination rasterized), but it also makes it very simple, source code is short.
As a sideeffect, it can run without GPU.
It uses proprietary CUDA instead of OpenCL, so it can't use majority of GPUs.
Multi-Image Based Photon Tracing for Interactive Global Illumination of Dynamic Scenes
describes realtime GI technique for DX9 generation GPUs.
Performance numbers look well, considering it's fully dynamic and with caustics, but it's too slow for games.
It's good for hipoly scenes as it calculates intersections in depthmaps only, no KD trees.
It's not good for complex architecture, it would need too many depthmaps to represent whole scene.
virtual C* createReference();
This single member function makes class C refcounted, without breaking old apps, without breaking simple general rule that every object has its owner who creates and deletes it by plain old "delete". Users may look at createReference() as if it creates full copy, but in fact, it increments internal refcount and returns "this". Trick is in destructors and operator delete. We can't stop compiler from calling them, but implementation is ours, we can check refcount and destruct member variables only if it is 0. Some class psychologists might talk about destructor abuse, so keep implementation in closet to avoid arrest, expose only nice createReference(), users will love you and your middleware will dominate.
static void* g_classHeader;
class Ninja
{
public:
//! First in instance lifetime.
void* operator new(size_t n)
{
return malloc(n);
};
//! Second in instance lifetime.
Ninja()
{
refCount = 1;
}
//! When deleting an instance, this is called first.
virtual ~Ninja()
{
if (--refCount)
{
// backup class header, we are going to die temporarily
g_classHeader = *(void**)this;
return;
}
// destruct members as in your usual destructor
}
//! When deleting an instance, this is called last.
void operator delete(void* p, size_t n)
{
if (p)
{
Ninja* n = (Ninja*)p;
if (n->refCount)
{
// resurrect instance after destructor
*(void**)n = g_classHeader;
}
else
{
// delete instance after destructor
free(p);
}
}
};
//! Creates new reference to this. Both pointers must be deleted (in any order).
//
//! It is not thread safe, must not be called concurrently for one buffer.
//! It may be called concurrently for different buffers.
virtual Ninja* createReference()
{
if (this)
refCount++;
return this;
}
private:
volatile unsigned refCount;
};
class Image ninja-refcounted, it takes only few lines to make Image* loadImage(filename)
detect when the same filename is used again and return new reference to the same image.
User code is not affected, memory saved, load time reduced. No other refcounting technique makes this possible.
Is this page dead? Definitely not, but not much happened in RTGI.
~~~
OpenCL develops nicely.
~~~
Startups come up with new big words.
~~~
No stronger game consoles on horizon, this is real blocker for widespread RTGI use.
~~~
GPU makers play their usual games; while Intel
says that people don't need high performance graphics,
and Nvidia with uncompetitive hw says "demand is high",
current leader AMD/ATI plays extremely respectably, it totally makes me forget
pain of buggy drivers several years ago.
~~~
Introversion released Darwinia+,
that's game I wrote graphics effects for, several years ago.
I hope guys make enough money for staying independent, reviews are great.
Sometimes better 2d compression makes better global illumination,
like in old Broncs demos
with adaptively subdivided scene and
quantized vertex irradiances saved to jpg (num.vertices * num.frames).
Of course GI quality sucked, but it was part of joke,
people believed it's realtime computed.
Complete dynamic scene GI was brutally crammed into a few kilobytes
thanks to fresh new libjpeg 6b (by Thomas G. Lane).
Years passed and 2d compression gurus did not sleep. And although fractal compression did not make revolution we expected, wavelets went nearly mainstream in jpeg2000. Nearly mainstream, because despite wide support in software and 1.2x better compression, people still store data in jpeg, using libjpeg 6.
This summer, Guido Vollbeding released libjpeg 7. 11 years after v6. First thing I noticed in readme was shocking broncs-like FILE FORMAT WARS: The ISO JPEG standards committee actually promotes different formats like JPEG-2000 or JPEG-XR which are incompatible with original DCT-based JPEG and which are based on faulty technologies. IJG therefore does not and will not support such momentary mistakes and then v7 is basically just a necessary interim release, paving the way for a major breakthrough in image coding technology with the next v8 package which is scheduled for release in the year 2010. Ha! I want to know more, how will that v8 work? Friend google quickly found forum post from 2004 that shows Guido badmouthing jpeg2000 and promising revolution in jpg. cpoc commented I like how the guy always raises the unknown-secret mega property of DCT that trumps everything... so no one can argument. Few years later, Guido revealed his secret (pdf). If you need more proof the guy's crazy, see the last page, figure B-1. However, if you really read the pdf, he does good work and v8 may greatly improve progressive jpg, improve compression in low quality settings and lossless compresion. Let's wait for v8.
Or not wait? Guido did not provide jpg compression improvement numbers and I think it won't be enough to catch jpeg2000. Jpeg2000 is already here and it's 1.2x better than jpg. Current state of the art, DLI, is 1.6x better than jpg. It's very slow, but few months back, Dennis Lee added "fast" mode that's fast enough for practical use, with only slightly worse compression, still much better than jpeg2000. I think dli could do what jpeg2000 did not, seriously kick jpg ass. But Dennis Lee would have to open source it.
AMD made ATI Stream SDK beta with OpenCL on CPU freely available.
This makes OpenCL development available also for unregistered developers.
It's still only for early adopters as it's CPU only.
Remaining problem: No vendor supports OpenCL-OpenGL interoperability. Without fast switch to OpenGL, it's useless for hybrid rendering. DX11 + compute shaders will probably come sooner than working OpenCL+OpenGL. Until vendors fix the problem, OpenCL and DX11 won't compete, they will cover different markets, OpenCL for GPGPU, DX11 for hybrid rendering.
Update: OpenCL GPU driver developed by Apple should be present in just released OSX 10.6. I didn't have time to test it yet, but if it's true, it's first publicly available OpenCL GPU driver. Others report it doesn't support any extensions. It would mean no OpenGL interoperability.
Last year, Cass Everitt opened interesting discussion how to do
next-gen graphics on top of OpenCL.
I found it now, but it's still interesting reading.
And it reminds me, I predicted OpenGL death by OpenCL
in 25 years. 24 years to go.
On unrelated note, this is truly web 2.0, now with all post titles linking to itself. I considered going even further, but this single static html file design already scales fantastically. See how bandwidth grows exponentially (go internets go) while page size grows linearly (I'm not robots).
Nvidia sent new OpenCL driver for conformance tests and made it available
to registered developers.
Two missing features still stop me from writing OpenCL code
(damn, it says that my experience must not be discussed),
but our future is bright, with many light bounces.
Update: AMD drivers are also already available to selected developers. AMD expects public release in "second half of 2009".
Update 2009-05-28: Another closed beta release from Nvidia. It's good, one of two features I miss was added.
Many years ago, MinGW was the best gratis C++ compiler for Windows. Everyone supported it.
Then gcc 4.0 brought big changes that broke Windows port and at the same time, Microsoft started
releasing Visual C++ gratis. One day we will wake, all base belongs to Microsoft and all functions have _safe suffix.
Then MinGW will rise again and save Windows coders from submission. But for now, Visual C++ works great and incentives for supporting MinGW shrink.
Two good news from Nvidia:
Jacco Bikker posted new
Arauna GI
demo at gamedev.net.
It's not really dynamic as you can't move lights, GI precalculation takes 8 seconds on my 4 cores.
But still good to see this development, it's only matter of time
before Jacco or one of his students calls GI update in every frame and starts
tweaking quality to make it interactive.
Go go!
OpenGL 3.1
says I'm new,
specification and beta drivers were released today.
Spec size of cleaned up API dropped from 3MB to 2MB.
I can't enjoy 3.1 at the moment, as removing all references to removed functions
would cost me at least two unproductive weeks, but cleaned up spec is great for coders who start from scratch.
Michael Larabel comments release with
In regards to OpenCL, Khronos also let loose that OpenCL implementations should begin appearing quite soon.
Update: Must.. resist.. temptation.. rewrite.. everything.. for.. 3.1.
I'm weak, already rewriting.
Zack Rusin blogs about
free OpenCL implementation in Gallium3D
(OpenCL is perfect layer between realtime radiosity and hardware,
Gallium3D is a new 3D driver architecture recently merged to Mesa trunk).
Zack's wish to have OpenCL supported on first GPU
platform by summer 2009 looks wildly optimistic, however,
as a Gallium3D developer who attended at least one OpenCL working group meeting,
he is qualified to make such estimate. Thanks, Zack.
Usually I don't blag about adding new item to the list of demos, but this time
it's different, because I haven't added anything new for nearly 1.5 years.
What happened with realtime GI freaks?
Arauna evolved,
but I think Arauna based games did not bring anything significantly new over already listed Arauna demo.
Lightsmark 2008 greatly improved speed and quality over 2007, but somehow I didn't feel it deserves new item in the list.
Geometric algebra guys still taint good algebra's name by producing marketing materials only.
So what's new?
First, there was smallPT, GI in 99 lines of C++, fantastic piece of code that renders Cornell box like scene made only of spheres. It's very slow, but Thierry Berger-Perrin ported it to CUDA and added new scenes and controls for realtime manipulation, The Once Known as SmallPT was born. Magnificent. Beware, requires Nvidia card. Source code included.
Scenes are made of roughly 10 spheres. It makes me wonder in good sense, we coders love limits and intros show we can do lots in 256 bytes... can we make good gameplay with renderer limited to 10 spheres? I guess building games on Thierry's commented source code would be piece of cake. Any demo organizer here?
It would be cool to have single interchange data format for 3d scenes.
There's need for such format and there's nothing better than Collada,
so it's sad that Collada support is not granted everywhere yet.
We have none or very weak native support in the biggest game engines
and 3d content creation tools.
However, it's growing from bottom up, with
native support in smaller products and 3rdparty plugins into the big ones.
Recent good news is
huge speedup
of new collada max/maya import/export plugins.
It's December and OpenCL specification is out.
Let's make some advanced realtime stuff that runs everywhere...oh wait, where's platform runtime?
Update: this slide at hardware.fr shows Nvidia plan: final runtime/driver is expected in the middle of 2009.
From full article:
See
Imperfect Shadow Maps for Efficient Computation of Indirect Illumination (pdf).
Instant Radiosity was improved by rendering random surface points into
shadowmaps, it's faster than rendering triangles.
Paper doesn't show scaling, their 'complex' scenes are pretty medium,
but I think it works and we will see it in a game in less than 10 years
- if we survive all those ninja paradigm shifters throwing raytraced donuts.
Gamefest 2008 presentations
are online.
Shedding (Indirect) Light on Global Illumination for Games
sounds interesting, however slides are not available in any standard
format, only .pptx. OpenOffice 2.4 doesn't support it.
PowerPoint Viewer 2007 fails.
PowerPoint Viewer 2007 SP1 was expected to fix it, but it fails too.
There's no free viewer capable of opening it...... so point of my story is:
proprietary formats are baaad, mkay?
I reported it to MS, but before fix is available (and before OpenOffice releases 3.0 with .pptx support), users will google for solution. Let's attract them. Viewer says PowerPoint Viewer cannot open the file "Shedding Indirect Light on Global Illumination for Games.pptx".. Error in event log is Faulting application PPCNVCOM.EXE, version 12.0.6211.1000, time stamp 0x46ce621f, faulting module oartconv.dll, version 12.0.6214.1000, time stamp 0x47029c72, exception code 0xc0000005, fault offset 0x0022d94b, process id 0x3b78, application start time 0x01c912bcf5abd300.
See
Bouliiii's CUDA raytracer (source code).
Lighting is simple, direct rays only, but speed is promising,
12-40Mrays/s on 8800GT.
Of course, raytracing direct light is much slower than rasterizing it,
but it's good excersise before going to indirect lighting where
those rays would really help. So is 12-40Mrays/s practical?
Incoherent rays for indirect lighting will be much slower,
but it's difficult to estimate without trying.
However, even with very roughly estimated 10x slowdown for indirect rays,
this may be moment when GPUs overtake CPUs. No marketing fakes,
here's the demo.
Gameplayer lists (real-time) GI in Top 10 Game Technologies
BTW, every child knows that when writing new GI article, it's important to keep first two sentences
in standard form
1. "GI is extremely critical stuff"
2. explain why
Gameplayer levels up the scheme by adding strong third sentence
3.
"If you don't believe us, go back and fire up the original Doom."
Siggraph brought news about OpenCL (pdf presentation).
For me, no surprises inside.
Design looks like it could work on Cell too, although with terribly expensive random access to big data structures in global memory.
Still, OpenCL would be good for Cell in PS3 development, because one would be able to test on PC
and have roaring PS3 jet-engines turned off.
One year late, OpenGL 3.0 is out and it doesn't bring promised changes ('new API').
This thread shows what probably happened behind the doors:
some CAD companies were against the new APi because it would - ahh - break their precious badly written rendering code. That is what I call politics
The IHVs themselves seem to want the new API badly, because it means greatly simplified driver developement and better driver quality.
As far as I'm concerned, I welcome even this smaller step, because driver bugs are no longer problem,
I became pretty skilled in avoiding them.
Thinking more about Khronos, I hope OpenCL will start soon.
Because when we have generic compute shader platform with simple memory management, OpenGL and Direct3D are just unimpressive high level APIs
built around hw rasterization, sentenced to slow death.
AMD dropped CTM in favour of OpenCL. Nvidia will drop CUDA for the same reason 10 years later.
The only two questions are, will Intel manage to add serious OpenGL support before everyone else drops it (in 25 years)? Of course,
serious = runs Lightsmark.
And will Microsoft manage to prevent OpenCL implementations on next Xbox?
Update:
- That thing about CAD companies, repeated on all forums, originates from Carmack.
- I don't think that death of 'new API' is sooo big disaster for driver developers.
Completely new API in addition to the old one, maintained for compatibility is a lots of additional work, while
current deprecation model forces them only to add some #ifdefs and compile twice.
If they want, they can create separated 'clean' GL3 codebase at any time.
Update2: AMD is not dropping CTM, sorry. Plans are to layer OpenCL on top of CAL that evolved from CTM. However effect is the same, mainstream app developers won't use proprietary API if they get portable one.
After one year of work, Lightsmark 2008,
realtime GI benchmark,
improves both speed and quality and adds Linux and native 64bit versions.
Enjoy.
Update: users report scores over 700 fps.
Several years ago, undisclosed game studio manager
_watched_ latest id software game. He said it _looks_ extremely bad.
I'm pretty sure his feelings were caused by
complete lack of indirect illumination,
he just had no words for it because he was manager.
Finally, id gets it too. Carmack explained that he found some of the complaints of Doom 3's brand of horror to be "completely valid," saying that the "contrived nature of monsters hiding in a closet" and the extreme darkness were two things that caused the company to cancel its game Darkness and begin production of Rage.
Another realtime raytraced game based on Arauna was released - Let there be light.
It immediately crashes before displaying first frame.
This would be probably good for open source game, people would get involved to fix it
and developer base would grow. Bad luck, there's only binary installer.
I really like this effort, it's 100x more important for realtime raytracing than Intel's marketing dollars. On the other hand, people should not confuse this with realistic lighting. With zero or constant ambient, it looks less realistically than an average game. Of course there are ways how to improve lighting and even without looking into Arauna, I'm sure Jacco has something in development.
OpenMP 3.0 final spec was published.
See what's new.
It's no help until there are compilers supporting it, so
good news: OpenMP 3.0 support was merged into gcc 4.4,
version currently in development, to be released probably in a few months.
Redway3d is one of several smaller raytracing companies out there.
I never mentioned them because I had no chance to see their results.
It changed today as they released executable
demo.
Lighting looks conventionally and general impression is way behind Outbound,
so I'm not adding it to demos, however, kudos for releasing executable.
Btw, watch hand disappears when manually zooming.
It's cracking in pure raytracing camp, Pete Shirley admitted that
pure raytracing is no go
(are hybrid methods a natural architecture for interactive graphics? After much contemplation, I have concluded the answer is yes.)
and joined Nvidia.
After years of confusion and marketing abuse, realtime raytracing
finally appears in game. I mean, not just in marketing materials
of company X. or Y., but in actual game.
Jacco Bikker and his students work on Outbound.
Game seems to be available for download,
Update: Gotchas solved, but still no luck.
With disabled Adblock, download page finally opens.
Download link shot down Firefox, right click to save is necessary.
Game started, but image was distorted, then it stopped responding and silently disappeared.
Demo displayed single static image. Maybe Vista x64 still too exotic?
Update: Demo starts after hitting space, thanks Dhatz. Very nice scene!
Scene and lights are static, but it's still the best realtime raytracing demo ever.
Looking for improved realism,
majority of pixels have only direct lighting + const ambient,
so it resembles late 20th century. Exceptions are pixels with
specular reflection, refraction and some strange noise.
One extremely important point I forgot to mention: source code
of Outbound engine, Arauna, is available too.
Zigzag between driver bugs is my daily bread, but this one is funny:
Lightsmark compiled on Linux,
with Nvidia GeForce driver, scores roughly 10% higher
than it should. Why? It's because application time runs slower. Why?
I think it's bug in nvidia binary blob driver.
Removing several opengl commands, without any changes to the rest of execution
path, makes time run normally.
Update: problem disappeared after several changes in rendering code. I haven't tried to track it down to one-line change, but changes were related to display lists.
Only as a curiosity - Donald Knuth is
not enthusiastic about parallelism.
I think it's because he doesn't work on global illumination.
While direct lighting (either by rasterization or raytracing)
is so effortlessly parallel that it's no fun,
and TeX page is so serial that it's no fun,
global illumination stays in the middle and it brings lots of
interesting problems. Knuth is right that ground is fragmented, different
hw pieces need different techniques that will go away with hw in a few years,
but I think future is predictable:
Another guy thinks raytracing by Intel is
sophisticated marketing campaign. He explains why.
See
Meshless Finite Elements for Hierarchical Global Illumination.
GCC Improvements on Windows
was accepted into Goggle summer of code.
Thanks Aaron, thanks Google.
Update at the end of summer: It seems Aaron reached rougly 50% of his goals. Still I think it's an excellent result.
Rendering needs lots of memory. Clients need Windows.
These two powers pushed me into Vista x64.
Yes, I know everyone says Vista is bad, but Dee's no crybaby.
Only 2 hours after fresh clean install,
automatic Windows update entered famous infinite reboot loop.
Then I tested both AMD and NVIDIA hi-end cards with latest stable drivers,
and both consistently crash.
Maybe I'll pick different challenge next time...
jump realtime raytracing hype and solve rendering equation with single ray? ;)
Now also
John Carmack talks about raytracing.
Is it already two years since Intel started using word 'raytracing'
to get free publicity? Even without thinking about computational cost,
total lack of hard evidence from Intel (only videos) makes it smell.
So John's opinion doesn't surprise. However, he adds something I haven't
realized before: by buying Project Offset, Intel probably starts
work on hard evidence, creating real demo or game. It would be nice, finally!
OpenMP 3.0 draft was published
(5 days after I wrote that OpenMP doesn't help with task prioritization).
It's possible that Task Queues will help.
I published new demo, Lightsmark,
with engine 10x faster than my previous one.
Real-Time Ray Tracing : Holy Grail or Fools' Errand?
very differs from tens or hundreds of mainstream articles on realtime raytracing.
I finally killed the boss at the end of "happy with OpenMP" level
and entered "needs prioritized threads" level.
Ralph Silberstein expressed it precisely,
"I want to run a pipeline, held to some real-time frame rate, and some second pipeline can utilize what is left in a non-deterministic fashion."
OpenMP is no help here
and no new version seems to be in works.
Is TBB better here?
Intel answers,
it probably is not, but at least, it is still under development.
So... finally I manage threads 'manually'.
Back in VGA times, I
quantized in 3-d by brute force, assembler and big table.
It was blazingly fast for displaying truecolor images on cheap VGA.
I displayed 320x200/24 in 320x400/8 (addressed in chain mode, banking was not supported by all vendors)
with two very similar images alternating at 30Hz, so 1 24bit pixel was approximated by 4 quantized 8bit pixels.
It was probably the most realistic VGA graphics, yo :)
Somehow I think there's not enough realism yet, even with truecolor added to VESA BIOS some time ago.
Realtime global illumination is not pixel perfect.
How do I quantize in 500000-dimensional space, quickly?
What, NP hard? Starting assembler...
It's common to have several versions of shader, simple one for low end cards, complex
one for high end cards (so complex that it would not run on low end) etc.
How do you automatically select the most complex shader supported?
Direct3D defines approx 10 fixed levels of shader length/complexity/whatever (e.g. SM4.0), you must always pick one of them. If you can fit your shaders into these levels, everything is fine, simply look at level supported by graphics card and use appropriate shader from your collection.
OpenGL is simpler and more general, it doesn't specify length/complexity/whatever, you simply compile your shader and driver says OK. Or Sorry, too many interpolators/whatever used, we'll screw you by running it in software. Now you have several shaders and you want to pick the most complex one supported. The only question is, do you test bottom up or top down?
I started with top down: when most complex shader failed to compile/link, I went to second most complex and so on, until I found first supported one. It worked fine except that every card behaved differently, I had strange things reported. I had to buy 5 graphics cards for testing and write several workarounds, based on vendor and model string (ouch!) to prevent broken rendering and crashes inside driver when requested to compile valid GLSL shader. Behaviour often changed with graphics card driver version. It wasn't getting better.
I switched to bottom up: when the simplest shader compiled/linked, I went to second most simple and so on, until next one failed. Number of revealed driver bugs fell down significantly, only one card family continued to crash in driver or render incorrectly.
Conclusion? OpenGL is superior in design, but inferior in practical use, due to insufficient driver testing. Search for the most complex shader won't make you happy.
Try search for the most complex 5-state Turing machine instead. It's like writing really good intro in 5 instructions. Ready?...go! After 45 years of pure research without drivers, problem is still open. Nice intro article. It's about beavers too.
I'm back from my "Realtime Global Illumination for next generation games"
talk at Eurographics.
I met some nice people and also organizers were awesome.
I never talked at big event before,
so I was nervous and tried to talk at home. It was total fiasco.
With unlimited time to prepare,
I wasn't able to say 3 proper sentences in a row.
At conference room, everything was perfect.
I'm better in realtime, precalculations are not for me.
CryTek published more details on their Real-Time Ambient Map technique,
to be used in Crysis, to be released soon.
See chapter 8.5.4.2 in
Advanced Real-Time Rendering in 3D Graphics and Games Course: Finding next gen - CryEngine 2.
Back when only video was released, I speculated that it's copy of our technique from 2000 (used in RealtimeRadiosity 2 demo) where per-vertex layers of lighting are precomputed for sampled positions of light / objects. When light / objects move, closest samples and their layers of precomputed lighting are selected and per-vertex lighting is reconstructed as weighted average.
Back to reality. CryTek's technique is much simpler and less correct. Only scalar ambient occlusion value per texel is precomputed. The rest is magic expression in shader (not specified precisely) that combines distance attenuation, normal divergence and ambient occlusion. It doesn't simulate light paths, so it would probably look bad in scenes with mostly indirect light, but it probably looks great in mostly open scenes / no big blockers. They added support for portals... I think it was absolutely necessary to avoid light freely leaking through walls.
Conclusion: In mainstream realtime rendering, indoors, it will be first big improvement over aging constant ambient. It's rough estimate rather than light simulation, so it needs lots of additional infrastructure (portals) and work from artists to look good/avoid artifacts. It also means it's viable only for gaming industry and even there, adoption won't be fast. However it's very inexpensive for end user (gamer), so chances are good that it will be adopted by several other engines before fully automated physically based RTGI techniques take lead.
After many years of work,
MinGW released technical preview of first version
based on gcc 4.
It made me recompare several compilers, gcc 3.4, gcc 4.2, gcc 4.3,
msvc 8, icc 9 (others report that 10 is not a good one)...
It's still nice to use gcc. It has the best warnings of all compilers, it always finds some error that was completely ignored by other compilers.
C++ language designer Bjarne Stroustrup
described current state of C++0x
in his recent talk at The University of Waterloo (video).
I remember seeing concepts (new type system for templates)
as a proposal in previous papers, here it is already accepted part of the
language. With feature freeze for C++0x around now, it gives good
overview of next standard, expected to be finalized around 2009.
It seems only remotely related to rendering, however it has some importance for me as my code is heavily templated and new language features may simplify it.
Intel has released Threading Building
Blocks under GPL 2.
Now the question is, is it good idea to use TBB in renderer?
Is another threading API necessary, now?
I think it is.
Check Implicit Visibility and Antiradiance for Interactive Global Illumination
paper, one of the best works!
According to video, it's much slower than Lightsprint with features nearly identical, but great work.
And Carsten is scener, yay!
Bad luck there's no demo... maybe at The Gathering or BreakPoint?
Incremental Instant Radiosity for Real-Time Indirect Illumination
paper is another attepmt to kill instant radiosity artifacts.
It makes things simpler, although less correct, by rendering only first indirect bounce.
No demo, so one can't see how good/bad it is this time.
As you can see above, I added page Demos.
I fill it with links to executable demos of realtime global illumination
and additional information like papers, source code...
I'll have a Realtime Global Illumination talk
at Game Developers Session 2007 next week.
Come in and see the bleeding edge demos,
including bits of the one I'm cooking right now :)
I see that recent Nvidia drivers ForceWare 158.19
are able to compile and run more complex GLSL shaders, with 1-2 more interpolators than previously
on the same GeForce 8800.
In Lightsprint Demo, this immediately, without any setup,
produces smoother Penumbra Shadows.
Good work Nvidia!
Check Lightsprint Demo, brand new piece
with shiny female characters. Of course globally illuminated!
Check this
Q&A With Jerry Bautista, research head at Intel.
Interactive k-D Tree GPU Raytracing
(2007) advances raytracing on GPU. Good work!
It's still 50x slower that rasterization in rendering shadows,
but specular bounces rock, rasterizers can't fake this.
It took some time,
but finally both CTM and CUDA, new APIs for GPUs, are available to broad
developer community, not only to few universities.
In CTM case, one only needs to break psychic barrier and fill
request form with first required field "University:" :)
Few hours ago, OpenGL
Pipeline Newsletter 002 with sexy names for future
OpenGL releases was published. DX10 is dead.
Developing on Nvidia, but targetting all graphics cards, was painful.
Unlike ATI (that was so strict it had problem compile
even correct GLSL shaders),
Nvidia was known by relaxed rules, even shaders with errors were compiled
(eg. with "float x=1;").
With release 95 of their drivers, this has been fixed:
compiler still accepts such incompatibilities, but emits warnings
and it is possible to switch to strict mode where it reports them as errors.
Thanks!
OpenGL support for G80 is clearly superior to DX10 -
it is publically available now for everyone on several operating systems,
while DX10 is expected to ship soon next year on only one
proprietary operating system, without possibility to spread.
Still everyone talks about revolution: "G80 supports DX10!"
and OpenGL is not mentioned.
I think it would be different if OpenGL has some number associated to
new features, ideally if it's easy to remember...imagine those big fonts
on front pages, G80 supports OpenGL100! Of course this won't happen.
Process behind OpenGL standardization is its main advantage,
the only drawback is that it doesn't generate so cool numbers for marketing.
OpenGL extensions for G80 were released, cool, great work Nvidia!
Read Beyond3D forum
and great techreport.com article
for info on G80 hardware architecture.
When there were more than 10 vendors, universal APIs like VESA BIOS,
OpenGL and Direct3D emerged.
Maybe 2 remaining giants (AMD, NVIDIA) are too few for a single universal GPGPU
API and we are returning to painful days without standards.
Ok, it's still too hot area to set one standard now. Land changes quickly with AMD's Fusion plan to integrate GPU and CPU, optimal APIs for 3d graphics evolve as never before. So let's live with several APIs for next few years.
But that reminds me, hacking chaotic VGA registers was fun,
because it was simple and nearly bug free;
on the other hand using well designed VESA BIOS Extensions (VBE)
was pain, because there were bugs in ROM
and no chance to fix them without buying new graphics card, so we had to
live with bugs.
I wrote this list of approx 20 bugs and workarounds
for VESA BIOS implementations. With this list,
I was able to run on buggy cards,
while others displayed random pixels or crashed.
Using GPGPU APIs could be fun again, because they are not set in stone (in ROM),
and users can fix bugs by simple driver/library update, we don't have to
gather workarounds.
The only thing that makes me worry is closed nature of those APIs.
In comparison of APIs created by one company and multiple companies,
those multi- look clearly better; see eg. how well specified is
OpenGL in comparison to DirectX.
DirectX must be rewritten each 2 years to catch up with OpenGL ;)
NVIDIA responds to AMD's GPGPU API by it's own approach -
CUDA.
What's common with AMD?
This post
lists 35 new OpenGL extension names
(with some redundancies) found in latest
Nvidia drivers. It could be work in progress, but chances rise that
tomorrow's G80 release will come together with an API. Good!
More G80 tech details on the Inquirer
(see Beyond3D too).
There's probably nothing left to be revealed on the relase day.
Expect a very limited allocation of these boards, with UK alone getting a mediocre 200 boards.
Realtime radiosity won't get big boost if it's only for few lucky,
cheaper stripped down versions coming probably next spring
(together with competition from AMD) will make us much better service.
Launch of NVIDIA G80 GPU is behind doors.
Lots of advanced effects including
realtime radiosity are going to benefit from new G80 features
as soon as they are supported in drivers.
Web is full of leaked images, leaked specs, leaked prices, leaked everything
(see eg. Beyond3D),
only support for new features in drivers is hidden in mist,
it is expected but not granted.
If anything goes bad, it's no problem for experienced marketing machine
to release hw without software support for new features
and silently add it few months later.
At least specs for DX10 are already available from Microsoft.
In world of OpenGL, no specs exist yet, Nvidia promises to
create appropriate extensions for new features (primitive shader,
primitive adjacency etc).
Some announcement is expected on 2006/11/08, let's see in one week...
An essay on realtime radiosity generated by
essaygenerator.
All Eurographics 2006 GmG contestants (including Realtime Radiosity Bugs) are
available for download.
Realtime radiosity gets popular.
Some guys work on realtime radiosity game.
Hohoo, I don't know them but I like their screenshots,
very good work with light ;)
(somehow reminds me my shots)
QuteMol renders molecules
with direct illumination from sun and scattered daylight.
It has no radiosity yet ;), but I love their
ambient occlusion technique which looks like overlapped shadowmaps.
Shadowmapping produces artifacts, everyone
knows them and considers them ugly. But if you overlap many
shadowmaps, many artifacts on different positions overlap
and create completely different feeling, smell of ugly error disappears
because it is less visible and it looks more like natural noise,
it can even improve image quality.
I found this effect two years ago and I overlap shadowmaps everywhere
since then ;)
I spent some time googling for realtime radiosity news to feed you, my beloved readers ;).
Startups start popping around with realtime radiosity announcements. No papers or demos, pure marketing.
They already fill forums with funny remarks like
Looking at their website, it seems they invented realtime anything simply using Algebra.
Several big players with tons of manpower started to emit realtime radiosity or global illumination
marketing (without papers or demos) too.
Comon and release, it's not stimulating to work on the only demo on the block.
Expected announcement
with standard stuff like comparing our new product with competitor's
very old product. No link to download new GPGPU libs, they are just "pioneering".
Seems that more news on ATI's GPGPU business (could be good news also for realtime radiosity) are going to be
revealed on Sept 29.
Images.
I hope they won't artificially limit new GPGPU API to the most expensive cards.
Unfortunately this is business, everyone is trying to sell hardware and software is
used as a hostage. Luckily sometimes
it's just an illusion, like when I tried to find low end graphics cards with OpenGL 2.0
support. Marketing machines of NVIDIA and ATI mark their higher end parts as OpenGL 2.0
ready, while lower end parts are marketed with OpenGL 1.5 or even OpenGL 1.4 support.
Fortunately even quite low end cards with new driver run fine with OpenGL 2.0
(GLSL and other stuff).
Eurographics 2006 contest results were finally
published.
(Realtime Radiosity Bugs participated)
While novelty was the main criteria to choose the 8 contestants, naturally the "fun factor" of the games had a significant impact on the audience vote.
Radiosity on Graphics Hardware
(2004) describes nice radiosity on GPU technique.
Realtime Radiosity Bugs were finally released,
tested on both NVIDIA and ATI cards. Enjoy!
I'm back from shop with a shiny new Radeon with an ugly creature on cover
(ok, no joking, marketing guys with GeForce cards believe in ugly creatures too).
Finally, both Radeon and GeForce sit in my computer.
I won't do it again, no more releases that run fine with GeForce but crash
with Radeon.
I use to think about myself as a business (in contrast to academic research),
because I left the university with Masters degree while my friends stayed
there.
I solve real tough problems and sometimes people find my work
valuable, because no other solutions to their problem exist.
But today my self confidence as a businessman was broken.
The real business is made by people, who claim they can
turn any color image to a black-and-white picture.
Academic science doesn't necessarily solve all problems.
Known papers concentrate on rendering with higher and higher quality.
If it doesn't add few more digits of precision, it's rejected.
Look at deer's royal heads, deer needs the biggest horns in
the neighbourhood to be sexually attractive, even if it
is useless waste of resources otherwise and it may harm him.
Some have even explained how it's nearly impossible for
evolution to revert this and make female deer sexually prefer
males without those hanidcaps on their heads. Luckily
scientists are no deer females, they can change the trend.
Now back to the papers on rendering:
If speed is a concern, it is always secondary.
However world needs also solution
with the best quality possible _in realtime_
and it could be, but isn't necessarily based on papers
that go for max precision in unlimited time.
You can see many papers with precomputed global illumination. They offer very fast rendering of precomputed data and it's not always clearly visible that it must be precomputed, it could be hiden deep inside the article. So some of them are misinterpreted by massmedia as a final solution of the old problem with global illumination. However those hours of precomputations and those limitations coming from precomputed data (eg. you can't move any object) should not be forgotten.
Imagine architectural modeling. Architect or user interactively manipulates geometry of the building. He wants to see it nicely lit. Existing papers offer him for example relighting (Direct-to-Indirect Transfer for Cinematic Relighting by M.Hasan) - once building is modeled, you can run expensive precomputation and when it's done, manipulate light in realtime. But don't touch the geometry. And don't touch the camera. It's both fixed. Well.. can we have more interaction?
Now let's go for games. Games create strong pressure for technological progress, together with sex, however I'll stick to games. While previous generation had static buildings and precomputed illumination fits for such scenarios, current trend is to make everything destructible, even walls. Soon game developers will start looking around for fast global illumination in dynamic scenes.
OpenMP is a great API for
guys working on multithreaded renderers,
so I'm eager to see it also in gcc (both icc and msvc have it).
It seems that merge of already existing contributed code
was blocked by enormous regression count on gcc trunk,
however last 2 weeks after Moscow meeting
and regression hack-a-thon
showed unbelievable progress with serious bug count going from 160 to 105. Go go gcc 4.2!
Who produces faster code, research or business? This time in ray x triangle mesh intersections.
According to GameTools
technology page, 6 universities and 6 companies started
work on Realtime Radiosity (among other tasks). Wow!
Rasterization
vs Ray Tracing in opinions of AMD, Intel, ATI, NVIDIA etc.
Many predict use of both techniques in different games.
They are partially right, however rasterization with Realtime Radiosity is so strong
(fast) competition, that it will stop Ray Tracing progress everywhere except
shiny car racing games. It's nice and easy to write simple and powerful
renderer, either raytracing or rasterizing,
but competition is tough and it is necessary to improve performance and quality
even if it makes development longer. So complicated hybrid techniques
with raytracing used only for selected objects will compete with
pure rasterizers.
Time when there's enough photorealism and we can sacrifice some performance
to make renderer simpler is still far away.
Interesting reading on graphics
paper unfair rejections, especially at SIGGRAPH.
Many guys take it as a known problem (I would say, they are pragmatic OR
they see the evolutionary
roots behind and they know these problems can't be fixed easily).
Some guys are unhappy (I would say they don't see the roots).
Few guys have left the
field or academy, work for themselves and express their happiness.
In my opinion, graphics research is fun even with existing level of
unfairness. There are fewer chances for single unfair rejection
in business, but look for example at sad situation with software patents.
Try to do business with million of lawyers trying to kill you.
ATI's support
for GPGPU looks very promising, scatter (ability to write
at random address, not present in DirectX 10 or OpenGL 2.1)
could be very helpful for realtime radiosity.
Although ATI confirms that GPGPU computations using their new API
can run in parallel with rendering using DirectX or OpenGL,
it's not clear if they make it possible to run program with scatter
eg. on OpenGL texture.
Good news jump from the mailbox when I return from vacation -
Realtime Radiosity Bugs were selected for
bigscreen
show and competition at Eurographics 2006.