Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« on: February 19, 2010, 09:50:33 am » |
|
This is a place to discuss bugs and implementation problems with the new multithreading feature of rcx. The current main problem is that the camera is shaking sometimes. A quick summary of what is causing the problem: When physics is simulated, first forces are calculated, and then bodies are moved. After all bodies are moved, the camera is moved. When graphics is rendered, the camera position is first set, then each body/geom is rendered. Normally graphics get rendered just after physics is simulated, but since graphics is running in its own thread, it will sometimes render a frame just as physics is moving bodies. The result is that camera is set to the last simulation, but some (or all) bodies are moved one more step. This makes the camera stop for a short time (one frame). This is a small artefact, but unfortunately, its quite noticeable. So I've tried come up with some solution... The following are my ideas (in the order I've tried them): 1) don't allow physics and graphis to run at the same time. pros: solved problem cons: some systems where the graphics thread got a lot of processing priority (graphics_sleep=0 and multicore processor), the solution backfires, and grphics locks physics, causing serious problems for physics to catch up with realtime. discarded. 2) don't allow graphics while physics is simulated similar to 1, but only physics can lock graphics, but graphics can't lock physics. pros: earlier slowdown problems fixed cons: graphics begins by looking if physics is running (if so, it sleeps until it's done). Since there's no guarantee physics won't start simulating just after graphics checked, the problem still occurs. discarded. 3) current idea: only render one frame after it's been simulated every time graphics will render a frame, it sleeps until physics signals a new simulation step is done. pros: almost guaranteed not to have any problems (unless the system is running with a too low stepsize to be able to handle: misconfiguration by the user=not my problem). bonus advantage: instead of rendering duplicated frames, graphics will only render as many new frames as necessary (stepsize of 0.01 = 100fps max). this is good for performance (more processing power for other stuff), and extra good for single-core systems (where physics and graphics are loading each others on the same core). cons: limit of max fps. I understand it would be nice to have the fps as high as possible, for benchmarking. The solution could be to add a boolean value to internal to choose if to limit fps or not. That way the game can normally run with a limited fps, which can temporarily be disabled for benchmarking. update: ok, latest commit to "slinger/multithread" should contain a working version of the latest idea. It seems to work perfect.  btw: "Avarage FPS: 1129" Boyah!  (this was on windows. I suspect the drivers are locked in triplebuffer mode)
|
|
|
|
« Last Edit: February 19, 2010, 11:13:54 am by Slinger »
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #1 on: February 20, 2010, 02:01:33 pm » |
|
3) current idea: only render one frame after it's been simulated every time graphics will render a frame, it sleeps until physics signals a new simulation step is done. pros: almost guaranteed not to have any problems (unless the system is running with a too low stepsize to be able to handle: misconfiguration by the user=not my problem). bonus advantage: instead of rendering duplicated frames, graphics will only render as many new frames as necessary (stepsize of 0.01 = 100fps max). this is good for performance (more processing power for other stuff), and extra good for single-core systems (where physics and graphics are loading each others on the same core). cons: limit of max fps. lol. Who's going to care if the game doesn't benchmark above 200/300fps (physics stepsize speed)? Most people will probably be running this game on systems which won't get above 30fps and anyone that can will probably use vsync.  The latest one works pretty good even on my laptop (averaging 80fps with stepsize 0.002 (O_o) although I did notice two things that are peculiar 1) set event_sleep to 1 and the flippers act weird (they don't reset fully for some reason) 2) using fraps, the fps was going between 65fps and 110fps. it would stay at about 70 for about ten seconds, then go up to 100-110 for about five seconds, then drop back down to 70. I'm not sure if this is to do with system processes, it was just affecting the fps  How the hell did you get 1129? I need to try this on my desktop on Monday. I want to trash your bench =P (btw for the record, I'm not at home and I'm at a friends' house... been there all week. Will be home sunday. hence why I haven't been on recently ... internet sucks here.) editStartup time (ms): 777 Race time (ms): 31955 Avarage FPS: 59 Threading mode: Multithreaded (3 threads) Stepsize-too-low (slowdown) warnings: 120 On my desktop, using 1280x800 game window resolution, stepsize 0.005 and iterations 10, sync_graphics false and everything else default. ....I think the game is being framelocked xD damn you Windows 7. FPS never goes above 60, even though my laptop hits 100fps... desktop should trash that (should get 500 at least >_>) Oh well.  btw is stepsize-too-low a bad thing if it has lots of warnings? I was sure that on earlier builds it was just a warning for when frames were dropped. My desktop shouldn't be dropping physics events... graphics I'm not bothered about, but physics it shouldn't. Also, on the earlier build of this which had the event and physics sleep things in internal.conf, I never got any stepsize warnings. never. >_>; just jumpy camera movement on my laptop, that was it. xD
|
|
|
|
« Last Edit: February 21, 2010, 09:24:18 am by Mac »
|
Logged
|
|
|
|
Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« Reply #2 on: February 22, 2010, 04:47:12 am » |
|
The stepsize warning is about (hopefully temporary) slowdowns. Basically: the stepsize indicates how many long time each physics step should take (and how long into the future it should simulate). One physics step is performed, then the time it took is calculated, and the time until the next step is calculated. In case there is a positive time until next step, the physics thread will sleep until that point (to sync with realtime). In case there is zero or negative time left, the last physics step took more time to compute than it simulated, in which case there will be a stepsize warning issued, and the physics thread will not sleep (instead it will directly simulate another frame hoping it will manage to catch up with realtime). Having a just a few warnings indicates a good stepsize (it tortures the cpu on the limit to what it can handle), but too many indicates a too low stepsize. In your case, 31955 seconds of simulation and 120 warnings, it doesn't seem too high at all. That is, assuming they weren't all caused at some specific point: did you notice any simulation slowdown at some point?  As for vsync (with graphics syncing disabled): my system doesn't use vsync for opengl (I get around 120fps - it will increase as the currently unstable drivers develops  ) under xp, the ati-developed drivers are locked in tripple buffer (I get around 1100fps) running on my dads quadcore (nvidious graphics), vsyn is enabled, and the fps never goes above 59fps. vsync or not depends on the drivers, really. It would be possible to make rcx request vsync (or request it to be disabled, based on internal.conf), but really, it's not guaranteed the the OS will accept it, or that the drivers will support it. so I simply decided to leave it at whatever is default. Hey, I'm thinking about removing the old single-thread code, forcing rcx to always use multithread. It would make it possible to remove both "multithread" and "graphics_threshold" from internal.conf, to make it slightly cleaner. But if there's some performance problems (stepsize warnings, low fps) on some systems, that'll have to be fixed first.
|
|
|
|
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #3 on: February 22, 2010, 05:35:55 am » |
|
Ah. I think I can force vsync off in nvidia control panel O_o lemme go try. Success. Startup time (ms): 964 Race time (ms): 37690 Avarage FPS: 199 Threading mode: Multithreaded (3 threads) Stepsize-too-low (slowdown) warnings: 113 (stepsize is 0.005 ... 200fps. lol. I realised just now that I have the fps limited.) Startup time (ms): 773 Race time (ms): 50920 Avarage FPS: 3374 Threading mode: Multithreaded (3 threads) Stepsize-too-low (slowdown) warnings: 7 fps unlimited. OMFG. 3,300fps?! you're kidding me. Anyway, go beat that.  in nVidia Control Panel there's a "Vertical Sync" option at the bottom of the Manage 3D Settings list. Change it to "force off", and RCX will run without vsync on windows. I think my 2.85Ghz quad-core and my beastly GTX275 have just set a benchmark for this damn game.  Also I think the mass of stepsize warnings I tend to get (on the first two not the third) is because I move the window when the game opens (9/10 it opens with part of it off the bottom and right of the screen = I have to move the window). I guess since window resizing or moving locks the graphics until it's set, it probably locks physics up too and causes it to have to catch back up to realtime (hence the warnings)? Anyway on the third one, it opened slap-bang in the middle of my desktop, so I didn't have to touch the window at all.  Personally, I think you need to get the camera to stop shaking when fps isn't limited and is quite low, if that's possible. So long as the fps stays above 30 (or above 60) then it isn't visible and from what I see it works great (seems singlethreading works better on my laptop though ... but it's shit. O_o) I dunno how physics simulation slowdown might affect online play though. probably shouldn't... but I worry about silly things. 
|
|
|
|
|
Logged
|
|
|
|
Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« Reply #4 on: February 22, 2010, 05:59:23 am » |
|
I wouldn't call that a silly thing, in fact it could be a real problem: if one player gets physics simulation slowdowns... That system might fall behind the others. It shuold catch up, but what if it keeps on going slow? the system will fall behind the others, and since the inputs need to be synced, he/she will not have realtime control over the car anymore! O_o (key pressed: sent to others, and simulated by them. But not simulated by the local machine until physics catches up = delay in inputs) hmm... oh well... tests on my dads computer: first: 59fps vsync disabled (thanks): 490fps triple buffer (utterly pointless): 461fps Strange... triple buffer is meant to bypass some restrictions on some games in which physics requires a high enough fps. This isn't the case in rcx, but it should get a higher fps by it (although it's a faked fps). And yeah, it seems graphics is frozen when resizing (on windows), and somehow physics gets frozen by it as well (will have too see why). So that seems to be the reason for your warnings. Personally, I think you need to get the camera to stop shaking when fps isn't limited and is quite low, if that's possible. It might be, but I don't know how. The whole idea of having graphics synced with physics is to prevent graphics from rendering a frame while it's being simulated (by only render one frame just after it's been simulated, and then sleep until next simulation is done).
|
|
|
|
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #5 on: February 22, 2010, 07:14:35 am » |
|
It shuold catch up, but what if it keeps on going slow? the system will fall behind the others, and since the inputs need to be synced, he/she will not have realtime control over the car anymore! O_o (key pressed: sent to others, and simulated by them. But not simulated by the local machine until physics catches up = delay in inputs) You just explained why NSR runs so weird on my laptop - the "bullet time" effect (where the CPU isn't fast enough to calculate the physics, and desyncs with realtime for a short time before suddenly jumping back up to realtime - a period of slow gameplay followed by a sudden burst of quick gameplay, repeated constantly). The only solution to prevent it from happening on bad systems is the (gasp) "minimum system requirements". My old laptop can currently handle the 0.005 stepsize at about 30fps, but obviously that's with very simple graphics. When proper graphics (3D, shading, shadows, etc, etc) get implemented, we'll need to find some sort of "ultimate low" system spec where the game will run without any noticeable physics/realtime desyncs (bullet time / input lag), and at the same time have a frame rate that lets the player see where they're going (lol. I remember winning races on trackmania using 2-5fps ... not a joke. my first desktop was trash). Trackmania isn't really physics-heavy but more graphics-heavy and it actually runs on anything above 500Mhz, providing it has good-enough dedicated graphics to back it up (run Trackmania on an AMD K6-2 @ 433Mhz before, and 64MB RAM ... at one point I got 11fps. woot). NSR on the other hand is a complete resource bastard, and needs 2.4Ghz of processing power (preferably a dual-core). My laptop doesn't meet that requirement too well (2x 1.66Ghz processors) so with a lot of cars on the screen it lags like mad and at most points is desynced from realtime (constantly jolting back and forth ... makes racing hard). RCX will probably be able to run fine on any decent processor, so long as the system doesn't rely entirely on the CPU for everything (i.e. onboard graphics, poor RAM, etc, etc). I reckon it should run fine on anything over 1.5Ghz ... might run on a 1Ghz processor. Obviously dual/quad cores will get it easier, but if the game frame rate is good enough, having one or two physics frames of input lag every few seconds isn't going to get noticed, especially if the system catches right back up afterwards. And for online, inputs are never going to get synced properly because of internet latency. RCX is probably going to have to do what GTA4 and Trackmania do and use predictive car movement or just update the opponents' position as quickly as possible (done in NSR, although very buggy as all opponent cars teleport between locations rather than moving smoothly between them, basically causing any collisions to be death (xD)). I don't know how things such as objects are going to work online either, as lag could cause someone to hit a rock, and it would either bounce off in a direction that shouldn't be possible, or ... if they lag enough ... it might suddenly jolt back to its original position as if they didn't hit it (this happens often on GTA4 online, especially with hackers (guilty)). 
|
|
|
|
|
Logged
|
|
|
|
Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« Reply #6 on: February 22, 2010, 08:55:11 am » |
|
*agrees with everything you just wrote*  Hopefully, the game will not require more performance than it currently does, but will begin putting more load on the graphics processor (dedicated card or on-board, it should be possible to run as not all eye-candy is used, and it got enough memory). A random idea for online: what about having a central system simulating the race, and sending important coordinates (cars, weapons) to the other systems. They can then perform their own simulation around that. I guess at some point this could lead to some building exploding for all players - except one with a slower system? "What was that blast wave, I didn't see anything explode?!"  oh, I almost forgot: the whole problem with stepsize warnings on singlethread... My mistake: I decreased stepsize (0.02 to 0.01) without changing graphics_threshold (it was the old values used on my old laptop).
|
|
|
|
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #7 on: February 22, 2010, 09:10:02 am » |
|
My laptop can't render more than 80fps anyway (averages at 86) so even 0.01 will produce stepsize warnings. lol. plus, stepsize 0.02 sucks.  Onboard graphics are terrible though, but my old old desktop has 32MB onboard using DDR RAM, my old desktop 256MB using DDR2500, and my laptop 128MB using DDR2333 (though it has a decent dedicated processor for a laptop). My old laptop has 64MB onboard as well, but I hammered some newer drivers on it and whacked it onto full performance mode so it should probably run stuff as good as my laptop (just that it lacks any directX pixel shaders ... not sure on OpenGL). Speaking of OpenGL, when you start utilizing the graphics, make sure that it doesn't go haywire on systems that don't have cards that support certain OpenGL things - my laptop is old so it can't use some OGL things that my desktop can probably take. Obviously, toggles to enable/disable stuff is needed, eventually some sort of system benchmarker to determine what graphics settings are best will be needed too (I will make sure not to use a shitty benchmark replay like Trackmania does). A random idea for online: what about having a central system simulating the race, and sending important coordinates (cars, weapons) to the other systems. They can then perform their own simulation around that. I guess at some point this could lead to some building exploding for all players - except one with a slower system? "What was that blast wave, I didn't see anything explode?!" Cheesy Certain things, such as buildings exploding, are just triggered events, right? Although it would obviously lag in the scenario that one player sets off a building explosion, since it's an event that triggers the explosion, that event would just be passed along to every other player, and the building would automatically detonate, right? I guess that having online only give co-ordinates of other cars and then the game objects simulating around that would work (could give some weird collisions though, such as a car hitting a large box/object ... if the car isn't being simulated, the box might react as if the car is a non-placeable geom (i.e. it goes PING! ZOOOOOOM!) but if any collisions are detected and calculated correctly, then the only thing that would happen is that the object hit would probably go in a slightly different trajectory on the systems of the one hitting the box, and everyone else who sees that person hit it (that trajectory could then probably be corrected by the host player's system, so the box ultimately goes the same way for everyone). Online always lags more than offline because of the extra calculations required to try and keep collisions in sync with all systems. I bet we'll see a lot of teleporting.  (shit, come to think of it, that would mean online races would need their replays saving as co-ord logs, not input logs, to counter for weird movements caused by lag ..... LOL.)
|
|
|
|
|
Logged
|
|
|
|
Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« Reply #8 on: February 23, 2010, 08:53:42 am » |
|
But if you're using multithread, the fps should not affect simulation step (stepsize warnings), and you might be able to use "0.01" on the laptop. - I hope.  I've experimented a bit with the slowest system I have access to, an old dell laptop. Technically I wouldn't call it a "laptop", since it eventually gets too hot enough to burn through skin (slight overexaggeration...  ). It's a really old system (1.6GHz mobile p4 I think). Using a stepsize of 0.005 on it is only possible with 5 iterations per step. And even then, it seems a bit too resource demanding. I guess this means a stepsize of 0.01 and an iteration level of 10-20 (resulting in: 100 steps per second, 1000-2000 iterations per second) could work for even old systems. But how much problem does a stepsize of 0.01 and 10 iterations give on your old desktop and laptop (when multithreading)? Certain things, such as buildings exploding, are just triggered events, right? ... man, I feel so stupid!  Ok, assuming the movement of the resulting bodies of the explosion gets the same on all systems, that might actually solve it! Just share car/weapon positions and events.  update: since I've got nothing better to do: I'll merge multithread branch to 0.06_cleanup...
|
|
|
|
« Last Edit: February 23, 2010, 09:49:39 am by Slinger »
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #9 on: February 23, 2010, 01:42:21 pm » |
|
0.01 on 10 iterations gives me too many random collision bugs, because the stepsize isn't low enough. Ironically, on the objloader build, I use 0.005 with 5 iterations, and I honestly don't see a difference in anything compared to when I use 10 iterations, other than buildings don't spaz the fps when they get hit. Plus, that laptop is really old. Try running any modern game on it.  my laptop won't run any modern games (mostly because it doesn't have pixel shader 3 on it ... only shader 2) and even my old old laptop (which has a 1.5Ghz celeron M) can run my hack-up rescale build properly - hell, it runs Trackmania - on the 0.005 stepsize and 10 iterations setup. Also don't forget that RCX will probably be aimed at systems that are better than now (for the time being we can make it work on rubbish setups, but I think by the time the game gets finished enough to be considered for a true beta release, low-end systems are going to be using 512MB graphics chips as standard (most laptops today come with ATI HD3000 or 4000 series chipsets you know). If it runs on my laptop, I think that's good enough. I consider my laptop the "minimum specs" system. if it won't run on my laptop, it sucks. Plus, all of Raydium's demos work at about 40-60fps on it, so if you optimize the VBO enough, it should start to work well even on trash systems. That old laptop probably CAN load the game, but it just doesn't have the graphics to back it up. A friend of mine has tested RCX on two identical systems except one had a trash GPU and one a new one, and it ran crap on the trash GPU (about 5fps) whilst it ran at about 100fps on the new one. same processor, mobo, RAM, everything, just different graphics cards. O_o  But if you're using multithread, the fps should not affect simulation step (stepsize warnings), and you might be able to use "0.01" on the laptop. - I hope. I actually run 0.005 on my laptop (and 10 iterations). I think the stepsize warnings are more due to Windows Aero sucking, because if I try to run my laptop in classic (windows 2000 display) mode, no OpenGL apps actually load in windowed mode. I dunno why, because when I had W7 RC on the laptop, doing that both worked, and gave me about 40fps extra. =/ I just wanted to clear that up. I use 0.005 on everything, and if it won't tolerate it, I change iterations to 5. Obviously it does have an effect (probably just on collisions on a very small scale) but I don't really notice any difference. you should find out what stepsize and iterations raydium uses, really.  assuming the movement of the resulting bodies of the explosion gets the same on all systems I doubt they would. Things exploding on GTA4 are never synced when online (the server has to pull stuff to the correct direction of travel). Plus, the building might for some reason have slightly different threshold values (because of lag ... i.e. a player hitting a pillar from the building, but on another computer that player lags and their car teleports through the buiding and doesn't hit it) and that could cause the explosion to happen differently... :/ But how much problem does a stepsize of 0.01 and 10 iterations give on your old desktop and laptop (when multithreading)? I dunno. I don't readily have access to either (as they are used by other members of my family) but both are singlecore so I assume that multithread will just trash them. You really should have left the singlethreading option in. 
|
|
|
|
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #10 on: February 23, 2010, 03:47:16 pm » |
|
Doublepost. First off, your coding sucks. (joke)  Two typos in the makefile; you've hit enter on two lines along with the space. (makefile goes WTF?)  ....coding error in main.o means it doesn't compile anyway. :< Needs fixing lol. (yes, I do just randomly check commits to see they compile. codie is going to rant over this you know.) Btw, this is the most recent 0.06_cleanup build. The previous one - where you corrected something in the Text_File - compiles and works. 
|
|
|
|
« Last Edit: February 23, 2010, 03:51:17 pm by Mac »
|
Logged
|
|
|
|
Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« Reply #11 on: February 25, 2010, 10:07:55 am » |
|
Do you still have that compilation problem? Make sure you begin with a "make -f Makefile.doze clean" before building. Sometimes this is necessary (unfortunately), but you might be right, there could be a typo somewhere. --- More info on the old laptop: stepsize of 0.005 and iteration of 5 gives only a few stepsize warnings (from start), but then runs stable and gets just above 40fps. It's a 1.6GHz p4 mobile, with a 32MB GeForce2 Go. So this is not the most powerful system any longer (all settings at low and it barely runs trackmania at lowest resolution). ...and I removed singlethreading when I discovered multithreading gave better performance (less stepsize warnings, more fps) on even my old singlecore laptop.  --- I must point out a thing I saw in the SDL_Delay manpage: Count on a delay granularity of at least 10 ms. Some platforms have shorter clock ticks but this is the most common. So a stepsize under 0.01 could be quite unreliable.  For now, the safest bet could be 0.01 and 20 iterations (possibly 10 if simulation stability seems unaffected). just random stuff: my htpc (which got a 3GHz c2d and intel g45 graphics), can run rcx at 0.005 stepzise and 200 iterations. It's a bit slow the first second, but then runs stable and seems to go at 60fps on 1920x1080 (intel graphics ftw!). 
|
|
|
|
|
Logged
|
|
|
|
|
Mac
|
 |
« Reply #12 on: February 25, 2010, 10:43:56 am » |
|
O_o I don't notice anything strange with 0.005, but 0.01 doesn't seem like it will be fast enough to keep up with high-speed things.
Maybe when we get replays working and stuff we'll be able to find out. You could try find out what platforms are affected and which aren't, but I don't notice anything strange between 0.005 and 0.01 except 0.01 is more prone to weird collisions (due to the lower physics step) and also the wheel axes aren't quite as stable as they are on 0.005. :<
As I said, the previous commit worked - I deleted all files between compiling the first and then deleted them again before trying the second. you should try compile it on windows and see O_o maybe your most recent changes have solved it. I dunno. xD
And my desktop could probably run the game on 0.005 and 200 iterations with where it's currently at =P I mean, 3,400fps on 0.005 and 10 iterations? please. But obviously less iterations will be possible when the game gets more complex (proper 3D graphics, textures, effects, scripting, AI, etc).
I think the game could run on 0.01 without too much problems (personally I want to know what Rollcage uses ... I don't think it's very high since the game is framelocked to 30fps - it's probably 60 or 120 or something like that) but I prefer 0.005 if it proves to be reliable enough.
edit
Most recent commit compiles. yayuhz.
|
|
|
|
« Last Edit: February 25, 2010, 10:51:14 am by Mac »
|
Logged
|
|
|
|
Slinger
<The Crazy Swede>
RCX Development
Hero Member
   
Karma: 1
Offline
Posts: 1619
The owls are not what they seem...
|
 |
« Reply #13 on: February 25, 2010, 10:56:30 am » |
|
I guess rollcage used some approach of "do one collision detection, one iteration" (it didn't really simulate any joints or other modern dynamic stuff, so it could probably get away with some really cpu-easy solution). And it probably moved the cars out of anything they collided with (could be possible in ode, but is not the default and easiest solution). Most recent commit compiles. yayuhz. Yay!  *realizes I've been sitting in front of the computer way too long...* well, I'm gonna call it a day, be back tomorrow! 
|
|
|
|
|
Logged
|
|
|
|
|